%0 Journal Article %J PLoS ONE %D 2020 %T Multimodal mental health analysis in social media %A Amir Hossein Yazdavar %A Mohammad Saeid Mahdavinejad %A Goonmeet Baja %A William Romine %A Amit Sheth %A Amir Hassan Monadjemi %A Krishnaprasad Thirunarayan %A John M. Meddar %A Annie Myers %A Jyotishman Pathak %A Pascal Hitzler %K Explainable Machine Learning %K Hypothesis Testing %K National Language Processing %K Prediction %K Regression %X

p.p1 {margin: 0.0px 0.0px 0.0px 0.0px; font: 9.5px Helvetica}

Depression is a major public health concern in the U.S. and globally. While successful early

identification and treatment can lead to many positive health and behavioral outcomes,

depression, remains undiagnosed, untreated or undertreated due to several reasons,

including denial of the illness as well as cultural and social stigma. With the ubiquity of social

media platforms, millions of people are now sharing their online persona by expressing their

thoughts, moods, emotions, and even their daily struggles with mental health on social

media. Unlike traditional observational cohort studies conducted through questionnaires

and self-reported surveys, we explore the reliable detection of depressive symptoms from

tweets obtained, unobtrusively. Particularly, we examine and exploit multimodal big (social)

data to discern depressive behaviors using a wide variety of features including individuallevel

demographics. By developing a multimodal framework and employing statistical techniques

to fuse heterogeneous sets of features obtained through the processing of visual,

textual, and user interaction data, we significantly enhance the current state-of-the-art

approaches for identifying depressed individuals on Twitter (improving the average F1-

Score by 5 percent) as well as facilitate demographic inferences from social media. Besides

providing insights into the relationship between demographics and mental health, our

research assists in the design of a new breed of demographic-aware health interventions.

%B PLoS ONE %G eng %U https://journals.plos.org/plosone/article/file?id=10.1371/journal.pone.0226248&type=printable %0 Conference Paper %B IEEE, ICHI %D 2018 %T Mental Health Analysis Via Social Media Data, IEEE ICHI 2018 %A Amir Hossein Yazdavar %A Mohammad Saied Mahdavinejad %A Goonmeet Bajaj %A Krishnaprasad Thirunarayan %A Jyotishman Pathak %A Amit Sheth %B IEEE, ICHI %G eng %0 Generic %D 2017 %T Challenges of Sentiment Analysis for Dynamic Events %A Monireh Ebrahimi %A Amir Hossein Yazdavar %A Amit Sheth %X

Efforts to assess people's sentiments on Twitter have suggested that Twitter could be a valuable resource for studying political sentiment and that it reflects the offline political landscape. Many opinion mining systems and tools provide users with people's attitudes toward products, people, or topics and their attributes/aspects. However, although it may appear simple, using sentiment analysis to predict election results is difficult, since it is empirically challenging to train a successful model to conduct sentiment analysis on tweet streams for a dynamic event such as an election. This article highlights some of the challenges related to sentiment analysis encountered during monitoring of the presidential election using Kno.e.sis's Twitris system.

%G eng %0 Conference Paper %B IJCAI %D 2017 %T Relatedness-based Multi-Entity Summarization %A Kalpa Gunaratna %A Amir Hossein Yazdavar %A Krishnaprasad Thirunarayan %A Amit Sheth %A Gong Cheng %X

Representing world knowledge in a machine processable format is important as entities and their descriptions have fueled tremendous growth in knowledge-rich information processing platforms, services, and systems. Prominent applications of knowledge graphs include search engines (e.g., Google Search and Microsoft Bing), email clients (e.g., Gmail), and intelligent personal assistants (e.g., Google Now, Amazon Echo, and Apple’s Siri). In this paper, we present an approach that can summarize facts about a collection of entities by analyzing their relatedness in preference to summarizing each entity in isolation. Specifically, we generate informative entity summaries by selecting: (i) inter-entity facts that are similar and (ii) intra-entity facts that are important and diverse. We employ a constrained knapsack problem solving approach to efficiently compute entity summaries. We perform both qualitative and quantitative experiments and demonstrate that our approach yields promising results compared to two other stand-alone state-ofthe-art entity summarization approaches.

%B IJCAI %G eng %0 Conference Paper %B ASONAM %D 2017 %T Semi-Supervised Approach to Monitoring Clinical Depressive Symptoms in Social Media %A Amir Hossein Yazdavar %E Hussein S. Al-Olimat %E Monireh Ebrahimi %E Goonmeet Bajaj %E Tanvi Banerjee %E Krishnaprasad Thirunarayan %E Jyotishman Pathak %E Amit Sheth %X

With the rise of social media, millions of people are routinely expressing their moods, feelings, and daily struggles with mental health issues on social media platforms like Twitter. Unlike traditional observational cohort studies conducted through questionnaires and self-reported surveys, we explore the reliable detection of clinical depression from tweets obtained unobtrusively. Based on the analysis of tweets crawled from users with self-reported depressive symptoms in their Twitter profiles, we demonstrate the potential for detecting clinical depression symptoms which emulate the PHQ-9 questionnaire clinicians use today. Our study uses a semi-supervised statistical model to evaluate how the duration of these symptoms and their expression on Twitter (in terms of word usage patterns and topical preferences) align with the medical findings reported via the PHQ-9. Our proactive and automatic screening tool is able to identify clinical depressive symptoms with an accuracy of 68% and precision of 72%.

%B ASONAM %G eng %U https://dl.acm.org/doi/abs/10.1145/3110025.3123028 %0 Conference Paper %B 2013 IEEE/WIC/ACM International Conferences on Web Intelligence, WI 2013 %D 2013 %T Automatic Domain Identification for Linked Open Data %A Sarasi Lalithsena %A Pascal Hitzler %A Amit Sheth %A Prateek Jain %K dataset search %K Domain Identification %K Linked Open Data Cloud %X

Linked Open Data (LOD) has emerged as one of the largest collections of interlinked structured datasets on the Web. Although the adoption of such datasets for applications is increasing, identifying relevant datasets for a specific task or topic is still challenging. As an initial step to make such identification easier, we provide an approach to automatically identify the topic domains of given datasets. Our method utilizes existing knowledge sources, more specifically Freebase, and we present an evaluation which validates the topic domains we can identify with our system. Furthermore, we evaluate the effectiveness of identified topic domains for the purpose of finding relevant datasets, thus showing that our approach improves reusability of LOD datasets.

%B 2013 IEEE/WIC/ACM International Conferences on Web Intelligence, WI 2013 %C Atlanta, GA, USA %P 205–212 %G eng %U http://dx.doi.org/10.1109/WI-IAT.2013.206 %R 10.1109/WI-IAT.2013.206 %0 Book Section %B On the Move to Meaningful Internet Systems: OTM 2012 %D 2012 %T Alignment-based querying of linked open data %A Joshi, Amit Krishna %A Prateek Jain %A Pascal Hitzler %A Peter Z. Yeh %A Kunal Verma %A Amit Sheth %A Mariana Damova %B On the Move to Meaningful Internet Systems: OTM 2012 %I Springer %P 807–824 %G eng %0 Conference Paper %B 23rd ACM Conference on Hypertext and Social Media, HT '12 %D 2012 %T Moving beyond SameAs with PLATO: Partonomy detection for Linked Data %A Prateek Jain %A Pascal Hitzler %A Kunal Verma %A Peter Z. Yeh %A Amit Sheth %E Ethan V. Munson %E Markus Strohmaier %K Linked Open Data Cloud %K Mereology %K Part of Relation %X

The Linked Open Data (LOD) Cloud has gained significant traction over the past few years. With over 275 interlinked datasets across diverse domains such as life science, geography, politics, and more, the LOD Cloud has the potential to support a variety of applications ranging from open domain question answering to drug discovery.

Despite its significant size (approx. 30 billion triples), the data is relatively sparely interlinked (approx. 400 million links). A semantically richer LOD Cloud is needed to fully realize its potential. Data in the LOD Cloud are currently interlinked mainly via the owl:sameAs property, which is inadequate for many applications. Additional properties capturing relations based on causality or partonomy are needed to enable the answering of complex questions and to support applications.

In this paper, we present a solution to enrich the LOD Cloud by automatically detecting partonomic relationships, which are well-established, fundamental properties grounded in linguistics and philosophy. We empirically evaluate our solution across several domains, and show that our approach performs well on detecting partonomic properties between LOD Cloud data.

%B 23rd ACM Conference on Hypertext and Social Media, HT '12 %I ACM %C Milwaukee, WI, USA %P 33–42 %G eng %U http://doi.acm.org/10.1145/2309996.2310004 %R 10.1145/2309996.2310004 %0 Report %D 2012 %T Semantic Aspects of EarthCube %A Pascal Hitzler %A Krzysztof Janowicz %A Gary Berg-Cross %A Leo Obrst %A Amit Sheth %A Timothy Finin %A Isabel Cruz %X

In this document, we give a high-level overview of selected Semantic (Web) technologies, methods, and other important considerations, that are relevant for the success of EarthCube. The goal of this initial document is to provide entry points and references for discussions between the Semantic Technologies experts and the domain experts within EarthCube. The selected topics are intended to ground the EarthCube roadmap in the state of the art in semantics research and ontology engineering.

We anticipate that this document will evolve as EarthCube progresses. Indeed, all EarthCube parties are asked to provide topics of importance that should be treated in future versions of this document.

%B EarthCube report of the Technology Subcommittee of the EarthCube Semantics and Ontologies Group %G eng %0 Conference Paper %B Workshop on GIScience in the Big Data Age, In conjunction with the seventh International Conference on Geographic Information Science 2012 (GIScience 2012) %D 2012 %T Semantics and Ontologies for EarthCube %A Gary Berg-Cross %A Isabel Cruz %A Mike Dean %A Tim Finin %A Mark Gahegan %A Pascal Hitzler %A Hook Hua %A Krzysztof Janowicz %A Naicong Li %A Philip Murphy %A Bryce Nordgren %A Leo Obrst %A Mark Schildhauer %A Amit Sheth %A Krishna Sinha %A Anne Thessen %A Nancy Wiegand %A Ilya Zaslavsky %E Krzysztof Janowicz %E C. Kessler %E T. Kauppinen %E Dave Kolas %E Simon Scheider %X

Semantic technologies and ontologies play an increasing role in scientific workflow systems and knowledge infrastructures. While ontologies are mostly used for the semantic annotation of metadata, semantic technologies enable searching metadata catalogs beyond simple keywords, with some early evidence of semantics used for data translation. However, the next generation of distributed and interdisciplinary knowledge infrastructures will require capabilities beyond simple subsumption reasoning over subclass relations. In this work, we report from the EarthCube Semantics Community by highlighting which role semantics and ontologies should play in the EarthCube knowledge infrastructure. We target the interested domain scientist and, thus, introduce the value proposition of semantic technologies in a non-technical language. Finally, we commit ourselves to some guiding principles for the successful implementation and application of semantic technologies and ontologies within EarthCube.

%B Workshop on GIScience in the Big Data Age, In conjunction with the seventh International Conference on Geographic Information Science 2012 (GIScience 2012) %C Columbus, Ohio, USA %G eng %0 Conference Paper %B The Semantic Web: Research and Applications - 8th Extended Semantic Web Conference, ESWC 2011 %D 2011 %T Contextual Ontology Alignment of LOD with an Upper Ontology: A Case Study with Proton %A Prateek Jain %A Peter Z. Yeh %A Kunal Verma %A Reymonrod G. Vasquez %A Mariana Damova %A Pascal Hitzler %A Amit Sheth %E Grigoris Antoniou %E Marko Grobelnik %E Elena Paslaru Bontas Simperl %E Bijan Parsia %E Dimitris Plexousakis %E Pieter De Leenheer %E Jeff Z. Pan %X

The Linked Open Data (LOD) is a major milestone towards realizing the Semantic Web vision, and can enable applications such as robust Question Answering (QA) systems that can answer queries requiring multiple, disparate information sources. However, realizing these applications requires relationships at both the schema and instance level, but currently the LOD only provides relationships for the latter. To address this limitation, we present a solution for automatically finding schema-level links between two LOD ontologies – in the sense of ontology alignment. Our solution, called BLOOMS+, extends our previous solution (i.e. BLOOMS) in two significant ways. BLOOMS+ 1) uses a more sophisticated metric to determine which classes between two ontologies to align, and 2) considers contextual information to further support (or reject) an alignment. We present a comprehensive evaluation of our solution using schema-level mappings from LOD ontologies to Proton (an upper level ontology) – created manually by human experts for a real world application called FactForge. We show that our solution performed well on this task. We also show that our solution significantly outperformed existing ontology alignment solutions (including our previously published work on BLOOMS) on this same task.

%B The Semantic Web: Research and Applications - 8th Extended Semantic Web Conference, ESWC 2011 %I Springer %C Heraklion, Crete, Greece %V 6643 %P 80–92 %G eng %U http://dx.doi.org/10.1007/978-3-642-21034-1_6 %R 10.1007/978-3-642-21034-1_6 %0 Conference Paper %B Linked Data Meets Artificial Intelligence, Papers from the 2010 AAAI Spring Symposium %D 2010 %T Linked Data Is Merely More Data %A Prateek Jain %A Pascal Hitzler %A Peter Z. Yeh %A Kunal Verma %A Amit Sheth %E Dan Brickley %E Vinay K. Chaudhri %E Harry Halpin %E Deborah McGuinness %X

In this position paper, we argue that the Linked Open Data (LoD) Cloud, in its current form, is only of limited value for furthering the Semantic Web vision. Being merely a weakly linked “triple collection,” it will only be of very limited bene- fit for the AI or Semantic Web communities. We describe the corresponding problems with the LoD Cloud and give directions for research to remedy the situation.

%B Linked Data Meets Artificial Intelligence, Papers from the 2010 AAAI Spring Symposium %I AAAI %C Stanford, California, USA %G eng %U http://www.aaai.org/ocs/index.php/SSS/SSS10/paper/view/1130 %0 Conference Paper %B The Semantic Web - ISWC 2010 - 9th International Semantic Web Conference, ISWC 2010 %D 2010 %T Ontology Alignment for Linked Open Data %A Prateek Jain %A Pascal Hitzler %A Amit Sheth %A Kunal Verma %A Peter Z. Yeh %E Peter F. Patel-Schneider %E Yue Pan %E Pascal Hitzler %E Peter Mika %E Lei Zhang %E Jeff Z. Pan %E Ian Horrocks %E Birte Glimm %X

The Web of Data currently coming into existence through the Linked Open Data (LOD) effort is a major milestone in realizing the Semantic Web vision. However, the development of applications based on LOD faces difficulties due to the fact that the different LOD datasets are rather loosely connected pieces of information. In particular, links between LOD datasets are almost exclusively on the level of instances, and schema-level information is being ignored. In this paper, we therefore present a system for finding schema-level links between LOD datasets in the sense of ontology alignment. Our system, called BLOOMS, is based on the idea of bootstrapping information already present on the LOD cloud. We also present a comprehensive evaluation which shows that BLOOMS outperforms state-of-the-art ontology alignment systems on LOD datasets. At the same time, BLOOMS is also competitive compared with these other systems on the Ontology Evaluation Alignment Initiative Benchmark datasets.

%B The Semantic Web - ISWC 2010 - 9th International Semantic Web Conference, ISWC 2010 %I Springer %C Shanghai, China %V 6496 %P 402–417 %G eng %U http://dx.doi.org/10.1007/978-3-642-17746-0_26 %R 10.1007/978-3-642-17746-0_26 %0 Conference Paper %B Scientific and Statistical Database Management, 22nd International Conference, SSDBM 2010 %D 2010 %T Provenance Context Entity (PaCE): Scalable Provenance Tracking for Scientific RDF Data %A Satya S. Sahoo %A Olivier Bodenreider %A Pascal Hitzler %A Amit Sheth %A Krishnaprasad Thirunarayan %E Michael Gertz %E Bertram Ludäscher %K Biomedical knowledge repository %K Context theory %K Provenance context entity %K Provenance Management Framework. %K Provenir ontology %K RDF reification %X

The Semantic Web Resource Description Framework (RDF) format is being used by a large number of scientific applications to store and disseminate their datasets. The provenance information, describing the source or lineage of the datasets, is playing an increasingly significant role in ensuring data quality, computing trust value of the datasets, and ranking query results. Current Semantic Web provenance tracking approaches using the RDF reification vocabulary suffer from a number of known issues, including lack of formal semantics, use of blank nodes, and application-dependent interpretation of reified RDF triples that hinders data sharing. In this paper, we introduce a new approach called Provenance Context Entity (PaCE) that uses the notion of provenance context to create provenance-aware RDF triples without the use of RDF reification or blank nodes. We also define the formal semantics of PaCE through a simple extension of the existing RDF(S) semantics that ensures compatibility of PaCE with existing Semantic Web tools and implementations. We have implemented the PaCE approach in the Biomedical Knowledge Repository (BKR) project at the US National Library of Medicine to support provenance tracking on RDF data extracted from multiple sources, including biomedical literature and the UMLS Metathesaurus. The evaluations demonstrate a minimum of 49% reduction in total number of provenancespecific RDF triples generated using the PaCE approach as compared to RDF reification. In addition, using the PACE approach improves the performance of complex provenance queries by three orders of magnitude and remains comparable to the RDF reification approach for simpler provenance queries. 

%B Scientific and Statistical Database Management, 22nd International Conference, SSDBM 2010 %I Springer %C Heidelberg, Germany %V 6187 %P 461–470 %G eng %U http://dx.doi.org/10.1007/978-3-642-13818-8_32 %R 10.1007/978-3-642-13818-8_32 %0 Conference Paper %B Ohio Collaborative Conference on BioInformatics (OCCBIO 2009), Posters & Demos %D 2009 %T Ontology Driven Integration of Biology Experiment Data %A Raghava Mutharaju %A Satya S. Sahoo %A D. Brent Weatherly %A Pramod Anantharam %A Flora Logan %A Amit Sheth %A Rick Tarleton %B Ohio Collaborative Conference on BioInformatics (OCCBIO 2009), Posters & Demos %C Cleveland, OH, USA %G eng %0 Conference Paper %B On the Move to Meaningful Internet Systems: OTM 2009, Confederated International Conferences, CoopIS, DOA, IS, and ODBASE 2009, Proceedings, Part II %D 2009 %T Ontology-Driven Provenance Management in eScience: An Application in Parasite Research %A Satya S. Sahoo %A D. Brent Weatherly %A Raghava Mutharaju %A Pramod Anantharam %A Amit Sheth %A Rick Tarleton %E Robert Meersman %E Tharam S. Dillon %E Pilar Herrero %X

Provenance, from the French word “provenir”, describes the lineage or history of a data entity. Provenance is critical information in scientific applications to verify experiment process, validate data quality and associate trust values with scientific results. Current industrial scale eScience projects require an end-to-end provenance management infrastructure. This infrastructure needs to be underpinned by formal semantics to enable analysis of large scale provenance information by software applications. Further, effective analysis of provenance information requires well-defined query mechanisms to support complex queries over large datasets. This paper introduces an ontology-driven provenance management infrastructure for biology experiment data, as part of the Semantic Problem Solving Environment (SPSE) for Trypanosoma cruzi (T.cruzi). This provenance infrastructure, called T.cruzi Provenance Management System (PMS), is underpinned by (a) a domain-specific provenance ontology called Parasite Experiment ontology, (b) specialized query operators for provenance analysis, and (c) a provenance query engine. The query engine uses a novel optimization technique based on materialized views called materialized provenance views (MPV) to scale with increasing data size and query complexity. This comprehensive ontology-driven provenance infrastructure not only allows effective tracking and management of ongoing experiments in the Tarleton Research Group at the Center for Tropical and Emerging Global Diseases (CTEGD), but also enables researchers to retrieve the complete provenance information of scientific results for publication in literature.

%B On the Move to Meaningful Internet Systems: OTM 2009, Confederated International Conferences, CoopIS, DOA, IS, and ODBASE 2009, Proceedings, Part II %I Springer %C Vilamoura, Portugal %V 5871 %P 992–1009 %G eng %U http://dx.doi.org/10.1007/978-3-642-05151-7_18 %R 10.1007/978-3-642-05151-7_18 %0 Conference Paper %B Web Information Systems Engineering - WISE 2009, 10th International Conference %D 2009 %T Spatio-Temporal-Thematic Analysis of Citizen Sensor Data: Challenges and Experiences %A Meenakshi Nagarajan %A Karthik Gomadam %A Amit Sheth %A Ajith Ranabahu %A Raghava Mutharaju %A Ashutosh Jadhav %E Gottfried Vossen %E Darrell D. E. Long %E Jeffrey Xu Yu %X

We present work in the spatio-temporal-thematic analysis of citizen-sensor observations pertaining to real-world events. Using Twitter as a platform for obtaining crowd-sourced observations, we explore the interplay between these 3 dimensions in extracting insightful summaries of social perceptions behind events. We present our experiences in building a web mashup application, Twitris [1] that extracts and facilitates the spatio-temporal-thematic exploration of event descriptor summaries.

%B Web Information Systems Engineering - WISE 2009, 10th International Conference %I Springer %C Poznan, Poland %V 5802 %P 539–553 %G eng %U http://dx.doi.org/10.1007/978-3-642-04409-0_52 %R 10.1007/978-3-642-04409-0_52 %0 Conference Paper %B Ohio Collaborative Conference on BioInformatics (OCCBIO 2009), Posters & Demos %D 2009 %T Trykipedia: Collaborative Bio-Ontology Development using Wiki Environment %A Pramod Anantharam %A Satya S. Sahoo %A D. Brent Weatherly %A Flora Logan %A Raghava Mutharaju %A Amit Sheth %A Rick Tarleton %B Ohio Collaborative Conference on BioInformatics (OCCBIO 2009), Posters & Demos %C Cleveland, OH, USA %G eng %0 Conference Paper %B Semantic Web Challenge at the 8th International Semantic Web Conference (ISWC 2009) %D 2009 %T Twitris: Socially Influenced Browsing %A Ashutosh Jadhav %A Wenbo Wang %A Raghava Mutharaju %A Pramod Anantharam %A Vinh Nguyen %A Amit Sheth %A Karthik Gomadam %A Meenakshi Nagarajan %A Ajith Ranabahu %X

In this paper, we present Twitris, a semantic Web application that facilitates browsing for news and information, using social perceptions as the fulcrum. In doing so we address challenges in large scale crawling, processing of real time information, and preserving spatiotemporal-thematic properties central to observations pertaining to realtime events. We extract metadata about events from Twitter and bring related news and Wikipedia articles to the user. In developing Twitris, we have used the DBPedia ontology.

%B Semantic Web Challenge at the 8th International Semantic Web Conference (ISWC 2009) %C Washington DC, USA %G eng