Objective: The purpose of this article is to explore data visualization as a consulting service offered by a research library with particular attention to uses of visualization at various places within the research lifecycle.
Methods: Lessons learned from a year of offering data visualization as a consulting service, and two general case studies are offered.
Results: Data visualization consulting services have a few unique considerations, including setting clear expectations, considering proprietary vs open source technologies, and making sure the consulting experience is also a learning experience. In addition, we can clearly place data visualization requests, in the form of profiled case studies, in multiple parts of the research lifecycle.
Numerous web co-link studies have analyzed a wide variety of websites ranging from those in the academic and business arena to those dealing with politics and governments. Such studies uncover rich information about these organizations. In recent years, however, there has been a dearth of co-link analysis, mainly due to the lack of sources from which co-link data can be collected directly. Although several commercial services such as Alexa provide inlink data, none provide co-link data. We propose a new approach to web co-link analysis that can alleviate this problem so that researchers can continue to mine the valuable information contained in co-link data. The proposed approach has two components: (a) generating co-link data from inlink data using a computer program; (b) analyzing co-link data at the site level in addition to the page level that previous co-link analyses have used. The site-level analysis has the potential of expanding co-link data sources. We tested this proposed approach by analyzing a group of websites focused on vaccination using Moz inlink data. We found that the approach is feasible, as we were able to generate co-link data from inlink data and analyze the co-link data with multidimensional scaling.
Geolocated social media data provide a powerful source of information about places and regional human behavior. Because only a small amount of social media data have been geolocation-annotated, inference techniques play a substantial role to increase the volume of annotated data. Conventional research in this area has been based on the text content of posts from a given user or the social network of the user, with some recent crossovers between the text- and network-based approaches. This paper proposes a novel approach to categorize highly-mentioned users (celebrities) into Local and Global types, and consequently use Local celebrities as location indicators. A label propagation algorithm is then used over the refined social network for geolocation inference. Finally, we propose a hybrid approach by merging a text-based method as a back-off strategy into our network-based approach. Empirical experiments over three standard Twitter benchmark data sets demonstrate that our approach outperforms state-of-the-art user geolocation methods.
The assessment of information literacy (IL) at the school level is mainly dependent on the measurement tools developed by the Western world. These tools need to be efficiently adapted and in most cases translated to allow them to be utilized in other cultures, languages, and countries. To date, there have been no standard guidelines to adapt these tools; hence, the results may be cross-culturally generalized to a certain extent. Furthermore, most data analyses produce generic outcomes without taking into account the ability of the students, including the difficulty of the test items. The present study proposes a systematic approach for context adaptation and language translation of the preexisting IL assessment tool known as TRAILS-9 to be used in different languages and context, particularly a Malaysian public secondary school. This study further administers a less common psychometric approach, the Rasch analysis, to validate the adapted instrument. This technique produces a hierarchy of item difficulty within the assessment domain that enables the ability level of the students to be differentiated based on item difficulty. The recommended scale adaptation guidelines are able to reduce the misinterpretation of scores from instruments in multiple languages as well as contribute to parallel development of IL assessment among secondary school students from different populations.
Finding similarity between concepts based on semantics has become a new trend in many applications (e.g., biomedical informatics, natural language processing). Measuring the Semantic Similarity (SS) with higher accuracy is a challenging task. In this context, the Information Content (IC)-based SS measure has gained popularity over the others. The notion of IC evolves from the science of information theory. Information theory has very high potential to characterize the semantics of concepts. Designing an IC-based SS framework comprises (i) an IC calculator, and (ii) an SS calculator. In this article, we propose a generic intrinsic IC-based SS calculator. We also introduce here a new structural aspect of an ontology called DCS (Disjoint Common Subsumers) that plays a significant role in deciding the similarity between two concepts. We evaluated our proposed similarity calculator with the existing intrinsic IC-based similarity calculators, as well as corpora-dependent similarity calculators using several benchmark data sets. The experimental results show that the proposed similarity calculator produces a high correlation with human evaluation over the existing state-of-the-art IC-based similarity calculators.
As Social Network Sites (SNSs) are increasingly becoming part of people's everyday lives, the implications of their use need to be investigated and understood. We conducted a systematic literature review to lay the groundwork for understanding the relationship between SNS use and users' psychological well-being and for devising strategies for taking advantage of this relationship. The review included articles published between 2003 and 2016, extracted from major academic databases. Findings revealed that the use of SNSs is both positively and negatively related to users' psychological well-being. We discuss the factors that moderate this relationship and their implications on users' psychological well-being. Many of the studies we reviewed lacked a sound theoretical justification for their findings and most involved young and healthy students, leaving other cohorts of SNS users neglected. The paper concludes with the presentation of a platform for future investigation.
In this article, we investigate the consequences of choosing different classification systems—namely, the way publications (or journals) are assigned to scientific fields—for the ranking of research units. We study the impact of this choice on the ranking of 500 universities in the 2013 edition of the Leiden Ranking in two cases. First, we compare a Web of Science (WoS) journal-level classification system, consisting of 236 subject categories, and a publication-level algorithmically constructed system, denoted G8, consisting of 5,119 clusters. The result is that the consequences of the move from the WoS to the G8 system using the Top 1% citation impact indicator are much greater than the consequences of this move using the Top 10% indicator. Second, we compare the G8 classification system and a publication-level alternative of the same family, the G6 system, consisting of 1,363 clusters. The result is that, although less important than in the previous case, the consequences of the move from the G6 to the G8 system under the Top 1% indicator are still of a large order of magnitude.
The explosive growth of social networks has the potential for online health communities to create social value for users. This study integrates online community with urban–rural health inequality in China to empirically explore whether online communities reduce health disparities between urban and rural areas in China. By collecting a unique data set from an online community in China that focuses on one disease, an exponential random graph model was used to empirically analyze the network structures and relationships formed in this community. The results indicate that technology-mediated online health communities can alleviate health disparities in China by exchanging information and improving the health capabilities of rural residents. We discuss the implications and guidelines for future research.
In contemporary urban settings, information seekers may face challenges assessing and making use of the large quantity of information to which they have access. Such challenges may be particularly acute when laypeople are considering specialized or technical information pertaining to topics over which knowledge is contested. Within a constructivist grounded theory study of the health information practices of 39 young parents in urban Canada, a complex practice of information triangulation was observed. Triangulation comprised an iterative process of seeking, assessment, and sense-making, and typically resulted in a decision or action. This paper examines the emergent concept of information triangulation in everyday life, using data from the young parent study. Triangulation processes in this study could be classified as one of four types, and functioned as an exercise of agency in the face of structures of expertise and exclusion. Although triangulation has long been described and discussed as a practice among scientific researchers wishing to validate and enrich their data, it has rarely been identified as an everyday practice in information behavior research. Future investigations should consider the use of information triangulation for other types of information, including by other populations and in other areas of contested knowledge.
Textual entailment is a relationship that obtains between fragments of text when one fragment in some sense implies the other fragment. The automation of textual entailment recognition supports a wide variety of text-based tasks, including information retrieval, information extraction, question answering, text summarization, and machine translation. Much ingenuity has been devoted to developing algorithms for identifying textual entailments, but relatively little to saying what textual entailment actually is. This article is a review of the logical and philosophical issues involved in providing an adequate definition of textual entailment. We show that many natural definitions of textual entailment are refuted by counterexamples, including the most widely cited definition of Dagan et al. We then articulate and defend the following revised definition: T textually entails H = df typically, a human reading T would be justified in inferring the proposition expressed by H from the proposition expressed by T. We also show that textual entailment is context-sensitive, nontransitive, and nonmonotonic.