Publication date: July 2022
Source: The Journal of Academic Librarianship, Volume 48, Issue 4
Author(s): Joyce Kasman Valenza, Heather Dalal, Gihan Mohamad, Brenda Boyer, Cara Berg, Leslin H. Charles, Rebecca Bushby, Megan Dempsey, Joan Dalrymple, Ewa Dziedzic-Elliott
In recent years, Community Question Answering (CQA) becomes increasingly prevalent, because it provides platforms for users to collect information and share knowledge. However, given a question in a CQA system, there are often many different paired answers. It is almost impossible for users to view them item by item and select the most relevant one. Hence, answer selection becomes an important task of CQA. In this paper, we propose a novel solution - BertHANK, which is a hierarchical attention networks with enhanced knowledge and pre-trained model for answer selection. Specifically, in the encoding stage, knowledge enhancement and pre-training model are used for questions and answers, respectively. Further, we adopt multi-attention mechanism, including the cross-attention on question-answer pairs, the inner attention on questions at word level, and the hierarchical inner attention on answers at both word and sentence level, to capture more subtle semantic features. In more details, the cross-attention focuses on capturing interactive information among encoded questions and answers. While the hierarchical inner attention assigns different weights to words in sentences, and sentences in answers, thereby obtaining both global and local information of question-answer pairs. The hierarchical inner attention contributes to select out best-matched answers for specific questions. Finally, we integrate attention-questions and attention-answers to make prediction. The results show that our model achieves state-of-the-art performance on two corpora, SemEval-2015 and SemEval-2017 CQA datasets, outperforming the advanced baselines by a large margin.
Records are persistent representations of activities created by partakers, observers, or their authorized proxies. People are generally willing to trust vital records such as birth, death, and marriage certificates. However, conspiracy theories and other misinformation may negatively impact perceptions of such documents, particularly when they are associated with a significant person or event. This paper explores the relationship between archival records and trustworthiness by reporting results of a survey that asked genealogists about their perceptions of 44th U.S. President Barack Obama's birth certificate, which was then at the center of the “birtherism” conspiracy. We found that although most participants perceived the birth certificate as trustworthy, others engaged in a biased review, considering it not trustworthy because of the news and politics surrounding it. These findings suggest that a conspiracy theory can act as a moderating variable that undermines the efficacy of normal or recommended practices and procedures for evaluating online information such as birth certificates. We provide recommendations and propose strategies for archivists to disseminate correct information to counteract the spread of misinformation about the authenticity of vital records, and we discuss future directions for research.
Entrepreneurs are showing an increasing focus on understanding and managing their social media strategies to optimize the success of their crowdfunding campaigns. While much of the current crowdfunding literature focuses on the roles of social media engagement in the funding performance of crowdfunded projects, in this study, drawing on social media engagement theory, we examine how product and entrepreneur characteristics moderate the influence of social media engagement on funding performance in the reward-based crowdfunding. Using a dataset of technology crowdfunded projects, we investigated whether Facebook and Twitter engagements affect funding outcomes and if so, how the two interact with each other and their influences vary by project type (hardware and software) and entrepreneur characteristics (gender, experience, and social capital). We found that both Facebook and Twitter engagements positively affect funding, but the two weaken each other's impact, particularly for hardware products. Additionally, Facebook engagements had a larger effect on funding outcomes in the early days of a campaign driven by its prelaunch efforts, whereas Twitter engagements had a larger impact in later days. Furthermore, our findings indicated that Facebook engagement is more influential for hardware products. An entrepreneur's internal social capital built inside the crowdfunding platform also weakened the effects of Facebook engagements generated outside the platform, whereas Twitter engagements on subsequent funding had less influence on experienced entrepreneurs. Our findings suggest that Facebook mainly serves as a channel to show the significant commitment of entrepreneurs to their projects and increase persuasiveness, while Twitter helps to raise awareness by broadcasting crowdfunding campaigns among potential investors.
Information researchers can further social justice and social equity to meet the needs of minority and underserved populations experiencing intersecting modes of cultural marginalization. Scholars of information and communication technologies for development (ICT4D) can find overlooked intersections with social justice in “community networking” research since the 1980s to overcome the digital divides between the haves and have-nots. To frame social justice initiatives within a consolidated vision of ICT4D in the field of information, this article proposes an impact-driven framework, expounded through five interrelated elements: why (motivations), with who (engaged constituencies), how (at external and internal levels to change traditional practices), and toward what (goal). It is explicated through select historical instances of “community networking” and digital divides, ICT4D, and social justice intersections. Significance of the elements is also demonstrated via this author's select information-related social justice research conducted in the United States. The urgency for critical and reflective conversations is important owing to historically abstracted human information behavior theory development within information research outdated in multiple contextualized needs of contemporary times. Historically situating impact-driven social justice research is important to further the relevance, existence, and growth of the information field as it strengthens its ties with ICT4D.
As a healthcare ICT4D solution, mobile health (mHealth) can potentially improve users' well-being during pandemics, especially in developing countries with limited healthcare resources. Recent ICT4D research reveals that providing end-users with access to ICT is insufficient for improving well-being and, thus, understanding how mHealth empowers end-users to enhance well-being against stressful events is important. However, prior research has rarely discussed the issue of empowerment in the domain of mHealth or the context of major disruptive events. This paper contributes to the literature by conceptualizing the psychological empowerment of mHealth users (PEMU) and investigating its nomological network during pandemics. Drawing upon theories of psychological empowerment and event characteristics, we developed a research model and tested it through a mixed-methods investigation, containing a quantitative study with 602 Chinese mHealth users during COVID-19 and a follow-up qualitative study of 326 online articles and reviews. We found that PEMU, driven by three technological characteristics (perceived response efficacy, ease of use, and mHealth quality), affects well-being through both (a) a stress-buffering effect, which counterbalances the detrimental, stress-increasing effects of event criticality and disruption, and (b) a vitality-stimulating effect, which is intensified by event criticality. These findings have important implications for ICT4D research and practice.
This research reports on qualitative interviews with 31 participants who are Irish parents, identify as lesbian, gay, bisexual, queer (LGBQ), and who expressed difficulty in the process of obtaining birth certificates for their children. Our aim was to use personal information management (PIM) and personal digital archiving (PDA) as a lens to explore the invisible work that the Irish government requires of a sexual minority parent group to obtain “equal” treatment in the birth registration and birth certificate process. Our findings suggest overlap with existing information behavior research (IB) that explore invisible information work, IB as a burden, information marginalization, information vulnerability, and information overload, and the everyday in IB. We propose a new framework: personal information burden (PIM-B) which is characterized by additional PIM activities, negative affect, lack of identity self extension to the personal information, and additional information seeking. We propose that a PIM-B may be used as an indicator of inequality in future research.
Contributing to the literature on knowledge infrastructure maintenance, this article describes a historical longitudinal analysis of revenue streams employed by four social science data organizations: the Roper Center for Public Opinion, the Inter-university Consortium for Political and Social Research (ICPSR), the UK Data Archive (UKDA), and the LIS Cross-National Data Center in Luxembourg (LIS). Drawing on archival documentation and interviews, we describe founders' assumptions about revenue, changes to revenue streams over the long term, practices for developing and maintaining revenue streams, the importance of financial support from host organizations, and how the context of each data organization shaped revenue possibilities. We extend conversations about knowledge infrastructure revenue streams by showing the types of change that have occurred over time and how it occurs. We provide examples of the types of flexibility needed for data organizations to remain sustainable over 40–60 years of revenue changes. We distinguish between Type A flexibilities, or development of new products and services, and Type B flexibilities, or continuous smaller adjustments to existing revenue streams. We argue that Type B flexibilities are as important as Type A, although they are easily overlooked. Our results are relevant to knowledge infrastructure managers and stakeholders facing similar revenue challenges.
This study examines what happens when an online community (OC) platform is shut down. In particular, it builds on recent interest from information science on everyday life information seeking, providing insights into the socio-emotional roles enacted by users following community closure. A qualitative study is undertaken on 12 months of social media comments relating to the closure of an OC platform. We identify and discuss the socio-emotional information roles that manifest, and present a model of their relationship to different aspects of the closure. We make theoretical connections between the notion of socio-emotional information roles and both the information behavior and practice literature, as well as research on community and participant roles. Theoretical and practical implications are discussed.
An uncertain graph (also known as probabilistic graph) is a generic model to represent many real-world networks from social to biological. In recent times, analysis and mining of uncertain graphs have drawn significant attention from the researchers of the data management community. Several noble problems have been introduced, and efficient methodologies have been developed to solve those problems. Hence, there is a need to summarize the existing results on this topic in a self-organized way. In this paper, we present a comprehensive survey on uncertain graph mining focusing on mainly three aspects: (i) different problems studied, (ii) computational challenges for solving those problems, and (iii) proposed methodologies. Finally, we list out important future research directions.
Detection of communities is one of the prominent characteristics of vast and complex networks like social networks, collaborative networks, and web graphs. In the modern era, new users get added to these complex networks, which results in an expansion of application-generated networks. Extracting relevant information from these large networks has become one of the most prominent research areas. Community detection tries to reduce the application-generated graph into smaller communities in which nodes within the community are similar. Most of the recent proposals are focused on detecting overlapping communities in the network with higher accuracy. An integral issue in graph theory is the enumeration of cliques in a larger graph. As clique is a group of completely connected nodes which shows the explicit communities means these nodes share the same types of information. Clique-based community detection algorithm utilizing the clique property of the graph also identifies the implicit communities, which is not directly shown in the graph. Many overlapping community detection algorithms are proposed by researchers that rely on cliques. The goal of this paper is to offer a comparative analysis of clique-based community detection algorithms. This paper provides a pervasive survey on research works identifying the cliques in a network for detecting overlapping communities. We bring together most of the state-of-the-art clique-based community detection algorithms into a single article with their accessible benchmark data sets. It presents a detailed description of methods based on K-cliques, maximal cliques, and triad percolation methods and addresses these approaches’ challenges. Finally, the comparative analysis of overlapping community detection methodologies is also reported.
With quick advancement in web technology, web-services offered on internet are growing quickly, making it challenging for users to choose a web-service fit to their needs. Recommender systems save users the hassle of going through a range of products by product recommendations through analytical techniques on historical data of user experiences of the available items/products. Research efforts provide several methods for web-service recommendation in which QoS-related attributes play primary role such as response-time, throughput, security, privacy and web-service-delivery. Derivable attributes including, user-trustworthiness and web-services reputation in contexts of users and web-services can also affect the QoS prediction. The proposed research focuses on a web-service recommendation model, S-RAP, for QoS prediction based on derivable attributes to predict QoS of a web-service that a user who has not invoked it before would experience. Services-Relevance attribute is proposed in this publication, which emphasizes on employing the historical data and extracting the degree of relevance in the users and web-services context to predict the QoS values for a user. The proposed system produces satisfactorily accurate rating predictions in the experiments evaluated by the Mean Absolute Error and Normalized Mean Absolute Error metrics. The results compared with state-of-the-art models show a relative improvement by 4.0%.
Random-walk-based sampling is an efficient way to extract and analyze the properties of large and complex graphs representing social networks. However, it is almost impractical for existing random-walk-based sampling schemes to reach the desired node distribution because of the indeterministic sampling budget (i.e., the number of samples or sampling steps) required for doing so with large volumes of data in graphs. On the other hand, under a small sampling budget, these methods produce low-quality samples with many repeats and high correlations (i.e., many common attributes), which leads to a large deviation from the desired node distribution and large estimation errors. In this paper, we propose a new random-walk sampling scheme based on node cliques (a subset of cliques), called node-clique random walk, or NCRW, to strike a good balance between the estimation error and the sampling budget, by producing unique samples with low correlations. Meanwhile, both the deviation from the desired node distribution and the estimation errors under the constraint of the sampling budget are reduced both theoretically and experimentally. Thus, the sampling costs which are closely related to the sampling budget are reduced. Our extensive experimental evaluation driven by real-world datasets further confirms that NCRW significantly increases the quality of samples and accuracy of estimations with much lower costs than those of existing random-walk-based sampling schemes especially in estimating the higher-order node attributes.
TV program recommendation is very important to avoid confusing users with large amounts of information. The existing methods are mainly based on collaborative filtering to utilize the interaction between users and items. However, they ignore auxiliary information that contains rich semantic information. In this paper, we propose a neural TV program recommendation with heterogeneous attention, which incorporates the multi-level features of auxiliary information and neural networks based on attention mechanism to obtain accurate program and user representations. In the program encoder module, we learn the different semantic information of labels and titles contained in each program through a neural network with heterogeneous attention to identify multi-hierarchical program information. In the user encoder module, we incorporate auxiliary information and interactions between users and programs. In addition, we utilize a personalized attention mechanism to learn the importance of different programs for each user to reveal user preferences. Specifically, we collect and process user viewing data in the capital of China to provide a real scenario for personalized recommendation. Experiments on real dataset show that our method can effectively improve the effectiveness of TV program recommendations than other existing models.
Despite the high accuracy offered by state-of-the-art deep natural-language models (e.g., LSTM, BERT), their application in real-life settings is still widely limited, as they behave like a black-box to the end-user. Hence, explainability is rapidly becoming a fundamental requirement of future-generation data-driven systems based on deep-learning approaches. Several attempts to fulfill the existing gap between accuracy and interpretability have been made. However, robust and specialized eXplainable Artificial Intelligence solutions, tailored to deep natural-language models, are still missing. We propose a new framework, named T-EBAnO, which provides innovative prediction-local and class-based model-global explanation strategies tailored to deep learning natural-language models. Given a deep NLP model and the textual input data, T-EBAnO provides an objective, human-readable, domain-specific assessment of the reasons behind the automatic decision-making process. Specifically, the framework extracts sets of interpretable features mining the inner knowledge of the model. Then, it quantifies the influence of each feature during the prediction process by exploiting the normalized Perturbation Influence Relation index at the local level and the novel Global Absolute Influence and Global Relative Influence indexes at the global level. The effectiveness and the quality of the local and global explanations obtained with T-EBAnO are proved on an extensive set of experiments addressing different tasks, such as a sentiment-analysis task performed by a fine-tuned BERT model and a toxic-comment classification task performed by an LSTM model. The quality of the explanations proposed by T-EBAnO, and, specifically, the correlation between the influence index and human judgment, has been evaluated by humans in a survey with more than 4000 judgments. To prove the generality of T-EBAnO and its model/task-independent methodology, experiments with other models (ALBERT, ULMFit) on popular public datasets (Ag News and Cola) are also discussed in detail.
Publication date: September 2022
Source: The Journal of Academic Librarianship, Volume 48, Issue 5
Author(s): Diana Ramirez, Margaret J. Foster, Ashlynn Kogut, Daniel Xiao