Research Data Management (RDM) promises to make research outputs more transparent, findable, and reproducible. Strategies to streamline data management across disciplines are of key importance. This paper presents results of an institutional survey (N = 258) at a medium-sized Austrian university with a STEM focus, supplemented with interviews (N = 18), to give an overview of the state-of-play of RDM practices across faculties and disciplinary contexts. RDM services are on the rise but remain somewhat behind leading countries like the Netherlands and UK, showing only the beginnings of a culture attuned to RDM. There is considerable variation between faculties and institutes with respect to data amounts, complexity of data sets, data collection and analysis, and data archiving. Data sharing practices within fields tend to be inconsistent. RDM is predominantly regarded as an administrative task, to the detriment of considerations of good research practice. Problems with RDM fall in two categories: Generic problems transcend specific research interests, infrastructures, and departments while discipline-specific problems need a more targeted approach. The paper extends the state-of-the-art on RDM practices by combining in-depth qualitative material with quantified, detailed data about RDM practices and needs. The findings should be of interest to any comparable research institution with a similar agenda.
This study examines the relationship between team's gender composition and outputs of funded projects using a large data set of National Institutes of Health (NIH) R01 grants and their associated publications between 1990 and 2017. This study finds that while the women investigators' presence in NIH grants is generally low, higher women investigator presence is on average related to slightly lower number of publications. This study finds empirically that women investigators elect to work in fields in which fewer publications per million-dollar funding is the norm. For fields where women investigators are relatively well represented, they are as productive as men. The overall lower productivity of women investigators may be attributed to the low representation of women in high productivity fields dominated by men investigators. The findings shed light on possible reasons for gender disparity in grant productivity.
Due to their confidence and dominance, narcissistic leaders oftentimes can be perceived favorably by followers, in particular during times of uncertainty. In this study, we propose and examine the relationship between narcissistic leaders and followers who are prone to experience uncertainty intensely and frequently in general, namely highly anxious followers. We do so by applying machine learning algorithms to account for personality traits in a large sample of leaders and followers on Twitter. We find that highly anxious followers are more likely to interact with narcissistic leaders in general, and male narcissistic leaders in particular. Finally, we also examined these interactions in the context of highly popular leaders and found that as leaders become more popular, they begin to attract less anxious followers, regardless of leader gender. We interpret and discuss these findings in relation to previous work and outline limitations and future research recommendations based on our approach.
When writing a research paper, the author has to select information to include in the paper to support various arguments. The information has to be organized and synthesized into a coherent whole through relationships and information structures. There is hardly any research on the information structure of research papers, and how information structure supports rhetorical and argument structures. Thus, this study is focused on information organization in the Abstract and Introduction sections of sociology research papers, analyzing the information structure of research objective, question, hypothesis, and result statements. The study is limited to research papers reporting research that investigated cause–effect relations between two concepts. Two semantic frames were developed to specify the types of information associated with cause–effect and comparison relations, and used as coding schemes to annotate the text for different information types. Six link patterns between the two frames were identified—showing how comparisons are used to support the claim that the cause-effect relation is valid. This study demonstrated how semantic frames can be incorporated in discourse analysis to identify deep structures underlying the argument structure. The results carry implications for the knowledge representation of academic research in knowledge graphs, for semantic relation extraction, and teaching of academic writing.
This paper examines how shared affiliations within an institution (e.g., same primary appointment, same secondary appointment, same research center, same laboratory/facility) and physical proximity (e.g., walking distance between collaborator offices) shape knowledge creation through biomedical science collaboration in general, and interdisciplinary collaboration in particular. Using archival and publication data, we examine pairwise research collaborations among 1,138 faculty members over a 12-year period at a medical school in the United States. Modeling at the dyadic level, we find that faculty members with more shared institutional affiliations are positively associated with knowledge creation and knowledge impact, and that this association is moderated by the physical proximity of collaborators. We further find that the positive influence of disciplinary diversity (e.g., collaborators from different fields) on knowledge impact is stronger among pairs that share more affiliations and is significantly reduced as the physical distance among collaborators increases. These results support the idea that shared institutional affiliations and physical proximity can increase interpersonal contact, providing more opportunities to develop trust and mutual understanding, and thus alleviating some of the coordination issues that can arise with higher disciplinary diversity. We discuss the implications for future research on scientific collaborations, managerial practice regarding office space allocation, and strategic planning of initiatives aimed at promoting interdisciplinary collaboration.
This brief communication aims to reveal whether the recommendation information's spillover effect decays with geographical distance. A unique dataset of Airbnb listings in Beijing is collected to perform an empirical analysis and a model simulation. The results of this study demonstrate that the spillover of listings' recommendation information decays with spatial distance and this decay follows a certain attenuation mode (i.e., has a certain attenuation rate and attenuation point). As one of the first to investigate the attenuation mechanism of information spillover, this communication enriches the current research on information transmission and brings a novel topic to the attention of the community.
This study examines the documents circulated among biomedical equipment repair technicians in order to build a conceptual model that accounts for multilayered temporality in technical healthcare professional communities. A metadata analysis informed by digital forensics and trace ethnography is employed to model the overlapping temporal, format-related, and annotation characteristics present in a corpus of repair manual files crowdsourced during collaborations between volunteer archivists and professional technicians. The corpus originates within iFixit.com's Medical Device Repair collection, a trove of more than 10,000 manuals contributed by working technicians in response to the strain placed on their colleagues and institutions due to the COVID-19 pandemic. The study focuses in particular on the Respiratory Analyzer subcategory of documents, which aid in the maintenance of equipment central to the care of COVID-19 patients experiencing respiratory symptoms. The 40 Respiratory Analyzer manuals in iFixit's collection are examined in terms of their original publication date, the apparent status of their original paper copies, the version of PDF used to encode them, and any additional metadata that is present. Based on these characteristics, the study advances a conceptual model accounting for circulation among multiple technicians, as well as alteration of documents during the course of their lifespans.
Machine learning methods, especially deep learning models, have achieved impressive performance in various natural language processing tasks including sentiment analysis. However, deep learning models are more demanding for training data. Data augmentation techniques are widely used to generate new instances based on modifications to existing data or relying on external knowledge bases to address annotated data scarcity, which hinders the full potential of machine learning techniques. This paper presents our work using part-of-speech (POS) focused lexical substitution for data augmentation (PLSDA) to enhance the performance of machine learning algorithms in sentiment analysis. We exploit POS information to identify words to be replaced and investigate different augmentation strategies to find semantically related substitutions when generating new instances. The choice of POS tags as well as a variety of strategies such as semantic-based substitution methods and sampling methods are discussed in detail. Performance evaluation focuses on the comparison between PLSDA and two previous lexical substitution-based data augmentation methods, one of which is thesaurus-based, and the other is lexicon manipulation based. Our approach is tested on five English sentiment analysis benchmarks: SST-2, MR, IMDB, Twitter, and AirRecord. Hyperparameters such as the candidate similarity threshold and number of newly generated instances are optimized. Results show that six classifiers (SVM, LSTM, BiLSTM-AT, bidirectional encoder representations from transformers [BERT], XLNet, and RoBERTa) trained with PLSDA achieve accuracy improvement of more than 0.6% comparing to two previous lexical substitution methods averaged on five benchmarks. Introducing POS constraint and well-designed augmentation strategies can improve the reliability of lexical data augmentation methods. Consequently, PLSDA significantly improves the performance of sentiment analysis algorithms.
Although Washington state sanctuary policies of 2017 prohibit collaboration between local law enforcement and federal immigration enforcement in noncriminal cases, compliance with sanctuary policies has not been systematically studied. We explore information practices and collaboration between local law enforcement and federal immigration enforcement in Grant County, Washington, based on records from November 2017 to May 2019 obtained by the University of Washington Center for Human Rights through Public Records Act (PRA) requests. Qualitative analysis of over 8,000 pages reveals a baseline of passive and active information sharing and collaboration between local law enforcement and federal immigration agencies before Washington sanctuary laws went into effect in May 2019, a practice that needs to stop if agencies are to comply with the laws. We employ a systematic methodology to obtain (through PRA and other Access to Information requests) and analyze official records through qualitative content analysis, to monitor and hold local law enforcement accountable in their compliance with sanctuary laws. This method can be used to examine law enforcement information behaviors in other counties in Washington, and in other states that offer sanctuary protections, as a way to monitor compliance with sanctuary laws and strengthen the protection of immigrants' rights.
This article discusses how letters to the editor boost publishing metrics for journals and authors, and then examines letters published since 2015 in six elite journals, including the Journal of the Association for Information Science and Technology. The initial findings identify some potentially anomalous use of letters and unusual self-citation patterns. The article proposes that Clarivate Analytics consider slightly reconfiguring the Journal Impact Factor to more fairly account for letters and that journals transparently explain their letter submission policies.
The COVID-19 pandemic emptied classrooms across the globe and pushed administrators, students, educators, and parents into an uneasy alliance with online learning systems already committing serious privacy and intellectual property violations, and actively promoted the precarity of educational labor. In this article, we use methods and theories derived from critical informatics to examine [anonymized] University's deployment of seven online learning platforms commonly used in higher education to uncover five themes that result from the deployment of corporate learning platforms. We conclude by suggesting ways ahead to meaningfully address the structural power and vulnerabilities extended by higher education's use of these platforms.
The spread of misinformation on social media has become a major societal issue during recent years. In this work, we used the ongoing COVID-19 pandemic as a case study to systematically investigate factors associated with the spread of multi-topic misinformation related to one event on social media based on the heuristic-systematic model. Among factors related to systematic processing of information, we discovered that the topics of a misinformation story matter, with conspiracy theories being the most likely to be retweeted. As for factors related to heuristic processing of information, such as when citizens look up to their leaders during such a crisis, our results demonstrated that behaviors of a political leader, former US President Donald J. Trump, may have nudged people's sharing of COVID-19 misinformation. Outcomes of this study help social media platform and users better understand and prevent the spread of misinformation on social media.
Information studies have identified numerous needs and barriers to the integration of asylum seekers and refugees; however, little emphasis has been placed thus far on their need to keep their own culture, values, and traditions alive. In this work, we use ethnographic constructivist grounded theory to explore the place of heritage in the information experience of people who have sought asylum in the United Kingdom. Based on our findings, we propose to conceptualize heritage as an affective and meaningful information literacy practice. Such conceptualization fosters integration by allowing people to simultaneously maintain their own ways of knowing and adapt to local ones. Our research approach provides scholars with a conceptual tool to holistically explore affective, meaningful, and cultural information practices. This study also reveals implications for policymakers, third sector organizations, and cultural institutions working toward the more sustainable integration of asylum seekers and refugees.
This research analyzes human-generated clarification questions to provide insights into how they are used to disambiguate and provide a better understanding of information needs. A set of clarification questions is extracted from posts on the Stack Exchange platform. Novel taxonomy is defined for the annotation of the questions and their responses. We investigate the clarification questions in terms of whether they add any information to the post (the initial question posted by the asker) and the accepted answer, which is the answer chosen by the asker. After identifying, which clarification questions are more useful, we investigated the characteristics of these questions in terms of their types and patterns. Non-useful clarification questions are identified, and their patterns are compared with useful clarifications. Our analysis indicates that the most useful clarification questions have similar patterns, regardless of topic. This research contributes to an understanding of clarification in conversations and can provide insight for clarification dialogues in conversational search scenarios and for the possible system generation of clarification requests in information-seeking conversations.
Open research data repositories are promoted as one of the cornerstones in the open research paradigm, promoting collaboration, interoperability, and large-scale sharing and reuse. There is, however, a lack of research investigating what these sharing platforms actually share and a more critical interface analysis of the norms and practices embedded in this datafication of academic practice is needed. This article takes image data sharing in the humanities as a case study for investigating the possibilities and constraints in 5 open research data repositories. By analyzing the visual and textual content of the interface along with the technical means for metadata, the study shows how the platforms are differentiated in terms of signifiers of research paradigms, but that beneath the rhetoric of the interface, they are designed in a similar way, which does not correspond well with the image researchers' need for detailed metadata. Combined with the problem of copyright limitations, these data-sharing tools are simply not sophisticated enough when it comes to sharing and reusing images. The result also corresponds with previous research showing that these tools are used not so much for sharing research data, but more for promoting researcher personas.
This study explores the use of PubPeer by the scholarly community, to understand the issues discussed in an online journal club, the disciplines most commented on, and the characteristics of the most prolific users. A sample of 39,985 posts about 24,779 publications were extracted from PubPeer in 2019 and 2020. These comments were divided into seven categories according to their degree of seriousness (Positive review, Critical review, Lack of information, Honest errors, Methodological flaws, Publishing fraud, and Manipulation). The results show that more than two-thirds of comments are posted to report some type of misconduct, mainly about image manipulation. These comments generate most discussion and take longer to be posted. By discipline, Health Sciences and Life Sciences are the most discussed research areas. The results also reveal “super commenters,” users who access the platform to systematically review publications. The study ends by discussing how various disciplines use the site for different purposes.
As part of the effort to digitally transform, organizations are seeking more and better solutions to long-standing enterprise content management challenges. Such solutions rarely investigate the relationship between knowledge workers' daily work to capture information and the perceived or actual value of that information to the enterprise per established content management strategy. The study described in this paper seeks to identify gaps in content management practices versus policy by modeling the conventions by which one organization's knowledge workers typically generate, store, and later recover their daily work products. Thirty-five interviews with knowledge workers at the Jet Propulsion Laboratory were conducted on this subject. The results of these interviews provide an insight as to how knowledge workers interact with enterprise content in their dual roles as both the primary creators and primary consumers of enterprise content. This paper, which outlines various permutations of the digital object creation, description, and storage (CDS) model, provides basic strategies for bringing the value perceptions of knowledge workers into alignment with institutional directives related to improving content findability and reuse in the enterprise.
The emphasis on research evaluation has brought scrutiny to the role of self-citations in the scholarly communication process. While author self-citations have been studied at length, little is known on national-level self-references (SRs). This paper analyses the citation context of national SRs, using the full-text of 184,859 papers published in PLOS journals. It investigates the differences between national SRs and nonself-references (NSRs) in terms of their in-text mention, presence in enumerations, and location features. For all countries, national SRs exhibit a higher level of engagement than NSRs. NSRs are more often found in enumerative citances than SRs, which suggests that researchers pay more attention to domestic than foreign studies. There are more mentions of national research in the methods section, which provides evidence that methodologies developed in a nation are more likely to be used by other researchers from the same nation. Publications from the United States are cited at a higher rate in each of the sections, indicating that the country still maintains a dominant position in science. On the whole, this paper contributes to a better understanding of the role of national SRs in the scholarly communication system, and how it varies across countries and over time.
This opinion piece takes Google's response to the so-called COVID-19 infodemic, as a starting point to argue for the need to consider societal relevance as a complement to other types of relevance. The authors maintain that if information science wants to be a discipline at the forefront of research on relevance, search engines, and their use, then the information science research community needs to address itself to the challenges and conditions that commercial search engines create in. The article concludes with a tentative list of related research topics.
This article considers the interdisciplinary opportunities and challenges of working with digital cultural heritage, such as digitized historical newspapers, and proposes an integrated digital hermeneutics workflow to combine purely disciplinary research approaches from computer science, humanities, and library work. Common interests and motivations of the above-mentioned disciplines have resulted in interdisciplinary projects and collaborations such as the NewsEye project, which is working on novel solutions on how digital heritage data is (re)searched, accessed, used, and analyzed. We argue that collaborations of different disciplines can benefit from a good understanding of the workflows and traditions of each of the disciplines involved but must find integrated approaches to successfully exploit the full potential of digitized sources. The paper is furthermore providing an insight into digital tools, methods, and hermeneutics in action, showing that integrated interdisciplinary research needs to build something in between the disciplines while respecting and understanding each other's expertise and expectations.
Virtue epistemology offers a yet-untapped path for ethical development in information science. This paper presents two empirical studies on intellectual humility (IH), a cornerstone intellectual virtue. Centrally, IH is a matter of being open to the possibility that one may be misinformed or uninformed; it involves accurately valuing one's beliefs according to the evidence. The studies presented in this paper explore the relationship between IH and people's information seeking and use. First, a correlational questionnaire study was conducted with 201 participants considering a recent, real-life task; second, a concurrent thinkaloud study was conducted with 8 participants completing 3 online search tasks. These studies give further color to prior assertions that people with higher IH engage in more information seeking. The results show, for instance, that those with higher IH may actually favor more easily accessible information sources and that some dimensions of IH, such as modesty and engagement, may be most important to information seeking. These findings offer a nuanced understanding of the relationship between IH and information behavior and practices. They suggest avenues for further research, and they may be applied in educational contexts and sociotechnical design.
This article examines the benefits of putting Indigenous perspectives and the digital humanities (DH) in conversation with each other in order to elaborate a DH approach that is suitable for Indigenous research and to suggest critical perspectives for a more sustainable DH. For this purpose, the article examines practices of data harvesting, categorizing, and sharing from the perspectives of groups in the margin, more specifically in relation to Sámi research. Previous research has emphasized the role of cultural and social contexts in the design, use, and adaptation of technologies in general, and digital technologies in particular (Douglas, 1987. Inventing American broadcasting; Nissenbaum, 2001. Computer, 34, 118–120; Powell & Aitken, 2011. The American literature scholar in the digital age) and several scholars have argued for how the application of critical studies make a fruitful contribution to the DH (Liu, 2012. Debates in the digital humanities; McPherson, 2012. Debates in the digital humanities). This article suggests an approach that addresses a need to acknowledge the diversity of technoscientific traditions. The perspectives of Indigenous groups bring this matter to a head. In order to make the DH more sustainable and inclusive, the development of the DH should be driven by cultural studies to a greater extent than it has been so far. A sustainable DH also means a better rendering of the plurality of the cultural values, perspectives, and ethics that characterize our fieldwork and research subjects.