Noticias em eLiteracias

❌ Sobre o FreshRSS
Há novos artigos disponíveis, clique para atualizar a página.
Antes de ontemThe Code4Lib Journal

“You could use the API!”: A Crash Course in Working with the Alma APIs using Postman

Por Rebecca Hyams and Tamara Pilko
While there are those within libraries that are able to take vendor APIs and use them to power applications and innovative workflows in their respective systems, there are those of us that may have heard of APIs but have only the slightest idea of how to actually make use of them. Often colleagues in various forums will mention that a task could be “just done with the API” but provide little information to take us from “this is what an API is” or “here’s the API documentation” to actually putting them to use. Looking for a way to automate tasks in Alma, the authors of this article both found themselves in such a position and then discovered Postman, an API platform with a user-friendly interface that simplifies sending API calls as well as using bulk and chained requests. This article gives a basic primer in how to set up Postman, how to use it to work with ExLibris’ Alma APIs, as well as the authors’ use cases in working with electronic inventory and course reserves.

Archiving an Early Web-Based Journal: Addressing Issues of Workflow, Authenticity, and Bibliodiversity

Por Nick Szydlowski, Rhonda Holberton, Erika Johnson
SWITCH is a journal of new media art that has been published in an online-only format since 1995 by the CADRE Laboratory for New Media at San José State University (SJSU). The journal is distinctive in its commitment to presenting scholarship and criticism on new media art in a visual format that reflects and enhances its engagement with the subject. This approach, which includes the practice of redesigning the journal’s platform and visual presentation for each issue, raises significant challenges for the long-term preservation of the journal, as well as immediate issues related to indexing and discovery. This article describes the initial stages of a collaboration between the Martin Luther King, Jr. Library and the CADRE Laboratory at SJSU to archive and index SWITCH and to host a copy of the journal on SJSU’s institutional repository, SJSU ScholarWorks. It will describe the process of harvesting the journal, share scripts used to extract metadata and modify files to address accessibility and encoding issues, and discuss an ongoing curricular project that engages CADRE students in the process of augmenting metadata for SWITCH articles. The process reflects the challenges of creating an authentic version of this journal that is also discoverable and citable within the broader scholarly communication environment. This effort is part of a growing multi-institutional project to archive the new media art community in the Bay Area in a 3D web exhibition format.

Building CyprusArk a Web Content Management System for Small Museums Collections Online

Por Avgoustinos Avgousti, Georgios Papaioannou, and Feliz Ribeiro Gouveia
This article introduces CyprusArk, a work-in-progress solution to the problems that small museums in Cyprus have in providing online access to their collections. CyprusArk is an open-source web content management system for small museums’ online collections. Developed as part of Avgousti’s Ph.D. thesis, based on qualitative data collected from six small museums in Cyprus.

Predictable Book Shifting

Por Joshua Lambert
There are many methods to carry out a library book shift but those methods allow for varying degrees of predictability. The script, when used in conjunction with accurate measurements of a library's collection and shelving, provides library staff with predictability, flexibility, and the ability to shift in parallel. For every shelf, the script outputs a phrase such as the following, "The last book from this shelf goes 12.3 in/cm into shelf 776." While complicated shifts can still create surprises, using or similar methods typically make those surprises easy to correct.

Simplifying ARK ID management for persistent access to digital objects

Por Kyle Huynh, Natkeeran Ledchumykanthan, Kirsta Stapelfeldt, Irfan Rahman
This article will provide a brief overview of considerations made by the UTSC Library in selecting a persistent identifier scheme for digital collections in a mid-sized Canadian library.  ARKs were selected for their early support of digital object management, the low-cost, decentralized capabilities of the ARK system, and the usefulness of ARK URLs during system migration projects.  In the absence of a subscription to a centralized resolver service for ARKs, the UTSC Library Digital Scholarship Unit built an open source PHP-based application for minting, binding, managing, and tracking ARK IDs. This article will introduce the application's architecture and affordances, which may be useful to others in the library community with similar use cases, as well as the approach to using ARKs planned for an Islandora 2.x system.

Preservation and Visualization of the Rural Route Nomad Photo and Video Collection

Por Alan Webber
This article documents the steps taken in the preservation of a personal photo and video project, “Rural Route Nomad,” consisting of 14,058 born-digital objects from over a dozen different digital cameras used on world travels throughout all seven continents from the end of 2008 through 2009. Work was done independently, “DIY” if you will, with professional standards implemented in a manageable way sans the more extensive resources of a larger institution. Efforts were undertaken in three main stages: preservation, dataset generation, and visualization.

Ontology for Voice, Instruments, and Ensembles (OnVIE): Revisiting the Medium of Performance Concept for Enhanced Discoverability

Por Kimmy Szeto
Medium of performance—instruments, voices, and devices—is a frequent starting point in library users’ search for music resources. However, content and encoding standards for library cataloging have not been developed in a way that enables clear and consistent recording of medium of performance information. Consequently, unless specially configured, library discovery systems do not display medium of performance or provide this access point. Despite efforts to address this issue in the past decade in RDA, MARC, and the linked data environment, medium of performance information continues to be imprecise, dispersed across multiple fields or properties, and implied in other data elements. This article proposes revised definitions for “part,” “medium,” “performer,” and “ensemble,” along with a linked data model, the Ontology for Voice, Instruments, and Ensembles (OnVIE), that captures precise and complete medium of performance data reflecting music compositional practices, performance practices, and publishing conventions. The result is an independent medium of performance framework for recording searchable and machine-actionable metadata that can be hooked on to established library metadata ontologies and is widely applicable to printed and recorded classical, popular, jazz, and folk music. The clarity, simplicity, and extensibility of this model enable machine parsing so that the data can be searched, filtered, sorted, and displayed in multiple, creative ways.

Teaching AI when to care about gender

Por James Powell, Kari Sentz, Elizabeth Moyer, Martin Klein
Natural Language Processing (NLP) is a branch of Artificial Intelligence (AI) concerned with solving language tasks by modeling large amounts of textual data. Some NLP techniques use word embeddings which are semantic models where machine learning (ML) is used to learn to cluster semantically related words by learning about word co-occurrences in the original training text. Unfortunately, these models tend to reflect or even exaggerate biases that are present in the training corpus. Here we describe the Word Embedding Navigator (WEN), which is a tool for exploring word embedding models. We examine a specific potential use case for this tool: interactive discovery and neutralization of gender bias in word embedding models, and compare this human-in-the-loop approach to reducing bias in word embeddings with a debiasing post-processing technique.

Annif Analyzer Shootout: Comparing text lemmatization methods for automated subject indexing

Por Osma Suominen, Ilkka Koskenniemi
Automated text classification is an important function for many AI systems relevant to libraries, including automated subject indexing and classification. When implemented using the traditional natural language processing (NLP) paradigm, one key part of the process is the normalization of words using stemming or lemmatization, which reduces the amount of linguistic variation and often improves the quality of classification. In this paper, we compare the output of seven different text lemmatization algorithms as well as two baseline methods. We measure how the choice of method affects the quality of text classification using example corpora in three languages. The experiments have been performed using the open source Annif toolkit for automated subject indexing and classification, but should generalize also to other NLP toolkits and similar text classification tasks. The results show that lemmatization methods in most cases outperform baseline methods in text classification particularly for Finnish and Swedish text, but not English, where baseline methods are most effective. The differences between lemmatization methods are quite small. The systematic comparison will help optimize text classification pipelines and inform the further development of the Annif toolkit to incorporate a wider choice of normalization methods.

Editorial: On FOSS in Libraries

Por Andrew Darby
Some thoughts on the state of free and open source software in libraries.

Works, Expressions, Manifestations, Items: An Ontology

Por Karen Coyle
The concepts first introduced in the FRBR document and known as "WEMI" have been employed in situations quite different from the library bibliographic catalog. This is evidence that a definition of similar classes that are more general than those developed for library usage would benefit metadata developers broadly. This article proposes a minimally constrained set of classes and relationships that could form the basis for a useful model of created works.

Lantern: A Pandoc Template for OER Publishing

Por Chris Diaz
Lantern is a template and workflow for using Pandoc and GitHub to create and host multi-format open educational resources (OER) online. It applies minimal computing methods to OER publishing practices. The purpose is to minimize the technical footprint for digital publishing while maximizing control over the form, content, and distribution of OER texts. Lantern uses Markdown and YAML to capture an OER’s source content and metadata and Pandoc to transform it into HTML, PDF, EPUB, and DOCX formats. Pandoc’s options and arguments are pre-configured in a Bash script to simplify the process for users. Lantern is available as a template repository on GitHub. The template repository is set up to run Pandoc with GitHub Actions and serve output files on GitHub Pages for convenience; however, GitHub is not a required dependency. Lantern can be used on any modern computer to produce OER files that can be uploaded to any modern web server.

Fractal in detail: What information is in a file format identification report?

Por Ross Spencer
A file format identification report, such as those generated by digital preservation tools, DROID, Siegfried, or FIDO, contain an incredible wealth of information. Used to scan discrete sets of files comprising a part of, or the entirety of a digital collection, these datasets can serve as entry points for further activities including appraisal, identification of future work efforts, and the facilitation of transfer of digital objects into preservation storage. The information contained in them is fractal in detail and there are numerous outputs that can be generated from that detail. This paper describes the purpose of a file format identification report and the extensive information that can be extracted from one. It summarizes a number of ways of transforming them into the inputs for other systems and describes a handful of the tools already doing so. The paper concludes that describing a format identification report is a pivotal artefact in the digital transfer process, and asks the reader to consider how they might leverage them and the benefits doing so might provide.

Editorial — New name change policy

Por Ron Peterson
The Code4Lib Journal Editorial Committee is implementing a new name change policy aimed to facilitate the process and ensure timely and comprehensive name changes for anyone who needs to change their name within the Journal.

Automating reference consultation requests with JavaScript and a Google Form

Por Stephen Zweibel
At the CUNY Graduate Center Library, reference consultation requests were previously sent to a central email address, then manually directed by our head of reference to the appropriate subject expert. This process was cumbersome and because the inbox was not checked every day, responses were delayed and messages were occasionally missed. In order to streamline this process, I created a form and wrote a script that uses the answers in the form to automatically forward any consultation requests to the correct subject specialist. This was done using JavaScript, Google Sheets, and the Google Apps Script backend. When a patron requesting a consultation fills out the form, they include their field of research. This field is associated in my script with a particular subject specialist librarian, who then receives an email with the pertinent information. Rather than requiring either that patrons themselves search for the right subject specialist, or that library faculty spend time distributing messages to the right liaison, this enables a smoother, more direct interaction. In this article, I will describe the steps I took to write this script, using only freely available online software.

The DSA Toolkit Shines Light Into Dark and Stormy Archives

Por Shawn M. Jones, Himarsha R. Jayanetti, Alex Osborne, Paul Koerbin, Martin Klein, Michele C. Weigle, Michael L. Nelson
Themed web archive collections exist to make sense of archived web pages (mementos). Some collections contain hundreds of thousands of mementos. There are many collections about the same topic. Few collections on platforms like Archive-It include standardized metadata. Reviewing the documents in a single collection thus becomes an expensive proposition. Search engines help find individual documents but do not provide an overall understanding of each collection as a whole. Visitors need to be able to understand what individual collections contain so they can make decisions about individual collections and compare them to each other. The Dark and Stormy Archives (DSA) Project applies social media storytelling to a subset of a collection to facilitate collection understanding at a glance. As part of this work, we developed the DSA Toolkit, which helps archivists and visitors leverage this capability. As part of our recent International Internet Preservation Consortium (IIPC) grant, Los Alamos National Laboratory (LANL) and Old Dominion University (ODU) piloted the DSA toolkit with the National Library of Australia (NLA). Collectively we have made numerous improvements, from better handling of NLA mementos to native Linux installers to more approachable Web User Interfaces. Our goal is to make the DSA approachable for everyone so that end-users and archivists alike can apply social media storytelling to web archives.

Citation Needed: Adding Citations to CONTENTdm Records

Por Jenn Randles & Andrew Bullen
The Tennessee State Library and Archives and the Illinois State Library identified a need to add citation information to individual image records in OCLC’s CONTENTdm ( Experience with digital archives at both institutions showed that citation information was one of the most requested features. Unfortunately, CONTENTdm does not natively display citation information about image records; to add this functionality, custom JavaScript had to be written that would interact with the underlying React environment and parse out or retrieve the appropriate metadata to dynamically build record citations. Detailed code and a description of methods for building two different models of citation generators are presented.

Supporting open access, integrating distributed research platforms, and building a research information management platform

Por Daniel M. Coughlin, Cynthia Hudson Vitale

Academic libraries are often called upon by their university communities to collect, manage, and curate information about the research activity produced at their campuses. Proper research information management (RIM) can be leveraged for multiple institutional contexts, including networking, reporting activities, building faculty profiles, and supporting the reputation management of the institution.

In the last ten to fifteen years the adoption and implementation of RIM infrastructure has become widespread throughout the academic world. Approaches to developing and implementing this infrastructure have varied, from commercial and open-source options to locally developed instances. Each piece of infrastructure has its own functionality, features, and metadata sources. There is no single application or data source to meet all the needs of these varying pieces of research information, many of these systems together create an ecosystem to provide for the diverse set of needs and contexts.

This paper examines the systems at Pennsylvania State University that contribute to our RIM ecosystem; how and why we developed another piece of supporting infrastructure for our Open Access policy and the successes and challenges of this work.

Strategies for Preserving Digital Scholarship / Humanities Projects

Por Kirsta Stapelfeldt, Sukhvir Khera, Natkeeran Ledchumykanthan, Lara Gomez, Erin Liu, and Sonia Dhaliwal
The Digital Scholarship Unit (DSU) at the University of Toronto Scarborough library frequently partners with faculty for the creation of digital scholarship (DS) projects. However, managing completed projects can be challenging when it is no longer under active development by the original project team, and resources allocated to its ongoing maintenance are scarce. Maintaining inactive projects on the live web bloats staff workloads or is not possible due to limited staff capacity. As technical obsolescence meets a lack of staff capacity, the gradual disappearance of digital scholarship projects forms a gap in the scholarly record. This article discusses the Library DSU’s experimentations with using web archiving technologies to capture and describe digital scholarship projects, with the goal of accessioning the resulting web archives into the Library’s digital collections. In addition to comparing some common technologies used for crawling and replay of archives, this article describes aspects of the technical infrastructure the DSU is building with the goal of making web archives discoverable and playable through the library’s digital collections interface.

Automated 3D Printing in Libraries

Por Brandon Patterson, Ben Engel, and Willis Holle
This article highlights the creation of an automated 3D printed system created at a health sciences library at a large research university. As COVID-19 limited in-person interaction with 3D printers, a group of library staff came together to code a form that took users’ 3D printed files and connected them to machines automatically. A ticketing system and payment form was also automated via this system. The only in-person interactions are dedicated staff members that unload the prints. This article will describe the journey in getting to an automated system and share code and strategies so others can try it for themselves.

Core Concepts and Techniques for Library Metadata Analysis

Por Stacie Traill and Martin Patrick
Metadata analysis is a growing need in libraries of all types and sizes, as demonstrated in many recent job postings. Data migration, transformation, enhancement, and remediation all require strong metadata analysis skills. But there is no well-defined body of knowledge or competencies list for library metadata analysis, leaving library staff with analysis-related responsibilities largely on their own to learn how to do the work effectively. In this paper, two experienced metadata analysts will share what they see as core knowledge areas and problem solving techniques for successful library metadata analysis. The paper will also discuss suggested tools, though the emphasis is intentionally not to prescribe specific tools, software, or programming languages, but rather to help readers recognize tools that will meet their analysis needs. The goal of the paper is to help library staff and their managers develop a shared understanding of the skill sets required to meet their library’s metadata analysis needs. It will also be useful to individuals interested in pursuing a career in library metadata analysis and wondering how to enhance their existing knowledge and skills for success in analysis work.

Leveraging a Custom Python Script to Scrape Subject Headings for Journals

Por Shelly R. McDavid, Eric McDavid, and Neil E. Das
In our current library fiscal climate with yearly inflationary cost increases of 2-6+% for many journals and journal package subscriptions, it is imperative that libraries strive to make our budgets go further to expand our suite of resources. As a result, most academic libraries annually undertake some form of electronic journal review, employing factors such as cost per use to inform budgetary decisions. In this paper we detail some tech savvy processes we created to leverage a Python script to automate journal subject heading generation within the OCLC’s WorldCat catalog, the MOBIUS (A Missouri Library Consortium) Catalog, and the VuFind Library Catalog, a now retired catalog for the CARLI (Consortium for Academic and Research Libraries in Illinois). We also describe the rationale for the inception of this project, the methodology we utilized, the current limitations, and details of our future work in automating our annual analysis of journal subject headings by use of an OCLC API.

Editorial : The Cost of Knowing Our Users

Por Mark Swenson
Some musings on the difficulty of wanting to know our users' secrets and simultaneously wanting to not know them.

Closing the Gap between FAIR Data Repositories and Hierarchical Data Formats

Por Connor B. Bailey, Fedor F. Balakirev, and Lyudmila L. Balakireva
Many in the scientific community, particularly in publicly funded research, are pushing to adhere to more accessible data standards to maximize the findability, accessibility, interoperability, and reusability (FAIR) of scientific data, especially with the growing prevalence of machine learning augmented research. Online FAIR data repositories, such as the Open Science Framework (OSF), help facilitate the adoption of these standards by providing frameworks for storage, access, search, APIs, and other features that create organized hubs of scientific data. However, the wider acceptance of such repositories is hindered by the lack of support of hierarchical data formats, such as Technical Data Management Streaming (TDMS) and Hierarchical Data Format 5 (HDF5), that many researchers rely on to organize their datasets. Various tools and strategies should be used to allow hierarchical data formats, FAIR data repositories, and scientific organizations to work more seamlessly together. A pilot project at Los Alamos National Laboratory (LANL) addresses the disconnect between them by integrating the OSF FAIR data repository with hierarchical data renderers, extending support for additional file types in their framework. The multifaceted interactive renderer displays a tree of metadata alongside a table and plot of the data channels in the file. This allows users to quickly and efficiently load large and complex data files directly in the OSF webapp. Users who are browsing files can quickly and intuitively see the files in the way they or their colleagues structured the hierarchical form and immediately grasp their contents. This solution helps bridge the gap between hierarchical data storage techniques and FAIR data repositories, making both of them more viable options for scientific institutions like LANL which have been put off by the lack of integration between them.

Using Low Code to Automate Public Service Workflows: Three Cases

Por Dianna Morganti and Jess Williams
Public service librarians without coding experience or technical education may not always be aware of or consider automation to be an option to streamline their regular work tasks, but the new prevalence of enterprise-level low code solutions allows novices to take advantage of technology to make their work more efficient and effective. Low code applications apply a graphic user interface on top of a coding platform to make it easy for novices to leverage automation at work. This paper presents three cases of using low code solutions for automating public service problems using the prevalent Microsoft Power Automate application, available in many library workplaces that use the Microsoft Office ecosystem. From simplifying the communication and scheduling process for instruction classes to connecting our student workers’ hourly floor counts to our administrators’ dashboard of building occupancy, we’ve leveraged simple low code automation in a scalable and replicable manner. Pseudo-code examples provided.