Noticias em eLiteracias

🔒
❌ Sobre o FreshRSS
Há novos artigos disponíveis, clique para atualizar a página.
Antes de ontemH-I

A Conversation about LibGuides and SubjectsPlus

Por Monica Ruane Rogers
Volume 25, Issue 4, October-December 2021, Page 121-126
.
  • 26 de Maio de 2021, 06:41

Automatic metadata extraction via image processing using Migne's Patrologia Graeca

Por Evagelos Varthis

Automatic metadata extraction via image processing using Migne's Patrologia Graeca
Evagelos Varthis; Marios Poulos; Ilias Giarenis; Sozon Papavlasopoulos
International Journal of Metadata, Semantics and Ontologies, Vol. 14, No. 4 (2020) pp. 265 - 278
A wealth of knowledge is kept in libraries and cultural institutions in various digital forms without, however, the possibility of a simple term search, let alone of a substantial semantic search. In this study, a novel approach is proposed which strives to recognise words and automatically generate metadata from large machine-printed corpora such as Migne's Patrologia Graeca (PG). The proposed framework firstly applies an efficient word segmentation and then transforms the word-images into special compact shapes. For the comparison, we use Hu's invariant moments for discarding unlikely similar matches, Shape Context (SC) for the contour similarity and the Pearson's Correlation Coefficient (PCC) for final verification. Comparative results are presented by using the Long-Short Term Memory (LSTM) Neural Network (NN) engine of Tesseract Optical Character Recognition (OCR) system instead of PCC. In addition, an intelligent scenario is proposed for automatic generation of PG metadata by librarians.

Data aggregation lab: an experimental framework for data aggregation in cultural heritage

Por Ioanna Polychronou

Data aggregation lab: an experimental framework for data aggregation in cultural heritage
Nuno Freire
International Journal of Metadata, Semantics and Ontologies, Vol. 14, No. 4 (2020) pp. 315 - 324
This paper describes the Data Aggregation Lab software, a system that implements the metadata aggregation workflow of cultural heritage, based on the underlying concepts and technologies of the Web of Data. These aggregation technologies can fulfil the functional requirements for metadata harvesting in cultural heritage and at the same time allow cultural heritage data to be globally interoperable with internet search engines and the Web of Data. The Data Aggregation Lab provides a framework to support several research activities within the Europeana network, such as conducting case studies, providing reference implementations, and supporting technology adoption. It provides working implementations for metadata aggregation methods with which our research has obtained positive results. These methods apply linked data, Schema.org, IIIF, Sitemaps and RDF-related technologies for innovation in data aggregation, data analysis and data conversion for cultural heritage data. The software is available for reuse and is open-sourced.

Stress-testing big data platform to extract smart and interoperable food safety analytics

Por Ioanna Polychronou

Stress-testing big data platform to extract smart and interoperable food safety analytics
Ioanna Polychronou; Giannis Stoitsis; Mihalis Papakonstantinou; Nikos Manouselis
International Journal of Metadata, Semantics and Ontologies, Vol. 14, No. 4 (2020) pp. 306 - 314
One of the significant challenges for the future is to guarantee safe food for all inhabitants of the planet. During the last 15 years, very important fraud issues like the '2013 horse meat scandal' and the '2008 Chinese milk scandal' have greatly affected the food industry and public health. One of the alternatives for this issue consists of increasing production, but to accomplish this, it is necessary that innovative options be applied to enhance the safety of the food supply chain. For this reason, it is quite important to have the right infrastructure in order to manage data of the food safety sector and provide useful analytics to Food Safety Experts. In this paper, we describe Agroknow's Big Data Platform architecture and examine its scalability for data management and experimentation.

An ontology-based method for improving the quality of process event logs using database bin logs

Por Shokoufeh Ghalibafan

An ontology-based method for improving the quality of process event logs using database bin logs
Shokoufeh Ghalibafan; Behshid Behkamal; Mohsen Kahani; Mohammad Allahbakhsh
International Journal of Metadata, Semantics and Ontologies, Vol. 14, No. 4 (2020) pp. 279 - 289
The main goal of process mining is discovering models from event logs. The usefulness of these discovered models is directly related to the quality of event logs. Researchers proposed various solutions to detect deficiencies and improve the quality of event logs; however, only a few have considered the application of a reliable external source for the improvement of the quality of event data. In this paper, we propose a method to repair the event log using the database bin log. We show that database operations can be employed to overcome the inadequacies of the event logs, including incorrect and missing data. To this end, we, first, extract an ontology from each of the event logs and the bin log. Then, we match the extracted ontologies and remove inadequacies from the event log. The results show the stability of our proposed model and its superiority over related works.

A survey study on Arabic WordNet: baring opportunities and future research directions

Por Abdulmohsen S. Albesher

A survey study on Arabic WordNet: baring opportunities and future research directions
Abdulmohsen S. Albesher; Osama B. Rabie
International Journal of Metadata, Semantics and Ontologies, Vol. 14, No. 4 (2020) pp. 290 - 305
WordNet (WN) plays an essential role in knowledge management and information retrieval - as it allows for a better understanding of word relationships, which leads to more accurate text processing. The success of WN for the English language encouraged researchers to develop WNs for other languages. One of the most common of such languages is Arabic. However, the current state of affairs of Arabic WN (AWN) has not been properly studied. Thus, this paper presents a survey study on AWN conducted to explore opportunities and possible future research directions. The results involve the synthesis of over 100 research papers on AWN. These research papers were divided into categories and subcategories.

Can We Trust Social Media?

Por Ingrid Hsieh-Yee
Volume 25, Issue 1-2, January-March - April-June 2021
.
  • 29 de Julho de 2021, 05:33

Content analysis of medical college library websites in Pakistan indicates necessary improvements

Por Midrar Ullah

Abstract

Background

Library websites are important for marketing library services and providing access to electronic resources.

Objectives

To determine the extent and quality of medical college (school) library websites in Pakistan, according to predetermined criteria.

Methods

A checklist of 40 items was developed from the literature on academic library website evaluation as well as observation of known best practice. The checklist was used on the 45 medical college websites that fitted initial inclusion criteria.

Results

Of the possible 114 candidates for inclusion, 52 institution websites contained no information about the library, 17 only provided minimum details, leaving 45 medical college library websites that could be included. Library websites lack uniformity, and most of the important features as only three library websites contained more than 20 items from the checklist. The Agha Khan University Medical College, Karachi library website contained the highest (27) number of items.

Discussion

The findings indicate the design of medical college library websites is generally inadequate in Pakistan. The websites are not performing a useful role in communicating with faculty and students. The findings point to inadequate website design skills among librarians or the lack of co-operation with professional website designers.

Conclusions

Marketing of library services and good customer relations demand improvements in the information architecture of medical college library websites as well as continued maintenance of the content to ensure that it is up to date.

Development of a validated search filter for Ovid Embase for degenerative cervical myelopathy

Por Maaz A. Khan, Oliver D. Mowforth, Isla Kuhn, Mark R. N. Kotter, Benjamin M. Davies

Abstract

Background

Degenerative cervical myelopathy (DCM) is a recently proposed umbrella term for symptomatic cervical spinal cord compression secondary to degeneration of the spine. Currently literature searching for DCM is challenged by the inconsistent uptake of the term ‘DCM’ with many overlapping keywords and numerous synonyms.

Objectives

Here, we adapt our previous Ovid medline search filter for the Ovid embase database, to support comprehensive literature searching. Both embase and medline are recommended as a minimum for systematic reviews.

Methods

References contained within embase identified in our prior study formed a ‘development gold standard’ reference database (N = 220). The search filter was adapted for embase and checked against the reference database. The filter was then validated against the ‘validation gold standard’.

Results

A direct translation was not possible, as medline indexing for DCM and the keywords search field were not available in embase. We also used the ‘focus’ function to improve precision. The resulting search filter has 100% sensitivity in testing.

Discussion and Conclusion

We have developed a validated search filter capable of retrieving DCM references in embase with high sensitivity. In the absence of consistent terminology and indexing, this will support more efficient and robust evidence synthesis in the field.

COVID‐19 information seeking needs and behaviour among citizens in Isfahan, Iran: A qualitative study

Por Mohammad Reza Soleymani, Maedeh Esmaeilzadeh, Faezeh Taghipour, Hasan Ashrafi‐rizi

Abstract

Background

Access to reliable and credible health information improves individuals’ personal care level in crises, such as the coronavirus disease 2019 (COVID-19) pandemic. It subsequently results in enhancing the community's health and reducing the health system's costs.

Objectives

This study aimed to investigate the COVID-19 related information seeking behaviour demonstrated by citizens in Isfahan, Iran.

Methods

This research was conducted in 2020 and employed a qualitative approach using conventional content analysis. The research population was selected from almost different social classes of people in Iran using purposive sampling. The saturation point was reached at 24 semi-structured interviews. The data's soundness was confirmed based on the criteria of credibility, confirmability, dependability and transferability proposed by Guba and Lincoln.

Results

The findings revealed five subcategories and 25 codes within the information seeking behaviour. The subcategories included attitude towards the COVID-19 crisis, information needs, information resources, information validation and information seeking barriers.

Conclusion

People seek information from various resources to update their knowledge and become more prepared in the face of COVID-19. The findings can be used to develop policies on informing and preventing the dissemination of false information in crises, such as the COVID-19 crisis.

Documenting flooding areas calculation: a PROV approach

Por Monica De Martino

Documenting flooding areas calculation: a PROV approach
Monica De Martino; Alfonso Quarati; Sergio Rosim; Laércio Massaru Namikawa
International Journal of Metadata, Semantics and Ontologies, Vol. 15, No. 1 (2021) pp. 50 - 59
Flooding events related to waste-lake dam ruptures are one of the most threatening natural disasters in Brazil. They must be managed in advance by public institutions through the use of adequate hydrographic and environmental information. Although the Open Data paradigm offers an opportunity to share hydrographic data sets, their actual reuse is still low because of metadata quality. Our previous work highlighted a lack of detailed provenance information. The paper presents an Open Data approach to improve the release of hydrographic data sets. We discuss a methodology, based on W3C recommendations, for documenting the provenance of hydrographic data sets, considering the workflow activities related to the study of flood areas caused by the waste-lakes breakdowns. We provide an illustrative example that documents, through W3C PROV metadata model, the generation of flooding area maps by integrating land use classification, from Sentinel images, with hydrographic data sets produced by the Brazilian National Institute for Space Research.

Persons, GLAM institutes and collections: an analysis of entity linking based on the COURAGE registry

Por Ghazal Faraj

Persons, GLAM institutes and collections: an analysis of entity linking based on the COURAGE registry
Ghazal Faraj; András Micsik
International Journal of Metadata, Semantics and Ontologies, Vol. 15, No. 1 (2021) pp. 39 - 49
It is an important task to connect encyclopaedic knowledge graphs by finding and linking the same entity nodes. Various available automated linking solutions cannot be applied in situations where data is sparse, private or a high degree of correctness is expected. Wikidata has grown into a leading linking hub collecting entity identifiers from various registries and repositories. To get a picture of connectability, we analysed the linking methods and results between the COURAGE registry and Wikidata, VIAF, ISNI and ULAN. This paper describes our investigations and solutions while mapping and enriching entities in Wikidata. Each possible mapped pair of entities received a numeric score of reliability. Using this score-based matching method, we tried to minimise the need for human decisions, hence we introduced the term human decision window for the mappings where neither acceptance nor refusal can be made automatically and safely. Furthermore, Wikidata has been enriched with related COURAGE entities and bi-directional links between mapped persons, organisations, collections, and collection items. We also describe the findings on coverage and quality of mapping among the above mentioned authority databases.

An ontology-driven perspective on the emotional human reactions to social events

Por Danilo Cavaliere

An ontology-driven perspective on the emotional human reactions to social events
Danilo Cavaliere; Sabrina Senatore
International Journal of Metadata, Semantics and Ontologies, Vol. 15, No. 1 (2021) pp. 23 - 38
Social media has become a fulcrum for sharing information on everyday-life events: people, companies, and organisations express opinions about new products, political and social situations, football matches, and concerts. The recognition of feelings and reactions to events from social networks requires dealing with great amounts of data streams, especially for tweets, to investigate the main sentiments and opinions that justify some reactions. This paper presents an emotion-based classification model to extract feelings from tweets related to an event or a trend, described by a hashtag, and build an emotional concept ontology to study human reactions to events in a context. From the tweet analysis, terms expressing a feeling are selected to build a topological space of emotion-based concepts. The extracted concepts serve to train a multi-class SVM classifier that is used to perform soft classification aimed at identifying the emotional reactions towards events. Then, an ontology allows arranging classification results, enriched with additional DBpedia concepts. SPARQL queries on the final knowledge base provide specific insights to explain people's reactions towards events. Practical case studies and test results demonstrate the applicability and potential of the approach.

Analysis of structured data on Wikipedia

Por Johny Moreira

Analysis of structured data on Wikipedia
Johny Moreira; Everaldo Costa Neto; Luciano Barbosa
International Journal of Metadata, Semantics and Ontologies, Vol. 15, No. 1 (2021) pp. 71 - 86
Wikipedia has been widely used for information consumption or for implementing solutions using its content. It contains primarily unstructured text about entities, but it can also contain infoboxes, which are structured attributes describing these entities. Owing to its structural nature, infoboxes have been shown useful to many applications. In this work, we perform an extensive data analysis on different aspects of Wikipedia structured data: infoboxes, templates and categories, aiming to uncover data issues and limitations, and to guide researchers in the use of these structured data. We devise a framework to process, index and query the Wikipedia data, using it to analyse different scenarios such as the popularity of infoboxes, their size distribution and usage across categories. Some of our findings are: only 54% of Wikipedia articles have infoboxes; there is a considerable amount of geographical and temporal information in infoboxes; and there is great heterogeneity of infoboxes across a same category.

Children's art museum collections as Linked Open Data

Por Konstantinos Kotis

Children's art museum collections as Linked Open Data
Konstantinos Kotis; Sotiris Angelis; Maria Chondrogianni; Efstathia Marini
International Journal of Metadata, Semantics and Ontologies, Vol. 15, No. 1 (2021) pp. 60 - 70
It has been recently argued that it is rather beneficial to cultural institutions to provide their datasets as Linked Open Data, to achieve cross-referencing, interlinking, and integration with other datasets in the LOD cloud. In this paper, we present the Greek Children's Art Museum (GCAM) linked dataset, along with dataset and vocabulary statistics, as well as lessons learned from the process of transforming the collections to HTML-embedded structured data using the Europeana Data Model and the Schema.org model. The dataset consists of three cultural collections of 121 child artworks (paintings), including detailed descriptions and interlinks to external datasets. In addition to the presentation of GCAM data and the lessons learned from the experimentation of non-ICT experts with LOD paradigm, the paper introduces a new metric for measuring datasets quality in terms of links to and from other datasets.

Applying cross-data set identity reasoning for producing URI embeddings over hundreds of RDF data sets

Por Michalis Mountantonakis

Applying cross-data set identity reasoning for producing URI embeddings over hundreds of RDF data sets
Michalis Mountantonakis; Yannis Tzitzikas
International Journal of Metadata, Semantics and Ontologies, Vol. 15, No. 1 (2021) pp. 1 - 22
There is a proliferation of approaches that exploit RDF data sets for creating URI embeddings, i.e., embeddings that are produced by taking as input URI sequences (instead of simple words or phrases), since they can be of primary importance for several tasks (e.g., machine learning tasks). However, existing techniques exploit either a single or a few data sets for creating URI embeddings. For this reason, we introduce a prototype, called LODVec, which exploits LODsyndesis for enabling the creation of URI embeddings by using hundreds of data sets simultaneously, after enriching them with the results of cross-data set identity reasoning. By using LODVec, it is feasible to produce URI sequences by following paths of any length (according to a given configuration), and the produced URI sequences are used as input for creating embeddings through word2vec model. We provide comparative results for evaluating the gain of using several data sets for creating URI embeddings, for the tasks of classification and regression, and for finding the most similar entities to a given one.

Engaging with uncertainty: Information practices in the context of disease surveillance in Burkina Faso

Publication date: September 2021

Source: Information and Organization, Volume 31, Issue 3

Author(s): Stine Loft Rasmussen, Sundeep Sahay

  • 4 de Outubro de 2021, 17:05

Editorial Board

Publication date: September 2021

Source: Information and Organization, Volume 31, Issue 3

Author(s):

  • 4 de Outubro de 2021, 17:05
❌