Data Quality and Data Access for Research

A special issue of Data (ISSN 2306-5729). This special issue belongs to the section "Information Systems and Data Management".

Deadline for manuscript submissions: closed (31 January 2021) | Viewed by 28971

Special Issue Editors


E-Mail Website
Guest Editor
University's Research Data Officer, HUIT; Chief Data Science and Technology Officer, IQSS, Harvard University, Cambridge, MA 02138, USA

E-Mail Website
Guest Editor
Department of Sociology, Philosophy and Anthropology, College of Social Sciences and International Studies, University of Exeter, Exeter EX4 4RJ, UK
Interests: data-intensive science; open science; open data; history of data collection and sharing; science policy; globalization of scientific research; epistemology of bioinformatics; bio-ontologies; abstraction and modelling processes in biology

E-Mail Website
Guest Editor
Arcadia Fund, London W8 6EH, UK
Interests: phyloinformatics; biodiversity informatics; open data; data mining; reproducible research

Special Issue Information

Dear colleagues,

Today's research and the advance of knowledge depend heavily on data. In recent years, there has been significant progress in making research data findable and accessible through generalist and domain-specific repositories, partially thanks to the increasing number of data-sharing policies from funders and journals. There are currently many efforts towards making data FAIR (findable, accessible, interoperable, and reusable data by machines and humans), including better-defined FAIR metrics, tools that help implement standards, and supporting guidelines. However, more work by data professionals, data providers, and the research community is still needed to make a difference. Much of the data that could be useful to infer new knowledge are not well-described, lack provenance, or are inaccessible and, therefore, hard or impossible to use for further research or validation of prior published work. For example, even when data and code are shared in archival repositories, and verification is attempted during peer review, often reviewers still need to contact the authors to be able to reuse the data and code. Or, in other cases, the data that could be used for research are private (either with sensitive information or owned by an organization) and inaccessible unless the appropriate data use agreements and privacy-preserving mechanisms to analyze the data are put in place.

This Special Issue is intended to present discussions and any advances at making data for research more (re)usable, whether the data have been generated by researchers or private organizations. Topics include but are not limited to the following areas:

Software tools to improve data curation and data quality;
Technology mechanisms and/or policies to improve the access of private data for research;
Tools to validate the quality and completeness of data and code used for research;
Mechanisms to provide well-documented, standardized data and code.
 

Dr. Mercè Crosas
Dr. Sabina Leonelli
Dr. Ross Mounce
Guest Editors

Manuscript Submission Information

Manuscripts should be submitted online at www.mdpi.com by registering and logging in to this website. Once you are registered, click here to go to the submission form. Manuscripts can be submitted until the deadline. All submissions that pass pre-check are peer-reviewed. Accepted papers will be published continuously in the journal (as soon as accepted) and will be listed together on the special issue website. Research articles, review articles as well as short communications are invited. For planned papers, a title and short abstract (about 100 words) can be sent to the Editorial Office for announcement on this website.

Submitted manuscripts should not have been published previously, nor be under consideration for publication elsewhere (except conference proceedings papers). All manuscripts are thoroughly refereed through a single-blind peer-review process. A guide for authors and other relevant information for submission of manuscripts is available on the Instructions for Authors page. Data is an international peer-reviewed open access monthly journal published by MDPI.

Please visit the Instructions for Authors page before submitting a manuscript. The Article Processing Charge (APC) for publication in this open access journal is 1600 CHF (Swiss Francs). Submitted papers should be well formatted and use good English. Authors may use MDPI's English editing service prior to publication or during author revisions.

Benefits of Publishing in a Special Issue

  • Ease of navigation: Grouping papers by topic helps scholars navigate broad scope journals more efficiently.
  • Greater discoverability: Special Issues support the reach and impact of scientific research. Articles in Special Issues are more discoverable and cited more frequently.
  • Expansion of research network: Special Issues facilitate connections among authors, fostering scientific collaborations.
  • External promotion: Articles in Special Issues are often promoted through the journal's social media, increasing their visibility.
  • e-Book format: Special Issues with more than 10 articles can be published as dedicated e-books, ensuring wide and rapid dissemination.

Further information on MDPI's Special Issue polices can be found here.

Published Papers (6 papers)

Order results
Result details
Select all
Export citation of selected articles as:

Research

Jump to: Other

17 pages, 3876 KiB  
Article
Information System for Selection of Conditions and Equipment for Mammalian Cell Cultivation
by Natalia Menshutina, Elena Guseva, Diana Batyrgazieva and Igor Mitrofanov
Data 2021, 6(3), 23; https://doi.org/10.3390/data6030023 - 25 Feb 2021
Cited by 1 | Viewed by 3038
Abstract
Over the past few decades, animal cell culture technology has advanced significantly. It is now considered a reliable, functional, and relatively well-developed technology. At present, biotherapeutic drugs are synthesized using cell culture techniques by large manufacturing enterprises that produce products for commercial use [...] Read more.
Over the past few decades, animal cell culture technology has advanced significantly. It is now considered a reliable, functional, and relatively well-developed technology. At present, biotherapeutic drugs are synthesized using cell culture techniques by large manufacturing enterprises that produce products for commercial use and clinical research. The reliable implementation of mammalian cell culture technology requires the optimization of a number of variables, including the culture environment and bioreactor conditions, suitable cell lines, operating costs, efficient process management and, most importantly, quality. Successful implementation also requires an appropriate process development strategy, industrial scale, and characteristics, as well as the certification of sustainable procedures that meet the requirements of current regulations. All of this has led to a trend of increasing research in the field of biotechnology and, as a result, to a great accumulation of scientific information which, however, remains fragmentary and non-systematic. The development of information and network technologies allow us to solve this problem. Information system creation allows for implementation of the modern concept of integrating various structured and unstructured data, as well as the collection of information from internal and external sources. We propose and develop an information system which contains the conditions and various parameters of cultivation processes. The associated ranking system is the result of the set of recommendations—both from technological and hardware solutions—which allow for choosing the optimal conditions for the cultivation of mammalian cells at the stage of scientific research, thereby significantly reducing the time and cost of work. The proposed information system allows for the accumulation of experience regarding existing technologies for the cultivation of mammalian cells, along with application to the development of new technologies. The main goal of the present work is to discuss information systems, the organizational support of scientific research in the field of mammalian cell cultivation, and to provide a detailed description of the developed system and its main modules, including the conceptual and logical scheme of the database. Full article
(This article belongs to the Special Issue Data Quality and Data Access for Research)
Show Figures

Figure 1

12 pages, 586 KiB  
Article
Repository Approaches to Improving the Quality of Shared Data and Code
by Ana Trisovic, Katherine Mika, Ceilyn Boyd, Sebastian Feger and Mercè Crosas
Data 2021, 6(2), 15; https://doi.org/10.3390/data6020015 - 3 Feb 2021
Cited by 18 | Viewed by 5606
Abstract
Sharing data and code for reuse has become increasingly important in scientific work over the past decade. However, in practice, shared data and code may be unusable, or published results obtained from them may be irreproducible. Data repository features and services contribute significantly [...] Read more.
Sharing data and code for reuse has become increasingly important in scientific work over the past decade. However, in practice, shared data and code may be unusable, or published results obtained from them may be irreproducible. Data repository features and services contribute significantly to the quality, longevity, and reusability of datasets. This paper presents a combination of original and secondary data analysis studies focusing on computational reproducibility, data curation, and gamified design elements that can be employed to indicate and improve the quality of shared data and code. The findings of these studies are sorted into three approaches that can be valuable to data repositories, archives, and other research dissemination platforms. Full article
(This article belongs to the Special Issue Data Quality and Data Access for Research)
Show Figures

Figure 1

13 pages, 1391 KiB  
Article
Guidelines for a Standardized Filesystem Layout for Scientific Data
by Florian Spreckelsen, Baltasar Rüchardt, Jan Lebert, Stefan Luther, Ulrich Parlitz and Alexander Schlemmer
Data 2020, 5(2), 43; https://doi.org/10.3390/data5020043 - 24 Apr 2020
Cited by 3 | Viewed by 5237
Abstract
Storing scientific data on the filesystem in a meaningful and transparent way is no trivial task. In particular, when the data have to be accessed after their originator has left the lab, the importance of a standardized filesystem layout cannot be underestimated. It [...] Read more.
Storing scientific data on the filesystem in a meaningful and transparent way is no trivial task. In particular, when the data have to be accessed after their originator has left the lab, the importance of a standardized filesystem layout cannot be underestimated. It is desirable to have a structure that allows for the unique categorization of all kinds of data from experimental results to publications. They have to be accessible to a broad variety of workflows, e.g., via graphical user interface as well as via command line, in order to find widespread acceptance. Furthermore, the inclusion of already existing data has to be as simple as possible. We propose a three-level layout to organize and store scientific data that incorporates the full chain of scientific data management from data acquisition to analysis to publications. Metadata are saved in a standardized way and connect original data to analyses and publications as well as to their originators. A simple software tool to check a file structure for compliance with the proposed structure is presented. Full article
(This article belongs to the Special Issue Data Quality and Data Access for Research)
Show Figures

Figure 1

24 pages, 4752 KiB  
Article
Multiple Regression Analysis and Frequent Itemset Mining of Electronic Medical Records: A Visual Analytics Approach Using VISA_M3R3
by Sheikh S. Abdullah, Neda Rostamzadeh, Kamran Sedig, Amit X. Garg and Eric McArthur
Data 2020, 5(2), 33; https://doi.org/10.3390/data5020033 - 29 Mar 2020
Cited by 11 | Viewed by 5235
Abstract
Medication-induced acute kidney injury (AKI) is a well-known problem in clinical medicine. This paper reports the first development of a visual analytics (VA) system that examines how different medications associate with AKI. In this paper, we introduce and describe VISA_M3R3, a VA system [...] Read more.
Medication-induced acute kidney injury (AKI) is a well-known problem in clinical medicine. This paper reports the first development of a visual analytics (VA) system that examines how different medications associate with AKI. In this paper, we introduce and describe VISA_M3R3, a VA system designed to assist healthcare researchers in identifying medications and medication combinations that associate with a higher risk of AKI using electronic medical records (EMRs). By integrating multiple regression models, frequent itemset mining, data visualization, and human-data interaction mechanisms, VISA_M3R3 allows users to explore complex relationships between medications and AKI in such a way that would be difficult or sometimes even impossible without the help of a VA system. Through an analysis of 595 medications using VISA_M3R3, we have identified 55 AKI-inducing medications, 24,212 frequent medication groups, and 78 medication groups that are associated with AKI. The purpose of this paper is to demonstrate the usefulness of VISA_M3R3 in the investigation of medication-induced AKI in particular and other clinical problems in general. Furthermore, this research highlights what needs to be considered in the future when designing VA systems that are intended to support gaining novel and deep insights into massive existing EMRs. Full article
(This article belongs to the Special Issue Data Quality and Data Access for Research)
Show Figures

Figure 1

10 pages, 1342 KiB  
Article
Influence of Information Quality via Implemented German RCD Standard in Research Information Systems
by Otmane Azeroual, Joachim Schöpfel and Dragan Ivanovic
Data 2020, 5(2), 30; https://doi.org/10.3390/data5020030 - 27 Mar 2020
Cited by 2 | Viewed by 3464
Abstract
With the steady increase in the number of data sources to be stored and processed by higher education and research institutions, it has become necessary to develop Research Information Systems, which will store this research information in the long term and make it [...] Read more.
With the steady increase in the number of data sources to be stored and processed by higher education and research institutions, it has become necessary to develop Research Information Systems, which will store this research information in the long term and make it accessible for further use, such as reporting and evaluation processes, institutional decision making and the presentation of research performance. In order to retain control while integrating research information from heterogeneous internal and external data sources and disparate interfaces into RIS and to maximize the benefits of the research information, ensuring data quality in RIS is critical. To facilitate a common understanding of the research information collected and to harmonize data collection processes, various standardization initiatives have emerged in recent decades. These standards support the use of research information in RIS and enable compatibility and interoperability between different information systems. This paper examines the process of securing data quality in RIS and the impact of research information standards on data quality in RIS. We focus on the recently developed German Research Core Dataset standard as a case of application. Full article
(This article belongs to the Special Issue Data Quality and Data Access for Research)
Show Figures

Figure 1

Other

Jump to: Research

10 pages, 197 KiB  
Essay
Towards a Contextual Approach to Data Quality
by Stefano Canali
Data 2020, 5(4), 90; https://doi.org/10.3390/data5040090 - 25 Sep 2020
Cited by 11 | Viewed by 4230
Abstract
In this commentary, I propose a framework for thinking about data quality in the context of scientific research. I start by analyzing conceptualizations of quality as a property of information, evidence and data and reviewing research in the philosophy of information, the philosophy [...] Read more.
In this commentary, I propose a framework for thinking about data quality in the context of scientific research. I start by analyzing conceptualizations of quality as a property of information, evidence and data and reviewing research in the philosophy of information, the philosophy of science and the philosophy of biomedicine. I identify a push for purpose dependency as one of the main results of this review. On this basis, I present a contextual approach to data quality in scientific research, whereby the quality of a dataset is dependent on the context of use of the dataset as much as the dataset itself. I exemplify the approach by discussing current critiques and debates of scientific quality, thus showcasing how data quality can be approached contextually. Full article
(This article belongs to the Special Issue Data Quality and Data Access for Research)
Back to TopTop