Journal Description
Data
Data
is a peer-reviewed, open access journal on data in science, with the aim of enhancing data transparency and reusability. The journal publishes in two sections: a section on the collection, treatment and analysis methods of data in science; a section publishing descriptions of scientific and scholarly datasets (one dataset per paper). The journal is published monthly online by MDPI.
- Open Access— free for readers, with article processing charges (APC) paid by authors or their institutions.
- High Visibility: indexed within Scopus, ESCI (Web of Science), Ei Compendex, dblp, Inspec, RePEc, and other databases.
- Journal Rank: JCR - Q2 (Multidisciplinary Sciences) / CiteScore - Q2 (Information Systems and Management)
- Rapid Publication: manuscripts are peer-reviewed and a first decision is provided to authors approximately 26.8 days after submission; acceptance to publication is undertaken in 3.6 days (median values for papers published in this journal in the second half of 2024).
- Recognition of Reviewers: reviewers who provide timely, thorough peer-review reports receive vouchers entitling them to a discount on the APC of their next publication in any MDPI journal, in appreciation of the work done.
Impact Factor:
2.2 (2023);
5-Year Impact Factor:
2.4 (2023)
Latest Articles
Open Georeferenced Field Data on Forest Types and Species for Biodiversity Assessment and Remote Sensing Applications
Data 2025, 10(3), 30; https://doi.org/10.3390/data10030030 - 21 Feb 2025
Abstract
►
Show Figures
Forest ecosystems are important for biodiversity conservation, climate regulation and climate change mitigation, soil and water protection, and the recreation and provision of raw materials. This paper presents a dataset on forest type and tree species composition for 934 georeferenced plots located in
[...] Read more.
Forest ecosystems are important for biodiversity conservation, climate regulation and climate change mitigation, soil and water protection, and the recreation and provision of raw materials. This paper presents a dataset on forest type and tree species composition for 934 georeferenced plots located in Italy. The forest type is classified in the field consistently with the Italian National Forest Inventory (NFI) based on the dominant tree species or species group. Tree species composition is provided by the percent crown cover of the main five species in the plot. Additional data on conifer and broadleaves pure/mixed condition, total tree and shrub cover, forest structure, sylvicultural system, development stage, and local land position are provided. The surveyed plots are distributed in the central–eastern Alps, in the central Apennines, and in the southern Apennines; they represent a wide range of species composition, ecological conditions, and silvicultural practices. Data were collected as part of a project aimed at developing a classification algorithm based on hyperspectral data. The dataset was made publicly available as it refers to forest types and species widespread in many countries of Central and Southern Europe and is potentially useful to other researchers for the study of forest biodiversity or for remote sensing applications.
Full article
Open AccessData Descriptor
HOSPI Application to Portuguese Hospitals’ Websites
by
Delfina Soares, Joana Carvalho and Dimitrios Sarantis
Data 2025, 10(3), 29; https://doi.org/10.3390/data10030029 - 21 Feb 2025
Abstract
The Health Online Service Provision Index (HOSPI) is an instrument to assess and monitor hospitals’ websites. The index comprises four criteria—Content, Services, Community Interaction and Technology Features—each with a subset of indicators and sub-indicators. HOSPI was applied to the Portuguese hospitals’ websites in
[...] Read more.
The Health Online Service Provision Index (HOSPI) is an instrument to assess and monitor hospitals’ websites. The index comprises four criteria—Content, Services, Community Interaction and Technology Features—each with a subset of indicators and sub-indicators. HOSPI was applied to the Portuguese hospitals’ websites in 2023, originating the dataset described in this article. The article also provides a detailed account of the data collection process, which involved direct observation of the websites and specific treatment methods, ensuring the reliability and validity of the dataset. It underscores the relevance of having this data available and how it can improve service provision online in health facilities and support policymaking.
Full article
Open AccessArticle
A Directory of Datasets for Mining Software Repositories
by
Themistoklis Diamantopoulos and Andreas L. Symeonidis
Data 2025, 10(3), 28; https://doi.org/10.3390/data10030028 - 20 Feb 2025
Abstract
The amount of software engineering data is constantly growing, as more and more developers employ online services to store their code, keep track of bugs, or even discuss issues. The data residing in these services can be mined to address different research challenges;
[...] Read more.
The amount of software engineering data is constantly growing, as more and more developers employ online services to store their code, keep track of bugs, or even discuss issues. The data residing in these services can be mined to address different research challenges; therefore, certain initiatives have been established to encourage sharing research datasets collecting them. In this work, we investigate the effect of such an initiative; we create a directory that includes the papers and the corresponding datasets of the data track of the Mining Software Engineering (MSR) conference. Specifically, our directory includes metadata and citation information for the papers of all data tracks, throughout the last twelve years. We also annotate the datasets according to the data source and further assess their compliance to the FAIR principles. Using our directory, researchers can find useful datasets for their research, or even design methodologies for assessing their quality, especially in the software engineering domain. Moreover, the directory can be used for analyzing the citations of data papers, especially with regard to different data categories, as well as for examining their FAIRness score throughout the years, along with its effect on the usage/citation of the datasets.
Full article
(This article belongs to the Section Information Systems and Data Management)
►▼
Show Figures

Figure 1
Open AccessArticle
SAPEx-D: A Comprehensive Dataset for Predictive Analytics in Personalized Education Using Machine Learning
by
Muhammad Adnan Aslam, Fiza Murtaza, Muhammad Ehatisham Ul Haq, Amanullah Yasin and Numan Ali
Data 2025, 10(3), 27; https://doi.org/10.3390/data10030027 - 20 Feb 2025
Abstract
Education is crucial for leading a productive life and obtaining necessary resources. Higher education institutions are progressively incorporating artificial intelligence into conventional teaching methods as a result of innovations in technology. As a high academic record raises a university’s ranking and increases student
[...] Read more.
Education is crucial for leading a productive life and obtaining necessary resources. Higher education institutions are progressively incorporating artificial intelligence into conventional teaching methods as a result of innovations in technology. As a high academic record raises a university’s ranking and increases student career chances, predicting learning success has been a central focus in education. Both performance analysis and providing high-quality instruction are challenges faced by modern schools. Maintaining high academic standards, juggling life and academics, and adjusting to technology are problems that students must overcome. In this study, we present a comprehensive dataset, SAPEx-D (Student Academic Performance Exploration), designed to predict student performance, encompassing a wide array of personal, familial, academic, and behavioral factors. Our data collection effort at Air University, Islamabad, Pakistan, involved both online and paper questionnaires completed by students across multiple departments, ensuring diverse representation. After meticulous preprocessing to remove duplicates and entries with significant missing values, we retained 494 valid responses. The dataset includes detailed attributes such as demographic information, parental education and occupation, study habits, reading frequencies, and transportation modes. To facilitate robust analysis, we encoded ordinal attributes using label encoding and nominal attributes using one-hot encoding, expanding our dataset from 38 to 88 attributes. Feature scaling was performed to standardize the range and distribution of data, using a normalization technique. Our analysis revealed that factors such as degree major, parental education, reading frequency, and scholarship type significantly influence student performance. The machine learning models applied to this dataset, including Gradient Boosting and Random Forest, demonstrated high accuracy and robustness, underscoring the dataset’s potential for insightful academic performance prediction. In terms of model performance, Gradient Boosting achieved an accuracy of 68.7% and an F1-score of 68% for the eight-class classification task. For the three-class classification, Random Forest outperformed other models, reaching an accuracy of 80.8% and an F1-score of 78%. These findings highlight the importance of comprehensive data in understanding and predicting academic outcomes, paving the way for more personalized and effective educational strategies.
Full article
(This article belongs to the Special Issue Data Mining and Computational Intelligence for E-Learning and Education—3rd Edition)
►▼
Show Figures

Figure 1
Open AccessArticle
Consistency and Stability in Feature Selection for High-Dimensional Microarray Survival Data in Diffuse Large B-Cell Lymphoma Cancer
by
Kazeem A. Dauda and Rasheed K. Lamidi
Data 2025, 10(2), 26; https://doi.org/10.3390/data10020026 - 18 Feb 2025
Abstract
►▼
Show Figures
High-dimensional survival data, such as microarray datasets, present significant challenges in variable selection and model performance due to their complexity and dimensionality. Identifying important genes and understanding how these genes influence the survival of patients with cancer are of great interest and a
[...] Read more.
High-dimensional survival data, such as microarray datasets, present significant challenges in variable selection and model performance due to their complexity and dimensionality. Identifying important genes and understanding how these genes influence the survival of patients with cancer are of great interest and a major challenge to biomedical scientists, healthcare practitioners, and oncologists. Therefore, this study combined the strengths of two complementary feature selection methodologies: a filtering (correlation-based) approach and a wrapper method based on Iterative Bayesian Model Averaging (IBMA). This new approach, termed Correlation-Based IBMA, offers a highly efficient and effective means of selecting the most important and influential genes for predicting the survival of patients with cancer. The efficiency and consistency of the method were demonstrated using diffuse large B-cell lymphoma cancer data. The results revealed that the 15 most important genes out of 3835 gene features were consistently selected at a threshold p-value of 0.001, with genes with posterior probabilities below 1% being removed. The influence of these 15 genes on patient survival was assessed using the Cox Proportional Hazards (Cox-PH) Model. The results further revealed that eight genes were highly associated with patient survival at a 0.05 level of significance. Finally, these findings underscore the importance of integrating feature selection with robust modeling approaches to enhance accuracy and interpretability in high-dimensional survival data analysis.
Full article

Figure 1
Open AccessArticle
CropsDisNet: An AI-Based Platform for Disease Detection and Advancing On-Farm Privacy Solutions
by
Mohammad Badhruddouza Khan, Salwa Tamkin, Jinat Ara, Mobashwer Alam and Hanif Bhuiyan
Data 2025, 10(2), 25; https://doi.org/10.3390/data10020025 - 18 Feb 2025
Abstract
Crop failure is defined as crop production that is significantly lower than anticipated, resulting from plants that are harmed, diseased, destroyed, or influenced by climatic circumstances. With the rise in global food security concern, the earliest detection of crop diseases has proven to
[...] Read more.
Crop failure is defined as crop production that is significantly lower than anticipated, resulting from plants that are harmed, diseased, destroyed, or influenced by climatic circumstances. With the rise in global food security concern, the earliest detection of crop diseases has proven to be pivotal in agriculture industries to address the needs of the global food crisis and on-farm data protection, which can be met with a privacy-preserving deep learning model. However, deep learning seems to be a largely complex black box to interpret, necessitating a prerequisite for the groundwork of the model’s interpretability. Considering this, the aim of this study was to follow up on the establishment of a robust deep learning custom model named CropsDisNet, evaluated on a large-scale dataset named “New Bangladeshi Crop Disease Dataset (corn, potato and wheat)”, which contains a total of 8946 images. The integration of a differential privacy algorithm into our CropsDisNet model could establish the benefits of automated crop disease classification without compromising on-farm data privacy by reducing training data leakage. To classify corn, potato, and wheat leaf diseases, we used three representative CNN models for image classification (VGG16, Inception Resnet V2, Inception V3) along with our custom model, and the classification accuracy for these three different crops varied from 92.09% to 98.29%. In addition, demonstration of the model’s interpretability gave us insight into our model’s decision making and classification results, which can allow farmers to understand and take appropriate precautions in the event of early widespread harvest failure and food crises.
Full article
(This article belongs to the Topic Decision-Making and Data Mining for Sustainable Computing)
►▼
Show Figures

Figure 1
Open AccessArticle
Visual Footprint of Separation Through Membrane Distillation on YouTube
by
Ersin Aytaç and Mohamed Khayet
Data 2025, 10(2), 24; https://doi.org/10.3390/data10020024 - 8 Feb 2025
Abstract
►▼
Show Figures
Social media has revolutionized the dissemination of information, enabling the rapid and widespread sharing of news, concepts, technologies, and ideas. YouTube is one of the most important online video sharing platforms of our time. In this research, we investigate the trace of separation
[...] Read more.
Social media has revolutionized the dissemination of information, enabling the rapid and widespread sharing of news, concepts, technologies, and ideas. YouTube is one of the most important online video sharing platforms of our time. In this research, we investigate the trace of separation through membrane distillation (MD) on YouTube using statistical methods and natural language processing. The dataset collected on 04.01.2024 included 212 videos with key characteristics such as durations, views, subscribers, number of comments, likes, etc. The results show that the number of videos is not sufficient, but there is an increasing trend, especially since 2019. The high number of channels offering information about MD technology in countries such as the USA, India, and Canada indicates that these countries recognized the practical benefits of this technology, especially in areas such as water treatment, desalination, and industrial applications. This suggests that MD could play a pivotal role in finding solutions to global water challenges. Word cloud analysis showed that terms such as “water”, “treatment”, “desalination”, and “separation” were prominent, indicating that the videos focused mainly on the principles and applications of MD. The sentiment of the comments is mostly positive, and the dominant emotion is neutral, revealing that viewers generally have a positive attitude towards MD. The narrative intensity metric evaluates the information transfer efficiency of the videos and provides a guide for effective content creation strategies. The results of the analyses revealed that social media awareness about MD technology is still not sufficient and that content development and sharing strategies should focus on bringing the technology to a wider audience.
Full article

Figure 1
Open AccessArticle
A Bayesian State-Space Approach to Dynamic Hierarchical Logistic Regression for Evolving Student Risk in Educational Analytics
by
Moeketsi Mosia
Data 2025, 10(2), 23; https://doi.org/10.3390/data10020023 - 7 Feb 2025
Abstract
Early detection of academically at-risk students is crucial for designing timely interventions that improve educational outcomes. However, many existing approaches either ignore the temporal evolution of student performance or rely on “black box” models that sacrifice interpretability. In this study, we develop a
[...] Read more.
Early detection of academically at-risk students is crucial for designing timely interventions that improve educational outcomes. However, many existing approaches either ignore the temporal evolution of student performance or rely on “black box” models that sacrifice interpretability. In this study, we develop a dynamic hierarchical logistic regression model in a fully Bayesian framework to address these shortcomings. Our method leverages partial pooling across students and employs a state-space formulation, allowing each student’s log-odds of failure to evolve over multiple assessments. By using Markov chain Monte Carlo for inference, we obtain robust posterior estimates and credible intervals for both population-level and individual-specific effects, while posterior predictive checks ensure model adequacy and calibration. Results from simulated and real-world datasets indicate that the proposed approach more accurately tracks fluctuations in student risk compared to static logistic regression, and it yields interpretable insights into how engagement patterns and demographic factors influence failure probability. We conclude that a Bayesian dynamic hierarchical model not only enhances prediction of at-risk students but also provides actionable feedback for instructors and administrators seeking evidence-based interventions.
Full article
(This article belongs to the Special Issue Data Mining and Computational Intelligence for E-Learning and Education—3rd Edition)
►▼
Show Figures

Figure 1
Open AccessArticle
Stress Factors in Higher Education: A Data Analysis Case
by
Rodolfo Bojorque, Fernando Moscoso, Fernando Pesántez and Ángela Flores
Data 2025, 10(2), 22; https://doi.org/10.3390/data10020022 - 7 Feb 2025
Abstract
This study investigates stressors in higher education, focusing on their impact on students and faculty at Universidad Politécnica Salesiana (UPS) and using eight years of comprehensive data. Employing data mining techniques, the research analyzed enrollment, retention, graduation, employability, socioeconomic status, academic performance, and
[...] Read more.
This study investigates stressors in higher education, focusing on their impact on students and faculty at Universidad Politécnica Salesiana (UPS) and using eight years of comprehensive data. Employing data mining techniques, the research analyzed enrollment, retention, graduation, employability, socioeconomic status, academic performance, and faculty workload to uncover patterns affecting academic outcomes. The study found that UPS exhibits a stable educational system, maintaining consistent metrics across student success indicators. However, the COVID-19 pandemic presented unique stressors, evidenced by a paradoxical increase in student grades during heightened faculty stress levels. This anomaly suggests a potential link between academic rigor and faculty well-being during systemic disruptions. Stressors affecting students directly correlated with reduced academic performance, highlighting the importance of early detection and intervention. Conversely, faculty stress was reflected in adjustments to grading practices, raising questions about institutional pressures and faculty motivation. These findings emphasize the value of proactive data analytics in identifying stress-induced anomalies to support student success and faculty well-being. The study advocates for further research on faculty burnout, motivation, and institutional strategies to mitigate stressors, underscoring the potential of data-driven approaches to enhance the quality and sustainability of higher education ecosystems.
Full article
(This article belongs to the Special Issue Data Mining and Computational Intelligence for E-Learning and Education—3rd Edition)
►▼
Show Figures

Figure 1
Open AccessData Descriptor
An Open Database of the Internal and Surface Temperatures of a Reinforced-Concrete Slab-on-I-Beam Section
by
Pedro Cavadia, José M. Benjumea, Oscar Begambre, Edison Osorio and María A. Mantilla
Data 2025, 10(2), 21; https://doi.org/10.3390/data10020021 - 4 Feb 2025
Abstract
►▼
Show Figures
Due to climate change, the temperature monitoring of reinforced-concrete (RC) structures is becoming critical for preventive maintenance and extending their lifespan. Significant temperature variations in RC elements can affect their natural frequencies and modulus of elasticity or generate abnormal stress levels, potentially leading
[...] Read more.
Due to climate change, the temperature monitoring of reinforced-concrete (RC) structures is becoming critical for preventive maintenance and extending their lifespan. Significant temperature variations in RC elements can affect their natural frequencies and modulus of elasticity or generate abnormal stress levels, potentially leading to structural damage. Data from thermal monitoring systems are invaluable for testing and validating numerical methodologies for estimating internal thermal responses and aiding in prevention/maintenance decision making. Despite its importance, few experimental outdoor data on the internal and external temperatures of concrete structures are available. This study presents a comprehensive dataset from a 120-day temperature-monitoring campaign on a 1.2 m long reinforced-concrete slab-on-I-beam model under tropical conditions in Bucaramanga, Colombia. The monitoring system measured the internal temperatures at 40 points using embedded thermocouples, while the surface temperatures were recorded with handheld and drone-mounted thermal cameras. Simultaneously, the ambient temperature, solar radiation, rainfall, wind velocity, and other parameters were monitored using a weather station. The instrumentation ensured the synchronization and high spatial resolution of the thermal data. The data, collected at 30 min intervals, are openly available in CSV format, offering valuable resources for validating numerical models, studying thermal gradients, and enhancing structural health-monitoring frameworks.
Full article

Figure 1
Open AccessArticle
Seaweed-Based Bioplastics: Data Mining Ingredient–Property Relations from the Scientific Literature
by
Fernanda Véliz, Thulasi Bikku, Davor Ibarra-Pérez, Valentina Hernández-Muñoz, Alysia Garmulewicz and Felipe Herrera
Data 2025, 10(2), 20; https://doi.org/10.3390/data10020020 - 1 Feb 2025
Abstract
►▼
Show Figures
Automated analysis of the scientific literature using natural language processing (NLP) can accelerate the identification of potentially unexplored formulations that enable innovations in materials engineering with fewer experimentation and testing cycles. This strategy has been successful for specific classes of inorganic materials, but
[...] Read more.
Automated analysis of the scientific literature using natural language processing (NLP) can accelerate the identification of potentially unexplored formulations that enable innovations in materials engineering with fewer experimentation and testing cycles. This strategy has been successful for specific classes of inorganic materials, but their general application in broader material domains such as bioplastics remains challenging. To begin addressing this gap, we explore correlations between the ingredients and physicochemical properties of seaweed-based biofilms from a corpus of 2000 article abstracts from the scientific literature since 1958, using a supervised word co-occurrence analysis and an unsupervised approach based on the language model MatBERT without fine-tuning. Using known relations between ingredients and properties for test scenarios, we discuss the potential and limitations of these NLP approaches for identifying novel combinations of polysaccharides, plasticizers, and additives that are related to the functionality of seaweed biofilms. The model demonstrates a valuable predictive ability to identify ingredients associated with increased water vapor permeability, suggesting its potential utility in optimizing formulations for future research. Using the model further revealed alternative combinations that are underrepresented in the literature. This automated method facilitates the mapping of relationships between ingredients and properties, guiding the development of seaweed bioplastic formulations. The unstructured and heterogeneous nature of the literature on bioplastics represents a particular challenge that demands ad hoc fine-tuning strategies for state-of-the-art language models for advancing the field of seaweed bioplastics.
Full article

Figure 1
Open AccessArticle
Impact of Various Land Cover Transformations on Climate Change: Insights from a Spatial Panel Analysis
by
Mohsen Khezri
Data 2025, 10(2), 19; https://doi.org/10.3390/data10020019 - 31 Jan 2025
Abstract
►▼
Show Figures
This study introduces an innovative empirical methodology by integrating spatial panel models with satellite imagery data from 1970 to 2019. This innovative approach illuminates the effects of greenhouse gas emissions, deforestation, and various global variables on regional temperature shifts and the environmental repercussions
[...] Read more.
This study introduces an innovative empirical methodology by integrating spatial panel models with satellite imagery data from 1970 to 2019. This innovative approach illuminates the effects of greenhouse gas emissions, deforestation, and various global variables on regional temperature shifts and the environmental repercussions of land-use alterations, establishing a substantial empirical basis for climate change. The results revealed that global variables such as sunspot activity, the length of day (LOD), and the Global Mean Sea Level (GMSL) have negligible impacts on global temperature variations. This model uncovers the nuanced effect of deforestation on global temperatures, highlighting a decrease in temperature following deforestation above 40°N latitude, contrary to the warming effect observed in lower latitudes. Exceptionally, deforestation within the 10° N to 10° S tropical bands results in a temperature decrease, challenging the established theories. The results suggest that converting forests to grass/shrublands and croplands plays a significant role in these temperature dynamics.
Full article

Figure 1
Open AccessArticle
Statistical Approach in Personalized Nutrition Exemplified by Reanalysis of Public Datasets
by
Paola G. Ferrario, Maik Döring and Christian Ritz
Data 2025, 10(2), 18; https://doi.org/10.3390/data10020018 - 30 Jan 2025
Abstract
In clinical nutrition, it is regularly observed that individuals respond differently to a dietary treatment. Personalized nutrition aims to consider such variability in response by delivering personalized nutritional recommendations. Ideally, the optimal treatment for each individual will be selected and then dispensed according
[...] Read more.
In clinical nutrition, it is regularly observed that individuals respond differently to a dietary treatment. Personalized nutrition aims to consider such variability in response by delivering personalized nutritional recommendations. Ideally, the optimal treatment for each individual will be selected and then dispensed according to the specific individual’s characteristics. The aim of this paper is to discuss and apply existing statistical methods, which can be adequately used in the context of personalized nutrition. We discuss the estimation of individualized treatment rules (ITRs) as we wish to favor one out of two interventions. The applicability of the methods is demonstrated by reusing two public datasets: one in the context of a parallel group design and one in the context of a crossover design. The bias of the estimator of the ITRs underlying parameters is evaluated in a simulation study.
Full article
(This article belongs to the Section Computational Biology, Bioinformatics, and Biomedical Data Science)
Open AccessData Descriptor
Rainfall Intensity–Duration–Frequency Curves Dataset for Brazil
by
Ivana Patente Torres, Roberto Avelino Cecílio, Laura Thebit de Almeida, Marcel Carvalho Abreu, Demetrius David da Silva, Sidney Sara Zanetti and Alexandre Cândido Xavier
Data 2025, 10(2), 17; https://doi.org/10.3390/data10020017 - 29 Jan 2025
Abstract
►▼
Show Figures
This is a database containing rainfall intensity–duration–frequency equations (IDF equations) for 6550 pluviographic and pluviometric stations in Brazil. The database was compiled from 370 different publications and contains the following information: station identification, geographic position, size and period of the rainfall series used,
[...] Read more.
This is a database containing rainfall intensity–duration–frequency equations (IDF equations) for 6550 pluviographic and pluviometric stations in Brazil. The database was compiled from 370 different publications and contains the following information: station identification, geographic position, size and period of the rainfall series used, parameters of the IDF equations, and literature references. The database is available on Mendeley Data (DOI: 10.17632/378bdcmnc8.1) in the form of spreadsheets and vector files. Since the launch of the Pluvio 2.1 software in 2006, which included 549 IDF equations obtained in the country, this is the largest and most accessible database of IDF equations in Brazil. The data provided may be useful, among other purposes, for designing hydraulic structures, controlling water erosion, planning land use, and water resource planning and management.
Full article

Figure 1
Open AccessArticle
Data-Driven Scheduling Optimization for SMT Lines Using SMD Reel Commonality
by
Jorge Quijano, Nohemi Torres Cruz, Leslie Quijano-Quian, Eduardo Rafael Poblano-Ojinaga and Salvador Anacleto Noriega Morales
Data 2025, 10(2), 16; https://doi.org/10.3390/data10020016 - 29 Jan 2025
Abstract
Optimizing production efficiency in Surface-Mount Technology (SMT) manufacturing is a critical challenge, particularly in high-mix environments where frequent product changeovers can lead to significant downtime. This study presents a scheduling algorithm that minimizes changeover times on SMT lines by leveraging the commonality of
[...] Read more.
Optimizing production efficiency in Surface-Mount Technology (SMT) manufacturing is a critical challenge, particularly in high-mix environments where frequent product changeovers can lead to significant downtime. This study presents a scheduling algorithm that minimizes changeover times on SMT lines by leveraging the commonality of Surface-Mount Device (SMD) reel part numbers across product Bills of Materials (BOMs). The algorithm’s capabilities were demonstrated through both simulated datasets and practical validation trials, providing a comprehensive evaluation framework. In the practical implementation, the algorithm successfully aligned predicted and measured changeover times, highlighting its applicability and accuracy in operational settings. The proposed approach integrates heuristic and optimization techniques to identify scheduling strategies that not only minimize reel changes but also support production scalability and operational flexibility. This framework offers a robust solution for optimizing SMT workflows, enhancing productivity, and reducing resource inefficiencies in both greenfield projects and established manufacturing environments.
Full article
(This article belongs to the Special Issue Cutting-Edge Datasets and Algorithms for Enhancing Industrial Processes and Supply Chain Optimization)
►▼
Show Figures

Figure 1
Open AccessData Descriptor
Global Dataset of Extreme Sea Levels and Coastal Flood Impacts over the 21st Century
by
Ebru Kirezci, Ian Young, Roshanka Ranasinghe, Yiqun Chen, Yibo Zhang and Abbas Rajabifard
Data 2025, 10(2), 15; https://doi.org/10.3390/data10020015 - 28 Jan 2025
Abstract
►▼
Show Figures
A global database of coastal flooding impacts resulting from extreme sea levels is developed for the present day and for the years 2050 and 2100. The database consists of three sub-datasets: the extreme sea levels, the coastal areas flooded by these extreme sea
[...] Read more.
A global database of coastal flooding impacts resulting from extreme sea levels is developed for the present day and for the years 2050 and 2100. The database consists of three sub-datasets: the extreme sea levels, the coastal areas flooded by these extreme sea levels, and the resulting socioeconomic implications. The extreme sea levels consider the processes of storm surge, tide levels, breaking wave setup and relative sea level rise. The socioeconomic implications are expressed in terms of Expected Annual Population Affected (EAPA) and Expected Annual Damage (EAD), and presented at the global, regional and national scales. The EAPA and EAD are determined both for existing coastal defence levels and assuming two plausible adaptation scenarios, along with socioeconomic development narratives. All the sub-datasets can be visualized with a Digital Twin platform based on a GIS-based mapping host. This publicly available database provides a first-pass assessment, enabling users to extract and identify global and national coastal hotspots under different projections of sea level rise and socioeconomic developments.
Full article

Figure 1
Open AccessData Descriptor
Data on Stark Broadening of Sn II Spectral Lines
by
Milan S. Dimitrijević, Magdalena D. Christova, Cristina Yubero and Sylvie Sahal-Bréchot
Data 2025, 10(2), 14; https://doi.org/10.3390/data10020014 - 28 Jan 2025
Abstract
Data on spectral line widths and shifts broadened by interactions with charged particles, for 44 lines in the spectrum of ionized tin, for collisions with electrons and H II and HeII ions, are presented as online available tables. We obtained them by employing
[...] Read more.
Data on spectral line widths and shifts broadened by interactions with charged particles, for 44 lines in the spectrum of ionized tin, for collisions with electrons and H II and HeII ions, are presented as online available tables. We obtained them by employing the semiclassical perturbation theory for temperatures, T, within the 5000–100,000 K range, and for a grid of perturber densities from 1014 cm−3 to 1020 cm−3. The presented Stark broadening data are of interest for the analysis and synthesis of ionized tin lines in the spectra of hot and dense stars, such as, for example, for white dwarfs and hot subwarfs, and for the modelling of their atmospheres. They are also useful for the diagnostics of laser-induced plasmas for high-order harmonics generation in ablated materials.
Full article
(This article belongs to the Special Issue Data in Astrophysics and Geophysics: Research and Applications, 3rd Edition)
Open AccessArticle
A Multimodal Dataset of Fact-Checked News from Chile’s Constitutional Processes: Collection, Processing, and Analysis
by
Ignacio Molina, Brian Keith and Mauricio Matus
Data 2025, 10(2), 13; https://doi.org/10.3390/data10020013 - 28 Jan 2025
Abstract
This paper presents a multimodal dataset capturing fact-checked news coverage of Chile’s constitutional processes from 2019–2023. The collection comprises 300 articles from three sources: Fast Check, Fact Checking UC, and BioBioChile, containing 242,687 words of text and visual content in
[...] Read more.
This paper presents a multimodal dataset capturing fact-checked news coverage of Chile’s constitutional processes from 2019–2023. The collection comprises 300 articles from three sources: Fast Check, Fact Checking UC, and BioBioChile, containing 242,687 words of text and visual content in 168 entries. The dataset implements advanced natural language processing through RoBERTa and computer vision techniques via EfficientNet, with unified multimodal analysis using the CLIP model. Technical validation through clustering analysis and expert review demonstrates the dataset’s effectiveness in identifying narrative patterns within constitutional process coverage. The structured format includes verification metadata, precomputed embeddings, and documented relationships between textual and visual elements. This enables research into how misinformation propagates through multiple channels during significant political events. This paper details the dataset’s composition, collection methodology, and validation while acknowledging specific limitations. This contribution addresses a gap in current research resources by providing verified multimodal content spanning two constitutional processes, supporting investigations in computational social science and misinformation studies.
Full article
(This article belongs to the Section Information Systems and Data Management)
►▼
Show Figures

Figure 1
Open AccessData Descriptor
Portable Analyses of Strategic Metal-Rich Minerals Using pXRF and pLIBS: Methodology and Database Development
by
Marjolène Jatteau, Jean Cauzid, Cécile Fabre, Panagiotis Voudouris, Georgios Soulamidis and Alexandre Tarantola
Data 2025, 10(2), 12; https://doi.org/10.3390/data10020012 - 27 Jan 2025
Abstract
►▼
Show Figures
Strategic metals are indispensable for meeting the needs of modern society. It is then necessary to reassess the potential of such metals in Europe. For the exploration of strategic metals, portable XRF (X-Ray Fluorescence) and LIBS (Laser Induced Breakdown Spectroscopy) are powerful techniques
[...] Read more.
Strategic metals are indispensable for meeting the needs of modern society. It is then necessary to reassess the potential of such metals in Europe. For the exploration of strategic metals, portable XRF (X-Ray Fluorescence) and LIBS (Laser Induced Breakdown Spectroscopy) are powerful techniques allowing their multi-elementary analysis. This paper presents a database providing more than 2000 pXRF data and more than 4000 pLIBS spectra acquired on minerals from the Mineralogy and Petrology Museum of National and Kapodistrian University of Athens (NKUA), selected based on their potential in bearing strategic metals. The combination of these two portable techniques, along with expanding dataset on strategic metal-rich minerals, provides valuable insights into strategic metal affinities and demonstrates the effectiveness of portable tools for exploring strategic raw materials. Indeed, such database allows to strengthen the knowledge on strategic metals by producing statistic and chemometric analyses (e.g., boxplot, PCA, PLS) on their distribution.
Full article

Figure 1
Open AccessData Descriptor
RNA Sequencing Dataset of Drosophila Nociceptor Translatomic Response to Injury
by
Christine M. Hale, Kyle J. Beauchemin, Courtney L. Brann, Julie K. Moulton, Ramaz Geguchadze, Benjamin J. Harrison and Geoffrey K. Ganter
Data 2025, 10(2), 11; https://doi.org/10.3390/data10020011 - 21 Jan 2025
Abstract
►▼
Show Figures
To prepare to address the mechanisms of injury-induced nociceptor sensitization, we sequenced the translatome of the nociceptors of injured Drososophila larvae and those of uninjured larvae. Third-instar larvae expressing a green fluorescent protein (GFP)-tagged ribosomal subunit specifically in Class 4 dendritic arborization neurons,
[...] Read more.
To prepare to address the mechanisms of injury-induced nociceptor sensitization, we sequenced the translatome of the nociceptors of injured Drososophila larvae and those of uninjured larvae. Third-instar larvae expressing a green fluorescent protein (GFP)-tagged ribosomal subunit specifically in Class 4 dendritic arborization neurons, recognized as pickpocket-expressing primary nociceptors, via the GAL4/UAS method, were injured by ultraviolet light or sham-injured. Larvae were subjected to translating ribosome affinity purification for the GFP tag and nociceptor-specific ribosome-bound RNA was sequenced.
Full article

Figure 1

Journal Menu
► ▼ Journal Menu-
- Data Home
- Aims & Scope
- Editorial Board
- Reviewer Board
- Topical Advisory Panel
- Instructions for Authors
- Guidelines for Reviewers
- Special Issues
- Topics
- Sections & Collections
- Article Processing Charge
- Indexing & Archiving
- Editor’s Choice Articles
- Most Cited & Viewed
- Journal Statistics
- Journal History
- Journal Awards
- Editorial Office
Journal Browser
► ▼ Journal BrowserHighly Accessed Articles
Latest Books
E-Mail Alert
News
Topics
Topic in
BDCC, Data, Environments, Geosciences, Remote Sensing
Database, Mechanism and Risk Assessment of Slope Geologic Hazards
Topic Editors: Chong Xu, Yingying Tian, Xiaoyi Shao, Zikang Xiao, Yulong CuiDeadline: 28 February 2025
Topic in
Data, Energies, Sensors, Sustainability, Water
Water and Energy Monitoring and Their Nexus
Topic Editors: Lucas Pereira, Hugo Morais, Wolf-Gerrit FrühDeadline: 31 March 2025
Topic in
Algorithms, Data, Earth, Geosciences, Mathematics, Land, Water, IJGI
Applications of Algorithms in Risk Assessment and Evaluation
Topic Editors: Yiding Bao, Qiang WeiDeadline: 31 July 2025
Topic in
AI, Data, Economies, Mathematics, Risks
Advanced Techniques and Modeling in Business and Economics
Topic Editors: José Manuel Santos-Jaén, Ana León-Gomez, María del Carmen Valls MartínezDeadline: 30 September 2025

Conferences
Special Issues
Special Issue in
Data
New Progress in Big Earth Data
Guest Editors: Aditya Chakravarty, Juanle WangDeadline: 30 March 2025
Special Issue in
Data
Cutting-Edge Datasets and Algorithms for Enhancing Industrial Processes and Supply Chain Optimization
Guest Editors: Luis Alberto Rodríguez-Picón, Iván Pérez-Olguín, Luis Carlos Méndez GonzálezDeadline: 30 April 2025
Special Issue in
Data
Data-Driven Approaches for Safety in Industrial Sites
Guest Editors: Francesca Mauro, Mara Lombardi, Mario FargnoliDeadline: 30 June 2025
Special Issue in
Data
Benchmarking Datasets in Bioinformatics, 2nd Edition
Guest Editor: Pufeng DuDeadline: 31 July 2025
Topical Collections
Topical Collection in
Data
Modern Geophysical and Climate Data Analysis: Tools and Methods
Collection Editors: Vladimir Sreckovic, Zoran Mijic