Applied Statistics in Real-World Problems

A special issue of Mathematics (ISSN 2227-7390). This special issue belongs to the section "Probability and Statistics".

Deadline for manuscript submissions: 20 May 2025 | Viewed by 4080

Special Issue Editors


E-Mail Website
Guest Editor
Department of Statistics, Universidade Federal da Bahia, Salvador 40110-909, Brazil
Interests: causality; statistical learning; computational statistics; image processing; econometrics; epidemiology; complex systems
Special Issues, Collections and Topics in MDPI journals

E-Mail Website
Guest Editor
Centre of Mathematics, University of Minho, 4710-057 Braga, Portugal
Interests: bayesian modeling; epidemiological models; fuzzy logic and decision-making; machine learning and data analytics; nonparametric inference; optimization techniques
Special Issues, Collections and Topics in MDPI journals

Special Issue Information

Dear Colleagues,

The convergence of statistical methodologies and real-world challenges within different sciences offers a rich tapestry of innovation. From unraveling the complexities to decoding information, statistical techniques serve as the linchpin connecting theory to application. This Special Issue is a canvas for articles that delve into the dynamic landscape where statistics intersects with diverse scientific domains applied to a real-life world.

Contributions are welcome from a spectrum of disciplines, including causality, medicine, economics, epidemiology, sociology, biochemistry, biophysics, neurology, psychology, etc. Furthermore, we invite researchers from other branches, fields, or sub-disciplines to share their insights into the latest statistical methodologies.

An inherent criterion for each submission is accessibility. Craft your articles with clarity in mind, catering to an audience that may not be intimately familiar with the specific terminology of your field. Consider employing aids such as tables featuring key concepts or a glossary to enhance reader understanding.

In the spirit of practicality, we encourage the incorporation of real datasets to illustrate the statistical methods applied. To promote transparency and facilitate scientific rigor, authors are kindly asked to share the codes and datasets used in their analyses, utilizing platforms such as Python, R, or others.

We anticipate a collection of articles that not only showcases statistical innovations but also inspires a cross-disciplinary dialogue, enriching our collective understanding of the intricate relationship between statistical methodologies and real-world challenges.

This Special Issue is seeking submissions in applied data science with potential applications in, but not limited to, the following:

(a) Artificial intelligence;

(b) Bayesian methods;

(c) Big data, dimensionality high, and large-scale data analysis;

(d) Causality;

(e) Deep and statistical learning;

(f) Machine learning;

(g) Statistical learning.

Dr. Raydonal Ospina
Prof. Dr. Victor Leiva
Dr. Cecília Castro
Guest Editors

Manuscript Submission Information

Manuscripts should be submitted online at www.mdpi.com by registering and logging in to this website. Once you are registered, click here to go to the submission form. Manuscripts can be submitted until the deadline. All submissions that pass pre-check are peer-reviewed. Accepted papers will be published continuously in the journal (as soon as accepted) and will be listed together on the special issue website. Research articles, review articles as well as short communications are invited. For planned papers, a title and short abstract (about 100 words) can be sent to the Editorial Office for announcement on this website.

Submitted manuscripts should not have been published previously, nor be under consideration for publication elsewhere (except conference proceedings papers). All manuscripts are thoroughly refereed through a single-blind peer-review process. A guide for authors and other relevant information for submission of manuscripts is available on the Instructions for Authors page. Mathematics is an international peer-reviewed open access semimonthly journal published by MDPI.

Please visit the Instructions for Authors page before submitting a manuscript. The Article Processing Charge (APC) for publication in this open access journal is 2600 CHF (Swiss Francs). Submitted papers should be well formatted and use good English. Authors may use MDPI's English editing service prior to publication or during author revisions.

Keywords

  • artificial intelligence
  • big data, big data analytics, and big data science
  • bioinformatics, health informatics, and bio-computing
  • causality
  • data analytics, data mining, and expert systems
  • decision support systems and knowledge discovery in databases
  • deep learning, machine learning, and statistical learning
  • differential privacy
  • digital transformation and digitization
  • monitoring/recognizing/forecasting of emotions and sentiment analysis
  • multivariate analysis
  • optimization algorithms
  • predictive models and analytics using artificial intelligence quality control
  • statistical analysis/modeling and its diagnostics
  • survey sampling

Benefits of Publishing in a Special Issue

  • Ease of navigation: Grouping papers by topic helps scholars navigate broad scope journals more efficiently.
  • Greater discoverability: Special Issues support the reach and impact of scientific research. Articles in Special Issues are more discoverable and cited more frequently.
  • Expansion of research network: Special Issues facilitate connections among authors, fostering scientific collaborations.
  • External promotion: Articles in Special Issues are often promoted through the journal's social media, increasing their visibility.
  • e-Book format: Special Issues with more than 10 articles can be published as dedicated e-books, ensuring wide and rapid dissemination.

Further information on MDPI's Special Issue polices can be found here.

Published Papers (3 papers)

Order results
Result details
Select all
Export citation of selected articles as:

Research

14 pages, 339 KiB  
Article
OUCH: Oversampling and Undersampling Cannot Help Improve Accuracy in Our Bayesian Classifiers That Predict Preeclampsia
by Franklin Parrales-Bravo, Rosangela Caicedo-Quiroz, Elena Tolozano-Benitez, Víctor Gómez-Rodríguez, Lorenzo Cevallos-Torres, Jorge Charco-Aguirre and Leonel Vasquez-Cevallos
Mathematics 2024, 12(21), 3351; https://doi.org/10.3390/math12213351 - 25 Oct 2024
Viewed by 701
Abstract
Unbalanced data can have an impact on the machine learning (ML) algorithms that build predictive models. This manuscript studies the influence of oversampling and undersampling strategies on the learning of the Bayesian classification models that predict the risk of suffering preeclampsia. Given the [...] Read more.
Unbalanced data can have an impact on the machine learning (ML) algorithms that build predictive models. This manuscript studies the influence of oversampling and undersampling strategies on the learning of the Bayesian classification models that predict the risk of suffering preeclampsia. Given the properties of our dataset, only the oversampling and undersampling methods that operate with numerical and categorical attributes will be taken into consideration. In particular, synthetic minority oversampling techniques for nominal and continuous data (SMOTE-NC), SMOTE—Encoded Nominal and Continuous (SMOTE-ENC), random oversampling examples (ROSE), random undersampling examples (UNDER), and random oversampling techniques (OVER) are considered. According to the results, when balancing the class in the training dataset, the accuracy percentages do not improve. However, in the test dataset, both positive and negative cases of preeclampsia were accurately classified by the models, which were built on a balanced training dataset. In contrast, models built on the imbalanced training dataset were not good at detecting positive cases of preeclampsia. We can conclude that while imbalanced training datasets can be addressed by using oversampling and undersampling techniques before building prediction models, an improvement in model accuracy is not always guaranteed. Despite this, the sensitivity and specificity percentages improve in binary classification problems in most cases, such as the one we are dealing with in this manuscript. Full article
(This article belongs to the Special Issue Applied Statistics in Real-World Problems)
Show Figures

Figure 1

23 pages, 438 KiB  
Article
Skew-Normal Inflated Models: Mathematical Characterization and Applications to Medical Data with Excess of Zeros and Ones
by Guillermo Martínez-Flórez, Roger Tovar-Falón, Víctor Leiva and Cecilia Castro
Mathematics 2024, 12(16), 2486; https://doi.org/10.3390/math12162486 - 12 Aug 2024
Cited by 1 | Viewed by 1007
Abstract
The modeling of data involving proportions, confined to a unit interval, is crucial in diverse research fields. Such data, expressing part-to-whole relationships, span from the proportion of individuals affected by diseases to the allocation of resources in economic sectors and the survival rates [...] Read more.
The modeling of data involving proportions, confined to a unit interval, is crucial in diverse research fields. Such data, expressing part-to-whole relationships, span from the proportion of individuals affected by diseases to the allocation of resources in economic sectors and the survival rates of species in ecology. However, modeling these data and interpreting information obtained from them present challenges, particularly when there is high zero–one inflation at the extremes of the unit interval, which indicates the complete absence or full occurrence of a characteristic or event. This inflation limits traditional statistical models, which often fail to capture the underlying distribution, leading to biased or imprecise statistical inferences. To address these challenges, we propose and derive the skew-normal zero–one inflated (SNZOI) models, a novel class of asymmetric regression models specifically designed to accommodate zero–one inflation presented in the data. By integrating a continuous-discrete mixture distribution with covariates in both continuous and discrete parts, SNZOI models exhibit superior capability compared to traditional models when describing these complex data structures. The applicability and effectiveness of the proposed models are demonstrated through case studies, including the analysis of medical data. Precise modeling of inflated proportion data unveils insights representing advancements in the statistical analysis of such studies. The present investigation highlights the limitations of existing models and shows the potential of SNZOI models to provide more accurate and precise inferences in the presence of zero–one inflation. Full article
(This article belongs to the Special Issue Applied Statistics in Real-World Problems)
Show Figures

Figure 1

33 pages, 1468 KiB  
Article
Modeling Residential Energy Consumption Patterns with Machine Learning Methods Based on a Case Study in Brazil
by Lucas Henriques, Cecilia Castro, Felipe Prata, Víctor Leiva and René Venegas
Mathematics 2024, 12(13), 1961; https://doi.org/10.3390/math12131961 - 25 Jun 2024
Viewed by 1632
Abstract
Developing efficient energy conservation and strategies is relevant in the context of climate change and rising energy demands. The objective of this study is to model and predict the electrical power consumption patterns in Brazilian households, considering the thresholds for energy use. Our [...] Read more.
Developing efficient energy conservation and strategies is relevant in the context of climate change and rising energy demands. The objective of this study is to model and predict the electrical power consumption patterns in Brazilian households, considering the thresholds for energy use. Our methodology utilizes advanced machine learning methods, such as agglomerative hierarchical clustering, k-means clustering, and self-organizing maps, to identify such patterns. Gradient boosting, chosen for its robustness and accuracy, is used as a benchmark to evaluate the performance of these methods. Our methodology reveals consumption patterns from the perspectives of both users and energy providers, assessing the corresponding effectiveness according to stakeholder needs. Consequently, the methodology provides a comprehensive empirical framework that supports strategic decision making in the management of energy consumption. Our findings demonstrate that k-means clustering outperforms other methods, offering a more precise classification of consumption patterns. This finding aids in the development of targeted energy policies and enhances resource management strategies. The present research shows the applicability of advanced analytical methods in specific contexts, showing their potential to shape future energy policies and practices. Full article
(This article belongs to the Special Issue Applied Statistics in Real-World Problems)
Show Figures

Figure 1

Back to TopTop