Statistical Methods in Bioinformatics and Health Informatics

A special issue of Mathematics (ISSN 2227-7390). This special issue belongs to the section "E1: Mathematics and Computer Science".

Deadline for manuscript submissions: 30 September 2025 | Viewed by 4477

Special Issue Editor


E-Mail Website
Guest Editor
Department of Biostatistics, University of Nebraska Medical Center, Omaha, NE, USA
Interests: causal inference; measurement error modeling; biomarker evaluation; metabolomics; nutritional epidemiology; risk of chronic diseases
Special Issues, Collections and Topics in MDPI journals

Special Issue Information

Dear Colleagues,

It is our pleasure to invite you to submit a paper to our upcoming Special Issue on "Statistical Methods in Bioinformatics and Health Informatics". We aim to feature original research articles, reviews, and perspectives that focus on the development and application of sophisticated statistical and machine learning methods to analyze bioinformatics and health informatics data, including, but not limited to, genomics, transcriptomics, proteomics, metabolomics, microbiomics, radiomics, and electronic health record data.

This Special Issue will highlight recent advancements in statistical methods and their applications in bioinformatics and health informatics data. We welcome submissions that cover a wide range of topics, such as:

  • Dimension reduction and feature selection methods;
  • Machine learning and artificial intelligence method;
  • Federated or distributional learning to handle big data;
  • The early diagnosis of disease and prediction of treatment response utilizing high-dimensional biomarkers;
  • Precision medicine using omics data;
  • The integration of multi-omics data.

We hope that this Special Issue will provide a platform for researchers from various disciplines to share their latest findings, discuss emerging challenges, and propose novel solutions.

We look forward to your contribution and participation.

Dr. Cheng Zheng
Guest Editor

Manuscript Submission Information

Manuscripts should be submitted online at www.mdpi.com by registering and logging in to this website. Once you are registered, click here to go to the submission form. Manuscripts can be submitted until the deadline. All submissions that pass pre-check are peer-reviewed. Accepted papers will be published continuously in the journal (as soon as accepted) and will be listed together on the special issue website. Research articles, review articles as well as short communications are invited. For planned papers, a title and short abstract (about 100 words) can be sent to the Editorial Office for announcement on this website.

Submitted manuscripts should not have been published previously, nor be under consideration for publication elsewhere (except conference proceedings papers). All manuscripts are thoroughly refereed through a single-blind peer-review process. A guide for authors and other relevant information for submission of manuscripts is available on the Instructions for Authors page. Mathematics is an international peer-reviewed open access semimonthly journal published by MDPI.

Please visit the Instructions for Authors page before submitting a manuscript. The Article Processing Charge (APC) for publication in this open access journal is 2600 CHF (Swiss Francs). Submitted papers should be well formatted and use good English. Authors may use MDPI's English editing service prior to publication or during author revisions.

Keywords

  • bioinformatics
  • precision medicine
  • health informatics
  • big data
  • electronic health records
  • artificial intelligence
  • highly dimensional data
  • machine learning
  • distributed learning
  • federated learning
  • online learning
  • genomics
  • transcriptomics
  • proteomics
  • metabolomics
  • radiomics
  • microbiomics
  • computational biology

Benefits of Publishing in a Special Issue

  • Ease of navigation: Grouping papers by topic helps scholars navigate broad scope journals more efficiently.
  • Greater discoverability: Special Issues support the reach and impact of scientific research. Articles in Special Issues are more discoverable and cited more frequently.
  • Expansion of research network: Special Issues facilitate connections among authors, fostering scientific collaborations.
  • External promotion: Articles in Special Issues are often promoted through the journal's social media, increasing their visibility.
  • e-Book format: Special Issues with more than 10 articles can be published as dedicated e-books, ensuring wide and rapid dissemination.

Further information on MDPI's Special Issue policies can be found here.

Published Papers (3 papers)

Order results
Result details
Select all
Export citation of selected articles as:

Research

21 pages, 1101 KiB  
Article
On Data-Enriched Logistic Regression
by Cheng Zheng, Sayan Dasgupta, Yuxiang Xie, Asad Haris and Ying-Qing Chen
Mathematics 2025, 13(3), 441; https://doi.org/10.3390/math13030441 - 28 Jan 2025
Viewed by 484
Abstract
Biomedical researchers typically investigate the effects of specific exposures on disease risks within a well-defined population. The gold standard for such studies is to design a trial with an appropriately sampled cohort. However, due to the high cost of such trials, the collected [...] Read more.
Biomedical researchers typically investigate the effects of specific exposures on disease risks within a well-defined population. The gold standard for such studies is to design a trial with an appropriately sampled cohort. However, due to the high cost of such trials, the collected sample sizes are often limited, making it difficult to accurately estimate the effects of certain exposures. In this paper, we discuss how to leverage the information from external “big data” (datasets with significantly larger sample sizes) to improve the estimation accuracy at the risk of introducing a small amount of bias. We propose a family of weighted estimators to balance bias increase and variance reduction when incorporating the big data. We establish a connection between our proposed estimator and the well-known penalized regression estimators. We derive optimal weights using both second-order and higher-order asymptotic expansions. Through extensive simulation studies, we demonstrate that the improvement in mean square error (MSE) for the regression coefficient can be substantial even with finite sample sizes, and our weighted method outperformed existing approaches such as penalized regression and James–Stein estimator. Additionally, we provide a theoretical guarantee that the proposed estimators will never yield an asymptotic MSE larger than the maximum likelihood estimator using small data only in general. Finally, we apply our proposed methods to the Asia Cohort Consortium China cohort data to estimate the relationships between age, BMI, smoking, alcohol use, and mortality. Full article
(This article belongs to the Special Issue Statistical Methods in Bioinformatics and Health Informatics)
Show Figures

Figure 1

12 pages, 933 KiB  
Article
Improving the Diagnosis of Systemic Lupus Erythematosus with Machine Learning Algorithms Based on Real-World Data
by Meeyoung Park
Mathematics 2024, 12(18), 2849; https://doi.org/10.3390/math12182849 - 13 Sep 2024
Viewed by 1412
Abstract
This study addresses the diagnostic challenges of Systemic Lupus Erythematosus (SLE), an autoimmune disease with a complex etiology and varied symptoms. The ANA (antinuclear antibody) test, currently the primary diagnostic tool for SLE, exhibits high sensitivity but low specificity, often leading to inaccurate [...] Read more.
This study addresses the diagnostic challenges of Systemic Lupus Erythematosus (SLE), an autoimmune disease with a complex etiology and varied symptoms. The ANA (antinuclear antibody) test, currently the primary diagnostic tool for SLE, exhibits high sensitivity but low specificity, often leading to inaccurate diagnoses. To enhance diagnostic precision, we propose integrating machine learning algorithms with existing clinical classification guidelines to improve SLE diagnosis accuracy, potentially reducing diagnostic errors and healthcare costs. We analyzed real-world data from a cohort of 24,990 patients over a 10-year period at the hospitals, excluding those previously diagnosed with SLE. Patients were categorized into three groups: negative ANA, positive ANA with non-SLE, and positive ANA with SLE. Feature selection was conducted to identify key factors influencing SLE diagnosis, and machine learning algorithms were employed to develop the CDSS. Performance analysis of three machine learning algorithms—decision tree, random forest, and gradient boosting—based on feature sets of 10, 20, and all available features revealed accuracy rates of 70%, 88%, and 87%, respectively, for the 20-feature set. The proposed system, utilizing real-world medical data, demonstrated modest performance in SLE diagnosis, highlighting the potential of machine learning-based CDSS in real clinical settings. Full article
(This article belongs to the Special Issue Statistical Methods in Bioinformatics and Health Informatics)
Show Figures

Figure 1

11 pages, 451 KiB  
Article
High-Dimensional Mediation Analysis for Time-to-Event Outcomes with Additive Hazards Model
by Meng An and Haixiang Zhang
Mathematics 2023, 11(24), 4891; https://doi.org/10.3390/math11244891 - 6 Dec 2023
Viewed by 1884
Abstract
Mediation analysis plays an increasingly crucial role in identifying potential causal pathways between exposures and outcomes. However, there is currently a lack of developed mediation approaches for high-dimensional survival data, particularly when considering additive hazard models. The present study introduces two novel approaches [...] Read more.
Mediation analysis plays an increasingly crucial role in identifying potential causal pathways between exposures and outcomes. However, there is currently a lack of developed mediation approaches for high-dimensional survival data, particularly when considering additive hazard models. The present study introduces two novel approaches for identifying statistically significant mediators in high-dimensional additive hazard models, including the multiple testing-based mediator selection method and knockoff filter procedure. The simulation results demonstrate the outstanding performance of these two proposed methods. Finally, we employ the proposed methodology to analyze the Cancer Genome Atlas (TCGA) cohort in order to identify DNA methylation markers that mediate the association between smoking and survival time among lung cancer patients. Full article
(This article belongs to the Special Issue Statistical Methods in Bioinformatics and Health Informatics)
Show Figures

Figure 1

Back to TopTop