MDPI - Publisher of Open Access Journals

23 pages, 4343 KB

Open AccessArticle

Integrative Analysis of Biomarkers for Cancer Stem Cells in Bladder Cancer and Their Therapeutic Potential

by Jing Wu and Wei Liu

Genes 2025, 16(10), 1146; https://doi.org/10.3390/genes16101146 (registering DOI) - 27 Sep 2025

Background: Cancer stem cells (CSCs) are key drivers of tumorigenesis and metastasis. However, the precise roles of CSC-associated genes in these processes remain unclear. Methods: This study integrates cancer stem cell biomarkers and clinical data from The Cancer Genome Atlas (TCGA) [...] Read more.

Background: Cancer stem cells (CSCs) are key drivers of tumorigenesis and metastasis. However, the precise roles of CSC-associated genes in these processes remain unclear. Methods: This study integrates cancer stem cell biomarkers and clinical data from The Cancer Genome Atlas (TCGA) specific to bladder cancer (BLCA). By combining differentially expressed genes (DEGs) from TCGA-BLCA samples with CSC-related biomarkers, we conducted comprehensive functional analyses and developed an 8-gene prognostic signature through Cox regression, least absolute shrinkage and selection operator (LASSO) analysis, and multivariate Cox regression. This model was validated with GEO datasets (GSE13507 and GSE32894), and the single-cell RNA seq dataset GSE222315 was subsequently analyzed to characterize the signature genes and elucidate their interactions. And a nomogram was created to stratify TCGA-BLCA patients into risk categories. The ‘oncoPredict’ algorithm based on the GDSC2 dataset assessed drug sensitivity in BLCA. Result: From the TCGA cohort, 665 CSC-related genes were identified, with 120 showing significant differential expression. The 8-gene signature (ALDH1A1, CBX7, CSPG4, DCN, FASN, INHBB, MYC, NCAM1) demonstrated strong predictive power for overall survival in both TCGA and GEO cohorts, as confirmed by Kaplan–Meier and ROC analyses. The nomogram, integrating age, tumor stage and risk scores, demonstrated high predictive accuracy. Additionally, the oncoPredict algorithm indicated varying drug sensitivities across patient groups. Based on retrospective data, we identified a novel CSC-related prognostic signature for BLCA. This finding suggests that targeting these genes could offer promising therapeutic strategies. Full article

(This article belongs to the Section Human Genomics and Genetic Diseases)

► Show Figures

Figure 1

38 pages, 2285 KB

Open AccessArticle

Short-Term Forecasting of Unplanned Power Outages Using Machine Learning Algorithms: A Robust Feature Engineering Strategy Against Multicollinearity and Nonlinearity

by Khathutshelo Steven Sivhugwana and Edmore Ranganai

Energies 2025, 18(18), 4994; https://doi.org/10.3390/en18184994 - 19 Sep 2025

Viewed by 163

Abstract

Efficient power grid operations and effective business strategies require accurate prediction of power outages. However, predicting outages is a difficult task due to the large amount of heterogeneous, random, intermittent, and non-linear power grid data characterised by highly complex variable relationships. Attempting to [...] Read more.

Efficient power grid operations and effective business strategies require accurate prediction of power outages. However, predicting outages is a difficult task due to the large amount of heterogeneous, random, intermittent, and non-linear power grid data characterised by highly complex variable relationships. Attempting to simultaneously quantify these characteristics using a conventional single (linear or nonlinear) model may lead to inaccurate and costly results. To address this, we propose a hybrid RVM-WT-AdaBoostRT-RF framework using power grid data from the Electricity Supply Commission (Eskom) of South Africa. To achieve model interpretability, the least absolute shrinkage and selection operator (LASSO) is first applied to remedy the adverse effects of multicollinearity through regularisation and variable selection. Secondly, a random forest (RF) is used to select the top 10 most influential variables for each season for further analysis. A relevance vector machine (RVM) captures complex nonlinear relationships separately for each season, while the wavelet transform (WT) decomposes residuals generated from RVM into different frequency subseries (with reduced noise). These subseries are predicted with minimal bias using AdaBoost with regression and threshold (AdaBoostRT). Finally, we stack RVM, AdaBoostRT, RF, and residual individual predictions using RF as a meta-model to produce the final forecast with minimal error accumulation and efficiency. The comparative study, based on point forecast metrics, the Diebold-Mariano test, and prediction interval widths, shows that the proposed model outperforms vector autoregressive (VAR), RF, AdaBoostRT, RVM, and Naïve models. The study results can be utilised for optimising resource allocation, effective power grid management, and customer alerts. Full article

(This article belongs to the Special Issue Machine Learning Algorithms for Power Systems and Renewable Energy Applications)

► Show Figures

Figure 1

10 pages, 860 KB

Open AccessArticle

A Machine Learning Approach to Modify the Neurocognitive Frailty Index for the Prediction of Cognitive Status in the Canadian Population

by Nader Fallah, Sarah Pakzad, Paul-Émile Bourque and Hamidreza Goodarzynejad

J. Clin. Med. 2025, 14(18), 6509; https://doi.org/10.3390/jcm14186509 - 16 Sep 2025

Viewed by 278

Abstract

Background/Objective: Frailty, a geriatric syndrome characterized by decreased reserve and resistance to stressors in older adults, has been established as a robust predictor of health outcomes. Recently, the Neurocognitive Frailty Index (NFI) was introduced, including 42 physical and cognitive elements that collectively assess [...] Read more.

Background/Objective: Frailty, a geriatric syndrome characterized by decreased reserve and resistance to stressors in older adults, has been established as a robust predictor of health outcomes. Recently, the Neurocognitive Frailty Index (NFI) was introduced, including 42 physical and cognitive elements that collectively assess an individual’s vulnerability to age-related health decline. While this multidimensional approach improves predictive accuracy for cognitive decline, its high dimensionality might be a barrier to widespread adoption. Methods: We employed several machine learning techniques to reduce the dimensions of NFI while maintaining its predictive power. We trained five models: Network Analysis, neural networks, Least Absolute Shrinkage and Selection Operator Regression (LASSO), Random Forest, and eXtreme Gradient Boosting (XGBoost). Each model was calibrated using a dataset from the Canadian Study of Health and Aging, which included various cognitive, health, and functional variables. Results: Results indicated that six variables had minimal impact on outcome. This reduction in dimensionality resulted in a modified NFI scale including 36 elements, while maintaining good predictive performance for cognitive change similar to the original NFI. Conclusions: Our findings support the feasibility of applying machine learning techniques to modify predictive models in neurodegenerative diseases beyond frailty assessment. We recommend exploring the application of this scale using other data. The results also emphasize the potential of machine learning approaches for improving predictive models, highlighting their value as a tool for advancing our understanding of aging and its complexities. Full article

(This article belongs to the Section Geriatric Medicine)

► Show Figures

Figure 1

14 pages, 4621 KB

Open AccessArticle

Radiomics for Detecting Metaplastic Histology in Triple-Negative Breast Cancer: A Step Towards Personalized Therapy

by Rana Gunoz Comert, Gorkem Durak, Ravza Yilmaz, Halil Ertugrul Aktas, Zeynep Tuz, Hongyi Pan, Jun Zeng, Aysel Bayram, Baran Mollavelioglu, Sukru Mehmet Erturk and Ulas Bagci

Bioengineering 2025, 12(9), 973; https://doi.org/10.3390/bioengineering12090973 - 12 Sep 2025

Viewed by 474

Abstract

This study aims to develop and validate a multisequence MRI-based radiomics approach for distinguishing metaplastic breast cancer (MBC) from non-metaplastic triple-negative breast cancer (TNBC) at the initial diagnosis, which could facilitate optimal treatment selection. In this retrospective study, we analyzed 105 patients (27 [...] Read more.

This study aims to develop and validate a multisequence MRI-based radiomics approach for distinguishing metaplastic breast cancer (MBC) from non-metaplastic triple-negative breast cancer (TNBC) at the initial diagnosis, which could facilitate optimal treatment selection. In this retrospective study, we analyzed 105 patients (27 MBC, 78 non-metaplastic TNBC) who underwent standardized breast magnetic resonance imaging (MRI), which included T1-weighted contrast-enhanced (T1W-CE) and short-tau inversion recovery (STIR) sequences. Two radiologists performed ground truth lesion segmentation, verified by a senior radiologist. We extracted 214 radiomic features (using PyRadiomics) and used least absolute shrinkage and selection operator (LASSO) regression for feature selection. Seven machine learning classifiers were thoroughly evaluated using five-fold cross-validation, with performance assessed through ROC analysis and accuracy metrics. The combined T1W-CE and STIR analysis demonstrated superior diagnostic performance for distinguishing MBC from non-metaplastic TNBC (AUC = 0.845; accuracy = 81%) compared with either sequence alone (T1W only AUC = 0.805; accuracy = 80%; STIR only AUC:0.768; accuracy = 77%). Multisequence MRI radiomics can reliably distinguish between MBC and TNBC at the time of initial diagnosis. This could potentially facilitate the selection of more appropriate treatments and help avoid ineffective chemotherapy for MBC patients. Full article

(This article belongs to the Special Issue Breast Cancer: From Precision Medicine to Diagnostics)

► Show Figures

Figure 1

20 pages, 1655 KB

Open AccessArticle

Predicting Academic Performance from Future-Oriented Daily Time Management Behavior: A LASSO-Based Study of First-Year College Students

by Mingzhang Zuo, Kunyu Wang, Pengxuan Tang, Meng Xiao, Xiaotang Zhou and Heng Luo

Behav. Sci. 2025, 15(9), 1242; https://doi.org/10.3390/bs15091242 - 12 Sep 2025

Viewed by 811

Abstract

This study examined how the time management behavior of first-year college students predicted their academic performance. Data on 44 objective indicators of daily time management behaviors were collected from 110 first-year students via a WeChat Mini Program, through one month of consecutive daily [...] Read more.

This study examined how the time management behavior of first-year college students predicted their academic performance. Data on 44 objective indicators of daily time management behaviors were collected from 110 first-year students via a WeChat Mini Program, through one month of consecutive daily tracking. To identify stable predictors, Least Absolute Shrinkage and Selection Operator (LASSO) regression with 5000 bootstrap resamples was conducted, and variables with high selection frequency were subsequently entered Elastic Net regression to examine explanatory relationships. Six key behavioral indicators were found to predict overall academic performance. Subject-specific models revealed varying associations: time management behaviors appeared more influential in subjects such as Physical Education and English, while their role was less evident in Mathematics. The number and nature of retained predictors also differed across disciplines. Full article

(This article belongs to the Section Developmental Psychology)

► Show Figures

Figure 1

13 pages, 853 KB

Open AccessArticle

Risk Factors and Development of a Predictive Model for In-Hospital Mortality in Hemodynamically Stable Older Adults with Urinary Tract Infection

by Tzu-Heng Cheng, Wei Lu, Chen-Bin Chen, Chen-June Seak and Chieh-Ching Yen

Medicina 2025, 61(9), 1625; https://doi.org/10.3390/medicina61091625 - 8 Sep 2025

Viewed by 250

Abstract

Background and Objectives: Urinary tract infections (UTIs) are a major cause of emergency department (ED) visits and hospital admissions among older adults. Although most seniors present hemodynamically stable, a sizeable fraction deteriorate during hospitalization, and no ED-specific tool exists to identify those [...] Read more.

Background and Objectives: Urinary tract infections (UTIs) are a major cause of emergency department (ED) visits and hospital admissions among older adults. Although most seniors present hemodynamically stable, a sizeable fraction deteriorate during hospitalization, and no ED-specific tool exists to identify those at greatest risk. We sought to determine risk factors for in-hospital mortality in this population and to develop a predictive model. Materials and Methods: We analyzed the MIMIC-IV-ED database (2011–2019) and enrolled culture-confirmed UTI patients aged ≥ 65 years who were hemodynamically stable—defined as a systolic blood pressure ≥ 100 mm Hg without vasopressor support. Demographics, comorbidities, triage vital signs, and initial laboratory tests were extracted. Least Absolute Shrinkage and Selection Operator (LASSO) regression with 10-fold cross-validation was performed for variable selection. Discrimination was quantified with the C-statistic, calibration with the Hosmer–Lemeshow test, and clinical utility with decision curve analysis. Internal validation was assessed via 1000-sample bootstrap resampling. Results: Among 1571 eligible encounters (median age 79 years, 33% male), in-hospital mortality was 4.5%. LASSO selected eight variables; six remained significant in multivariable analysis: age, systolic blood pressure, oxygen saturation, white blood cell count, red cell distribution width, and blood urea nitrogen. The predictive nomogram demonstrated a C-statistic of 0.73 (95% CI 0.66–0.79) and outperformed traditional early warning scores. Conclusions: A six-variable nomogram may stratify mortality risk in hemodynamically stable older adults with UTI. Because the model was developed in a single U.S. tertiary-care ED, it remains hypothesis-generating until validated in external, multicenter cohorts to confirm generalizability. Full article

(This article belongs to the Section Urology & Nephrology)

► Show Figures

Figure 1

20 pages, 2325 KB

Open AccessArticle

The Predictive Role of the Systemic Inflammation Response Index in the Prognosis of Hepatitis B Virus-Related Acute-on-Chronic Liver Failure: A Multicenter Study

by Jing Yuan, Jing Chen, Haibin Su, Yu Chen, Tao Han, Tao Chen, Xiaoyan Liu, Qi Wang, Pengbin Gao, Jinjun Chen, Jingjing Tong, Chen Li and Jinhua Hu

Healthcare 2025, 13(17), 2199; https://doi.org/10.3390/healthcare13172199 - 2 Sep 2025

Viewed by 579

Abstract

Background/Objectives: The prognosis of patients with hepatitis B virus-related acute-on-chronic liver failure (HBV-ACLF) is significantly affected by inflammatory state and immune dysregulation. The systemic inflammatory response index (SIRI), which reflects neutrophil, monocyte, and lymphocyte dynamics, has emerged as a potential marker of immune-inflammatory [...] Read more.

Background/Objectives: The prognosis of patients with hepatitis B virus-related acute-on-chronic liver failure (HBV-ACLF) is significantly affected by inflammatory state and immune dysregulation. The systemic inflammatory response index (SIRI), which reflects neutrophil, monocyte, and lymphocyte dynamics, has emerged as a potential marker of immune-inflammatory status. However, its role in predicting HBV-ACLF outcomes remains unclear. This research aims to elucidate the prognostic value of SIRI and its dynamic changes combined with disease severity scores in predicting the outcomes of HBV-ACLF. Methods: The study included HBV-ACLF patients enrolled in a multicenter clinical study between July 2019 and April 2024. Based on 90-day outcomes, the participants were categorized into survival and death groups. Clinical data and SIRI values were collected on days 0 (baseline), 3, 7, and 14. Independent prognostic factors were identified using Cox regression and least absolute shrinkage and selection operator (LASSO) analysis. The predictive value of dynamic SIRI changes combined with disease severity scores was evaluated using receiver operating characteristic (ROC) curves. Results: A total of 153 patients with HBV-ACLF were analyzed, including 104 in the survival group and 49 in the death group. SIRI values were significantly lower in the survival group than in the death group across all time points. Multivariate Cox regression analysis identified that an increased ΔSIRI at day 3 (ΔSIRI3), a higher MELD score, and a lower albumin level were independently associated with increased 90-day mortality. The combination of SIRI on day three (SIRI3) and MELD-Na score on day three (MELD-Na3) demonstrated the highest predictive performance, with an AUC of 0.817 (95% CI: 0.750–0.883). Conclusions: The combination of the SIRI and MELD-Na score on day three provides a strong predictive value for the short-term prognosis of HBV-ACLF, highlighting its potential utility in early prognostic evaluation. Full article

► Show Figures

Figure 1

19 pages, 8289 KB

Open AccessArticle

Machine Learning Integration of Bulk and Single-Cell RNA-Seq Data Reveals Cathepsin B as a Central PANoptosis Regulator in Influenza

by Bin Liu, Lin Zhu, Caijuan Zhang, Dunfang Wang, Haifan Liu, Jianyao Liu, Jingwei Sun, Xue Feng and Weipeng Yang

Int. J. Mol. Sci. 2025, 26(17), 8533; https://doi.org/10.3390/ijms26178533 - 2 Sep 2025

Viewed by 623

Abstract

Influenza A virus (IAV) infection triggers excessive activation of PANoptosis—a coordinated form of programmed cell death integrating pyroptosis, apoptosis, and necroptosis—which contributes to severe immunopathology and acute lung injury. However, the molecular regulators that drive PANoptosis during IAV infection remain poorly understood. In [...] Read more.

Influenza A virus (IAV) infection triggers excessive activation of PANoptosis—a coordinated form of programmed cell death integrating pyroptosis, apoptosis, and necroptosis—which contributes to severe immunopathology and acute lung injury. However, the molecular regulators that drive PANoptosis during IAV infection remain poorly understood. In this study, we integrated bulk and single-cell RNA sequencing (scRNA-seq) datasets to dissect the cellular heterogeneity and transcriptional dynamics of PANoptosis in the influenza-infected lung. PANoptosis-related gene activity was quantified using the AUCell, ssGSEA, and AddModuleScore algorithms. Machine learning approaches, including Support Vector Machine (SVM), Random Forest (RF), and Least Absolute Shrinkage and Selection Operator (LASSO) regression, were employed to identify key regulatory genes. scRNA-seq analysis revealed that PANoptosis activity was primarily enriched in macrophages and neutrophils. Integration of transcriptomic and computational data identified cathepsin B (CTSB) as a central regulator of PANoptosis. In vivo validation in an IAV-infected mouse model confirmed elevated expression of PANoptosis markers and upregulation of CTSB. Mechanistically, CTSB may facilitate NLRP3 inflammasome activation and promote lysosomal dysfunction-associated inflammatory cell death. These findings identify CTSB as a critical mediatoCTSBr linking lysosomal integrity to innate immune-driven lung injury and suggest that targeting CTSB could represent a promising therapeutic strategy to alleviate influenza-associated immunopathology. Full article

(This article belongs to the Section Molecular Informatics)

► Show Figures

Graphical abstract

20 pages, 5360 KB

Open AccessArticle

Identification of Key Biomarkers Related to Lipid Metabolism in Acute Pancreatitis and Their Regulatory Mechanisms Based on Bioinformatics and Machine Learning

by Liang Zhang, Yujie Jiang, Taojun Jin, Mingxian Zheng, Yixuan Yap, Xuanyang Min, Jiayue Chen, Lin Yuan, Feng He and Bingduo Zhou

Biomedicines 2025, 13(9), 2132; https://doi.org/10.3390/biomedicines13092132 - 31 Aug 2025

Viewed by 627

Abstract

Background: Acute pancreatitis (AP) is characterized by the abnormal activation of pancreatic enzymes due to various causes, leading to local pancreatic inflammation. This can trigger systemic inflammatory response syndrome and multi-organ dysfunction. Hyperlipidemia, mainly resulting from lipid metabolism disorders and elevated triglyceride levels, [...] Read more.

Background: Acute pancreatitis (AP) is characterized by the abnormal activation of pancreatic enzymes due to various causes, leading to local pancreatic inflammation. This can trigger systemic inflammatory response syndrome and multi-organ dysfunction. Hyperlipidemia, mainly resulting from lipid metabolism disorders and elevated triglyceride levels, is a major etiological factor in AP. This study aims to investigate the role of lipid metabolism-related genes in the pathogenesis of AP and to propose novel strategies for its prevention and treatment. Methods: We obtained AP-related datasets GSE3644, GSE65146, and GSE121038 from the GEO database. Differentially expressed genes (DEGs) were identified using DEG analysis and gene set enrichment analysis (GSEA). To identify core lipid metabolism genes in AP, we performed least absolute shrinkage and selection operator (LASSO) regression and support vector machine recursive feature elimination (SVM-RFE) analysis. Gene and protein interactions were predicted using GeneMANIA and AlphaFold. Finally, biomarker expression levels were quantified using Real-Time quantitative Polymerase Chain Reaction (RT-qPCR) in an AP mouse model. Results: Seven lipid metabolism-related genes were identified as key biomarkers in AP: Amacr, Cyp39a1, Echs1, Gpd2, Osbpl9, Acsl4, and Mcee. The biological roles of these genes mainly involve fatty acid metabolism, cholesterol metabolism, lipid transport across cellular membranes, and mitochondrial function. Conclusions: Amacr, Cyp39a1, Echs1, Gpd2, Osbpl9, Acsl4, and Mcee are characteristic biomarkers of lipid metabolism abnormalities in AP. These findings are crucial for a deeper understanding of lipid metabolism pathways in AP and for the early implementation of preventive clinical measures, such as the control of blood lipid levels. Full article

(This article belongs to the Section Cancer Biology and Oncology)

► Show Figures

Graphical abstract

14 pages, 5071 KB

Open AccessArticle

Radiomics Features from Different Prostatic Zones on ¹⁸F-PSMA-1007 PET/CT for Predicting Persistent PSA in Prostate Cancer Patients: A Multicenter Study

by Licong Li, Jian Xu, Shuying Bian, Fei Yao, Qi Lin, Meiyan Zhou, Yunjun Yang, Meiyao Song, Yixuan Pan, Qinyang Shen, Yuandi Zhuang and Jie Lin

Cancers 2025, 17(17), 2807; https://doi.org/10.3390/cancers17172807 - 28 Aug 2025

Viewed by 519

Abstract

Objectives: This study aims to explore the role of radiomics features (RFs) from prostate subregions, including the tumor microenvironment (TME), in predicting persistent PSA. Methods: In retrospective analysis, we segregated 354 patients with pathologically confirmed localized prostate cancer (PCa) into training, [...] Read more.

Objectives: This study aims to explore the role of radiomics features (RFs) from prostate subregions, including the tumor microenvironment (TME), in predicting persistent PSA. Methods: In retrospective analysis, we segregated 354 patients with pathologically confirmed localized prostate cancer (PCa) into training, internal validation, and external validation cohorts. The prostate on ¹⁸F-prostate-specific membrane antigen (PSMA)-1007 positron emission tomography/computed tomography (PET/CT) was partitioned into three zones based on the maximum standardized uptake value (SUVmax) (zone-intra: 45–100% SUVmax; zone-peri: 20–45% SUVmax; zone-norm: 0–20% SUVmax). RFs from these zones were harnessed to develop five radiomics models [model-intra; model-peri; model-norm; model-ip; model-ipn]. Three optimal radiomics models were further integrated with the PSA model to construct combined models. Model performance was evaluated using the receiver operating characteristic (ROC) curves and the area under the curve (AUC). Results: Utilizing least absolute shrinkage and selection operator (LASSO) and logistic regression, five radiomics models were constructed, with model-ip, model-ipn, and model-intra showing superior performance [training cohort AUCs: 0.76 (0.68–0.83), 0.75 (0.68–0.83), 0.76 (0.68–0.83); internal validation cohort AUCs: 0.76 (0.65–0.88), 0.72 (0.57–0.86), 0.70 (0.55–0.86); external validation cohort AUCs: 0.70 (0.50–0.86), 0.55 (0.36–0.73), 0.53 (0.34–0.72)]. Notably, the combined model incorporating model-ip and the PSA model exhibited optimal performance [training cohort AUC: 0.78 (0.71–0.85); internal validation cohort AUC: 0.78 (0.67–0.90); external validation cohort AUC: 0.89 (0.72–0.98)]. Conclusions: The RFs in different subregions on ¹⁸F-PSMA-1007 PET/CT have varying effectiveness in predicting persistent PSA. A radiomics model that encompasses the 20–45% SUVmax and 45–100% SUVmax zones, when combined with the PSA model, enhances predictive accuracy. Full article

(This article belongs to the Section Methods and Technologies Development)

► Show Figures

Figure 1

19 pages, 1087 KB

Open AccessArticle

Exploring Sarcopenic Obesity in the Cancer Setting: Insights from the National Health and Nutrition Examination Survey on Prognosis and Predictors Using Machine Learning

by Yinuo Jiang, Wenjie Jiang, Qun Wang, Ting Wei and Lawrence Wing Chi Chan

Bioengineering 2025, 12(9), 921; https://doi.org/10.3390/bioengineering12090921 - 27 Aug 2025

Viewed by 625

Abstract

Objective: Sarcopenic obesity (SO) is a combination of depleted skeletal muscle mass and obesity, with a high prevalence, undetected onset, challenging diagnosis, and poor prognosis. However, studies on SO in cancer settings are limited. We aimed to explore the association between SO [...] Read more.

Objective: Sarcopenic obesity (SO) is a combination of depleted skeletal muscle mass and obesity, with a high prevalence, undetected onset, challenging diagnosis, and poor prognosis. However, studies on SO in cancer settings are limited. We aimed to explore the association between SO and mortality and to investigate potential predictors involved in the development of SO, with a further objective of constructing a model to detect its occurrence in cancer patients. Methods: The data of 1432 cancer patients from the National Health and Nutrition Examination Survey (NHANES) from the years 1999 to 2006 and 2011 to 2016 were included. For survival analysis, univariable and multivariable Cox proportional hazard models were used to examine the associations of SO with overall survival, adjusting for potential confounders. For machine learning, six algorithms, including logistic regression, stepwise logistic regression, least absolute shrinkage and selection operator (LASSO), support vector machine (SVM), random forest (RF), and extreme gradient boosting (XGBoost), were utilized to build models to predict the presence of SO. The predictive performances of each model were evaluated. Results: From six machine learning algorithms, cancer patients with SO were significantly associated with a higher risk of all-cause mortality (adjusted HR 1.368, 95%CI 1.107–1.690) compared with individuals without SO. Among the six machine learning algorithms, the optimal LASSO model achieved the highest area under the curve (AUC) of 0.891 on the training set and 0.873 on the test set, outperforming the other five machine learning algorithms. Conclusions: SO is a significant risk factor for the prognosis of cancer patients. Our constructed LASSO model to predict the presence of SO is an effective tool for clinical practice. This study is the first to utilize machine learning to explore the predictors of SO among cancer populations, providing valuable insights for future research. Full article

(This article belongs to the Special Issue Computer Vision and Machine Learning in Medical Applications, 2nd Edition)

► Show Figures

Figure 1

13 pages, 1357 KB

Open AccessArticle

Decision Tree Modeling to Predict Myopia Progression in Children Treated with Atropine: Toward Precision Ophthalmology

by Jun-Wei Chen, Chi-Jie Lu, Chieh-Han Yu, Tzu-Chi Liu and Tzu-En Wu

Diagnostics 2025, 15(16), 2096; https://doi.org/10.3390/diagnostics15162096 - 20 Aug 2025

Viewed by 541

Abstract

Background/Objectives: Myopia is a growing global health concern, especially among school-aged children in East Asia. Topical atropine is a key treatment for pediatric myopia control, but individual responses vary, with some children showing rapid progression despite higher doses. This retrospective observational study [...] Read more.

Background/Objectives: Myopia is a growing global health concern, especially among school-aged children in East Asia. Topical atropine is a key treatment for pediatric myopia control, but individual responses vary, with some children showing rapid progression despite higher doses. This retrospective observational study aims to develop an interpretable machine learning model to predict individualized treatment responses and support personalized clinical decisions, based on data collected over a 3-year period without a control group. Methods: A total of 1545 pediatric eyes treated with topical atropine for myopia control at a single tertiary medical center are analyzed. Classification and regression tree (CART) is constructed to predict changes in spherical equivalent (SE) and identify influencing risk factors. These factors are mainly received treatments for myopia including atropine dosage records, treatment duration, and ophthalmic examinations. Furthermore, decision rules that closely resemble the clinical diagnosis process are provided to assist clinicians with more interpretable insights into personalized treatment decisions. The performance of CART is evaluated by comparing with the benchmark model of least absolute shrinkage and selection operator regression (Lasso) to confirm the practicality of CART usage. Results: Both the CART and Lasso models demonstrated comparable predictive performance. The CART model identified baseline SE as the primary determinant of myopia progression. Children with a baseline SE more negative than −3.125 D exhibited greater myopic progression, particularly those with prolonged treatment duration and higher cumulative atropine dosage. Conclusions: Baseline SE has been identified as the key factor affecting SE difference. The generated decision rules from CART demonstrate the use of explainable machine learning in precision myopia management. Full article

(This article belongs to the Special Issue Global Perspectives on Myopia—Epidemiology, Pathophysiology, and Emerging Assessment Technologies)

► Show Figures

Figure 1

21 pages, 12853 KB

Open AccessArticle

Identification of Novel Lactylation-Related Biomarkers for COPD Diagnosis Through Machine Learning and Experimental Validation

by Chundi Hu, Weiliang Qian, Runling Wei, Gengluan Liu, Qin Jiang, Zhenglong Sun and Hui Li

Biomedicines 2025, 13(8), 2006; https://doi.org/10.3390/biomedicines13082006 - 18 Aug 2025

Viewed by 788

Abstract

Objective: This study aims to identify clinically relevant lactylation-related biomarkers in chronic obstructive pulmonary disease (COPD) and investigate their potential mechanistic roles in COPD pathogenesis. Methods: Differentially expressed genes (DEGs) were identified from the GSE21359 dataset, followed by weighted gene co-expression network analysis [...] Read more.

Objective: This study aims to identify clinically relevant lactylation-related biomarkers in chronic obstructive pulmonary disease (COPD) and investigate their potential mechanistic roles in COPD pathogenesis. Methods: Differentially expressed genes (DEGs) were identified from the GSE21359 dataset, followed by weighted gene co-expression network analysis (WGCNA) to detect COPD-associated modules. Least absolute shrinkage and selection operator (LASSO) regression and support vector machine–recursive feature elimination (SVM–RFE) algorithms were applied to screen lactylation-related biomarkers, with diagnostic performance evaluated through the ROC curve. Candidates were validated in the GSE76925 dataset for expression and diagnostic robustness. Immune cell infiltration patterns were exhibited using EPIC deconvolution. Single-cell transcriptomics (from GSE173896) were processed via the ‘Seurat’ package encompassing quality control, dimensionality reduction, and cell type annotation. Cell-type-specific markers and intercellular communication networks were delineated using the ‘FindAllMarkers’ package and the ‘CellChat’ R package, respectively. In vitro validation was conducted using a cigarette smoke extract (CSE)-induced COPD model. Results: Integrated transcriptomic approaches and multi-algorithm screening (LASSO/Boruta/SVM–RFE) revealed carbonyl reductase 1 (CBR1) and peroxiredoxin 1 (PRDX1) as core COPD biomarkers enriched in oxidation–reduction and inflammatory pathways, with high diagnostic accuracy (AUC > 0.85). Immune profiling and scRNA-seq delineated macrophage and cancer-associated fibroblasts (CAFs) infiltration with oxidative-redox transcriptional dominance in COPD. CBR1 was significantly upregulated in T cells, neutrophils, and mast cells; and PRDX1 showed significant upregulation in endothelial, macrophage, and ciliated cells. Experimental validation in CSE-induced models confirmed significant upregulation of both biomarkers via transcription PCR (qRT-PCR) and immunofluorescence. Conclusions: CBR1 and PRDX1 are lactylation-associated diagnostic markers, with lactylation-driven redox imbalance implicated in COPD progression. Full article

(This article belongs to the Section Molecular and Translational Medicine)

► Show Figures

Figure 1

8 pages, 529 KB

Open AccessData Descriptor

An Extended Dataset of Educational Quality Across Countries (1970–2023)

by Hanol Lee and Jong-Wha Lee

Data 2025, 10(8), 130; https://doi.org/10.3390/data10080130 - 15 Aug 2025

Viewed by 688

Abstract

This study presents an extended dataset on educational quality covering 101 countries, from 1970 to 2023. While existing international assessments, such as the Programme for International Student Assessment (PISA) and Trends in International Mathematics and Science Study (TIMSS), offer valuable snapshots of student [...] Read more.

This study presents an extended dataset on educational quality covering 101 countries, from 1970 to 2023. While existing international assessments, such as the Programme for International Student Assessment (PISA) and Trends in International Mathematics and Science Study (TIMSS), offer valuable snapshots of student performance, their limited coverage across countries and years constrains broader analyses. To address this limitation, we harmonized observed test scores across assessments and imputed missing values using both linear interpolation and machine learning (Least Absolute Shrinkage and Selection Operator (LASSO) regression). The dataset included (i) harmonized test scores for 15 year olds, (ii) annual educational quality indicators for the 15–19 age group, and (iii) educational quality indexes for the working-age population (15–64). These measures are provided in machine-readable formats and support empirical research on human capital, economic development, and global education inequalities across economies. Full article

(This article belongs to the Special Issue Data Mining and Computational Intelligence for E-Learning and Education—3rd Edition)

► Show Figures

Figure 1

14 pages, 2727 KB

Open AccessArticle

A Multimodal MRI-Based Model for Colorectal Liver Metastasis Prediction: Integrating Radiomics, Deep Learning, and Clinical Features with SHAP Interpretation

by Xin Yan, Furui Duan, Lu Chen, Runhong Wang, Kexin Li, Qiao Sun and Kuang Fu

Curr. Oncol. 2025, 32(8), 431; https://doi.org/10.3390/curroncol32080431 - 30 Jul 2025

Viewed by 971

Abstract

Purpose: Predicting colorectal cancer liver metastasis (CRLM) is essential for prognostic assessment. This study aims to develop and validate an interpretable multimodal machine learning framework based on multiparametric MRI for predicting CRLM, and to enhance the clinical interpretability of the model through [...] Read more.

Purpose: Predicting colorectal cancer liver metastasis (CRLM) is essential for prognostic assessment. This study aims to develop and validate an interpretable multimodal machine learning framework based on multiparametric MRI for predicting CRLM, and to enhance the clinical interpretability of the model through SHapley Additive exPlanations (SHAP) analysis and deep learning visualization. Methods: This multicenter retrospective study included 463 patients with pathologically confirmed colorectal cancer from two institutions, divided into training (n = 256), internal testing (n = 111), and external validation (n = 96) sets. Radiomics features were extracted from manually segmented regions on axial T2-weighted imaging (T2WI) and diffusion-weighted imaging (DWI). Deep learning features were obtained from a pretrained ResNet101 network using the same MRI inputs. A least absolute shrinkage and selection operator (LASSO) logistic regression classifier was developed for clinical, radiomics, deep learning, and combined models. Model performance was evaluated by AUC, sensitivity, specificity, and F1-score. SHAP was used to assess feature contributions, and Grad-CAM was applied to visualize deep feature attention. Results: The combined model integrating features across the three modalities achieved the highest performance across all datasets, with AUCs of 0.889 (training), 0.838 (internal test), and 0.822 (external validation), outperforming single-modality models. Decision curve analysis (DCA) revealed enhanced clinical net benefit from the integrated model, while calibration curves confirmed its good predictive consistency. SHAP analysis revealed that radiomic features related to T2WI texture (e.g., LargeDependenceLowGrayLevelEmphasis) and clinical biomarkers (e.g., CA19-9) were among the most predictive for CRLM. Grad-CAM visualizations confirmed that the deep learning model focused on tumor regions consistent with radiological interpretation. Conclusions: This study presents a robust and interpretable multiparametric MRI-based model for noninvasively predicting liver metastasis in colorectal cancer patients. By integrating handcrafted radiomics and deep learning features, and enhancing transparency through SHAP and Grad-CAM, the model provides both high predictive performance and clinically meaningful explanations. These findings highlight its potential value as a decision-support tool for individualized risk assessment and treatment planning in the management of colorectal cancer. Full article

(This article belongs to the Section Gastrointestinal Oncology)

► Show Figures

Graphical abstract

Search Results (361)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI

Saved Queries

Search Filter Reset All

Years

Feature Papers

Subjects

Journals

Article Types

Countries / Regions

Search Results (361)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI