Submit to this Journal Review for this Journal Propose a Special Issue

Article Menu

Share Help Cite Discuss in SciProfiles

Open AccessArticle

Peer-Review Record

Bayesian Learning Strategies for Reducing Uncertainty of Decision-Making in Case of Missing Values

Mach. Learn. Knowl. Extr. 2025, 7(3), 106; https://doi.org/10.3390/make7030106

by Vitaly Schetinin^*,† and Livija Jakaite^†

Reviewer 1: Anonymous

Reviewer 2: Anonymous

Mach. Learn. Knowl. Extr. 2025, 7(3), 106; https://doi.org/10.3390/make7030106

Submission received: 10 July 2025 / Revised: 11 September 2025 / Accepted: 16 September 2025 / Published: 22 September 2025

(This article belongs to the Section Learning)

Round 1

Reviewer 1 Report

Comments and Suggestions for Authors

This paper presents a well-motivated and methodologically sound study addressing the critical challenge of missing data in predictive modeling for liquidity crisis forecasting. The integration of Bayesian Model Averaging (BMA) with Decision Trees (DTs) via RJ MCMC sampling is innovative, and the proposed "sweeping strategy" effectively mitigates overfitting while maintaining model interpretability. The novel Ext preprocessing technique (extending features with binary missing-value indicators) demonstrates significant empirical advantages over established baselines, particularly in handling non-random missingness. The real-world financial application enhances practical relevance, and the rigorous validation (synthetic benchmarks, AUC-PRC, Hosmer-Lemeshow tests) strengthens credibility.
1. The RJ MCMC birth/death/change moves (Algorithms 1–2) lack sufficient pseudocode detail. How the Metropolis-Hastings acceptance ratio is calculated (e.g., proposal distributions for parameters)？ How priors (e.g., tree size, node parameters) are defined beyond uniform sampling. Please include mathematical formulations for key transition probabilities and a complete RJ MCMC sampling flowchart.
2. The computational cost of RJ MCMC (burn-in: 100k samples; post-burn: 5k samples) is non-trivial but underexplored. Please benchmark runtime against simpler methods (e.g., single DT, Random Forest) and discuss scalability. Could parallelization or variational approximations expedite sampling?
3. Only one synthetic (XOR) and one real-world dataset were tested. Please validate on additional UCI/standard datasets with controlled missingness mechanisms (MCAR, MAR, MNAR) to generalize claims beyond finance. The importance of Ext’s 14 binary indicators (Fig. 4) is not analyzed. Please discuss whether specific indicators (e.g., for debt ratios) drive predictions, enhancing model interpretability.
4. Table 2 (30+ rows) is overwhelming. Please condense Table 2 to critical thresholds (e.g., max F1, Youden’s index) and move full data to supplementary material.
5. Limited comparison with modern missing-data techniques (e.g., MICE, GAIN, or deep learning imputers). Please add a baseline using XGBoost/Random Forest with built-in missing-value handling to contextualize the 92.2% accuracy gain.
6. Uncertainty quantification (Fig. 5) is insightful but lacks guidance on translating posterior distributions into actionable decisions (e.g., risk thresholds for intervention). Please include a case study showing how uncertainty estimates directly impact a financial decision pipeline.
7. Critical implementation details (e.g., RJ MCMC proposal variances tuning protocol) are omitted. Please publish code/data (or synthetic data generator) and specify hyperparameter search ranges.

Author Response

Please see attached

Author Response File: Author Response.pdf

Reviewer 2 Report

Comments and Suggestions for Authors

Please see the attached PDF file.

Comments for author File: Comments.pdf

Author Response

Please see attached

Author Response File: Author Response.pdf

Round 2

Reviewer 2 Report

Comments and Suggestions for Authors

I congratulate the authors for the excellent improvements made. They have addressed all the points raised in the first review.

I would like to point out the following minor issues for your consideration:

Lines 86–97 contain information that also appears in lines 98–106 regarding the definition of μj and M. I kindly ask you to check whether this is a typos-rephrasing issue.
Line 125. The sentences ends and begins with "in this paper". If possible, please rephrase the beginning of the following sentence.
FIGURE 1. The rectangles overlap, with arrows that are small and in some parts not clearly visible. Please check Figure 1.
Line 516. A truncated sentence reads "Tuning protocol". Is this a header or something else?

Author Response

Please see attached.

Author Response File: Author Response.pdf

Article Menu

Bayesian Learning Strategies for Reducing Uncertainty of Decision-Making in Case of Missing Values

Further Information

Guidelines

MDPI Initiatives

Follow MDPI