DCBAN: A Dynamic Confidence Bayesian Adaptive Network for Reconstructing Visual Images from fMRI Signals
Round 1
Reviewer 1 Report (Previous Reviewer 3)
Comments and Suggestions for AuthorsThe authors have made significant revision compared with original version of the paper (Brainsci- 3782921). However, several questions remain:
- Table 2: Is the 1e-3 = 10-3? If yes, then 10-3 should be used in the text. It is related to other similar values in the text.
- The article is very lengthy (42 pages), and it should be shortened (this is not a review), for example, because of the repetitive drawings. After revision the paper increased from 37 to 42 pages (instead of decreasing!). How is it possible to decrease the paper? Fig.7 is the matrix (6x6) of images, where each row demonstrates the same idea as other rows. Therefore, it maybe remained only first (or any other) row of images in Fig. 7 deleting all other rows of imagines. Similar revisions could be made for Fig. 8 (6x4 imagines), Fig. 9 (5x3 imagines) and Fig. 10 (6x3 imagines). Moreover, Appendix A (pages 34 – 39) should be removed to Internet website, but its link placed in the paper text directly.
Author Response
The main corrections in the paper and the responses to the reviewers’ comments are in reviewer1.pdf. The reviewers’ comments are in black font, our responses are shown in blue font, and their corresponding content in the new manuscript is marked in red.
Author Response File:
Author Response.pdf
Reviewer 2 Report (New Reviewer)
Comments and Suggestions for AuthorsThe authors revised the manuscript accordingly. So, it can be published.
Author Response
Comments : The authors revised the manuscript accordingly. So, it can be published.
Response: We sincerely appreciate the reviewer's positive feedback and your acknowledgment that no further revisions are needed for the current version. We greatly appreciate the constructive comments you provided throughout the entire review process, as these comments have greatly helped us improve the quality and clarity of the manuscript. Once again, we thank you for your time and support.
Reviewer 3 Report (New Reviewer)
Comments and Suggestions for AuthorsThis paper introduces DCBAN (Dynamic Confidence Bayesian Adaptive Network), a framework for reconstructing visual images from fMRI signals. The model integrates three core innovations: (1) Deep Nested Singular Value Decomposition (DeepSVD) to embed low-rank constraints for fine-grained feature extraction, (2) Bayesian Adaptive Fractional Ridge Regression (BAFRR) for adaptive regularization and enhanced generalization, and (3) a Dynamic Confidence Adaptive Diffusion Model (DCAF) that adjusts semantic injection strength using a confidence network and time decay strategy. Applied to the Natural Scenes Dataset (NSD), DCBAN achieves superior performance in both structural and semantic fidelity, outperforming previous state-of-the-art methods in PixCorr, Inception, and CLIP metrics.
With the inclusion of the suggested analyses and minor editorial corrections, this work has the potential to make a meaningful contribution to the field of neural decoding and fMRI-based image reconstruction.
The following comments should be considered in the revised manuscript
Comments to the Authors:
- Please ensure that each figure and table is self-explanatory, with sufficiently descriptive captions and clear labeling of axes, metrics, and experimental conditions. This will improve readability and help the reader understand the results without referring back to the main text.
- Include an explicit ablation study to quantify the individual contributions of the DeepSVD, BAFRR, and DCAF modules.
- Provide qualitative comparisons (reconstructed images) alongside ground truth and baseline methods to illustrate the visual improvements more clearly.
- Clarify the computational complexity and training time of DCBAN relative to prior models to assess scalability and practical feasibility.
- Discuss potential limitations and generalization challenges when applying DCBAN to other datasets or non-natural visual stimuli.
- There are minor typographical and formatting inconsistencies, including instances of mixed text color. These are observed around lines 280, 340, and 510, among others. Please review the manuscript carefully to ensure uniform formatting and correction of typographical errors.
- The provided link for reproducibility verification is not accessible.
Author Response
The main corrections in the paper and the responses to the reviewers’ comments are in reviewer3.pdf. The reviewers’ comments are in black font, our responses are shown in blue font, and their corresponding content in the new manuscript is marked in red.
Author Response File:
Author Response.pdf
This manuscript is a resubmission of an earlier submission. The following is a list of the peer review reports and author responses from that submission.
Round 1
Reviewer 1 Report
Comments and Suggestions for AuthorsThe introduction is too long, and it can be reduced by only highlighting the motivation of the study.
More experimental results should be reported to understand the contribution of different components of the proposed model in the reconstruction. Similar to the statistics in Table 2, a visual comparison is required.
Clear details about the dataset should be given, such as how it is split into cross-validation and testing, and how you ensured no data leaking.
Limitations and future work should be discussed.
Author Response
Details can be found in the attachment.
Author Response File:
Author Response.pdf
Reviewer 2 Report
Comments and Suggestions for AuthorsReview Report for MDPI Brain Sciences
(DCBAN: A Dynamic Confidence Bayesian Adaptive Network for Reconstructing Visual Images from fMRI Signals)
1. In this study, the Dynamic Confidence Bayesian Adaptive Network model was developed, which aims to address structural accuracy, model generalization, and visual naturalness issues in fMRI-based visual image reconstruction, and in experiments conducted on the Natural Scenes Dataset, it was stated that the highest structural and semantic reconstruction performance in the field was achieved with a significant success increase compared to existing methods.
2. The introduction section discusses reconstructing visual images from fMRI signals, the importance of the topic, its place in the literature, studies related to traditional machine learning-based methods and deep learning-based methods, studies based on generative adversarial networks, and the study's main contributions to the literature. This section provides a sufficient overview of both the importance of the topic, the study's place in the literature, and its main contributions. However, to further highlight this study in relation to the literature, a literature table containing specific columns such as "dataset, proposed method, originality, pros and cons, results" should be added for the studies in the literature. This table should then be linked to this newly added table, detailing the gaps in the literature that this study addresses.
3. A detailed examination of the Natural Scenes Dataset used in the study reveals that it is sufficient in terms of type, quantity, and content. However, to ensure the integrity of the article, it would be more appropriate to include more detailed dataset details under the "Materials and Methods" heading rather than the "Results" heading.
4. Analyzing the architectural details of the Dynamic Confidence Bayesian Adaptive Network model proposed in the study reveals that it consists of feature extraction and feature decoding sections, as well as semantic feature and text feature prediction models. This proposed model fundamentally possesses a certain level of originality. However, although there are many alternatives to principal component analysis and deep nested singular value decomposition, particularly those used in the feature extraction section, more detail should be provided to explain why these were chosen and/or whether different experiments were conducted.
5. The hardware used in the study is adequately described. However, it is recommended to include a table detailing the types, quantities, and reasons for choosing hyperparameters. Furthermore, more in-depth explanations are recommended regarding the programming language, toolbox, and frameworks used, as well as the algorithm and/or pseudocode associated with the proposed model.
6. The results obtained in relation to the proposed model in this study are sufficient when compared to similar studies in the literature and, in this particular study, when considered in terms of controllability of the results. The results obtained and their comparison with the literature demonstrate the quality of the study.
In conclusion, the study is both interesting and has a high potential to contribute to the literature. However, all the sections mentioned above need to be fully addressed.
Author Response
Details can be found in the attachment.
Author Response File:
Author Response.pdf
Reviewer 3 Report
Comments and Suggestions for AuthorsThis study proposes a Dynamic Confidence Bayesian Adaptive Network (DCBAN). In this network model, deep nested Singular Value Decomposition is introduced to embed low-rank constraints into the deep learning model layers for fine-grained feature extraction, improving structural fidelity. The proposed Bayesian Adaptive Fractional Ridge Regression module, based on singular value space, dynamically adjusts the regularization parameters, enhancing the decoder's generalization ability under complex stimulus conditions. The constructed Dynamic Confidence Adaptive Diffusion Model module incorporates a confidence network and time decay strategy, dynamically adjusting the semantic injection strength during the generation phase, further enhancing the details and naturalness of the generated images.
To regret, the paper was very poorly proofread and formatted before submission.
The main shortcomings and omission of the paper are the following:
- In the end of section 1. Introduction, it should be briefly found the subsequent sections of the paper.
- All formulae of other authors must contain corresponding links. For example: (1) and (5).
- Equation (1): What does ||…|| mean?
- There is total confusion with numbers of sections. For example: there are two (2.1.1) in Lines 385, 489; two (2.2.2) in Lines 466, 555; two (3.1) in Lines 476, 729. By this, it is absent Section 2.2.1.
- (Line 500): What does []52 mean? The paper includes only 49 references.
- Equation (7) is absent, at the same time there are two Equations (6).
- (Line 845): The designation “e-8” is incorrect.
- Fig. 11 is absent.
- Figs. 10 – 13 show different (not the same) pictures for various methods. How could the comparative quality of these different pictures be estimated and advantage of certain method?
- (Line 1127): “Table 2. This is a table. Tables should be placed in the main text near to the first time they are cited.” – it is very strange phrase for the Table title.
- Section 4. Discussion is very short and should be corresponded to section 5. Conclusions but text of Section 3 Results should be divided between sections of Results and Discussion.
- Section 5. Conclusions is absent from the paper.
- The list of references should be designed according to the MDPI rules and all papers must be accompanied by doi.
- The paper is very long (37 pages) and should be reduced (it is not review) for account of for example one-type Figures.
Author Response
Details can be found in the attachment.
Author Response File:
Author Response.pdf
