Submit to this Journal Review for this Journal Propose a Special Issue

Article Menu

Share Help Cite Discuss in SciProfiles

Open AccessArticle

Peer-Review Record

Dynamic PSO-Optimized XGBoost–RFE with Cross-Domain Hierarchical Transfer: A Small-Sample Feature Selection Approach for Equipment Health Management

Electronics 2025, 14(17), 3521; https://doi.org/10.3390/electronics14173521

by Yao Lei, Jianyin Zhao^*, Weimin Lv and Youwei Hu

Reviewer 1:

Rui Zhong

Reviewer 2: Anonymous

Electronics 2025, 14(17), 3521; https://doi.org/10.3390/electronics14173521

Submission received: 6 August 2025 / Revised: 29 August 2025 / Accepted: 1 September 2025 / Published: 3 September 2025

Round 1

Reviewer 1 Report

Comments and Suggestions for Authors

This paper presents a novel methodology for feature selection in equipment health management (EHM), which is a multi-stage approach that integrates Principal Component Analysis (PCA) for initial dimensionality reduction, a dynamic Particle Swarm Optimization (PSO) algorithm to optimize XGBoost hyperparameters, and a Recursive Feature Elimination (RFE) framework for feature selection. A key contribution is the introduction of a cross-domain hierarchical transfer learning strategy to leverage knowledge from a data-rich source domain to a data-scarce target domain. The proposed model is validated on the Case Western Reserve University (CWRU) bearing dataset, which demonstrates significant improvements in classification accuracy, reduction in overfitting, and enhanced feature selection stability compared to traditional methods.
Here are my concerns
1. The primary limitation of the study is its reliance on a single dataset (CWRU bearing dataset). While this is a standard benchmark, the performance of the proposed model on other EHM datasets with different characteristics is not validated. The authors should acknowledge this limitation and suggest that future work will involve testing the model on various datasets to establish its generalizability.
2. The authors conduct multiple comparative experiments to validate each component of their proposed model, including a comparison of the improved PSO with traditional PSO and grid search, as well as ablation studies are implemented to demonstrate the contribution of each element of the improved PSO. I appreciate this comprehensive experimental analysis, while more variants of PSO are encouraged to be involved as competitor algorithms, such as competitive swarm optimizer and self-adaptive competitive swarm optimizer.
3. The paper is well-written, but a few minor grammatical errors were noted (e.g., "Convergent algebra" in Table 2, which likely means "Convergence Iterations").
Overall, this paper is well-written and well-structured, and I think it can be accepted after minor revisions.

Author Response

Concern # 1 (please list here): The primary limitation of the study is its reliance on a single dataset (CWRU bearing dataset). While this is a standard benchmark, the performance of the proposed model on other EHM datasets with different characteristics is not validated. The authors should acknowledge this limitation and suggest that future work will involve testing the model on various datasets to establish its generalizability.

Author response:

(1) Add limitations explanation in the "5. Conclusion" section

Add a new paragraph about research limitations after the existing content in the conclusion section, explicitly acknowledging the limitations of a single dataset.

(2) Supplement the dataset expansion plan in the future work section of "5. Conclusion"

On the basis of the existing content of 'Future research can explore...', refine the specific direction of multi dataset validation.

(3) Supplement the applicability explanation of the dataset in the "4. Case Study" section

To enhance the coherence of the limitations explanation, a transitional statement was added at the end of the case study section to lay the foundation for the limitations explanation in the conclusion.

Concern # 2 (please list here): The authors conduct multiple comparative experiments to validate each component of their proposed model, including a comparison of the improved PSO with traditional PSO and grid search, as well as ablation studies are implemented to demonstrate the contribution of each element of the improved PSO. I appreciate this comprehensive experimental analysis, while more variants of PSO are encouraged to be involved as competitor algorithms, such as competitive swarm optimizer and self-adaptive competitive swarm optimizer.

Author response:

Add PSO variant comparison experiments in the "4.2.2 Comparison of Parameter Optimization Strategies" section

(1) Supplementary comparative algorithm explanation

After the existing definitions of "traditional PSO" and "grid search", new core mechanism explanations for "Competitive Swarm Optimizer (CSO)" and "Self-adaptive Competitive Swarm Optimizer (SACSO)" have been added, maintaining the same experimental control variable logic as the original text.

(2) Update experimental parameter settings

In the parameter configuration section of "4.1.2 Improved PSO Feature Selection", supplement the key parameter settings of CSO and SACSO to ensure experimental fairness.

(3) Expand Table 2 Experimental Results Table

Add columns "CSO" and "SACSO" to the original Table 2 to supplement experimental data.

(4) Supplementary result analysis

Add a comparative discussion on CSO, SACSO, and improved PSO at the end of the original results analysis paragraph, highlighting the advantages of improving PSO.

Concern # 3 (please list here): The paper is well-written, but a few minor grammatical errors were noted (e.g., "Convergent algebra" in Table 2, which likely means "Convergence Iterations").

Author response:

Unified revision of table and text terminology

(1) Change 'Convergent algebra' in Table 2 to 'Convergence iterations'.

(2) Synchronize and modify the corresponding description in the main text

Thanks for the reviewer's suggestions, we will carry out relevant work in the future work.

Author Response File: Author Response.docx

Reviewer 2 Report

Comments and Suggestions for Authors

The manuscript suggests a new algorithm for bearing diagnosis using vibration analysis. It uses transfer learning to address the problem of small dataset sample size. It combines dynamic PSO optimized XGBoost model with cross domain transfer learning based on hierarchical transfer. It utilized a feature selection during the process. I do not understand on what dataset this study is demonstrated – I do not familiar with CWU.

I do not understand what the CWU dataset is. Maybe I missed it, but do you add a reference to this dataset? Maybe you are using the CRWU dataset and not CWU? Please add more description about the dataset you are using.
Please also add a quantitative summary of the dataset. Including the number of samples, operating conditions, fault types, etc.
Please also cover in the introduction regular technique for bearing diagnosis that do not use machine learning.
What is really special in your new algorithm? Does it just a combination of other known techniques?
What are the crucial components of your algorithm? Please utilize an ablation study on it.
What is the data availability of the dataset you have used?
Why do you not compared to more techniques of machine learning for bearing diagnosis?. I think you have use a very limited comparison.

Author Response

Concern # 1 (please list here): I do not understand what the CWU dataset is. Maybe I missed it, but do you add a reference to this dataset? Maybe you are using the CRWU dataset and not CWU? Please add more description about the dataset you are using.

Author response:

(1) Name correction (unified throughout the entire text)

'CWU' is a typo, all "CWU datasets" in the entire text have been revised to "CWRU datasets" (full name: Case Western Reserve University Bearing Dataset), including 4 Title and body description of Case Study Using CWU Bearing Dataset.

(2) Supplement to Detailed Dataset Description (added at the beginning of Section 4.1)

Add background information and references to the CWRU dataset in section 4.1 Experimental Data and Model Parameter Setting.

Concern # 2 (please list here): Please also add a quantitative summary of the dataset. Including the number of samples, operating conditions, fault types, etc.

Author response:

We have revised section 4.1.

Quantitative information clarification

By specifying the sample size, operating condition classification (including corresponding speed), fault type (including fault size), and data dimension changes of the dataset through specific numerical values, it meets the requirements for quantitative summary.

Experimental data correspondence

Clarify the sample allocation logic for the 2-HP (source domain, 300 samples) and 3-HP (target domain, 50 samples) data used in this study, in line with the design of small sample scenarios

Concern # 3 (please list here): Please also cover in the introduction regular technique for bearing diagnosis that do not use machine learning.

Author response:

We have Systematically introduced non machine learning methods such as time-domain analysis, frequency-domain analysis, time-frequency analysis, and oil/visual inspection, clarify their principles and application scenarios; pointed out the shortcomings of traditional methods such as relying on expert experience and poor adaptability, providing reasonable support for the subsequent introduction of machine learning methods; Added relevant classic literature to ensure that the description of traditional technologies is based on evidence, while maintaining logical coherence with the paper topic.

Concern # 4 (please list here): What is really special in your new algorithm? Does it just a combination of other known techniques

Author response:

We have added clear explanations of the core innovation points of the algorithm in sections 1. Introduction and 3. Health Feature Selection Model, emphasizing that it is not a simple combination of existing technologies, but a deep integration and improvement for small sample device health management scenarios; By explaining the coupling mechanism of technical components (such as the dynamic guidance relationship between PSO and RFE), it is demonstrated that the algorithm is an organic whole for specific problems, rather than a stack of isolated technologies; Highlighting innovative design, such as feature alignment with physical constraints and small sample overfitting prevention strategies, is aimed at addressing unique challenges in EHM, such as small sample size, high dimensionality, and cross domain differences, rather than transplanting generic technologies; Combining experimental results, demonstrate the performance gains brought by innovative design and confirm its actual effect beyond "simple combinations".

Concern # 5 (please list here): What are the crucial components of your algorithm? Please utilize an ablation study on it.

Author response:

Add definition of algorithm key components at the beginning of section 4.2.5.
Add targeted ablation plan for key components to the original ablation experiment design. Replace the ablation design "only for PSO module" in section 4.2.5 and extend it to all key components to maintain consistency in experimental variable control logic.
Supplement analysis of ablation results and key component contributions.
Add conclusion of ablation experiment at the end of section 4.2.5.

Concern # 6 (please list here): What is the data availability of the dataset you have used?

Author response:

We have added the availability information of the supplementary dataset in section 4.1 Experimental Data and Model Parameter Setting.

Concern # 7 (please list here): Why do you not compared to more techniques of machine learning for bearing diagnosis? I think you have use a very limited comparison.

Author response:

In response to the opinion that the scope of machine learning technology comparison is limited, and based on the mainstream technology route in the field of bearing fault diagnosis, the comparison model will be expanded from the original 3 techniques to 6 techniques in the experimental design section of section 4.2.4, covering the three dimensions of "traditional machine learning deep learning small sample specific models", to ensure the comprehensiveness and representativeness of the comparison.
Key parameter settings for newly added models。
Expand the experimental results table and supplement the analysis of the results

Thanks for the reviewer's suggestions, we will carry out relevant work in the future work.

Article Menu

Dynamic PSO-Optimized XGBoost–RFE with Cross-Domain Hierarchical Transfer: A Small-Sample Feature Selection Approach for Equipment Health Management

Further Information

Guidelines

MDPI Initiatives

Follow MDPI