**6. Conclusions**

This study proposed a vehicle interior noise multi-classification model based on the XGBoost method and onboard smartphone data. By considering the Shannon entropy, a 1-second time window was selected to perform the data segmentation task. The comparison between the performances before and after the training data was balanced demonstrated that data balancing can promote the recall of minority classes but decrease the precision of their results. Feature importance analysis results show that features calculated from the spectrum of the Fourier transform and the first 12 MFCCs are the most essential among all features. By comparing and analyzing the results of importance-based and mutual information-based methods, this study selected the top 10 features in importance score to form the features set, whose *F*1 *wm* reached 0.91. Then, the comparison between the XGBoost and other commonly used classifiers showed that the proposed XGBoost-based classification model presents a faster computing speed while maintaining a good performance. The case studies verified that the proposed multi-classification model has the potential to investigate the correlation between abnormal vehicle interior noise and dynamic responses of the train. Moreover, the capacity of the model to monitor abnormal noise events and evaluate the e ffect of rail grinding was also proved.

There are a few directions for future research. A more detailed classification of vehicle interior noise could be developed based on specific track-vehicle conditions so that this model would be suitable for general cases. Furthermore, more experiments are needed to explain the performance among di fferent vehicles and track slabs. Another interesting option is to investigate the relationship between abnormal noise and wheel-rail contact conditions. Furthermore, the authors intend to set up a data collection system with high-quality sensors for more accurate and reliable data.

**Author Contributions:** Conceptualization, P.W. and Q.H.; data curation, Y.W. and Q.H.; formal analysis, Y.W.; methodology, Y.W. and Q.W.; validation, Y.W. and Z.C.; writing—original draft preparation, Y.W.; writing—review and editing, Y.W. and Q.H. All authors have read and agreed to the published version of the manuscript.

**Funding:** This research was funded by National Natural Science Foundation of China, gran<sup>t</sup> number 51878576 and U1934214, and China Scholarship Council, file No. 201907000077.

**Acknowledgments:** The authors would like to thank Huajiang Ouyang, from the University of Liverpool, for his support when this study was being finished.

**Conflicts of Interest:** The authors declare no conflict of interest.
