Towards Accurate Identification of Antibiotic-Resistant Pathogens through the Ensemble of Multiple Preprocessing Methods Based on MALDI-TOF Spectra
Abstract
:1. Introduction
2. Results
2.1. Peak Counts
2.2. Distribution of Benchmark Peaks
2.3. Performances of Models
2.4. Feature Selection on RF
3. Discussion
4. Materials and Methods
4.1. Bacterial Isolates
4.2. MALDI-TOF MS
4.3. Proposed Signal Preprocessing Method
4.4. Feature Extraction by a Two-Stage Alignment Method
4.5. Alignment and Featurization
4.6. Machine Learning Models
4.7. Evaluation Metrics
5. Conclusions
Supplementary Materials
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Conflicts of Interest
References
- Vrioni, G.; Tsiamis, C.; Oikonomidis, G.; Theodoridou, K.; Kapsimali, V.; Tsakris, A. MALDI-TOF mass spectrometry technology for detecting biomarkers of antimicrobial resistance: Current achievements and future perspectives. Ann. Transl. Med. 2018, 6, 240. [Google Scholar] [CrossRef] [PubMed]
- Wang, H.-Y.; Chung, C.-R.; Chen, C.-J.; Lu, K.-P.; Tseng, Y.-J.; Chang, T.-H.; Wu, M.-H.; Huang, W.-T.; Lin, T.-W.; Liu, T.-P.; et al. Clinically Applicable System for Rapidly Predicting Enterococcus faecium Susceptibility to Vancomycin. Microbiol. Spectr. 2021, 9, e0091321. [Google Scholar] [CrossRef] [PubMed]
- Weis, C.; Cuénod, A.; Rieck, B.; Dubuis, O.; Graf, S.; Lang, C.; Oberle, M.; Brackmann, M.; Søgaard, K.K.; Osthoff, M.; et al. Direct antimicrobial resistance prediction from clinical MALDI-TOF mass spectra using machine learning. Nat. Med. 2022, 28, 164–174. [Google Scholar] [CrossRef]
- Wang, H.-Y.; Chung, C.-R.; Wang, Z.; Li, S.; Chu, B.-Y.; Horng, J.-T.; Lu, J.-J.; Lee, T.-Y. A large-scale investigation and identification of methicillin-resistant Staphylococcus aureus based on peaks binning of matrix-assisted laser desorption ionization-time of flight MS spectra. Briefings Bioinform. 2020, 22, bbaa138. [Google Scholar] [CrossRef] [PubMed]
- Croxatto, A.; Prod’hom, G.; Greub, G. Applications of MALDI-TOF mass spectrometry in clinical diagnostic microbiology. FEMS Microbiol. Rev. 2012, 36, 380–407. [Google Scholar] [CrossRef] [PubMed]
- Wang, H.-Y.; Li, W.-C.; Huang, K.-Y.; Chung, C.-R.; Horng, J.-T.; Hsu, J.-F.; Lu, J.-J.; Lee, T.-Y. Rapid classification of group B Streptococcus serotypes based on matrix-assisted laser desorption ionization-time of flight mass spectrometry and machine learning techniques. BMC Bioinform. 2019, 20, 703. [Google Scholar] [CrossRef] [Green Version]
- Li, M.; Liu, M.; Song, Q.; Xiong, L.; Chen, Z.; Kang, M.; Xie, Y. Rapid antimicrobial susceptibility testing by matrix-assisted laser desorption ionization–time of flight mass spectrometry using a qualitative method in Acinetobacter baumannii complex. J. Microbiol. Methods 2018, 153, 60–65. [Google Scholar] [CrossRef]
- Chung, C.-R.; Wang, H.-Y.; Lien, F.; Tseng, Y.-J.; Chen, C.-H.; Lee, T.-Y.; Liu, T.-P.; Horng, J.-T.; Lu, J.-J. Incorporating Statistical Test and Machine Intelligence Into Strain Typing of Staphylococcus haemolyticus Based on Matrix-Assisted Laser Desorption Ionization-Time of Flight Mass Spectrometry. Front. Microbiol. 2019, 10, 2120. [Google Scholar] [CrossRef] [Green Version]
- Wang, C.; Wang, Z.; Wang, H.-Y.; Chung, C.-R.; Horng, J.-T.; Lu, J.-J.; Lee, T.-Y. Large-Scale Samples Based Rapid Detection of Ciprofloxacin Resistance in Klebsiella pneumoniae Using Machine Learning Methods. Front. Microbiol. 2022, 13, 827451. [Google Scholar] [CrossRef]
- Zhang, J.; Wang, Z.; Wang, H.-Y.; Chung, C.-R.; Horng, J.-T.; Lu, J.-J.; Lee, T.-Y. Rapid Antibiotic Resistance Serial Prediction in Staphylococcus aureus Based on Large-Scale MALDI-TOF Data by Applying XGBoost in Multi-Label Learning. Front. Microbiol. 2022, 13, 853775. [Google Scholar] [CrossRef]
- He, Z.; Qi, R.Z.; Yu, W. Bioinformatic analysis of data generated from MALDI mass spectrometry for biomarker discovery. In Applications of MALDI-TOF Spectroscopy; Springer: Berlin/Heidelberg, Germany, 2012; pp. 193–209. [Google Scholar]
- Gibb, S.; Strimmer, K. MALDIquant: A versatile R package for the analysis of mass spectrometry data. Bioinformatics 2012, 28, 2270–2271. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Sousa, C.; Botelho, J.; Silva, L.; Grosso, F.; Nemec, A.; Lopes, J.; Peixe, L. MALDI-TOF MS and chemometric based identification of the Acinetobacter calcoaceticus-Acinetobacter baumannii complex species. Int. J. Med. Microbiol. 2014, 304, 669–677. [Google Scholar] [CrossRef] [PubMed]
- Du, P.; Kibbe, W.; Lin, S.M. Improved peak detection in mass spectrum by incorporating continuous wavelet transform-based pattern matching. Bioinformatics 2006, 22, 2059–2065. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Zhang, Z.-M.; Tong, X.; Peng, Y.; Ma, P.; Zhang, M.-J.; Lu, H.-M.; Chen, X.-Q.; Liang, Y.-Z. Multiscale peak detection in wavelet space. Analyst 2015, 140, 7955–7964. [Google Scholar] [CrossRef]
- Yang, C.; He, Z.; Yu, W. Comparison of public peak detection algorithms for MALDI mass spectrometry data analysis. BMC Bioinform. 2009, 10, 4. [Google Scholar] [CrossRef] [Green Version]
- Cohen, A.; Messaoudi, C.; Badir, H. A new wavelet-based approach for mass spectrometry data classification. In New frontiers of Biostatistics and Bioinformatics; Springer: Berlin/Heidelberg, Germany, 2018; pp. 175–189. [Google Scholar]
- Nguyen, T.; Nahavandi, S.; Creighton, D.; Khosravi, A. Mass spectrometry cancer data classification using wavelets and genetic algorithm. FEBS Lett. 2015, 589, 3879–3886. [Google Scholar] [CrossRef]
- Wang, H.-Y.; Chen, C.-H.; Lee, T.-Y.; Horng, J.-T.; Liu, T.-P.; Tseng, Y.-J.; Lu, J.-J. Rapid Detection of Heterogeneous Vancomycin-Intermediate Staphylococcus aureus Based on Matrix-Assisted Laser Desorption Ionization Time-of-Flight: Using a Machine Learning Approach and Unbiased Validation. Front. Microbiol. 2018, 9, 2393. [Google Scholar] [CrossRef]
- Tang, W.; Ranganathan, N.; Shahrezaei, V.; Larrouy-Maumus, G. MALDI-TOF mass spectrometry on intact bacteria combined with a refined analysis framework allows accurate classification of MSSA and MRSA. PLoS ONE 2019, 14, e0218951. [Google Scholar] [CrossRef]
- Huang, T.-S.; Lee, S.S.-J.; Lee, C.-C.; Chang, F.-C. Detection of carbapenem-resistant Klebsiella pneumoniae on the basis of matrix-assisted laser desorption ionization time-of-flight mass spectrometry by using supervised machine learning approach. PLoS ONE 2020, 15, e0228459. [Google Scholar] [CrossRef]
- Chung, C.-R.; Wang, Z.; Weng, J.-M.; Wang, H.-Y.; Wu, L.-C.; Tseng, Y.-J.; Chen, C.-H.; Lu, J.-J.; Horng, J.-T.; Lee, T.-Y. MDRSA: A Web Based-Tool for Rapid Identification of Multidrug Resistant Staphylococcus aureus Based on Matrix-Assisted Laser Desorption Ionization-Time of Flight Mass Spectrometry. Front. Microbiol. 2021, 12, 766206. [Google Scholar] [CrossRef]
- Wang, Z.; Wang, H.-Y.; Chung, C.-R.; Horng, J.-T.; Lu, J.-J.; Lee, T.-Y. Large-scale mass spectrometry data combined with demographics analysis rapidly predicts methicillin resistance in Staphylococcus aureus. Briefings Bioinform. 2021, 22, bbaa293. [Google Scholar] [CrossRef] [PubMed]
- Chambers, M.C.; Maclean, B.; Burke, R.; Amodei, D.; Ruderman, D.L.; Neumann, S.; Gatto, L.; Fischer, B.; Pratt, B.; Egertson, J.; et al. A cross-platform toolkit for mass spectrometry and proteomics. Nat. Biotechnol. 2012, 30, 918–920. [Google Scholar] [CrossRef] [PubMed]
- Virtanen, P.; Gommers, R.; Oliphant, T.E.; Haberland, M.; Reddy, T.; Cournapeau, D.; Burovski, E.; Peterson, P.; Weckesser, W.; Bright, J.; et al. SciPy 1.0 Contributors. SciPy 1.0 Fundamental Algorithms for Scientific Computing in Python. Nat. Methods 2020, 17, 261–272. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Stępień-Pyśniak, D.; Hauschild, T.; Marek, A. MALDI-TOF Mass Spectrometry as a Useful Tool for Identification of Enterococcus spp. from Wild Birds and Differentiation of Closely Related Species. J. Microbiol. Biotechnol. 2017, 27, 1128–1137. [Google Scholar] [CrossRef] [PubMed]
- Botev, Z.I.; Grotowski, J.F.; Kroese, D.P. Kernel density estimation via diffusion. Ann. Stat. 2010, 38, 2916–2957. [Google Scholar] [CrossRef] [Green Version]
- Tuv, E.; Borisov, A.; Torkkola, K. Feature selection using ensemble based ranking against artificial contrasts. In Proceedings of the 2006 IEEE International Joint Conference on Neural Network Proceedings, Vancouver, BC, Canada, 16–21 July 2006; pp. 2181–2186. [Google Scholar]
- Geurts, P.; Ernst, D.; Wehenkel, L. Extremely randomized trees. Mach. Learn. 2006, 63, 3–42. [Google Scholar] [CrossRef] [Green Version]
- Pedregosa, F.; Varoquaux, G.; Gramfort, A.; Michel, V.; Thirion, B.; Grisel, O.; Blondel, M.; Prettenhofer, P.; Weiss, R.; Dubourg, V.; et al. Scikit-learn: Machine Learning in Python. J. Mach. Learn. Res. 2011, 12, 2825–2830. [Google Scholar]
- Schisterman, E.F.; Perkins, N.J.; Liu, A.; Bondell, H. Optimal Cut-point and Its Corresponding Youden Index to Discriminate Individuals Using Pooled Blood Samples. Epidemiology 2005, 16, 73–81. [Google Scholar] [CrossRef]
Method | #Features | TP | TN | FP | FN | Sensitivity | Specificity | Accuracy | AUROC |
---|---|---|---|---|---|---|---|---|---|
(a) A. baumannii, | |||||||||
FlexAnalysis | 101 | 243.1 ± 9.60 | 177.7 ± 6.82 | 18.1 ± 6.57 | 38.6 ± 9.45 | 0.8630 ± 0.0337 | 0.9075 ± 0.0337 | 0.8812 ± 0.0154 | 0.9460 ± 0.0098 |
MALDIquant | 117 | 234.3 ± 14.48 | 169.4 ± 8.15 | 26.4 ± 8.13 | 47.4 ± 14.23 | 0.8317 ± 0.0507 | 0.8652 ± 0.0415 | 0.8454 ± 0.0272 | 0.9218 ± 0.0152 |
CWT | 88 | 250.2 ± 8.23 | 179.6 ± 4.03 | 16.2 ± 3.94 | 31.5 ± 8.11 | 0.8882 ± 0.0288 | 0.9173 ± 0.0201 | 0.9001 ± 0.0158 | 0.9564 ± 0.0076 |
Ensemble method | 63 | 246.7 ± 6.96 | 185.0 ± 4.78 | 10.8 ± 4.96 | 35.0 ± 7.02 | 0.8758 ± 0.0249 | 0.9449 ± 0.0253 | 0.9041 ± 0.0137 | 0.9616 ± 0.0073 |
(b) A. nosocomialis | |||||||||
FlexAnalysis | 73 | 93.4 ± 5.89 | 92.3 ± 3.33 | 8.5 ± 3.44 | 15.5 ± 5.95 | 0.8577 ± 0.0545 | 0.9157 ± 0.0340 | 0.8855 ± 0.0223 | 0.9377 ± 0.0157 |
MALDIquant | 50 | 90.0 ± 2.67 | 90.4 ± 3.41 | 10.4 ± 3.17 | 18.9 ± 2.60 | 0.8264 ± 0.0240 | 0.8968 ± 0.0318 | 0.8603 ± 0.0175 | 0.9186 ± 0.0176 |
CWT | 32 | 92.7 ± 3.65 | 95.9 ± 1.97 | 4.9 ± 2.02 | 16.2 ± 3.82 | 0.8513 ± 0.0349 | 0.9514 ± 0.0201 | 0.8994 ± 0.0184 | 0.9417 ± 0.0160 |
Ensemble method | 20 | 91.9 ± 4.33 | 96.5 ± 3.87 | 4.3 ± 3.68 | 17.0 ± 4.47 | 0.8439 ± 0.0409 | 0.9573 ± 0.0367 | 0.8984 ± 0.0204 | 0.9318 ± 0.0099 |
(c) E. faecium | |||||||||
FlexAnalysis | 86 | 222.5 ± 11.65 | 215.5 ± 16.35 | 67.2 ± 16.49 | 53.0 ± 11.58 | 0.8076 ± 0.0421 | 0.7623 ± 0.0582 | 0.7847 ± 0.0198 | 0.8474 ± 0.0222 |
MALDIquant | 74 | 222.0 ± 6.88 | 206.2 ± 6.76 | 76.5 ± 7.06 | 53.5 ± 6.84 | 0.8058 ± 0.0248 | 0.7294 ± 0.0247 | 0.7671 ± 0.0163 | 0.8279 ± 0.0221 |
CWT | 85 | 225.5 ± 12.66 | 212.1 ± 9.45 | 70.6 ± 9.22 | 50.0 ± 12.58 | 0.8185 ± 0.0457 | 0.7502 ± 0.0328 | 0.7839 ± 0.0172 | 0.8493 ± 0.0197 |
Ensemble method | 62 | 219.2 ± 12.93 | 221.9 ± 8.61 | 60.8 ± 8.68 | 56.3 ± 12.93 | 0.7956 ± 0.0470 | 0.7849 ± 0.0307 | 0.7902 ± 0.0212 | 0.8565 ± 0.0203 |
(d) Group B Streptococci | |||||||||
FlexAnalysis | 33 | 152.5 ± 21.09 | 211.3 ± 25.54 | 75.7 ± 25.54 | 83.3 ± 20.91 | 0.6467 ± 0.0889 | 0.7362 ± 0.0890 | 0.6959 ± 0.0265 | 0.7442 ± 0.0272 |
MALDIquant | 78 | 151.1 ± 14.98 | 211 ± 12.36 | 76.0 ± 12.36 | 84.7 ± 14.83 | 0.6408 ± 0.0631 | 0.7352 ± 0.0431 | 0.6926 ± 0.0152 | 0.7414 ± 0.0226 |
CWT | 90 | 154.8 ± 14.86 | 211.9 ± 21.66 | 75.1 ± 21.66 | 81.0 ± 14.97 | 0.6565 ± 0.0633 | 0.7383 ± 0.0755 | 0.7014 ± 0.0191 | 0.7594 ± 0.0207 |
Ensemble method | 39 | 155.2 ± 12.66 | 208.3 ± 19.12 | 78.7 ± 19.12 | 80.6 ± 12.85 | 0.6582 ± 0.0542 | 0.7258 ± 0.0666 | 0.6953 ± 0.0222 | 0.7463 ± 0.0252 |
Species | #Features | TP | TN | FP | FN | Sensitivity | Specificity | Accuracy | AUROC |
---|---|---|---|---|---|---|---|---|---|
A. baumannii | 63 | 485 | 373 | 58 | 101 | 0.8276 | 0.8654 | 0.8437 | 0.9264 |
A. nosocomialis | 20 | 73 | 88 | 4 | 12 | 0.8588 | 0.9565 | 0.9096 | 0.9391 |
E. faecium | 62 | 927 | 767 | 229 | 234 | 0.7984 | 0.7701 | 0.7854 | 0.8569 |
Group B Streptococci | 39 | 330 | 461 | 200 | 137 | 0.7066 | 0.6974 | 0.7012 | 0.7611 |
Species | Antibiotics | Resistant | Susceptible | Total |
---|---|---|---|---|
Training | ||||
A. nosocomialis | CIP | 1089 | 1008 | 2097 |
A.baumannii | CIP | 2817 | 1958 | 4775 |
E. faecium | VA | 2755 | 2827 | 5582 |
Group B Streptococci | CC | 2358 | 2870 | 5228 |
Independent Testing | ||||
A. nosocomialis | CIP | 85 | 92 | 117 |
A.baumannii | CIP | 586 | 431 | 1017 |
E. faecium | VA | 1161 | 996 | 2157 |
Group B Streptococci | CC | 467 | 661 | 1128 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Chung, C.-R.; Wang, H.-Y.; Chou, P.-H.; Wu, L.-C.; Lu, J.-J.; Horng, J.-T.; Lee, T.-Y. Towards Accurate Identification of Antibiotic-Resistant Pathogens through the Ensemble of Multiple Preprocessing Methods Based on MALDI-TOF Spectra. Int. J. Mol. Sci. 2023, 24, 998. https://doi.org/10.3390/ijms24020998
Chung C-R, Wang H-Y, Chou P-H, Wu L-C, Lu J-J, Horng J-T, Lee T-Y. Towards Accurate Identification of Antibiotic-Resistant Pathogens through the Ensemble of Multiple Preprocessing Methods Based on MALDI-TOF Spectra. International Journal of Molecular Sciences. 2023; 24(2):998. https://doi.org/10.3390/ijms24020998
Chicago/Turabian StyleChung, Chia-Ru, Hsin-Yao Wang, Po-Han Chou, Li-Ching Wu, Jang-Jih Lu, Jorng-Tzong Horng, and Tzong-Yi Lee. 2023. "Towards Accurate Identification of Antibiotic-Resistant Pathogens through the Ensemble of Multiple Preprocessing Methods Based on MALDI-TOF Spectra" International Journal of Molecular Sciences 24, no. 2: 998. https://doi.org/10.3390/ijms24020998
APA StyleChung, C. -R., Wang, H. -Y., Chou, P. -H., Wu, L. -C., Lu, J. -J., Horng, J. -T., & Lee, T. -Y. (2023). Towards Accurate Identification of Antibiotic-Resistant Pathogens through the Ensemble of Multiple Preprocessing Methods Based on MALDI-TOF Spectra. International Journal of Molecular Sciences, 24(2), 998. https://doi.org/10.3390/ijms24020998