High-Speed Videoendoscopy and Stiffness Mapping for AI-Assisted Glottic Lesion Differentiation
Simple Summary
Abstract
1. Introduction
2. Materials and Methods
2.1. Study Group
2.2. Equipment and Instrumentation
2.3. Endoscopic Procedure
2.4. Laryngotopographic (LTG) Analysis
2.4.1. Identify Fundamental Frequency (F0) and Harmonic Components
2.4.2. Map Spatial Vibratory Patterns
2.4.3. Quantify Oscillatory Behavior in a Standardized Manner
2.5. Kymographic Analysis
2.6. Statistical Analysis
2.7. Machine Learning Model
- Parameter standardization. For each parameter, the mean value and standard deviation were calculated. After subtracting the mean and dividing by the standard deviation, each parameter had mean = 0 and standard deviation = 1. Thus, parameters with a larger range of values had the same effect on the distance metric calculated between data vectors as parameters with a small range.
- Selection of significant attributes. This step was performed using the sequential floating-forward search (SFFS) algorithm [17]. The exploration of the multidimensional parameter space in search of significant features (providing the best classification of the selected classifier) is done sequentially. First, a single parameter is found that minimizes classification error, and then more features are added or subtracted depending on their impact on the classification accuracy.
- Training of the final classifier model. The SFFS algorithm terminates when further addition or removal of attributes does not improve classification quality. The final subset is used to develop and test a classifier model of multidimensional data vectors. The following SVM models have been developed:
- A model for the classification to all 3 classes simultaneously.
- A binary model to discriminate between Norm and any organic lesion
- A binary model to discriminate between Malignant and Benign lesions.
3. Results
3.1. Univariate Analysis
3.2. Multidimensional Data Analysis
3.2.1. Classification to 3 Classes
3.2.2. Multidimensional Data Analysis–Classification to 2 Classes: Any Organic Lesion–Norm
3.2.3. Multidimensional Data Analysis–Classification to 2 Classes: Malignant Lesion–Benign Lesion
4. Discussion
5. Conclusions
Supplementary Materials
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Conflicts of Interest
Abbreviations
SAI | Stiffness Asymmetry Index |
HSV | High-Speed Videoendoscopy |
TPR | True Positive Rate |
FPR | False Positive Rate |
ROC | Receiver Operating Characteristic |
AUC | Area Under The Curve |
SVM | Support Vector Machines |
SFFS | Sequential Floating-Forward Search |
MCT | Multiple Comparisons Test |
Abbreviations for parameters with description in Table 1 |
References
- Mody, M.D.; Rocco, J.W.; Yom, S.S.; Haddad, R.I.; Saba, N.F. Head and neck cancer. Lancet 2021, 398, 2289–2299. [Google Scholar] [CrossRef]
- Laryngeal Cancer—Cancer Stat Facts. Available online: https://seer.cancer.gov/statfacts/html/laryn.html (accessed on 23 December 2024).
- Arthur, C.; Huangfu, H.; Li, M.; Dong, Z.; Asamoah, E.; Shaibu, Z.; Zhang, D.; Ja, L.; Obwoya, R.T.; Zhang, C.; et al. The Effectiveness of White Light Endoscopy Combined With Narrow Band Imaging Technique Using Ni Classification in Detecting Early Laryngeal Carcinoma in 114 Patients: Our Clinical Experience. J. Voice, 2023; Epub ahead of print. [Google Scholar] [CrossRef]
- Piazza, C.; Del Bon, F.; Peretti, G.; Nicolai, P. Narrow band imaging in endoscopic evaluation of the larynx. Curr. Opin. Otolaryngol. Head Neck Surg. 2012, 20, 472–476. [Google Scholar] [CrossRef]
- Arens, C.; Piazza, C.; Andrea, M.; Dikkers, F.G.; Gi, R.E.A.T.P.; Voigt-Zimmermann, S.; Peretti, G. Proposal for a descriptive guideline of vascular changes in lesions of the vocal folds by the committee on endoscopic laryngeal imaging of the European Laryngological Society. Eur. Arch. Oto-Rhino-Laryngol. 2016, 273, 1207–1214. [Google Scholar] [CrossRef] [PubMed]
- Tanaka, S.; Hirano, M. Fiberscopic estimation of vocal fold stiffness in vivo using the sucking method. Arch. Otolaryngol. Head Neck Surg. 1990, 116, 721–724. [Google Scholar] [CrossRef] [PubMed]
- Tseng, W.; Chang, C.; Yang, T.; Hsiao, T. Estimating vocal fold stiffness: Using the relationship between subglottic pressure and fundamental frequency of phonation as an analog. Clin. Otolaryngol. 2020, 45, 40–46. [Google Scholar] [CrossRef] [PubMed]
- Döllinger, M.; Gómez, P.; Patel, R.R.; Alexiou, C.; Bohr, C.; Schützenberger, A. Biomechanical simulation of vocal fold dynamics in adults based on laryngeal high-speed videoendoscopy. PLoS ONE 2017, 12, e0187486. [Google Scholar] [CrossRef]
- Yamauchi, A.; Yokonishi, H.; Imagawa, H.; Sakakibara, K.-I.; Nito, T.; Tayama, N.; Yamasoba, T. Quantification of Vocal Fold Vibration in Various Laryngeal Disorders Using High-Speed Digital Imaging. J. Voice 2016, 30, 205–214. [Google Scholar] [CrossRef]
- Sakakibara, K.I.; Imagawa, H.; Kimura, M.; Yokonishi, H.; Tayama, N. Modal analysis of vocal fold vibrations using laryngotopography. In Proceedings of the 11th Annual Conference of the International Speech Communication Association (INTERSPEECH 2010), Chiba, Japan, 26–30 September 2010; pp. 917–920. [Google Scholar]
- Kaluza, J.; Niebudek-Bogusz, E.; Malinowski, J.; Strumillo, P.; Pietruszewska, W. Assessment of Vocal Fold Stiffness by Means of High-Speed Videolaryngoscopy with Laryngotopography in Prediction of Early Glottic Malignancy: Preliminary Report. Cancers 2022, 14, 4697. [Google Scholar] [CrossRef]
- Kałuża, J.; Strumiłło, P.; Niebudek-Bogusz, E.; Pietruszewska, W. Preprocessing of Laryngeal Images from High-Speed Videoendoscopy. In Information Technology in Biomedicine, Proceedings of the 9th International Conference, ITIB 2022, Kamień Śląski, Poland, 20–22 June 2022; Springer: Cham, Switzerland, 2022; pp. 132–142. [Google Scholar]
- Veltrup, R.; Kniesburges, S.; Semmler, M. Influence of Perspective Distortion in Laryngoscopy. J. Speech Lang. Hearing Res. 2023, 66, 3276–3289. [Google Scholar] [CrossRef]
- Lechien, J.R.; Geneid, A.; Bohlender, J.E.; Cantarella, G.; Avellaneda, J.C.; Desuter, G.; Sjogren, E.V.; Finck, C.; Hans, S.; Hess, M.; et al. Consensus for voice quality assessment in clinical practice: Guidelines of the European Laryngological Society and Union of the European Phoniatricians. Eur. Arch. Oto-Rhino-Laryngol. 2023, 280, 5459–5473. [Google Scholar] [CrossRef]
- Pietruszewska, W.; Just, M.; Morawska, J.; Malinowski, J.; Hoffman, J.; Racino, A.; Barańska, M.; Kowalczyk, M.; Niebudek-Bogusz, E. Comparative analysis of high-speed videolaryngoscopy images and sound data simultaneously acquired from rigid and flexible laryngoscope: A pilot study. Sci. Rep. 2021, 11, 20480. [Google Scholar] [CrossRef]
- Vapnik, V.N. The Nature of Statistical Learning Theory; Springer: New York, NY, USA, 2000. [Google Scholar] [CrossRef]
- Pudil, P.; Novovičová, J.; Kittler, J. Floating search methods in feature selection. Pattern Recognit. Lett. 1994, 15, 1119–1125. [Google Scholar] [CrossRef]
- Yamauchi, A.; Imagawa, H.; Yokonishi, H.; Sakakibara, K.-I.; Tayama, N. Multivariate Analysis of Vocal Fold Vibrations on Various Voice Disorders Using High-Speed Digital Imaging. Appl. Sci. 2021, 11, 6284. [Google Scholar] [CrossRef]
- Yamauchi, A.; Imagawa, H.; Yokonishi, H.; Sakakibara, K.-I.; Tayama, N. Multivariate Analysis of Vocal Fold Vibrations in Normal Speakers Using High-Speed Digital Imaging. J. Voice 2024, 38, 10–17. [Google Scholar] [CrossRef] [PubMed]
- Tsuji, D.H.; Hachiya, A.; Dajer, M.E.; Ishikawa, C.C.; Takahashi, M.T.; Montagnoli, A.N. Improvement of Vocal Pathologies Diagnosis Using High-Speed Videolaryngoscopy. Int. Arch. Otorhinolaryngol. 2014, 18, 294–302. [Google Scholar] [CrossRef]
- Krausert, C.R.; Olszewski, A.E.; Taylor, L.N.; McMurray, J.S.; Dailey, S.H.; Jiang, J.J. Mucosal wave measurement and visualization techniques. J. Voice 2011, 25, 395–405. [Google Scholar] [CrossRef]
- Powell, M.E.; Deliyski, D.D.; Zeitels, S.M.; Burns, J.A.; Hillman, R.E.; Gerlach, T.T.; Mehta, D.D. Efficacy of Videostroboscopy and High-Speed Videoendoscopy to Obtain Functional Outcomes From Perioperative Ratings in Patients with Vocal Fold Mass Lesions. J. Voice 2020, 34, 769–782. [Google Scholar] [CrossRef]
- Zacharias, S.R.C.; Deliyski, D.D.; Gerlach, T.T. Utility of Laryngeal High-Speed Videoendoscopy in Clinical Voice Assessment. J. Voice 2018, 32, 216. [Google Scholar] [CrossRef]
- Malinowski, J.; Pietruszewska, W.; Kowalczyk, M.; Niebudek-Bogusz, E. Value of high-speed videoendoscopy as an auxiliary tool in differentiation of benign and malignant unilateral vocal lesions. J. Cancer Res. Clin. Oncol. 2024, 150, 10. [Google Scholar] [CrossRef]
- Fehling, M.K.; Grosch, F.; Schuster, M.E.; Schick, B.; Lohscheller, J. Fully automatic segmentation of glottis and vocal folds in endoscopic laryngeal high-speed videos using a deep Convolutional LSTM Network. PLoS ONE 2020, 15, e0227791. [Google Scholar] [CrossRef]
- Gandhi, S.; Bhatta, S.; Ganesuni, D.; Ghanpur, A.D.; Saindani, S.J. High-speed videolaryngoscopy in early glottic carcinoma patients following transoral CO2 LASER cordectomy. Eur. Arch. Otorhinolaryngol. 2021, 278, 1119–1127. [Google Scholar] [CrossRef] [PubMed]
- Mehlum, C.S.; Rosenberg, T.; Groentved, A.M.; Dyrvig, A.-K.; Godballe, C. Can videostroboscopy predict early glottic cancer? A systematic review and meta-analysis. Laryngoscope 2016, 126, 2079–2084. [Google Scholar] [CrossRef]
- Mehlum, C.S.; Kjaergaard, T.; Grøntved, Å.M.; Lyhne, N.M.; Jørkov AP, S.; Homøe, P.; Tvedskov, J.F.; Bork, K.H.; Möller, S.; Jørgensen, G.; et al. Value of pre- and intraoperative diagnostic methods in suspected glottic neoplasia. Eur. Arch. Otorhinolaryngol. 2020, 277, 207–215. [Google Scholar] [CrossRef] [PubMed]
- Volgger, V.; Felicio, A.; Lohscheller, J.; Englhard, A.S.; Al-Muzaini, H.; Betz, C.S.; Schuster, M.E. Evaluation of the combined use of narrow band imaging and high-speed imaging to discriminate laryngeal lesions. Lasers Surg. Med. 2017, 49, 609–618. [Google Scholar] [CrossRef]
- Zhao, W.; Zhi, J.; Zheng, H.; Du, J.; Wei, M.; Lin, P.; Li, L.; Wang, W. Construction of prediction model of early glottic cancer based on machine learning. Acta Otolaryngol. 2025, 145, 72–80. [Google Scholar] [CrossRef] [PubMed]
- You, Z.; Han, B.; Shi, Z.; Zhao, M.; Du, S.; Yan, J.; Liu, H.; Hei, X.; Ren, X.; Yan, Y. Vocal cord leukoplakia classification using deep learning models in white light and narrow band imaging endoscopy images. Head Neck 2023, 45, 3129–3145. [Google Scholar] [CrossRef]
- Wellenstein, D.J.; Woodburn, J.; Marres, H.A.M.; van den Broek, G.B. Detection of laryngeal carcinoma during endoscopy using artificial intelligence. Head Neck 2023, 45, 2217–2226. [Google Scholar] [CrossRef]
- Syed, S.A.; Rashid, M.; Hussain, S. Meta-analysis of voice disorders databases and applied machine learning techniques. Math. Biosci. Eng. 2020, 17, 7958–7979. [Google Scholar] [CrossRef]
Parameter | Description | Parameter | Description | ||
---|---|---|---|---|---|
1 | AmpMaxInd (%FL) | Index of maximum glottal gap amplitude | 20 | OQAvg_3/3 (%) | Open Quotient 1/3 anterior part of the glottis |
2 | AmpRMaxInd (%FL) | Index of maximum right vocal fold edge movement amplitude | 21 | AmplAsymAvg (%) | Average amplitude asymmetry |
3 | AmpLMaxInd (%FL) | Index of maximum left vocal fold edge movement amplitude | 22 | AmplAsymAvg_1/3 (%) | Average amplitude asymmetry 1/3 posterior part of the glottis |
4 | AmpCenter (%FL) | Index of net glottal gap amplitude | 23 | AmplAsymAvg_2/3 (%) | Average amplitude asymmetry 1/3 middle part of the glottis |
5 | AmpRCenter (%FL) | Index of net right vocal fold edge movement amplitude | 24 | AmplAsymAvg_3/3 (%) | Average amplitude asymmetry 1/3 anterior part of the glottis |
6 | AmpLCenter (%FL) | Index of net left vocal fold edge movement amplitude | 25 | AmplAsymWeighted (%) | Weighted amplitude asymmetry |
7 | AmpAvg (%FL) | Average glottal gap amplitude | 26 | PhaseAsymAvg (%) | Average phase asymmetry |
8 | AmpAvg_1/3 (%FL) | Glottal gap amplitude 1/3 posterior part of the glottis | 27 | PhaseAsymAvg_1/3 (%) | Average phase asymmetry 1/3 posterior part of the glottis |
9 | AmpAvg_2/3 (%FL) | Glottal gap amplitude 1/3 middle part of the glottis | 28 | PhaseAsymAvg_2/3 (%) | Average phase asymmetry 1/3 middle part of the glottis |
10 | AmpAvg_3/3 (%FL) | Glottal gap amplitude 1/3 anterior part of the glottis | 29 | PhaseAsymAvg_3/3 (%) | Average phase asymmetry 1/3 anterior part of the glottis |
11 | AmpLAvg (%FL) | Average left vocal fold amplitude | 30 | PhaseAsymWeighted (%) | Weighted Average phase asymmetry |
12 | AmpRAvg (%FL) | Average right vocal fold amplitude | 31 | Rel2CommonAmplAvg (%) | Average relative to common amplitude ratio |
13 | Nonclosing (%FL) | Non-closing part of vocal folds | 32 | PhaseDiffAvg (°) | Average phase difference |
14 | Nonopening (%FL) | Non-opening part of vocal folds | 33 | PhaseDiffAvg_1/3 (°) | Average phase difference 1/3 posterior part of the glottis |
15 | EffectiveArea (%) | Glottal effective area | 34 | PhaseDiffAvg_2/3 (°) | Average phase difference 1/3 middle part of the glottis |
16 | RGGA (%) | Relative Glottal Gap Area | 35 | PhaseDiffAvg_3/3 (°) | Average phase difference 1/3 anterior part of the glottis |
17 | OQAvg (%) | Average Open Quotient | 36 | PhaseDiffWeighted (°) | Weighted phase difference |
18 | OQAvg_1/3 (%) | Open Quotient 1/3 posterior part of the glottis | 37 | AbsPhaseDiffAvg (°) | Average absolute phase difference |
19 | OQAvg_2/3 (%) | Open Quotient 1/3 middle part of the glottis | 38 | AbsPhaseDiffWeighted (°) | Weighted absolute phase difference |
Parameter | Groups | p-Value | Statistic |
---|---|---|---|
AbsPhaseDiffAvg | Benign vs. Malignant | 0.0096 | 544.5 |
Norm vs. Benign | 0.0002 | 169.0 | |
Norm vs. Malignant | <0.0001 | 132.5 | |
AbsPhaseDiffWeighted | Benign vs. Malignant | 0.0097 | 545.0 |
Norm vs. Benign | 0.0004 | 178.5 | |
Norm vs. Malignant | <0.0001 | 143.0 | |
AmplAsymAvg | Benign vs. Malignant | 0.0001 | 416.0 |
Norm vs. Benign | <0.0001 | 124.5 | |
Norm vs. Malignant | <0.0001 | 62.0 | |
AmplAsymAvg_2/3 | Benign vs. Malignant | 0.0010 | 505.0 |
Norm vs. Benign | 0.0001 | 160.0 | |
Norm vs. Malignant | < 0.0001 | 112.5 | |
AmplAsymWeighted | Benign vs. Malignant | 0.0001 | 412.0 |
Norm vs. Benign | <0.0001 | 93.0 | |
Norm vs. Malignant | <0.0001 | 47.0 | |
SAI | Benign vs. Malignant | <0.0001 | 337.5 |
Norm vs. Benign | <0.0001 | 123.5 | |
Norm vs. Malignant | <0.0001 | 34.0 |
Parameter | Youden’s Index (95% CI) | Sensitivity (95% CI) | Specificity (95% CI) | |
---|---|---|---|---|
1 | SAI | 0.735 (0.612–0.882) | 0.832 (0.623–0.966) | 0.903 (0.737–1.0) |
2 | AmplAsymAvg | 0.715 (0.581–0.833) | 0.777 (0.662–0.903) | 0.938 (0.786–1.0) |
3 | AmplAsymAvg_2/3 | 0.633 (0.482–0.788) | 0.764 (0.5–0.911) | 0.869 (0.667–1.0) |
4 | AmplAsymWeighted | 0.595 (0.395–0.779) | 0.827 (0.697–0.952) | 0.921 (0.765–1.0) |
5 | AbsPhaseDiffAvg | 0.627 (0.435–0.8) | 0.808 (0.587–0.919) | 0.819 (0.636–1.0) |
6 | AbsPhaseDiffWeighted | 0.595 (0.395–0.779) | 0.783 (0.625–0.938) | 0.812 (0.571–1.0) |
Parameter | Youden’s Index (95% CI) | Sensitivity (95% CI) | Specificity (95% CI) | |
---|---|---|---|---|
1 | SAI | 0.545 (0.379–0.705) | 0.733 (0.54–0.913) | 0.812 (0.621–0.955) |
2 | AmplAsymAvg | 0.538 (0.0.362–0.705) | 0.763 (0.538–0.889) | 0.775 (0.634–0.941) |
3 | AmplAsymAvg_2/3 | 0.440 (0.259–0.606) | 0.624 (0.375–0.825) | 0.816 (0.629–0.974) |
4 | AmplAsymWeighted | 0.503 (0.334–0.667) | 0.725 (0.442–0.895) | 0.778 (0.605–0.976) |
5 | AbsPhaseDiffAvg | 0.366 (0.2–0.532) | 0.668 (0.341–0.923) | 0.698 (0.424–0.956) |
6 | AbsPhaseDiffWeighted | 0.369 (0.189–0.544) | 0.648 (0.355–0.895) | 0.72 (0.432–0.968) |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Pietrzak, M.M.; Kałuża-Olszewska, J.; Niebudek-Bogusz, E.; Klepaczko, A.; Pietruszewska, W. High-Speed Videoendoscopy and Stiffness Mapping for AI-Assisted Glottic Lesion Differentiation. Cancers 2025, 17, 1376. https://doi.org/10.3390/cancers17081376
Pietrzak MM, Kałuża-Olszewska J, Niebudek-Bogusz E, Klepaczko A, Pietruszewska W. High-Speed Videoendoscopy and Stiffness Mapping for AI-Assisted Glottic Lesion Differentiation. Cancers. 2025; 17():1376. https://doi.org/10.3390/cancers17081376
Chicago/Turabian StylePietrzak, Magdalena M., Justyna Kałuża-Olszewska, Ewa Niebudek-Bogusz, Artur Klepaczko, and Wioletta Pietruszewska. 2025. "High-Speed Videoendoscopy and Stiffness Mapping for AI-Assisted Glottic Lesion Differentiation" Cancers 17, no. : 1376. https://doi.org/10.3390/cancers17081376
APA StylePietrzak, M. M., Kałuża-Olszewska, J., Niebudek-Bogusz, E., Klepaczko, A., & Pietruszewska, W. (2025). High-Speed Videoendoscopy and Stiffness Mapping for AI-Assisted Glottic Lesion Differentiation. Cancers, 17(), 1376. https://doi.org/10.3390/cancers17081376