Open AccessArticle
Clinical Validation of a Computed Tomography Image-Based Machine Learning Model for Segmentation and Quantification of Shoulder Muscles
by
Hamidreza Rajabzadeh-Oghaz, Josie Elwell, Bradley Schoch, William Aibinder, Bruno Gobbato, Daniel Wessell, Vikas Kumar and Christopher P. Roche
Algorithms 2025, 18(7), 432; https://doi.org/10.3390/a18070432 (registering DOI) - 14 Jul 2025
Viewed by 46
Abstract
Introduction: We developed a computed tomography (CT)-based tool designed for automated segmentation of deltoid muscles, enabling quantification of radiomic features and muscle fatty infiltration. Prior to use in a clinical setting, this machine learning (ML)-based segmentation algorithm requires rigorous validation. The aim
[...] Read more.
Introduction: We developed a computed tomography (CT)-based tool designed for automated segmentation of deltoid muscles, enabling quantification of radiomic features and muscle fatty infiltration. Prior to use in a clinical setting, this machine learning (ML)-based segmentation algorithm requires rigorous validation. The aim of this study is to conduct shoulder expert validation of a novel deltoid ML auto-segmentation and quantification tool.
Materials and Methods: A SwinUnetR-based ML model trained on labeled CT scans is validated by three expert shoulder surgeons for 32 unique patients. The validation evaluates the quality of the auto-segmented deltoid images. Specifically, each of the three surgeons reviewed the auto-segmented masks relative to CT images, rated masks for clinical acceptance, and performed a correction on the ML-generated deltoid mask if the ML mask did not completely contain the full deltoid muscle, or if the ML mask included any tissue other than the deltoid. Non-inferiority of the ML model was assessed by comparing ML-generated to surgeon-corrected deltoid masks versus the inter-surgeon variation in metrics, such as volume and fatty infiltration.
Results: The results of our expert shoulder surgeon validation demonstrates that 97% of ML-generated deltoid masks were clinically acceptable. Only two of the ML-generated deltoid masks required major corrections and only one was deemed clinically unacceptable. These corrections had little impact on the deltoid measurements, as the median error in the volume and fatty infiltration measurements was <1% between the ML-generated deltoid masks and the surgeon-corrected deltoid masks. The non-inferiority analysis demonstrates no significant difference between the ML-generated to surgeon-corrected masks relative to inter-surgeon variations.
Conclusions: Shoulder expert validation of this CT image analysis tool demonstrates clinically acceptable performance for deltoid auto-segmentation, with no significant differences observed between deltoid image-based measurements derived from the ML generated masks and those corrected by surgeons. These findings suggest that this CT image analysis tool has potential to reliably quantify deltoid muscle size, shape, and quality. Incorporating these CT image-based measurements into the pre-operative planning process may facilitate more personalized treatment decision making, and help orthopedic surgeons make more evidence-based clinical decisions.
Full article