Deep Learning Techniques and Imaging in Otorhinolaryngology—A State-of-the-Art Review

Tsilivigkos, Christos; Athanasopoulos, Michail; Micco, Riccardo di; Giotakis, Aris; Mastronikolis, Nicholas S.; Mulita, Francesk; Verras, Georgios-Ioannis; Maroulis, Ioannis; Giotakis, Evangelos

doi:10.3390/jcm12226973

Open AccessReview

Deep Learning Techniques and Imaging in Otorhinolaryngology—A State-of-the-Art Review

by

Christos Tsilivigkos

^1,*,

Michail Athanasopoulos

²

,

Riccardo di Micco

³,

Aris Giotakis

¹,

Nicholas S. Mastronikolis

²

,

Francesk Mulita

^4,*

,

Georgios-Ioannis Verras

⁴

,

Ioannis Maroulis

⁴

and

Evangelos Giotakis

¹

1st Department of Otolaryngology, National and Kapodistrian University of Athens, Hippocrateion Hospital, 115 27 Athens, Greece

²

Department of Otolaryngology, University Hospital of Patras, 265 04 Patras, Greece

³

Department of Otolaryngology and Head and Neck Surgery, Medical School of Hannover, 30625 Hannover, Germany

⁴

Department of Surgery, University Hospital of Patras, 265 04 Patras, Greece

^*

Authors to whom correspondence should be addressed.

J. Clin. Med. 2023, 12(22), 6973; https://doi.org/10.3390/jcm12226973

Submission received: 14 October 2023 / Revised: 2 November 2023 / Accepted: 6 November 2023 / Published: 8 November 2023

(This article belongs to the Section Otolaryngology)

Download

Browse Figure

Versions Notes

Abstract

:

Over the last decades, the field of medicine has witnessed significant progress in artificial intelligence (AI), the Internet of Medical Things (IoMT), and deep learning (DL) systems. Otorhinolaryngology, and imaging in its various subspecialties, has not remained untouched by this transformative trend. As the medical landscape evolves, the integration of these technologies becomes imperative in augmenting patient care, fostering innovation, and actively participating in the ever-evolving synergy between computer vision techniques in otorhinolaryngology and AI. To that end, we conducted a thorough search on MEDLINE for papers published until June 2023, utilizing the keywords ‘otorhinolaryngology’, ‘imaging’, ‘computer vision’, ‘artificial intelligence’, and ‘deep learning’, and at the same time conducted manual searching in the references section of the articles included in our manuscript. Our search culminated in the retrieval of 121 related articles, which were subsequently subdivided into the following categories: imaging in head and neck, otology, and rhinology. Our objective is to provide a comprehensive introduction to this burgeoning field, tailored for both experienced specialists and aspiring residents in the domain of deep learning algorithms in imaging techniques in otorhinolaryngology.

Keywords:

otorhinolaryngology; deep learning; artificial intelligence; convolutional neural network; computer vision; imaging

1. Introduction

Artificial intelligence (AI) refers to the simulation of human intelligence in computer systems. It involves the development of algorithms and models that enable machines to perform tasks that typically require human intelligence, such as problem-solving, learning from experience, recognizing patterns, and making decisions. AI encompasses various subfields, including machine learning, natural language processing, computer vision, and robotics, all aimed at creating systems that can mimic human cognitive functions and behaviors [1].

These new technologies have been evolving during the last decades in all areas of medicine, including otorhinolaryngology, but it was the COVID-19 pandemic that led to the widespread adoption of such novel tools. During that time, AI applications assisted in increasing consciousness regarding the health and safety of both patients and healthcare practitioners and in driving behavioral alterations. At the same time, more sophisticated and accurate algorithms were developed [2].

The next contribution that complex AI systems promise is the IoMT. The recent COVID-19 pandemic but also the demands of everyday life can provoke inconvenience in visiting healthcare facilities for minor health issues. An IoMT system, comprising different interconnected medical devices through the internet, facilitates medical monitoring [3]. Basic IoMT architectures require the acquisition of patient medical information through smart sensors embedded in wearable devices. These devices are interconnected through a body sensors network (BSN) or a wireless sensor network (WSN), and the collected data are transmitted via the internet to the next stage, where analysis and data evaluation take place through delicate AI algorithms. The final step is medical intervention in case of a possible serious medical issue [4].

Machine learning techniques consist a large category of AI models. These systems are granted the ability to learn and enhance themselves through experience, gradually becoming adept at performing specific tasks, and human involvement remains necessary during their training phase. A categorization of AI systems is depicted in Figure 1.

Novel deep learning (DL) systems, a subset of machine learning models, employ intricate algorithms and neural networks with multiple complex layers for training, but often require a significant direct human input [5,6]. The tasks that these algorithms undertake include, in general, intricate computations [7], predictions [8], and repetitive analytical activities [9]. The combination of computer vision with DL applications presents the capability to manage extensive medical image datasets, enabling precise and effective diagnosis. Additionally, it has the potential to mitigate the considerable intra- and inter-observer variability, which can compromise the reliability of clinical assessments [10].

Deep learning systems, as revolutionary tools in the field of medical imaging, have significantly enhanced diagnostic accuracy and efficiency. These systems employ intricate neural networks to autonomously analyze complex medical images in the head and neck area, such as CT scans, MRIs, PET-scans, and U/S images.

This wide area of available techniques offers several possibilities, spanning from improving the quality of medical images and segmentizing specific structures or lesions to detecting anomalies. Their ability to detect patterns and abnormalities within images has led to the early and precise identification of various conditions, including both benign and malignant diseases. Moreover, deep learning algorithms continually refine their performance through exposure to vast amounts of medical data, making them increasingly adept at recognizing subtle variations that might escape human observation. As a result, these systems hold the potential to revolutionize diagnostics and decision-making in otorhinolaryngology, offering ear, nose, and throat (ENT) specialists invaluable support in delivering timely and accurate diagnoses to improve patient outcomes.

The future significance of AI in the practice of medicine, and specifically in our specialty, is undisputed, and thus it is deemed, nowadays, necessary for the otolaryngologists to be familiar with the concepts and the existing techniques in computer vision and DL algorithms. The goal of this narrative review is to serve as an introduction, for the specialists and residents in ENT medicine, to the domain of these models. To our knowledge, this is the first paper addressing this issue in its entirety, in terms of a review.

2. Head and Neck

2.1. Head and Neck Imaging

Head and neck surgery relies majorly on imaging, which is often a pre-requisite before any further management. Different techniques offer significant advantages in disease diagnosis but also follow-up. Computed tomography (CT) and magnetic resonance imaging (MRI), during the last decades, have usually been used combinatorically in a large variety of medical conditions to acquire both bone and soft tissue information.

Lately, deep learning algorithms have emerged, which enable the conversion of one imaging modality to another. For example, MRI scans, which involve bone techniques, give us the possibility of a subsequent MRI to CT reconstruction, avoiding exposure to ionizing energy and aiding non-experts in diagnosis at the same time [11]. A combination of two generative adversarial networks has also been implemented to generate accurate synthetic CT images from MRI scans [12]. On the other hand, non-contrast CT scans can be converted to PET-like images with generative models, eliminating the need for radioactive tracers. The generated PET images demonstrate comparable accuracy to actual FDG-PET images in predicting clinical outcomes [13]. It seems rational to hypothesize that such deep learning pipelines can transform head and neck imaging into a one-step-procedure in the future.

Next, CNNs are believed to exhibit superior performance compared to a traditional radiomic framework regarding their ability to detect image patterns, often undetectable by the latter, while systems such as ultra-high-resolution CT with a DL-based image reconstruction engine offer significant amelioration in subjective and objective image quality, with a higher sound-to-noise ratio, lower noise, and lower radiation exposure [14].

The DL technique utilized in the analysis of medical images allows the incorporation of both qualitative and quantitative imaging characteristics to create prediction models characterized by exceptional diagnostic accuracy. These principles have been applied generally to HNSCC imaging, but also specifically to specific types of HNSCC. Notably, in the imaging of oral and oropharyngeal cancer, FDG-PET/CT scans can be processed by DL systems to predict local treatment outcomes [15], disease-free survival with high sensitivity and specificity [16], overall survival [17], and they can even assist in differentiating human papillomavirus positive from human papillomavirus negative oropharyngeal carcinomas [18].

At the same time, progress in computer vision and deep learning provide potent techniques for creating supplementary tools capable of automatically screening the oral cavity. These cost-effective and non-invasive tools can offer real-time insights for healthcare practitioners during patient assessments and can also facilitate self-examinations for individuals. The automated diagnosis of oral cancer through images is predominantly focused on the utilization of specialized imaging technologies, namely optical coherence tomography [19,20], hyperspectral imaging [21], and autofluorescence imaging [22], but also white-light photographs [23]. Such DL techniques can come in the form of mHealth applications, assisting in oral and oropharyngeal lesion detection in both hospitals and resource-limited areas, and enabling telediagnosis [24]. Finally, systems offering a real-time estimation of cancer risk and biopsy assistance maps on the oral mucosa are very promising [25].

Furthermore, diseases of the nasopharynx have been an area of focus during the last years for DL system developers. From MRI-based applications focusing on the differential diagnosis between benign and malignant nasopharyngeal diseases [26,27] to the automatic detection of pathological lymph nodes and assessment of the peritumoral area in nasopharyngeal carcinoma, DL algorithms can significantly assist in disease prognosis and treatment planning [28]. Interestingly, peritumoral information, especially the largest areas of tumor invasion, has been shown to provide valuable insights for distant metastasis prediction in individuals with nasopharyngeal carcinoma [29].

Imaging of the salivary glands constitutes another significant challenge for radiologists and otolaryngologists, who have many different imaging modalities in their quiver. Specialized DL algorithms have been developed to assist in differential diagnosis between benign and malignant parotid gland tumors in contrast-enhanced CT images [30], and ultrasonography [31]. ΜRI remains the gold standard in the diagnosis of salivary gland diseases, where DL models intend to automatically classify salivary gland tumors with very high accuracy [32,33].

Relative to thyroid disease diagnosis, ultrasound (US) is widely acknowledged as the primary diagnostic technique for examining thyroid nodules and assessing papillary thyroid carcinomas (PTCs) before surgery [34]. DL networks with excellent diagnostic efficiency have been deployed to distinguish between benign nodules and thyroid carcinoma [35], improve the detection of follicular carcinoma, differentiate between atypical and typical medullary carcinoma [36], and assess for gross extrathyroidal extension in thyroid cancer [37]. AI systems can be very useful in eliminating the operator dependence of US and ameliorating diagnosis precision, especially in inexperienced radiologists.

Nevertheless, plenty of other DL techniques are associated with thyroid gland evaluation. Thus, apart from thyroid gland contouring in non-contrast-enhanced CT images [38], special applications used intraoperatively to assist surgeons in recurrent laryngeal nerve [39] and parathyroid gland identification have been designed. Such algorithms have the potential to improve surgical workflows in the intricate environment of open surgery.

The head and neck region is among the most common locations for cancer, with a substantial occurrence of lymph node involvement and metastases observed in both nearby and distant regions. The identification of distant metastases is linked to an unfavorable prognosis, often resulting in a median survival period of around 10 months [40]. The role of imaging in metastasis diagnosis is uncontroversial and novel convolutional neural networks have been developed in this direction. For example, extended 2D-CNN and 3D-CNN models have been deployed to perform time-to-event analysis for the binary classification of distant metastasis in head and neck cancer patients. These models result in the generation of distant metastasis-free probability curves and stratify patients into high- and low-risk groups [41]. CNN are generally able to detect image patterns that can be untraceable with traditional methods. Thus, it has been shown that CNN can be trained to forecast the treatment results for individuals with HNSCC, relying exclusively on the information from CT scans conducted prior to treatment [42].

CNN assessing pre-treatment MRI scans to predict the possibility of distant metastases in individuals with nasopharyngeal carcinoma can also be useful, since the occurrence of a metastasis is the main reason for radiotherapy failure in this patient group. Predicting the high risk for distant metastasis in a patient can lead to a more aggressive treatment approach [29]. Moreover, pre-therapy MRI scans have been used in patients with advanced (T3N1M0) nasopharyngeal carcinoma to guide the clinicians in deciding between induction chemotherapy plus concurrent chemoradiotherapy or concurrent chemoradiotherapy alone [43].

DL models diagnosing lymph node involvement can boost clinical decision-making in the future. A relative model has been developed that detects pathological lymph nodes in individuals with oral cancer [44], while another one predicts lymph node involvement in patients with thyroid cancer through the interpretation of their multiphase dual-energy spectral CT images [45].

The utilization of deep learning techniques allows for the complete automation of image analysis providing the user with multiple possibilities (Table 1). Nevertheless, it demands a substantial volume of accurately labeled images. Additionally, prediction-making necessitates detailed patient endpoint data, a process that is both expensive and time-intensive. Developing more effective models with constrained datasets stands as a critical challenge in the field of AI today.

2.2. Head and Neck Radiotherapy

Radiotherapy (RT) stands as a fundamental pillar in head and neck cancer (HNC) treatment, whether administered independently, post-surgery, or concurrently with chemotherapy. Defining organs at risk (OARs) and clinical target volumes represents a crucial phase in the treatment protocol. This process typically involves manual work, is time-consuming, and necessitates substantial training. Ideally, these tasks would be substituted by automated procedures requiring minimal clinician involvement, and AI appears competent to undertake this role.

A major challenge and the primary drawback of radiation therapy is that, apart from the cancerous mass, it unavoidably exposes nearby healthy tissues, known as OARs, to some level of radiation. This can potentially result in various adverse effects and toxicities, since contouring organs like the parotid and the submandibular gland and excluding them from radiation intake can be quite arduous [46]. Additionally, DL-based automated segmentation of the masticatory area has successfully reduced the incidence of RT-associated trismus [47].

Several applications aiming to realize normal tissue structure auto-segmentation from CT images [48,49] exist. These can include three-dimensional segmentation models and convolutional neural networks for final OAR identification [50]. DL pipelines focusing on tumor segmentation in specific organs, such as the oropharynx [51], and the salivary glands promise to gradually automatize the RT procedure, and at the same time reduce post-segmentation editing [52].

On the other hand, 3D CNNs aim to consistently and precisely generate clinical target volumes contouring for the different lymph node levels in HNSCC RT [53,54]. Such applications show quicker contouring adjustments in comparison to automated delineations, closely aligning with corrected delineations for specific levels, and reducing interobserver variability.

The possibilities that DL systems offer are countless, with distant metastasis and overall survival prediction in HNSCC using PET-only models without gross tumor volume segmentation [55], or automatically delineating the gross tumor volume in the FDG-PET/CT images of HNSCC patients [56]. Overall, DL systems present the potential to offer personalized RT guidance in HNSCC patients, with limited contribution from medical experts.

2.3. Endoscopy and Laryngoscopy

Machine learning has been recently experimentally applied to diagnostic ENT endoscopy to leverage meaningful information from digital images, videos, and other visual inputs and take actions or make recommendations based on that information. Mediated from the early experience acquired in the more standardized field of gastrointestinal endoscopy, AI-based video analysis, or videomics [57], has been variously applied to improve automatic image quality control, classification, optical detection, and the segmentation of images. After numerous proof-of-concept studies, videomics is rapidly moving to viable clinical approaches for detecting pathological patterns in real-time assistance during the endoscopic evaluation of the upper aerodigestive ways.

A deep learning model consists of complex multilayer artificial neural networks, among which convoluted neural networks are the most popular in the image analysis field. The CNN does not require instructions on which features describe an object and can autonomously learn how to identify it by observing a sufficient number of examples. Various available AI models exist and have been applied [58], although a specific comparison between the various algorithm architectures for the task is still lacking. After this preliminary conceptualization phase, the model undergoes a supervised learning session, in which expert otolaryngologists provide the AI human annotated images to transfer their ability in recognizing the lesions. The higher the quality and quantity of items in the validation set, the more accurate the model will be. After the training validation set, the performance of the system is measured on the testing set by comparing the model prediction with the original human annotations. The performance will be evaluated using diagnostic metrics relative to the task analyzed.

AI can be used to classify endoscopic images. In that case, the diagnostic metrics of interest are accuracy (percentage of correctly classified images), precision (positive predictive value), and sensitivity (percentage of correctly identified images compared to all the ones that should have been recognized); F1 score (harmonic mean of precision and sensitivity); and the receiver operating characteristic curve (graphically identifying the true positive rate against the false positive one) [59]. In this framework, it is possible to apply AI to classify videos based on their image quality, selecting only the most informative frames for further analysis [60,61]. Another classification task is the optical biopsy [62], predicting the histology of a lesion based on its appearance. At the current state, AI is more accurate in binary classification, e.g., premalignant/malignant [63], whereas it loses diagnostic power in multiclass operation [64]. By expanding and diversifying the validation dataset, it is possible to achieve high accuracy in simultaneously identifying different conditions such as glottic carcinoma, leucoplakia, nodules, and vocal cord polyps [65], outperforming other approaches according to AUC and F1 otolaryngologist trainees [66].

Another task the AI is devised for is the automatic detection of lesions during endoscopic evaluation. The main diagnostic metrics for this function are the F1 score, the intersection over union (how well the selected area overlaps with the original annotated area), and the mean average precision (precision and sensitivity according to the chosen IoU). Using narrow band images, AI can be trained to localize mucosal cancerous lesions in the pharynx and larynx during endoscopy [67,68,69]. This concept has been recently applied to automatically detect laryngeal cancer in real time video-laryngoscopy using the open-source YOLO CNN, achieving 67% precision, 62% sensitivity, and 0.63 mean average precision at 0.5 IoU [70], which could be implemented in a self-screening approach for early tumor recurrence detection [71]. Based on simple diagnostic endoscopy, the same approach can be applied intraoperatively to detect pathological tissues, such as in endoscopic parathyroid surgery [72,73].

Finally, CNN has been used to automatically delineate the boundaries of anatomical structure and lesions in the upper aerodigestive ways. Segmentation performance is evaluated with IoU and the dice similarity coefficient (similarity between the predicted segmentation mask and the ground truth mask). The rationale of segmentation in videomics is to improve lesion follow-up, the definition of tumor resection margins in the operation room, and the area of interest for general laryngology. The automated segmentation of cancer tissue has been successfully attempted in the nasopharynx (DSC 0.78) [74], oropharynx (DSC 0.76) [75], and laryngeal lesion (DSC 0.814) [76]. Aside from cancer pathology, segmentation may be used to select the region of interest for automated functional laryngeal analysis, such us the identification of the glottis angle [77,78], glottal midline [79], vocal cord paralysis [80], postintubation granuloma [81], vocal cord dynamics [82,83], or in the endoscopic evaluation of aspiration and penetration risk in dysphagia (FESS-CAD, DSC 0.92.5) [84].

Building a sufficiently large and heterogeneous training image dataset is a necessary task required to improve the deep learning-based image classifier. The main obstacles remain the lack of standardization of endoscopic techniques and study structures, hampering a comparison between the different experiences, and the complex anatomy of the upper aerodigestive ways, making image acquisition and standardization difficult. Although deep learning models can be very good at analyzing images belonging to the same group of the training cohort, they may lack accuracy when tested on different populations. To effectively apply videomics in real world situations, future research should focus on validating the trained models with an external dataset, acquired in different institutions and thus being diverse in terms of acquisition technique and population demographics. Although AI-aided endoscopy is still in a preclinical state, the results are promising and may soon efficiently assist the otolaryngologist in many tasks, such as the quality assessment of endoscopic examination, detection of mucosal lesions during endoscopy, optical biopsy of selected lesions, segmentation of cancer margins, and the assessment of laryngeal mobility.

3. Otology

3.1. Computer Vision in Otoscopy

The otoscopic ear inspection remains the first and most important step in the diagnosis of ear disease, especially otitis media and its variants. However, otoscopy requires extensive training, and there are still high rates of errors even with experienced otolaryngologists [85]. The growing use of video-otoscopy provides reliable data for developing deep learning models for automated image recognition, potentially assisting the less experienced physician in the identification and classification of pathological findings [86].

The most common machine learning approach used in automatized video-otoscopy is the convolutional neural network, a type of deep learning model which undergoes a supervised learning session using human-labeled data to master in order to recognize specific pathological patterns in a dataset of retrospectively collected images. CNN can be used for image classification, detection, and segmentation. As in videomics, one application is the optimization of the diagnostic image, such as the selection of the best quality image frames in an otoscopic video recording in order to create an informative composite image, stitching together only the best quality frames and excluding the less informative ones [87]. Moreover, CNN can provide an automatized segmentation of the eardrum from otoscopic images, orienting the clinician and future CNN to special areas of interest for the diagnosis [88]. Automatic image preprocessing, reducing imperfections such as motion artifacts or earwax [89], and selecting the proper color wavelength [90] may further enhance the informativeness of the pictures. Once the images have been properly selected, the final step of building an AI image classifier requires training on a large, annotated image dataset and validation of the results. The accuracy of the system is usually determined by a comparison with the diagnosis of a panel of experienced physicians or otolaryngologists. AI image classifiers have been mostly studied for the automatic diagnosis of otitis media. Many different available CNN models have been variously trained and compared (ResNet-50, Inception-V3, Inception-Resnet-V2, MobileNetV2) to binarily differentiate between normal and abnormal images [91] or to attempt multiclass classification [92,93,94,95], achieving on average a 90.47% accuracy in differentiating between normal and abnormal images and 97.6% between normal, otitis media acuta, and otitis media with effusion [96]. AI algorithms experimentally outperformed human assessors in classifying otoscopy images, achieving 93.4% versus 73.2% accuracy [93,94,97,98]. The same approach has been investigated for other otologic conditions, such as eardrum perforation [99], attic atelectasis [100], and otomycosis [101]. Comparing different studies and approaches is, however, difficult for the heterogeneity of the collected dataset, making the standardization of otoscopic image acquisition and annotation an important step for future developments [102].

It is also possible to apply deep learning models, which cluster data together based on similarity to provide predictions and reveal common themes, to make clinical predictions based on the collected images. For example, the optical recognition of pathological tympanic membranes can be paired with hearing loss predictions. In a preliminary study, a deep learning algorithm created to analyze video pneumatic otoscopy images accurately detected the presence of conductive hearing losses caused by middle ear effusion, ossicular fixation, otosclerosis, and adhesive otitis media, outperforming experienced otologists [103]. Similarly, CNN proved better than clinicians and logistic regression models in predicting a conductive hearing loss greater than 10 dB, focusing on the retraction pockets in otitis media images [104].

Although experimentally AI image classifiers can achieve accuracies comparable to those of experienced otologists, separately trained CNN still fails to maintain the same high internal performance when applied to a different cohort from the one used for training, although organized in the same way, as demonstrated in a multicenter study [105]. The accuracy of CNN image classifiers heavily relies on the quality and quantity of the image dataset used for training. Not having access to big quality data remains a persistent hindrance in many pilot studies; consequently, internal solutions such as data augmentation with rotation and cropping have been generally applied, with still-debatable consequences. A possible solution to the problem is represented by transfer learning procedures, in which knowledge learned from a task is extracted and re-used to boost performance in a related task. A CNN pretrained on another huge image database performs better in a small number of subjects as compared to one starting from scratch [106]. Potentially, creating a shared virtual imaging database could offer a better training dataset for future deep learning models, enhancing accuracy even in real-world applications. Although the bulk of the available literature is still at an infancy level in terms of practical implementation, the experimental outcomes have overall diagnostic accuracies not inferior to those of an experienced clinician.

3.2. Imaging in Otology

Deep learning is poised to integrate and assist the clinician in complex diagnosis by identifying patterns often imperceptible to humans, providing innovative health care solutions, especially in the field of telemedicine and early diagnosis. Artificial intelligence applications in otology are rapidly moving from simple proofs of concept to preliminary clinical applications in the field of applied radiomics.

A promising application of artificial intelligence in otology is the automatic segmentation and analysis of specific radiological images of the temporal bone. As in other cases of radiomics, the current approach consists of training the AI to automatically identify regions of interest based on sets of training images previously annotated by experienced radiologists. In this case, the measure of interest is the dice score, which compares the manual segmentation from human experts with the AI ones.

Highly accurate automatic segmentation of radiological images has been demonstrated in CT scans [107], MRI scans [108] and cone-beam scans [109,110]. After selecting the region of interest, deep learning models can be specifically trained to classify specific diseases, such as chronic otitis media [111], cholesteatoma [112,113], otosclerosis [114,115], mastoiditis [116,117], and Meniere disease [118,119], achieving detection results comparable to subspeciality-trained radiologists. AI-assisted radiomics can be extremely useful in the follow-up of specific diseases, such as vestibular schwannoma, whose surveillance is nowadays performed through analogical segmentation and an analysis of serial MRI scans to detect tumor enlargement. A deep learning approach can be applied for tumor detection and segmentation in treatment-naïve patients [120], both after radiosurgery [121], in evaluating residual disease [122], and in predicting tumor enlargement based on radiomics parameters during follow-up [123].

Although highly successful in experimental settings, all these studies are currently performed only on a small number of patients for AI standards, making real-world clinical applications challenging. There is a growing need for image collection standardization and a multicenter approach to pull more different data together to better approximate real-life situations.

Although promising and quickly expanding (Table 2), the use of AI in otology is still associated with difficult translation in clinical practice. As soon as the trained AI is applied to a different group of patients or data as compared with the experimental setting, the prediction accuracy decreases. The greatest limitation remains that AI relies on training using a massive dataset. To build a sufficiently large and reliable database to encompass a real-world clinical situation is challenging and time consuming in a still unstandardized clinical practice. Another obstacle is the difficulty encountered in interpreting how the AI draws its conclusions, as it is impossible to evaluate which features are used by the AI to make its predictions. The learning mechanisms remain mostly unclear.

4. Imaging in Rhinology

The field of rhinology, as a subspecialty, has witnessed numerous technological advancements, from endoscopic diagnosis and the treatment of paranasal diseases some decades ago to innovations like image-guided surgical navigation more recently. In an effort to provide personalized treatment and ameliorate surgical practice and accuracy, it is no wonder that a growing amount of research has focused on computer vision in rhinologic diseases (Table 3).

Imaging in rhinology concentrates a plethora of DL systems aiming to augment diagnostic accuracy in particular domains of plain radiography, CT, and MRI imaging. As a general rule, the dependability of radiography in assessing sinusitis is debatable, since the documented sensitivity is relatively low for all sinuses, except for maxillary sinusitis. An algorithm capable of identifying and categorizing individual paranasal sinuses using both Waters’ and Caldwell’s views, all without requiring manual cropping, is available and could be useful, especially in areas and health facilities with limited resources [124]. In another study, panoramic imaging was applied to help dentists diagnose maxillary sinusitis [125]. Additionally, a generative adversarial network system offers amelioration of the diagnostic efficacy of sinus radiography since it requires considerably less real healthcare datasets [126].

Computed tomography imaging, which constitutes the gold standard in the imaging of the paranasal sinuses, presents a variety of challenges that DL algorithms are called upon to face. Firstly, preoperative sinus CT scans have been utilized to train a system to differentiate between non-eosinophilic and eosinophilic chronic rhinosinusitis (CRS), relying solely on CT imaging [127]. Second, CRS presents a tendency to recur and poor prognosis even following surgery. DL techniques aim to confront this problem by predicting, pre-operatively, the risk of disease recurrence [128]. Such a solution could augment patient-oriented therapy modalities.

Chronic rhinosinusitis (CRS) constitutes a diverse range of conditions defined by chronic inflammation of the paranasal sinuses. Although clinical examination has a key role in the diagnosis of the disease, CT is of vital importance in appraising sinusitis. Thus, the need for objective, enhanced, and standardized evaluation has led to the development of systems that realize the prompt evaluation of the paranasal sinuses’ opacification on CT images in patients with chronic rhinosinusitis [129,130]. The efficacy of these algorithms is strongly correlated to the Lund–Mackay score since they are, as the Lund–Mackay score, moderately correlated to the Lund–Kennedy endoscopy score [130]. Similarly, a CNN system can assess occlusion of the osteomeatal complex in individuals with chronic rhinosinusitis, relying on coronal CT images [131].

Applications evolving MRI techniques have also been developed, with the example of a three-dimensional CNN which can differentiate between benign and malignant inverted papilloma [132].

5. Conclusions

DL systems utilize complex algorithms and neural networks featuring numerous intricate layers in order to make decisions and solve advanced problems. Their application in medicine, and specifically in otorhinolaryngology, has increased rapidly, with a plethora of different and usually overlapping algorithms appearing in each and every subspecialty of ENT surgery. Due to their already wide utilization in everyday clinical practice, which is expected to rise expeditiously during the coming years, the modern otolaryngologist is obliged to be aware of and familiar with their multiple utilities.

Author Contributions

Conceptualization, C.T. and E.G.; methodology, C.T.; writing—original draft preparation, C.T., M.A., R.d.M. and A.G.; writing—review and editing, F.M., A.G., N.S.M. and G.-I.V.; supervision, E.G. and I.M.; All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Conflicts of Interest

The authors declare no conflict of interest.

References

Bur, A.M.; Shew, M.; New, J. Artificial Intelligence for the Otolaryngologist: A State of the Art Review. Otolaryngol. Head. Neck Surg. 2019, 160, 603–611. [Google Scholar] [CrossRef] [PubMed]
Petrone, P.; Birocchi, E.; Miani, C.; Anzivino, R.; Sciancalepore, P.I.; Di Mauro, A.; Dalena, P.; Russo, C.; De Ceglie, V.; Masciavè, M.; et al. Diagnostic and Surgical Innovations in Otolaryngology for Adult and Paediatric Patients during the COVID-19 Era. Acta Otorhinolaryngol. Ital. 2022, 42 (Suppl. S1), S46–S57. [Google Scholar] [CrossRef]
Islam, M.M.; Rahaman, A.; Islam, M.R. Development of Smart Healthcare Monitoring System in IoT Environment. SN Comput. Sci. 2020, 1, 185. [Google Scholar] [CrossRef] [PubMed]
Srivastava, J.; Routray, S.; Ahmad, S.; Waris, M.M. Internet of Medical Things (IoMT)-Based Smart Healthcare System: Trends and Progress. Comput. Intell. Neurosci. 2022, 2022, 7218113. [Google Scholar] [CrossRef] [PubMed]
Bulfamante, A.M.; Ferella, F.; Miller, A.M.; Rosso, C.; Pipolo, C.; Fuccillo, E.; Felisati, G.; Saibene, A.M. Artificial Intelligence, Machine Learning, and Deep Learning in Rhinology: A Systematic Review. Eur. Arch. Otorhinolaryngol. 2023, 280, 529–542. [Google Scholar] [CrossRef] [PubMed]
Shen, D.; Wu, G.; Suk, H.-I. Deep Learning in Medical Image Analysis. Annu. Rev. Biomed. Eng. 2017, 19, 221–248. [Google Scholar] [CrossRef]
Lamassoure, L.; Giunta, J.; Rosi, G.; Poudrel, A.-S.; Meningaud, J.-P.; Bosc, R.; Haïat, G. Anatomical Subject Validation of an Instrumented Hammer Using Machine Learning for the Classification of Osteotomy Fracture in Rhinoplasty. Med. Eng. Phys. 2021, 95, 111–116. [Google Scholar] [CrossRef]
Kim, H.-G.; Lee, K.M.; Kim, E.J.; Lee, J.S. Improvement Diagnostic Accuracy of Sinusitis Recognition in Paranasal Sinus X-Ray Using Multiple Deep Learning Models. Quant. Imaging Med. Surg. 2019, 9, 942–951. [Google Scholar] [CrossRef]
Kim, D.-K.; Lim, H.-S.; Eun, K.M.; Seo, Y.; Kim, J.K.; Kim, Y.S.; Kim, M.-K.; Jin, S.; Han, S.C.; Kim, D.W. Subepithelial Neutrophil Infiltration as a Predictor of the Surgical Outcome of Chronic Rhinosinusitis with Nasal Polyps. Rhinology 2021, 59, 173–180. [Google Scholar] [CrossRef]
Olveres, J.; González, G.; Torres, F.; Moreno-Tagle, J.C.; Carbajal-Degante, E.; Valencia-Rodríguez, A.; Méndez-Sánchez, N.; Escalante-Ramírez, B. What Is New in Computer Vision and Artificial Intelligence in Medical Image Analysis Applications. Quant. Imaging Med. Surg. 2021, 11, 3830–3853. [Google Scholar] [CrossRef]
Bambach, S.; Ho, M.-L. Deep Learning for Synthetic CT from Bone MRI in the Head and Neck. AJNR Am. J. Neuroradiol. 2022, 43, 1172–1179. [Google Scholar] [CrossRef] [PubMed]
Klages, P.; Benslimane, I.; Riyahi, S.; Jiang, J.; Hunt, M.; Deasy, J.O.; Veeraraghavan, H.; Tyagi, N. Patch-Based Generative Adversarial Neural Network Models for Head and Neck MR-Only Planning. Med. Phys. 2020, 47, 626–642. [Google Scholar] [CrossRef] [PubMed]
Chandrashekar, A.; Handa, A.; Ward, J.; Grau, V.; Lee, R. A Deep Learning Pipeline to Simulate Fluorodeoxyglucose (FDG) Uptake in Head and Neck Cancers Using Non-Contrast CT Images without the Administration of Radioactive Tracer. Insights Imaging 2022, 13, 45. [Google Scholar] [CrossRef] [PubMed]
Altmann, S.; Abello Mercado, M.A.; Ucar, F.A.; Kronfeld, A.; Al-Nawas, B.; Mukhopadhyay, A.; Booz, C.; Brockmann, M.A.; Othman, A.E. Ultra-High-Resolution CT of the Head and Neck with Deep Learning Reconstruction—Assessment of Image Quality and Radiation Exposure and Intraindividual Comparison with Normal-Resolution CT. Diagnostics 2023, 13, 1534. [Google Scholar] [CrossRef] [PubMed]
Fujima, N.; Andreu-Arasa, V.C.; Meibom, S.K.; Mercier, G.A.; Salama, A.R.; Truong, M.T.; Sakai, O. Deep Learning Analysis Using FDG-PET to Predict Treatment Outcome in Patients with Oral Cavity Squamous Cell Carcinoma. Eur. Radiol. 2020, 30, 6322–6330. [Google Scholar] [CrossRef] [PubMed]
Fujima, N.; Andreu-Arasa, V.C.; Meibom, S.K.; Mercier, G.A.; Truong, M.T.; Hirata, K.; Yasuda, K.; Kano, S.; Homma, A.; Kudo, K.; et al. Prediction of the Local Treatment Outcome in Patients with Oropharyngeal Squamous Cell Carcinoma Using Deep Learning Analysis of Pretreatment FDG-PET Images. BMC Cancer 2021, 21, 900. [Google Scholar] [CrossRef] [PubMed]
Cheng, N.-M.; Yao, J.; Cai, J.; Ye, X.; Zhao, S.; Zhao, K.; Zhou, W.; Nogues, I.; Huo, Y.; Liao, C.-T.; et al. Deep Learning for Fully Automated Prediction of Overall Survival in Patients with Oropharyngeal Cancer Using FDG-PET Imaging. Clin. Cancer Res. 2021, 27, 3948–3959. [Google Scholar] [CrossRef] [PubMed]
Fujima, N.; Andreu-Arasa, V.C.; Meibom, S.K.; Mercier, G.A.; Truong, M.T.; Sakai, O. Prediction of the Human Papillomavirus Status in Patients with Oropharyngeal Squamous Cell Carcinoma by FDG-PET Imaging Dataset Using Deep Learning Analysis: A Hypothesis-Generating Study. Eur. J. Radiol. 2020, 126, 108936. [Google Scholar] [CrossRef]
Yuan, W.; Cheng, L.; Yang, J.; Yin, B.; Fan, X.; Yang, J.; Li, S.; Zhong, J.; Huang, X. Noninvasive Oral Cancer Screening Based on Local Residual Adaptation Network Using Optical Coherence Tomography. Med. Biol. Eng. Comput. 2022, 60, 1363–1375. [Google Scholar] [CrossRef]
Wilder-Smith, P.; Lee, K.; Guo, S.; Zhang, J.; Osann, K.; Chen, Z.; Messadi, D. In Vivo Diagnosis of Oral Dysplasia and Malignancy Using Optical Coherence Tomography: Preliminary Studies in 50 Patients. Lasers Surg. Med. 2009, 41, 353–357. [Google Scholar] [CrossRef]
Jeyaraj, P.R.; Samuel Nadar, E.R. Computer-Assisted Medical Image Classification for Early Diagnosis of Oral Cancer Employing Deep Learning Algorithm. J. Cancer Res. Clin. Oncol. 2019, 145, 829–837. [Google Scholar] [CrossRef] [PubMed]
Song, B.; Sunny, S.; Uthoff, R.D.; Patrick, S.; Suresh, A.; Kolur, T.; Keerthi, G.; Anbarani, A.; Wilder-Smith, P.; Kuriakose, M.A.; et al. Automatic Classification of Dual-Modalilty, Smartphone-Based Oral Dysplasia and Malignancy Images Using Deep Learning. Biomed. Opt. Express 2018, 9, 5318–5329. [Google Scholar] [CrossRef] [PubMed]
Fu, Q.; Chen, Y.; Li, Z.; Jing, Q.; Hu, C.; Liu, H.; Bao, J.; Hong, Y.; Shi, T.; Li, K.; et al. A Deep Learning Algorithm for Detection of Oral Cavity Squamous Cell Carcinoma from Photographic Images: A Retrospective Study. EClinicalMedicine 2020, 27, 100558. [Google Scholar] [CrossRef] [PubMed]
Birur, N.P.; Song, B.; Sunny, S.P.; Mendonca, P.; Mukhia, N.; Li, S.; Patrick, S.; AR, S.; Imchen, T.; Leivon, S.T.; et al. Field Validation of Deep Learning Based Point-of-Care Device for Early Detection of Oral Malignant and Potentially Malignant Disorders. Sci. Rep. 2022, 12, 14283. [Google Scholar] [CrossRef] [PubMed]
Coole, J.B.; Brenes, D.; Mitbander, R.; Vohra, I.; Hou, H.; Kortum, A.; Tang, Y.; Maker, Y.; Schwarz, R.A.; Carns, J.; et al. Multimodal Optical Imaging with Real-Time Projection of Cancer Risk and Biopsy Guidance Maps for Early Oral Cancer Diagnosis and Treatment. J. Biomed. Opt. 2023, 28, 016002. [Google Scholar] [CrossRef] [PubMed]
Li, S.; Hua, H.-L.; Li, F.; Kong, Y.-G.; Zhu, Z.-L.; Li, S.-L.; Chen, X.-X.; Deng, Y.-Q.; Tao, Z.-Z. Anatomical Partition-Based Deep Learning: An Automatic Nasopharyngeal MRI Recognition Scheme. J. Magn. Reason. Imaging 2022, 56, 1220–1229. [Google Scholar] [CrossRef]
Ji, L.; Mao, R.; Wu, J.; Ge, C.; Xiao, F.; Xu, X.; Xie, L.; Gu, X. Deep Convolutional Neural Network for Nasopharyngeal Carcinoma Discrimination on MRI by Comparison of Hierarchical and Simple Layered Convolutional Neural Networks. Diagnostics 2022, 12, 2478. [Google Scholar] [CrossRef]
Li, S.; Wan, X.; Deng, Y.-Q.; Hua, H.-L.; Li, S.-L.; Chen, X.-X.; Zeng, M.-L.; Zha, Y.; Tao, Z.-Z. Predicting Prognosis of Nasopharyngeal Carcinoma Based on Deep Learning: Peritumoral Region Should Be Valued. Cancer Imaging 2023, 23, 14. [Google Scholar] [CrossRef]
Hua, H.-L.; Deng, Y.-Q.; Li, S.; Li, S.-T.; Li, F.; Xiao, B.-K.; Huang, J.; Tao, Z.-Z. Deep Learning for Predicting Distant Metastasis in Patients with Nasopharyngeal Carcinoma Based on Pre-Radiotherapy Magnetic Resonance Imaging. Comb. Chem. High. Throughput Screen. 2023, 26, 1351–1363. [Google Scholar] [CrossRef]
Shen, X.-M.; Mao, L.; Yang, Z.-Y.; Chai, Z.-K.; Sun, T.-G.; Xu, Y.; Sun, Z.-J. Deep Learning-Assisted Diagnosis of Parotid Gland Tumors by Using Contrast-Enhanced CT Imaging. Oral. Dis. 2022. [CrossRef]
Tu, C.-H.; Wang, R.-T.; Wang, B.-S.; Kuo, C.-E.; Wang, E.-Y.; Tu, C.-T.; Yu, W.-N. Neural Network Combining with Clinical Ultrasonography: A New Approach for Classification of Salivary Gland Tumors. Head. Neck 2023, 45, 1885–1893. [Google Scholar] [CrossRef] [PubMed]
Liu, X.; Pan, Y.; Zhang, X.; Sha, Y.; Wang, S.; Li, H.; Liu, J. A Deep Learning Model for Classification of Parotid Neoplasms Based on Multimodal Magnetic Resonance Image Sequences. Laryngoscope 2023, 133, 327–335. [Google Scholar] [CrossRef] [PubMed]
Gunduz, E.; Alçin, O.F.; Kizilay, A.; Yildirim, I.O. Deep Learning Model Developed by Multiparametric MRI in Differential Diagnosis of Parotid Gland Tumors. Eur. Arch. Otorhinolaryngol. 2022, 279, 5389–5399. [Google Scholar] [CrossRef]
Guan, Q.; Wang, Y.; Du, J.; Qin, Y.; Lu, H.; Xiang, J.; Wang, F. Deep Learning Based Classification of Ultrasound Images for Thyroid Nodules: A Large Scale of Pilot Study. Ann. Transl. Med. 2019, 7, 137. [Google Scholar] [CrossRef] [PubMed]
Yang, Z.; Yao, S.; Heng, Y.; Shen, P.; Lv, T.; Feng, S.; Tao, L.; Zhang, W.; Qiu, W.; Lu, H.; et al. Automated Diagnosis and Management of Follicular Thyroid Nodules Based on the Devised Small-Datasets Interpretable Foreground Optimization Network Deep Learning: A Multicenter Diagnostic Study. Int. J. Surg. 2023, 109, 2732–2741. [Google Scholar] [CrossRef] [PubMed]
Zhang, R.; Yi, G.; Pu, S.; Wang, Q.; Sun, C.; Wang, Q.; Feng, L.; Liu, X.; Li, Z.; Niu, L. Deep Learning Based on Ultrasound to Differentiate Pathologically Proven Atypical and Typical Medullary Thyroid Carcinoma from Follicular Thyroid Adenoma. Eur. J. Radiol. 2022, 156, 110547. [Google Scholar] [CrossRef] [PubMed]
Qi, Q.; Huang, X.; Zhang, Y.; Cai, S.; Liu, Z.; Qiu, T.; Cui, Z.; Zhou, A.; Yuan, X.; Zhu, W.; et al. Ultrasound Image-Based Deep Learning to Assist in Diagnosing Gross Extrathyroidal Extension Thyroid Cancer: A Retrospective Multicenter Study. EClinicalMedicine 2023, 58, 101905. [Google Scholar] [CrossRef] [PubMed]
He, X.; Guo, B.J.; Lei, Y.; Tian, S.; Wang, T.; Curran, W.J.; Zhang, L.J.; Liu, T.; Yang, X. Thyroid Gland Delineation in Noncontrast-Enhanced CTs Using Deep Convolutional Neural Networks. Phys. Med. Biol. 2021, 66, 055007. [Google Scholar] [CrossRef]
Gong, J.; Holsinger, F.C.; Noel, J.E.; Mitani, S.; Jopling, J.; Bedi, N.; Koh, Y.W.; Orloff, L.A.; Cernea, C.R.; Yeung, S. Using Deep Learning to Identify the Recurrent Laryngeal Nerve during Thyroidectomy. Sci. Rep. 2021, 11, 14306. [Google Scholar] [CrossRef]
Pisani, P.; Airoldi, M.; Allais, A.; AluffiValletti, P.; Battista, M.; Benazzo, M.; Briatore, R.; Cacciola, S.; Cocuzza, S.; Colombo, A.; et al. Metastatic Disease in Head & Neck Oncology. Acta Otorhinolaryngol. Ital. 2020, 40 (Suppl. 1), S1–S86. [Google Scholar] [CrossRef]
Lombardo, E.; Kurz, C.; Marschner, S.; Avanzo, M.; Gagliardi, V.; Fanetti, G.; Franchin, G.; Stancanello, J.; Corradini, S.; Niyazi, M.; et al. Distant Metastasis Time to Event Analysis with CNNs in Independent Head and Neck Cancer Cohorts. Sci. Rep. 2021, 11, 6418. [Google Scholar] [CrossRef] [PubMed]
Diamant, A.; Chatterjee, A.; Vallières, M.; Shenouda, G.; Seuntjens, J. Deep Learning in Head & Neck Cancer Outcome Prediction. Sci. Rep. 2019, 9, 2764. [Google Scholar] [CrossRef] [PubMed]
Zhong, L.; Dong, D.; Fang, X.; Zhang, F.; Zhang, N.; Zhang, L.; Fang, M.; Jiang, W.; Liang, S.; Li, C.; et al. A Deep Learning-Based Radiomic Nomogram for Prognosis and Treatment Decision in Advanced Nasopharyngeal Carcinoma: A Multicentre Study. EBioMedicine 2021, 70, 103522. [Google Scholar] [CrossRef] [PubMed]
Ariji, Y.; Fukuda, M.; Nozawa, M.; Kuwada, C.; Goto, M.; Ishibashi, K.; Nakayama, A.; Sugita, Y.; Nagao, T.; Ariji, E. Automatic Detection of Cervical Lymph Nodes in Patients with Oral Squamous Cell Carcinoma Using a Deep Learning Technique: A Preliminary Study. Oral. Radiol. 2021, 37, 290–296. [Google Scholar] [CrossRef] [PubMed]
Jin, D.; Ni, X.; Zhang, X.; Yin, H.; Zhang, H.; Xu, L.; Wang, R.; Fan, G. Multiphase Dual-Energy Spectral CT-Based Deep Learning Method for the Noninvasive Prediction of Head and Neck Lymph Nodes Metastasis in Patients With Papillary Thyroid Cancer. Front. Oncol. 2022, 12, 869895. [Google Scholar] [CrossRef] [PubMed]
Zhong, Y.; Yang, Y.; Fang, Y.; Wang, J.; Hu, W. A Preliminary Experience of Implementing Deep-Learning Based Auto-Segmentation in Head and Neck Cancer: A Study on Real-World Clinical Cases. Front. Oncol. 2021, 11, 638197. [Google Scholar] [CrossRef] [PubMed]
Thor, M.; Iyer, A.; Jiang, J.; Apte, A.; Veeraraghavan, H.; Allgood, N.B.; Kouri, J.A.; Zhou, Y.; LoCastro, E.; Elguindi, S.; et al. Deep Learning Auto-Segmentation and Automated Treatment Planning for Trismus Risk Reduction in Head and Neck Cancer Radiotherapy. Phys. Imaging Radiat. Oncol. 2021, 19, 96–101. [Google Scholar] [CrossRef]
Kawahara, D.; Tsuneda, M.; Ozawa, S.; Okamoto, H.; Nakamura, M.; Nishio, T.; Saito, A.; Nagata, Y. Stepwise Deep Neural Network (Stepwise-Net) for Head and Neck Auto-Segmentation on CT Images. Comput. Biol. Med. 2022, 143, 105295. [Google Scholar] [CrossRef]
Oktay, O.; Nanavati, J.; Schwaighofer, A.; Carter, D.; Bristow, M.; Tanno, R.; Jena, R.; Barnett, G.; Noble, D.; Rimmer, Y.; et al. Evaluation of Deep Learning to Augment Image-Guided Radiotherapy for Head and Neck and Prostate Cancers. JAMA Netw. Open 2020, 3, e2027426. [Google Scholar] [CrossRef]
Cubero, L.; Castelli, J.; Simon, A.; de Crevoisier, R.; Acosta, O.; Pascau, J. Deep Learning-Based Segmentation of Head and Neck Organs-at-Risk with Clinical Partially Labeled Data. Entropy 2022, 24, 1661. [Google Scholar] [CrossRef]
De Biase, A.; Sijtsema, N.M.; van Dijk, L.V.; Langendijk, J.A.; van Ooijen, P.M.A. Deep Learning Aided Oropharyngeal Cancer Segmentation with Adaptive Thresholding for Predicted Tumor Probability in FDG PET and CT Images. Phys. Med. Biol. 2023, 68, 055013. [Google Scholar] [CrossRef] [PubMed]
van Rooij, W.; Dahele, M.; Nijhuis, H.; Slotman, B.J.; Verbakel, W.F. Strategies to Improve Deep Learning-Based Salivary Gland Segmentation. Radiat. Oncol. 2020, 15, 272. [Google Scholar] [CrossRef] [PubMed]
van der Veen, J.; Willems, S.; Bollen, H.; Maes, F.; Nuyts, S. Deep Learning for Elective Neck Delineation: More Consistent and Time Efficient. Radiother. Oncol. 2020, 153, 180–188. [Google Scholar] [CrossRef] [PubMed]
Cardenas, C.E.; Beadle, B.M.; Garden, A.S.; Skinner, H.D.; Yang, J.; Rhee, D.J.; McCarroll, R.E.; Netherton, T.J.; Gay, S.S.; Zhang, L.; et al. Generating High-Quality Lymph Node Clinical Target Volumes for Head and Neck Cancer Radiation Therapy Using a Fully Automated Deep Learning-Based Approach. Int. J. Radiat. Oncol. Biol. Phys. 2021, 109, 801–812. [Google Scholar] [CrossRef] [PubMed]
Wang, Y.; Lombardo, E.; Avanzo, M.; Zschaek, S.; Weingärtner, J.; Holzgreve, A.; Albert, N.L.; Marschner, S.; Fanetti, G.; Franchin, G.; et al. Deep Learning Based Time-to-Event Analysis with PET, CT and Joint PET/CT for Head and Neck Cancer Prognosis. Comput. Methods Programs Biomed. 2022, 222, 106948. [Google Scholar] [CrossRef]
Moe, Y.M.; Groendahl, A.R.; Tomic, O.; Dale, E.; Malinen, E.; Futsaether, C.M. Deep Learning-Based Auto-Delineation of Gross Tumour Volumes and Involved Nodes in PET/CT Images of Head and Neck Cancer Patients. Eur. J. Nucl. Med. Mol. Imaging 2021, 48, 2782–2792. [Google Scholar] [CrossRef] [PubMed]
Paderno, A.; Holsinger, F.C.; Piazza, C. Videomics: Bringing Deep Learning to Diagnostic Endoscopy. Curr. Opin. Otolaryngol. Head Neck Surg. 2021, 29, 143–148. [Google Scholar] [CrossRef]
Cho, W.K.; Choi, S.-H. Comparison of Convolutional Neural Network Models for Determination of Vocal Fold Normality in Laryngoscopic Images. J. Voice 2022, 36, 590–598. [Google Scholar] [CrossRef]
Sampieri, C.; Baldini, C.; Azam, M.A.; Moccia, S.; Mattos, L.S.; Vilaseca, I.; Peretti, G.; Ioppi, A. Artificial Intelligence for Upper Aerodigestive Tract Endoscopy and Laryngoscopy: A Guide for Physicians and State-of-the-Art Review. Otolaryngol. Head. Neck Surg. 2023, 169, 811–829. [Google Scholar] [CrossRef]
Yao, P.; Witte, D.; Gimonet, H.; German, A.; Andreadis, K.; Cheng, M.; Sulica, L.; Elemento, O.; Barnes, J.; Rameau, A. Automatic Classification of Informative Laryngoscopic Images Using Deep Learning. Laryngoscope Investig. Otolaryngol. 2022, 7, 460–466. [Google Scholar] [CrossRef]
Patrini, I.; Ruperti, M.; Moccia, S.; Mattos, L.S.; Frontoni, E.; De Momi, E. Transfer Learning for Informative-Frame Selection in Laryngoscopic Videos through Learned Features. Med. Biol. Eng. Comput. 2020, 58, 1225–1238. [Google Scholar] [CrossRef] [PubMed]
Dunham, M.E.; Kong, K.A.; McWhorter, A.J.; Adkins, L.K. Optical Biopsy: Automated Classification of Airway Endoscopic Findings Using a Convolutional Neural Network. Laryngoscope 2022, 132 (Suppl. 4), S1–S8. [Google Scholar] [CrossRef] [PubMed]
Xiong, H.; Lin, P.; Yu, J.-G.; Ye, J.; Xiao, L.; Tao, Y.; Jiang, Z.; Lin, W.; Liu, M.; Xu, J.; et al. Computer-Aided Diagnosis of Laryngeal Cancer via Deep Learning Based on Laryngoscopic Images. EBioMedicine 2019, 48, 92–99. [Google Scholar] [CrossRef] [PubMed]
Zhao, Q.; He, Y.; Wu, Y.; Huang, D.; Wang, Y.; Sun, C.; Ju, J.; Wang, J.; Mahr, J.J.-L. Vocal Cord Lesions Classification Based on Deep Convolutional Neural Network and Transfer Learning. Med. Phys. 2022, 49, 432–442. [Google Scholar] [CrossRef] [PubMed]
Ren, J.; Jing, X.; Wang, J.; Ren, X.; Xu, Y.; Yang, Q.; Ma, L.; Sun, Y.; Xu, W.; Yang, N.; et al. Automatic Recognition of Laryngoscopic Images Using a Deep-Learning Technique. Laryngoscope 2020, 130, E686–E693. [Google Scholar] [CrossRef] [PubMed]
Cho, W.K.; Lee, Y.J.; Joo, H.A.; Jeong, I.S.; Choi, Y.; Nam, S.Y.; Kim, S.Y.; Choi, S.-H. Diagnostic Accuracies of Laryngeal Diseases Using a Convolutional Neural Network-Based Image Classification System. Laryngoscope 2021, 131, 2558–2566. [Google Scholar] [CrossRef]
Inaba, A.; Hori, K.; Yoda, Y.; Ikematsu, H.; Takano, H.; Matsuzaki, H.; Watanabe, Y.; Takeshita, N.; Tomioka, T.; Ishii, G.; et al. Artificial Intelligence System for Detecting Superficial Laryngopharyngeal Cancer with High Efficiency of Deep Learning. Head. Neck 2020, 42, 2581–2592. [Google Scholar] [CrossRef] [PubMed]
Tamashiro, A.; Yoshio, T.; Ishiyama, A.; Tsuchida, T.; Hijikata, K.; Yoshimizu, S.; Horiuchi, Y.; Hirasawa, T.; Seto, A.; Sasaki, T.; et al. Artificial Intelligence-Based Detection of Pharyngeal Cancer Using Convolutional Neural Networks. Dig. Endosc. 2020, 32, 1057–1065. [Google Scholar] [CrossRef]
Heo, J.; Lim, J.H.; Lee, H.R.; Jang, J.Y.; Shin, Y.S.; Kim, D.; Lim, J.Y.; Park, Y.M.; Koh, Y.W.; Ahn, S.-H.; et al. Deep Learning Model for Tongue Cancer Diagnosis Using Endoscopic Images. Sci. Rep. 2022, 12, 6281. [Google Scholar] [CrossRef]
Azam, M.A.; Sampieri, C.; Ioppi, A.; Africano, S.; Vallin, A.; Mocellin, D.; Fragale, M.; Guastini, L.; Moccia, S.; Piazza, C.; et al. Deep Learning Applied to White Light and Narrow Band Imaging Videolaryngoscopy: Toward Real-Time Laryngeal Cancer Detection. Laryngoscope 2022, 132, 1798–1806. [Google Scholar] [CrossRef]
Kim, G.H.; Sung, E.-S.; Nam, K.W. Automated Laryngeal Mass Detection Algorithm for Home-Based Self-Screening Test Based on Convolutional Neural Network. Biomed. Eng. Online 2021, 20, 51. [Google Scholar] [CrossRef]
Wang, B.; Zheng, J.; Yu, J.-F.; Lin, S.-Y.; Yan, S.-Y.; Zhang, L.-Y.; Wang, S.-S.; Cai, S.-J.; Abdelhamid Ahmed, A.H.; Lin, L.-Q.; et al. Development of Artificial Intelligence for Parathyroid Recognition During Endoscopic Thyroid Surgery. Laryngoscope 2022, 132, 2516–2523. [Google Scholar] [CrossRef] [PubMed]
Avci, S.N.; Isiktas, G.; Ergun, O.; Berber, E. A Visual Deep Learning Model to Predict Abnormal versus Normal Parathyroid Glands Using Intraoperative Autofluorescence Signals. J. Surg. Oncol. 2022, 126, 263–267. [Google Scholar] [CrossRef] [PubMed]
Li, C.; Jing, B.; Ke, L.; Li, B.; Xia, W.; He, C.; Qian, C.; Zhao, C.; Mai, H.; Chen, M.; et al. Development and Validation of an Endoscopic Images-Based Deep Learning Model for Detection with Nasopharyngeal Malignancies. Cancer Commun. 2018, 38, 59. [Google Scholar] [CrossRef] [PubMed]
Paderno, A.; Piazza, C.; Del Bon, F.; Lancini, D.; Tanagli, S.; Deganello, A.; Peretti, G.; De Momi, E.; Patrini, I.; Ruperti, M.; et al. Deep Learning for Automatic Segmentation of Oral and Oropharyngeal Cancer Using Narrow Band Imaging: Preliminary Experience in a Clinical Perspective. Front. Oncol. 2021, 11, 626602. [Google Scholar] [CrossRef] [PubMed]
Azam, M.A.; Sampieri, C.; Ioppi, A.; Benzi, P.; Giordano, G.G.; De Vecchi, M.; Campagnari, V.; Li, S.; Guastini, L.; Paderno, A.; et al. Videomics of the Upper Aero-Digestive Tract Cancer: Deep Learning Applied to White Light and Narrow Band Imaging for Automatic Segmentation of Endoscopic Images. Front. Oncol. 2022, 12, 900451. [Google Scholar] [CrossRef] [PubMed]
Lin, J.; Walsted, E.S.; Backer, V.; Hull, J.H.; Elson, D.S. Quantification and Analysis of Laryngeal Closure From Endoscopic Videos. IEEE Trans. Biomed. Eng. 2019, 66, 1127–1136. [Google Scholar] [CrossRef]
DeVore, E.K.; Adamian, N.; Jowett, N.; Wang, T.; Song, P.; Franco, R.; Naunheim, M.R. Predictive Outcomes of Deep Learning Measurement of the Anterior Glottic Angle in Bilateral Vocal Fold Immobility. Laryngoscope 2023, 133, 2285–2291. [Google Scholar] [CrossRef]
Kruse, E.; Dollinger, M.; Schutzenberger, A.; Kist, A.M. GlottisNetV2: Temporal Glottal Midline Detection Using Deep Convolutional Neural Networks. IEEE J. Transl. Eng. Health Med. 2023, 11, 137–144. [Google Scholar] [CrossRef]
Adamian, N.; Naunheim, M.R.; Jowett, N. An Open-Source Computer Vision Tool for Automated Vocal Fold Tracking FromVideoendoscopy. Laryngoscope 2021, 131, E219–E225. [Google Scholar] [CrossRef]
Parker, F.; Brodsky, M.B.; Akst, L.M.; Ali, H. Machine Learning in Laryngoscopy Analysis: A Proof of Concept Observational Study for the Identification of Post-Extubation Ulcerations and Granulomas. Ann. Otol. Rhinol. Laryngol. 2021, 130, 286–291. [Google Scholar] [CrossRef] [PubMed]
Yousef, A.M.; Deliyski, D.D.; Zacharias, S.R.C.; de Alarcon, A.; Orlikoff, R.F.; Naghibolhosseini, M. A Deep Learning Approach for Quantifying Vocal Fold Dynamics During Connected Speech Using Laryngeal High-Speed Videoendoscopy. J. Speech Lang. Hear. Res. 2022, 65, 2098–2113. [Google Scholar] [CrossRef] [PubMed]
Yousef, A.M.; Deliyski, D.D.; Zacharias, S.R.C.; de Alarcon, A.; Orlikoff, R.F.; Naghibolhosseini, M. Spatial Segmentation for Laryngeal High-Speed Videoendoscopy in Connected Speech. J. Voice 2023, 37, 26–36. [Google Scholar] [CrossRef] [PubMed]
Weng, W.; Imaizumi, M.; Murono, S.; Zhu, X. Expert-Level Aspiration and Penetration Detection during Flexible Endoscopic Evaluation of Swallowing with Artificial Intelligence-Assisted Diagnosis. Sci. Rep. 2022, 12, 21689. [Google Scholar] [CrossRef] [PubMed]
Pichichero, M.E.; Poole, M.D. Assessing Diagnostic Accuracy and Tympanocentesis Skills in the Management of Otitis Media. Arch. Pediatr. Adolesc. Med. 2001, 155, 1137–1142. [Google Scholar] [CrossRef] [PubMed]
Habib, A.-R.; Crossland, G.; Patel, H.; Wong, E.; Kong, K.; Gunasekera, H.; Richards, B.; Caffery, L.; Perry, C.; Sacks, R.; et al. An Artificial Intelligence Computer-Vision Algorithm to Triage Otoscopic Images From Australian Aboriginal and Torres Strait Islander Children. Otol. Neurotol. 2022, 43, 481–488. [Google Scholar] [CrossRef] [PubMed]
Binol, H.; Niazi, M.K.K.; Essig, G.; Shah, J.; Mattingly, J.K.; Harris, M.S.; Elmaraghy, C.; Teknos, T.; Taj-Schaal, N.; Yu, L.; et al. Digital Otoscopy Videos Versus Composite Images: A Reader Study to Compare the Accuracy of ENT Physicians. Laryngoscope 2021, 131, E1668–E1676. [Google Scholar] [CrossRef] [PubMed]
Pham, V.-T.; Tran, T.-T.; Wang, P.-C.; Chen, P.-Y.; Lo, M.-T. EAR-UNet: A Deep Learning-Based Approach for Segmentation of Tympanic Membranes from Otoscopic Images. Artif. Intell. Med. 2021, 115, 102065. [Google Scholar] [CrossRef]
Viscaino, M.; Maass, J.C.; Delano, P.H.; Torrente, M.; Stott, C.; Auat Cheein, F. Computer-Aided Diagnosis of External and Middle Ear Conditions: A Machine Learning Approach. PLoS ONE 2020, 15, e0229226. [Google Scholar] [CrossRef]
Viscaino, M.; Talamilla, M.; Maass, J.C.; Henríquez, P.; Délano, P.H.; Auat Cheein, C.; Auat Cheein, F. Color Dependence Analysis in a CNN-Based Computer-Aided Diagnosis System for Middle and External Ear Diseases. Diagnostics 2022, 12, 917. [Google Scholar] [CrossRef]
Tsutsumi, K.; Goshtasbi, K.; Risbud, A.; Khosravi, P.; Pang, J.C.; Lin, H.W.; Djalilian, H.R.; Abouzari, M. A Web-Based Deep Learning Model for Automated Diagnosis of Otoscopic Images. Otol. Neurotol. 2021, 42, e1382–e1388. [Google Scholar] [CrossRef] [PubMed]
Livingstone, D.; Chau, J. Otoscopic Diagnosis Using Computer Vision: An Automated Machine Learning Approach. Laryngoscope 2020, 130, 1408–1413. [Google Scholar] [CrossRef] [PubMed]
Livingstone, D.; Talai, A.S.; Chau, J.; Forkert, N.D. Building an Otoscopic Screening Prototype Tool Using Deep Learning. J. Otolaryngol. Head. Neck Surg. 2019, 48, 66. [Google Scholar] [CrossRef] [PubMed]
Khan, M.A.; Kwon, S.; Choo, J.; Hong, S.M.; Kang, S.H.; Park, I.-H.; Kim, S.K.; Hong, S.J. Automatic Detection of Tympanic Membrane and Middle Ear Infection from Oto-Endoscopic Images via Convolutional Neural Networks. Neural Netw. 2020, 126, 384–394. [Google Scholar] [CrossRef] [PubMed]
Wu, Z.; Lin, Z.; Li, L.; Pan, H.; Chen, G.; Fu, Y.; Qiu, Q. Deep Learning for Classification of Pediatric Otitis Media. Laryngoscope 2021, 131, E2344–E2351. [Google Scholar] [CrossRef] [PubMed]
Habib, A.-R.; Kajbafzadeh, M.; Hasan, Z.; Wong, E.; Gunasekera, H.; Perry, C.; Sacks, R.; Kumar, A.; Singh, N. Artificial Intelligence to Classify Ear Disease from Otoscopy: A Systematic Review and Meta-Analysis. Clin. Otolaryngol. 2022, 47, 401–413. [Google Scholar] [CrossRef]
Byun, H.; Yu, S.; Oh, J.; Bae, J.; Yoon, M.S.; Lee, S.H.; Chung, J.H.; Kim, T.H. An Assistive Role of a Machine Learning Network in Diagnosis of Middle Ear Diseases. J. Clin. Med. 2021, 10, 3198. [Google Scholar] [CrossRef] [PubMed]
Crowson, M.G.; Bates, D.W.; Suresh, K.; Cohen, M.S.; Hartnick, C.J. “Human vs Machine” Validation of a Deep Learning Algorithm for Pediatric Middle Ear Infection Diagnosis. Otolaryngol. Head. Neck Surg. 2023, 169, 41–46. [Google Scholar] [CrossRef]
Habib, A.-R.; Wong, E.; Sacks, R.; Singh, N. Artificial Intelligence to Detect Tympanic Membrane Perforations. J. Laryngol. Otol. 2020, 134, 311–315. [Google Scholar] [CrossRef]
Zeng, J.; Deng, W.; Yu, J.; Xiao, L.; Chen, S.; Zhang, X.; Zeng, L.; Chen, D.; Li, P.; Chen, Y.; et al. A Deep Learning Approach to the Diagnosis of Atelectasis and Attic Retraction Pocket in Otitis Media with Effusion Using Otoscopic Images. Eur. Arch. Otorhinolaryngol. 2023, 280, 1621–1627. [Google Scholar] [CrossRef]
Mao, C.; Li, A.; Hu, J.; Wang, P.; Peng, D.; Wang, J.; Sun, Y. Efficient and Accurate Diagnosis of Otomycosis Using an Ensemble Deep-Learning Model. Front. Mol. Biosci. 2022, 9, 951432. [Google Scholar] [CrossRef] [PubMed]
Cao, Z.; Chen, F.; Grais, E.M.; Yue, F.; Cai, Y.; Swanepoel, D.W.; Zhao, F. Machine Learning in Diagnosing Middle Ear Disorders Using Tympanic Membrane Images: A Meta-Analysis. Laryngoscope 2023, 133, 732–741. [Google Scholar] [CrossRef] [PubMed]
Byun, H.; Park, C.J.; Oh, S.J.; Chung, M.J.; Cho, B.H.; Cho, Y.-S. Automatic Prediction of Conductive Hearing Loss Using Video Pneumatic Otoscopy and Deep Learning Algorithm. Ear Hear. 2022, 43, 1563–1573. [Google Scholar] [CrossRef] [PubMed]
Zeng, J.; Kang, W.; Chen, S.; Lin, Y.; Deng, W.; Wang, Y.; Chen, G.; Ma, K.; Zhao, F.; Zheng, Y.; et al. A Deep Learning Approach to Predict Conductive Hearing Loss in Patients With Otitis Media With Effusion Using Otoscopic Images. JAMA Otolaryngol. Head. Neck Surg. 2022, 148, 612–620. [Google Scholar] [CrossRef] [PubMed]
Habib, A.-R.; Xu, Y.; Bock, K.; Mohanty, S.; Sederholm, T.; Weeks, W.B.; Dodhia, R.; Ferres, J.L.; Perry, C.; Sacks, R.; et al. Evaluating the Generalizability of Deep Learning Image Classification Algorithms to Detect Middle Ear Disease Using Otoscopy. Sci. Rep. 2023, 13, 5368. [Google Scholar] [CrossRef] [PubMed]
Nie, L.; Li, C.; Marzani, F.; Wang, H.; Thibouw, F.; Grayeli, A.B. Classification of Wideband Tympanometry by Deep Transfer Learning With Data Augmentation for Automatic Diagnosis of Otosclerosis. IEEE J. Biomed. Health Inform. 2022, 26, 888–897. [Google Scholar] [CrossRef] [PubMed]
Ke, J.; Lv, Y.; Ma, F.; Du, Y.; Xiong, S.; Wang, J.; Wang, J. Deep Learning-Based Approach for the Automatic Segmentation of Adult and Pediatric Temporal Bone Computed Tomography Images. Quant. Imaging Med. Surg. 2023, 13, 1577–1591. [Google Scholar] [CrossRef]
Vaidyanathan, A.; van der Lubbe, M.F.J.A.; Leijenaar, R.T.H.; van Hoof, M.; Zerka, F.; Miraglio, B.; Primakov, S.; Postma, A.A.; Bruintjes, T.D.; Bilderbeek, M.A.L.; et al. Deep Learning for the Fully Automated Segmentation of the Inner Ear on MRI. Sci. Rep. 2021, 11, 2885. [Google Scholar] [CrossRef]
Ding, A.S.; Lu, A.; Li, Z.; Sahu, M.; Galaiya, D.; Siewerdsen, J.H.; Unberath, M.; Taylor, R.H.; Creighton, F.X. A Self-Configuring Deep Learning Network for Segmentation of Temporal Bone Anatomy in Cone-Beam CT Imaging. Otolaryngol. Head. Neck Surg. 2023, 169, 988–998. [Google Scholar] [CrossRef]
Ding, A.S.; Lu, A.; Li, Z.; Galaiya, D.; Siewerdsen, J.H.; Taylor, R.H.; Creighton, F.X. Automated Registration-Based Temporal Bone Computed Tomography Segmentation for Applications in Neurotologic Surgery. Otolaryngol. Head. Neck Surg. 2022, 167, 133–140. [Google Scholar] [CrossRef]
Wang, Y.-M.; Li, Y.; Cheng, Y.-S.; He, Z.-Y.; Yang, J.-M.; Xu, J.-H.; Chi, Z.-C.; Chi, F.-L.; Ren, D.-D. Deep Learning in Automated Region Proposal and Diagnosis of Chronic Otitis Media Based on Computed Tomography. Ear Hear. 2020, 41, 669–677. [Google Scholar] [CrossRef] [PubMed]
Eroğlu, O.; Eroğlu, Y.; Yıldırım, M.; Karlıdag, T.; Çınar, A.; Akyiğit, A.; Kaygusuz, İ.; Yıldırım, H.; Keleş, E.; Yalçın, Ş. Is It Useful. to Use Computerized Tomography Image-Based Artificial Intelligence Modelling in the Differential Diagnosis of Chronic Otitis Media with and without Cholesteatoma? Am. J. Otolaryngol. 2022, 43, 103395. [Google Scholar] [CrossRef] [PubMed]
Takahashi, M.; Noda, K.; Yoshida, K.; Tsuchida, K.; Yui, R.; Nakazawa, T.; Kurihara, S.; Baba, A.; Motegi, M.; Yamamoto, K.; et al. Preoperative Prediction by Artificial Intelligence for Mastoid Extension in Pars Flaccida Cholesteatoma Using Temporal Bone High-Resolution Computed Tomography: A Retrospective Study. PLoS ONE 2022, 17, e0273915. [Google Scholar] [CrossRef] [PubMed]
Tan, W.; Guan, P.; Wu, L.; Chen, H.; Li, J.; Ling, Y.; Fan, T.; Wang, Y.; Li, J.; Yan, B. The Use of Explainable Artificial Intelligence to Explore Types of Fenestral Otosclerosis Misdiagnosed When Using Temporal Bone High-Resolution Computed Tomography. Ann. Transl. Med. 2021, 9, 969. [Google Scholar] [CrossRef] [PubMed]
Fujima, N.; Andreu-Arasa, V.C.; Onoue, K.; Weber, P.C.; Hubbell, R.D.; Setty, B.N.; Sakai, O. Utility of Deep Learning for the Diagnosis of Otosclerosis on Temporal Bone CT. Eur. Radiol. 2021, 31, 5206–5211. [Google Scholar] [CrossRef] [PubMed]
Choi, D.; Sunwoo, L.; You, S.-H.; Lee, K.J.; Ryoo, I. Application of Symmetry Evaluation to Deep Learning Algorithm in Detection of Mastoiditis on Mastoid Radiographs. Sci. Rep. 2023, 13, 5337. [Google Scholar] [CrossRef]
Lee, K.J.; Ryoo, I.; Choi, D.; Sunwoo, L.; You, S.-H.; Jung, H.N. Performance of Deep Learning to Detect Mastoiditis Using Multiple Conventional Radiographs of Mastoid. PLoS ONE 2020, 15, e0241796. [Google Scholar] [CrossRef]
Park, C.J.; Cho, Y.S.; Chung, M.J.; Kim, Y.-K.; Kim, H.-J.; Kim, K.; Ko, J.-W.; Chung, W.-H.; Cho, B.H. A Fully Automated Analytic System for Measuring Endolymphatic Hydrops Ratios in Patients with Ménière Disease via Magnetic Resonance Imaging: Deep Learning Model Development Study. J. Med. Internet Res. 2021, 23, e29678. [Google Scholar] [CrossRef]
Cho, Y.S.; Cho, K.; Park, C.J.; Chung, M.J.; Kim, J.H.; Kim, K.; Kim, Y.-K.; Kim, H.-J.; Ko, J.-W.; Cho, B.H.; et al. Automated Measurement of Hydrops Ratio from MRI in Patients with Ménière’s Disease Using CNN-Based Segmentation. Sci. Rep. 2020, 10, 7003. [Google Scholar] [CrossRef]
George-Jones, N.A.; Wang, K.; Wang, J.; Hunter, J.B. Automated Detection of Vestibular Schwannoma Growth Using a Two-Dimensional U-Net Convolutional Neural Network. Laryngoscope 2021, 131, E619–E624. [Google Scholar] [CrossRef]
Lee, C.-C.; Lee, W.-K.; Wu, C.-C.; Lu, C.-F.; Yang, H.-C.; Chen, Y.-W.; Chung, W.-Y.; Hu, Y.-S.; Wu, H.-M.; Wu, Y.-T.; et al. Applying Artificial Intelligence to Longitudinal Imaging Analysis of Vestibular Schwannoma Following Radiosurgery. Sci. Rep. 2021, 11, 3106. [Google Scholar] [CrossRef] [PubMed]
Yao, P.; Shavit, S.S.; Shin, J.; Selesnick, S.; Phillips, C.D.; Strauss, S.B. Segmentation of Vestibular Schwannomas on Postoperative Gadolinium-Enhanced T1-Weighted and Noncontrast T2-Weighted Magnetic Resonance Imaging Using Deep Learning. Otol. Neurotol. 2022, 43, 1227–1239. [Google Scholar] [CrossRef] [PubMed]
Wang, K.; George-Jones, N.A.; Chen, L.; Hunter, J.B.; Wang, J. Joint Vestibular Schwannoma Enlargement Prediction and Segmentation Using a Deep Multi-Task Model. Laryngoscope 2023, 133, 2754–2760. [Google Scholar] [CrossRef]
Jeon, Y.; Lee, K.; Sunwoo, L.; Choi, D.; Oh, D.Y.; Lee, K.J.; Kim, Y.; Kim, J.-W.; Cho, S.J.; Baik, S.H.; et al. Deep Learning for Diagnosis of Paranasal Sinusitis Using Multi-View Radiographs. Diagnostics 2021, 11, 250. [Google Scholar] [CrossRef] [PubMed]
Murata, M.; Ariji, Y.; Ohashi, Y.; Kawai, T.; Fukuda, M.; Funakoshi, T.; Kise, Y.; Nozawa, M.; Katsumata, A.; Fujita, H.; et al. Deep-Learning Classification Using Convolutional Neural Network for Evaluation of Maxillary Sinusitis on Panoramic Radiography. Oral. Radiol. 2019, 35, 301–307. [Google Scholar] [CrossRef] [PubMed]
Kong, H.-J.; Kim, J.Y.; Moon, H.-M.; Park, H.C.; Kim, J.-W.; Lim, R.; Woo, J.; Fakhri, G.E.; Kim, D.W.; Kim, S. Automation of Generative Adversarial Network-Based Synthetic Data-Augmentation for Maximizing the Diagnostic Performance with Paranasal Imaging. Sci. Rep. 2022, 12, 18118. [Google Scholar] [CrossRef] [PubMed]
Hua, H.-L.; Li, S.; Xu, Y.; Chen, S.-M.; Kong, Y.-G.; Yang, R.; Deng, Y.-Q.; Tao, Z.-Z. Differentiation of Eosinophilic and Non-Eosinophilic Chronic Rhinosinusitis on Preoperative Computed Tomography Using Deep Learning. Clin. Otolaryngol. 2023, 48, 330–338. [Google Scholar] [CrossRef] [PubMed]
He, S.; Chen, W.; Wang, X.; Xie, X.; Liu, F.; Ma, X.; Li, X.; Li, A.; Feng, X. Deep Learning Radiomics-Based Preoperative Prediction of Recurrence in Chronic Rhinosinusitis. iScience 2023, 26, 106527. [Google Scholar] [CrossRef]
Humphries, S.M.; Centeno, J.P.; Notary, A.M.; Gerow, J.; Cicchetti, G.; Katial, R.K.; Beswick, D.M.; Ramakrishnan, V.R.; Alam, R.; Lynch, D.A. Volumetric Assessment of Paranasal Sinus Opacification on Computed Tomography Can Be Automated Using a Convolutional Neural Network. Int. Forum Allergy Rhinol. 2020, 10, 1218–1225. [Google Scholar] [CrossRef]
Massey, C.J.; Ramos, L.; Beswick, D.M.; Ramakrishnan, V.R.; Humphries, S.M. Clinical Validation and Extension of an Automated, Deep Learning-Based Algorithm for Quantitative Sinus CT Analysis. AJNR Am. J. Neuroradiol. 2022, 43, 1318–1324. [Google Scholar] [CrossRef]
Chowdhury, N.I.; Smith, T.L.; Chandra, R.K.; Turner, J.H. Automated Classification of Osteomeatal Complex Inflammation on Computed Tomography Using Convolutional Neural Networks. Int. Forum Allergy Rhinol. 2019, 9, 46–52. [Google Scholar] [CrossRef]
Liu, G.S.; Yang, A.; Kim, D.; Hojel, A.; Voevodsky, D.; Wang, J.; Tong, C.C.L.; Ungerer, H.; Palmer, J.N.; Kohanski, M.A.; et al. Deep Learning Classification of Inverted Papilloma Malignant Transformation Using 3D Convolutional Neural Networks and Magnetic Resonance Imaging. Int. Forum Allergy Rhinol. 2022, 12, 1025–1033. [Google Scholar] [CrossRef]

Figure 1. Categorization of AI systems. Deep learning systems consist of a subcategory of machine learning algorithms.

Table 1. The contributions of deep learning systems in head and neck imaging and radiotherapy.

Deep Learning Contributions in Head and Neck
Imaging	Generation of an imaging modality from another
	Prediction making based on imaging
	Automated diagnosis of malignant and benign diseases
	Automated diagnosis of pathological lymph nodes
	Automated diagnosis of metastases
	Analysis of specific tumor characteristics
	Contouring of significant structures
	Cancer risk assessment of a lesion
	Biopsy assistance mapping
	Intraoperative surgeon assistance
Radiotherapy	Auto-segmentation of structures based on imaging
	Automation of the procedure
	Automated clinical target volume contouring
Endoscopy and laryngoscopy	Image quality improvement
	Segmentation of images
	Optical detection
	Pathological pattern detection
	Endoscopic image classification
	Lesion histology (benign/malignant) prediction
	Self-screening tumor recurrence detection
	Intra-operative endoscopic lesion detection
	Anatomical structure and lesion automatic segmentation
	Automatic assessment of aspiration and dysphagia
	Evaluation of laryngeal mobility

Table 2. The contributions of deep learning techniques in otology.

Deep Learning Contributions in Otology
Otoscopy	Automatic detection and categorization of ear lesions
	Automatic segmentation of anatomical structures and lesions
	Optimization of the diagnostic images
	Lesion-based predictions
Imaging	Complex pattern identification
	Tele-diagnosis
	Automatic analysis and segmentation of images
	Automatic recognition of region of interest
	Automatic diagnosis
	Imaging follow-up of complex diseases

Table 3. The contributions of deep learning systems in rhinology.

Deep Learning Contributions in Rhinology
Identification and categorization of paranasal sinuses
Diagnosis of benign/malignant lesions
Prediction of disease recurrence
Detection of pathology related to chronic sinusitis

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Tsilivigkos, C.; Athanasopoulos, M.; Micco, R.d.; Giotakis, A.; Mastronikolis, N.S.; Mulita, F.; Verras, G.-I.; Maroulis, I.; Giotakis, E. Deep Learning Techniques and Imaging in Otorhinolaryngology—A State-of-the-Art Review. J. Clin. Med. 2023, 12, 6973. https://doi.org/10.3390/jcm12226973

AMA Style

Tsilivigkos C, Athanasopoulos M, Micco Rd, Giotakis A, Mastronikolis NS, Mulita F, Verras G-I, Maroulis I, Giotakis E. Deep Learning Techniques and Imaging in Otorhinolaryngology—A State-of-the-Art Review. Journal of Clinical Medicine. 2023; 12(22):6973. https://doi.org/10.3390/jcm12226973

Chicago/Turabian Style

Tsilivigkos, Christos, Michail Athanasopoulos, Riccardo di Micco, Aris Giotakis, Nicholas S. Mastronikolis, Francesk Mulita, Georgios-Ioannis Verras, Ioannis Maroulis, and Evangelos Giotakis. 2023. "Deep Learning Techniques and Imaging in Otorhinolaryngology—A State-of-the-Art Review" Journal of Clinical Medicine 12, no. 22: 6973. https://doi.org/10.3390/jcm12226973

APA Style

Tsilivigkos, C., Athanasopoulos, M., Micco, R. d., Giotakis, A., Mastronikolis, N. S., Mulita, F., Verras, G.-I., Maroulis, I., & Giotakis, E. (2023). Deep Learning Techniques and Imaging in Otorhinolaryngology—A State-of-the-Art Review. Journal of Clinical Medicine, 12(22), 6973. https://doi.org/10.3390/jcm12226973

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Deep Learning Techniques and Imaging in Otorhinolaryngology—A State-of-the-Art Review

Abstract

1. Introduction

2. Head and Neck

2.1. Head and Neck Imaging

2.2. Head and Neck Radiotherapy

2.3. Endoscopy and Laryngoscopy

3. Otology

3.1. Computer Vision in Otoscopy

3.2. Imaging in Otology

4. Imaging in Rhinology

5. Conclusions

Author Contributions

Funding

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI