Applied Artificial Intelligence in Healthcare: A Review of Computer Vision Technology Application in Hospital Settings

Lindroth, Heidi; Nalaie, Keivan; Raghu, Roshini; Ayala, Ivan N.; Busch, Charles; Bhattacharyya, Anirban; Moreno Franco, Pablo; Diedrich, Daniel A.; Pickering, Brian W.; Herasevich, Vitaly

doi:10.3390/jimaging10040081

Open AccessReview

Applied Artificial Intelligence in Healthcare: A Review of Computer Vision Technology Application in Hospital Settings

by

Heidi Lindroth

^1,2,3,*

,

Keivan Nalaie

^1,4

,

Roshini Raghu

¹

,

Ivan N. Ayala

¹

,

Charles Busch

^1,5,

Anirban Bhattacharyya

⁶,

Pablo Moreno Franco

⁷

,

Daniel A. Diedrich

⁴,

Brian W. Pickering

⁴ and

Vitaly Herasevich

⁴

¹

Division of Nursing Research, Department of Nursing, Mayo Clinic, Rochester, MN 55905, USA

²

Center for Aging Research, Regenstrief Institute, School of Medicine, Indiana University, Indianapolis, IN 46202, USA

³

Center for Health Innovation and Implementation Science, School of Medicine, Indiana University, Indianapolis, IN 46202, USA

⁴

Department of Anesthesiology and Perioperative Medicine, Mayo Clinic, Rochester, MN 55905, USA

⁵

College of Engineering, University of Wisconsin-Madison, Madison, WI 53705, USA

⁶

Department of Critical Care Medicine, Mayo Clinic, Jacksonville, FL 32224, USA

⁷

Department of Transplantation Medicine, Mayo Clinic, Jacksonville, FL 32224, USA

^*

Author to whom correspondence should be addressed.

J. Imaging 2024, 10(4), 81; https://doi.org/10.3390/jimaging10040081

Submission received: 31 January 2024 / Revised: 8 March 2024 / Accepted: 11 March 2024 / Published: 28 March 2024

(This article belongs to the Special Issue Computer Vision and Deep Learning: Trends and Applications (2nd Edition))

Download

Browse Figures

Versions Notes

Abstract

:

Computer vision (CV), a type of artificial intelligence (AI) that uses digital videos or a sequence of images to recognize content, has been used extensively across industries in recent years. However, in the healthcare industry, its applications are limited by factors like privacy, safety, and ethical concerns. Despite this, CV has the potential to improve patient monitoring, and system efficiencies, while reducing workload. In contrast to previous reviews, we focus on the end-user applications of CV. First, we briefly review and categorize CV applications in other industries (job enhancement, surveillance and monitoring, automation, and augmented reality). We then review the developments of CV in the hospital setting, outpatient, and community settings. The recent advances in monitoring delirium, pain and sedation, patient deterioration, mechanical ventilation, mobility, patient safety, surgical applications, quantification of workload in the hospital, and monitoring for patient events outside the hospital are highlighted. To identify opportunities for future applications, we also completed journey mapping at different system levels. Lastly, we discuss the privacy, safety, and ethical considerations associated with CV and outline processes in algorithm development and testing that limit CV expansion in healthcare. This comprehensive review highlights CV applications and ideas for its expanded use in healthcare.

Keywords:

healthcare; hospital; computer vision; artificial intelligence; system

1. Introduction

The use of technology to address inefficiencies within the healthcare system and optimize patient safety has an extensive history of development, starting with the documentation and recording of patient care events. The concept of the electronic health record (EHR) emerged in the 1970s with the first official EHR built in 1972 by the Regenstrief Institute at Indiana University and has grown expediently over several decades [1,2]. The EHR provides a historical record of the patient care that was ordered and completed, making the EHR inherently retrospective. While this historical record is necessary for legal, administrative (billing), and diagnostic confirmation (post-test probability), it is cumbersome to use for real-time clinical decision-making like predictions, detection, and prognosis. This limitation leads to the inability to anticipate the healthcare needs of the patient, as well as the disease process that may be occurring; this is otherwise known as pre-test probabilities [3]. An opportunity to obtain more granular, real-time data is to use ambient sensors such as CV. CV mimics human vision that integrates and interprets visual information and could be potentially used to create sophisticated algorithms in real-time.

We recognize that much can be said about CV in terms of development trends and internal causal relationships of the overall and individual application of this technology. However, the purpose of this manuscript is to provide a high-level, comprehensive overview of the application of CV in healthcare settings based on other industries. First, we will briefly review the use of CV in industries outside of healthcare and categorize its application into themes. Following these themes, we review in greater detail how CV has been applied and/or developed for the hospital setting, and then we review outpatient and community settings. To identify future opportunities for the application of CV, we completed journey mapping at the patient, clinician, and system levels. Lastly, we discuss the privacy and safety considerations and ethical implications for the use of CV in the healthcare setting.

2. Overview of Computer Vision

CV is a type of artificial intelligence (AI) that uses digital videos or sequences of images. The goal of CV is to train computers to extract information from the images, essentially enabling computers to “see” and recognize content [4]. The foundations of CV were established during the 1980s; they were marked by the development of algorithms like optical flow and edge detection [5,6]. The advancement of machine learning and statistical techniques in the 1990s empowered computer applications to acquire the ability to understand and process more intricate patterns within visual scenes [7,8]. During the 2000s, the application of CV manifested in more practical domains, including the analysis of medical images and the detection of faces. The use of Convolutional Neural Networks (CNNs) significantly advanced the field of CV in 2012 when CNNS demonstrated high performance in the ImageNet Large Scale Visual Recognition Competition [9], emerging as the predominant learning method in CV. CNNs showed expert-level performance in image classification, object detection, and semantic segmentation across diverse fields, including medicine, surveillance, and autonomous driving [10,11,12,13,14,15,16]. As machine learning models advanced, obtaining a sufficient quantity of labeled data became a challenge. In response to this obstacle, unsupervised techniques like clustering and dimensionality reduction were developed. These approaches delve into the inherent structure of data without relying on explicit guidance to address the scarcity of labeled data [17,18,19]. However, the adoption and therefore application of CV into hospital settings has been slow. Its utilization in healthcare comes with limitations, as ethical and privacy concerns take precedence when involving humans, particularly humans in a vulnerable state (i.e., patients). As CV continues to develop, including the potential to assist in many aspects of patient care (e.g., documentation, recognition of a deteriorating patient, etc.), it is important to revisit how CV could be utilized for the benefit of patients and providers.

CV encompasses a multidisciplinary domain integrating advanced machine learning techniques, pattern recognition, and image processing to empower computers to comprehend the visual content present in images and videos [16,20,21,22]. Typically, CV algorithms start with the acquisition of data through cameras or sensors, followed by preprocessing and image enhancements. CNNs play a pivotal role in automatically learning the representations of visual scene content and contribute significantly to various CV tasks. Due to the CNNs’ robust feature representation capabilities, CNNs have found widespread application as an effective method for extracting meaningful patterns and features [23,24,25]. Despite the complexity of training CNNs due to their numerous layers, the artificial intelligence field addresses these challenges by adopting transfer learning and fine-tuning techniques to enhance model efficiency and representation power [26,27,28]. Object detection, crucial in applications like medical diagnostics and autonomous driving, heavily relies on CNNs, where a CNN backbone network extracts image features, and candidate regions determine target category and location information. CNNs also find applications in object classification across medical imaging, security, and agriculture, among other industries. For instance, in medical imaging, CNNs extract relevant features for categorizing structures such as tumors, aiding in diagnostic processes [29,30,31]. By means of parallel processing, Graphics Processing Units (GPUs) (i.e., computer chips) enhance the efficiency of managing extensive matrix operations essential for the processing of CNNs. This parallel approach substantially diminishes training times, thereby streamlining the implementation of real-time applications utilizing CNNs [32]. The evolution and implementation of CV have undergone notable transformations in the utilization of deep learning models and the expansion of machine learning methods. These deep learning models and methods are outlined further in Table 1.

3. Application of Computer Vision in Industry Outside of Healthcare

The use of CV was identified in 24 major industries including agriculture, engineering and manufacturing, retail, and education, among many others. The application of CV largely fell into four different themes or categories: job enhancement, surveillance and monitoring, automation, and augmented reality. These themes are outlined in Table 2 and provide insight into how CV could be applied in healthcare. Examples of job enhancement include the use of CV to analyze sporting events to inform referee calls [33], the scoring of diving competitions [34], and insurance appraisals to assist with claim reporting [10]. The use of CV in surveillance included the detection of forgery in artwork [35,36,37] as well as other industries, the prevention of cheating in academic and educational settings, and the enforcement of speed limits [14]. The theme of monitoring was the largest, including the use of CV to monitor agricultural crops for disease or insect infestations [15,38,39], detect restocking needs in warehouses or retail stores [40], and identify defective products on assembly lines [41]. The final category of augmented reality included technology such as Apple Vision [42], the ability to try clothes on virtually [43,44,45], and tools to assist people with vision impairment and blindness [46,47,48].

4. Computer Vision Application in Hospital Settings

Current medical applications of CV largely focused on monitoring (detection and measurement) and were mostly in the development and testing phases. The application or use of CV in the hospital included several commercial companies that specialized in patient monitoring for falls (Artisight: https://artisight.com/, accessed on 18 January 2024, CareAI: https://www.care.ai/sensors.html, accessed on 18 January 2024, Inspiren: https://inspiren.com/solutions/, accessed on 18 January 2024, Ocuvera: https://ocuvera.com, accessed on 18 January 2024, VirtuSense: https://www.virtusense.ai/, accessed on 18 January 2024) [88,89,90,91,92], Magnetic Resonance Imaging (MRI) and Computed Tomography (CT) support (Philips: https://www.philips.com/a-w/about/artificial-intelligence/ai-enabled-solutions, accessed on 18 January 2024; Silo AI: Europe’s largest private AI lab|Silo AI, accessed on 18 January 2024) [93,94], and patient and protocol monitoring including hand sanitation (CareAI). CareAI also advertises automated natural language processing [88]. Published peer-reviewed literature on the effectiveness of these models or documentation of the implementation process into clinical care is scarce. Ocuvera includes an overview of pre- to post-implementation fall data with significant differences in fall rates [90]. Inspiren and Virtusense provide case studies and white papers that overview the technology [91,92]. Philips outlines the science behind the algorithms in a series of research articles [93]. Outside of these companies, we identified several studies reporting on the development and application of CV to assist with radiology exams (i.e., X-rays, MRIs, CTs, and PET scans) for abnormalities signaling a disease process such as breast cancer [95,96,97]. The use of CV in radiology and histology is discussed next as these tools are either applied in practice or are closer to application. Table 3 emphasizes key domains of CV advancement in healthcare, detailing the types of images and deep learning models used.

4.1. Radiology

The use of CV in radiology has gained increased attention as it can support timely intervention and enhanced efficiencies within clinical workflows. For example, a recent study demonstrated how CV could support surgeons in diagnosing wrist fractures in pediatric patients [154]. The motivation for this type of CV application is to expedite surgical care in low-resource environments where specialized radiologists are not readily available. Another example of a clinical application is notification software (AIDOC), which is used to detect intracranial hemorrhage [141]. Teleradiology networks can employ this software to expedite stroke workups in critical access hospitals or lower trauma centers. Other CV radiology applications concern detecting anomalies within medical images. A recent study by Lakhani et al. (2017) [99] employed deep models such as 2D-CNN, AlexNet [9], and GoogleNet [227] in their CV approach to detecting tuberculosis in chest radiographs and pulmonary tuberculosis in chest radiographs. Other studies applied ensemble learning for the diagnosis of Alzheimer’s Disease using MRI brain images [139,140,142]. The typical image types utilized for radiology segmentation include X-rays, CT scans (i.e., liver tumor, [138]), MRIs (i.e., brain tumors [149]), and 4D-CT (i.e., brain tissue for stroke workup [148]). While the utilization of deep learning has resulted in precise detection rates in the field of radiology, these approaches require extensive, well-annotated datasets. Without such datasets, deep learning methods may experience overfitting, leading to a reduction in their generalizability [228]. There are a multitude of other examples of CV applications in radiology as shown by recent reviews including emergency radiography, stroke workup, and workflow efficiencies [228,229,230]. The implementation and use of these algorithms have been slower than expected and may be due to a lack of standard user interfaces and differing expectations among clinicians and administrators as reported in a recent study [231].

4.2. Histology

The examination of histological images by pathologists provides diagnostic information crucial for influencing a patient’s clinical outcome. Traditional histological image representation involved extracting texture and color features and employing conventional machine-learning approaches. However, the CV landscape has evolved with the extensive representational power offered by Convolutional Neural Networks (CNNs) [127,134]. For example, Bejonrdi et al. (2017) [112] demonstrated that the application of CNNs for detecting lymph node metastases in breast cancer outperformed eleven pathologists in a simulated time-constrained setting. Tellez et al. (2018) [113] developed a CNN-based CV approach capable of effectively detecting mitosis in Hematoxylin and Eosin whole-slide images . Additionally, Kather et al. (2019) [136] showcased that microsatellite instable tumors in gastrointestinal cancer could be directly predicted from H&E histology using CNNs to classify tumors versus normal tissues. The present challenge involves handling high-resolution histological images, requiring substantial computational resources and extensive training sets. Employing transfer learning and knowledge distillation approaches may partially mitigate this challenge [232].

5. Development and Testing of Computer Vision in the Hospital Setting

Several CV-based tools were identified that had been shown to be effective in the testing and development phases; however, they were not yet put into practice at scale. These include the detection of behaviors and signs related to delirium, pain detection and monitoring, monitoring of sedation depth and signs of patient deterioration, mechanical ventilation, and monitoring of the care setting aimed at improving patient safety and quantifying workload. We provide an overview of these studies in this section.

5.1. Detection and Monitoring of Brain Health

Delirium, a type of acute brain dysfunction, occurs in 50–80% of critically ill patients [233]. An observational pilot study reported that delirious patients had significantly different expressions of facial action units, head pose movement, and other environmental factors such as visitation frequency measured by CV (n = 22) [156]. A different study examining the frequency of caregiver actions reported that delirious patients had more caregiver activity overall, which was most concentrated from 8:00 p.m. to 12:00 a.m. [195]. These observational differences between non-delirious and delirious patients can be automated to aid in the recognition of early warning signs of delirium, the measurement of delirium severity, and aid in subtyping and phenotyping efforts.

5.2. Detection and Quantification of Pain, Agitation, and Level of Sedation

CV has been used to detect pain in a variety of patient populations (infants to aging adults), specific disease states (lung cancer, dementia, chronic back pain, shoulder pain), and after procedures (procedural pain in infants), mostly in community or outpatient settings. A recent scoping review identified one study that tested the feasibility of an automated approach to pain assessments using a deep-learning method in a population of critically ill patients [176,177]. The study tested the accuracy of models to dichotomize pain and rate pain on three levels (0–2), reporting >0.85 accuracy for dichotomized models compared to 0.56 for the three-level model [176].

Automated pain recognition and monitoring have been widely explored in neonatal populations. Several reviews report on CV developed over the past two decades aimed at the recognition of pain, monitoring, and the measurement of intensity [159,163,171,179]. These feasibility studies and developed models have cumulated into a point-of-care mobile health application for procedural pain named PainChek Infant [172]. In a feasibility study of forty infants (age range 2.2–6.9 months), the mobile health application significantly correlated (r = 0.82–0.88, p < 0.0001) with clinician-administered pain scales (Neonatal Facial Coding System and Observer administered Visual Analogue Scale) and demonstrated high interrater reliability (ICC = 0.81–0.97, p < 0.001) and high internal consistency (alpha = 0.82–0.97) [172]. This type of technology could be applied to adult hospitalized patients to improve pain monitoring and assist clinicians with follow-up pain assessments after the administration of analgesics, especially in patient populations that are not able to verbalize their discomfort and pain level. Furthermore, this type of state-of-the-art technology was emphasized by a recent narrative review that discussed the updated definition of pain from the International Association for the Study of Pain and how multidimensional technologies are needed to improve the identification and monitoring of pain [234]. Untreated pain can result in delirium, agitation, hostility, and other adverse consequences such as impaired healing and increased mortality risk [235,236,237]. The use of CV, paired with other artificial intelligence modalities, clinicians, and patients within a model, could improve the proactive recognition and monitoring of pain in hospital environments across populations.

Proof of concept CV models have been developed to recognize and monitor facial and body movements associated with agitation during sedation, such as grimacing. One study used a volunteer to simulate agitation through different levels of grimacing, developing a proof-of-concept algorithm that can be tested in critically ill patients [158]. Another study used a volunteer to simulate limb movement during agitated episodes and then tested this model on five ICU patients. The results of the model were correlated over time with a nurse-administered Riker Sedation Agitation Scale and physiologic signs including heart rate, heart rate variability, systolic blood pressure, and blood pressure variability [157]. Lastly, a study on young children and infants used eye movements to facilitate the measurement of sedation and consciousness levels in young children and infants [168].

5.3. Patient Deterioration

The facial action unit is a comprehensive system that conveys facial movements, subsequently utilized for detecting emotions such as anxiety, stress, fear, or pain [164,166,173]. As presented in reference [175], Giannakakis et al. (2022) illustrated that facial action units are associated with stress levels. This implies that during stressful situations, specific action units, such as cheek raising, lid tightening, and upper lip raising, intensify.

The early warning signs of an impending acute patient deterioration are often subtle and overlooked by busy clinical staff leading to a delay in the escalation of care [238]. To address this limitation, a recent feasibility study examined how subtle changes in facial expressions were associated with a future admission to the intensive care unit. This study used CV to identify facial action units and reported that combinations of the upper face, head position, eye position, lips and jaw, and the lower face were associated with the increased likelihood of admission to the intensive care unit (n = 34 patients) [181]. This algorithm could be used to proactively identify patients at risk of acute deterioration and support early intervention [181]. In a post hoc analysis of these data, a decrease in the number of facial expressions (per time unit) and an increase in the diversity of facial expressions predicted admission to the ICU (AUC = 0.81) [182]. Numerous opportunities exist to expand on this CV algorithm and investigate how other signs, such as the frequency of clinician visits to the patient’s room or the presence of certain respiratory devices [183], could improve early recognition of impending acute deterioration.

5.4. Mechanical Ventilation

CV has been applied to the field of mechanical ventilation to estimate regional lung volumes using light to reconstruct the motion of the lungs and measure the regional pressure distribution [184]. This proof-of-concept model was developed by Zhou et al. using a mannequin that measured and monitored chest expansion with a light projector and cameras. They utilized surface reconstruction of regional chest expansion for their model, which showed good accuracy with an error of 8 mL under 600 mL tidal volume. They compared their methods with other frequently used computational models and reported a 40% reduction in computational costs paired with improved accuracy in their new model. This work needs to be clinically tested and validated.

5.5. Mobility

The use of CV to monitor and document physical activity and early mobilization in critically ill patients was reported in two studies. CV was used in a recent study to develop a tool for real-time patient mobility monitoring in the ICU [187]. Yeung et al. (2019) reported a mean sensitivity of 87.2% and specificity of 89.2% with an AUROC of 0.938 for detecting mobility activities (i.e., get out of bed, get into bed, get out of chair, get into chair). The CV had an accuracy of 68.8% for quantifying the number of healthcare personnel involved in each activity [187]. Another study by Reiter et al. (2016) developed an automated mobility sensor to support the monitoring of patient activity in the surgical ICU. They compared the algorithm’s performance with clinician performance on the identification of physical activity and reported high inter-rater reliability with a weighted Kappa score of 0.86 [186]. These types of models could automate documentation of physical activity in hospital patients, decreasing clinician documentation burden and increasing the accuracy of electronic health records.

5.6. Patient Safety

Several studies have developed models focused on improving patient safety using CV as a monitoring tool. One focus is hand hygiene, an essential component of infection prevention. While compliance is critical for patient safety, monitoring clinician performance is time-consuming. It can be inaccurate as it requires human observation of care procedures both inside and outside the patient room. CV provides an opportunity to automate monitoring. This use case has been demonstrated using depth sensors in a hospital unit and video and depth images in a simulated hospital room [192,220]. Both models achieved sensitivity and specificity greater than 90% in detecting hand hygiene dispenser use and performed better than human observers. In addition to monitoring, future studies could explore how the model could provide real-time feedback to clinicians, or reminders of hand hygiene, leading to further opportunities to improve patient safety.

Surgical procedures and operating rooms (ORs) are the focus of a recent review that highlights how CV could improve patient safety and system efficiencies [198]. A recent study used off-the-shelf camera images to measure the level of situational awareness of surgical teams during timeout procedures in the OR. The model distinguished between teams with good and poor situational awareness, substantiating existing studies in the OR on the use of CV to augment traditional human-based observation assessments [194,198]. Other CV-based models have aided in surgical phase recognition, robot-assisted surgeries, surgical skill assessment, detection of instruments or lesions during surgery, enhanced visual displays in surgeries, and navigation during surgical procedures [194,198,199].

5.7. Quantification of Workload in the ICU

A few studies have demonstrated the feasibility of ambient monitoring of caregiving activities in the ICU using CV. The first study completed a task recognition of caregiving activities over 5.5 h with an accuracy of 70% in a pediatric ICU [226]. The recognized tasks included documentation, observation, and procedures, among others [226]. These tasks were then examined over time for trends. The second study recognized and then categorized patient and caregiver movement (i.e., workload) over the course of 24 h in an adult ICU [195]. The study reported significant differences in patient and caregiver movement throughout the 24 h period, between intubated and non-intubated, delirious and non-delirious, and settings (high dependency unit vs. ICU). Another study developed and validated a Clinical Activity Tracking System (CATS), testing its use in both a simulated and actual ICU room. Like the previous study, more caregiving activity was reported between 7:00 a.m. to 11:00 p.m. compared to 11:00 p.m. to 7:00 a.m. [221]. This system was validated against manual observation with a correlation of r = 0.882 [222]. Improving the quantification and understanding of caregiver workload and function throughout a time period was the focus of these studies as existing monitoring systems are resource intensive and subjective [195,222,226].

6. Computer Vision in Outpatient and Community Settings

CV has been developed in the community and outpatient settings to detect, measure, and monitor patient symptoms, signs of underlying illness or disease, and patient events such as falls. A recent survey identified over thirty different CV models developed to automatically detect underlying symptoms related to medical diagnoses [162]. These include monitoring vascular pulse, pain, facial paralysis, neurologic conditions, neurodevelopmental disorders, psychiatric disorders (i.e., attention deficit hyperactivity disorder (ADHD), autism, depression), and mandibular disorders, among others [162]. This type of computer-assisted diagnosis ranges from the detection of facial shape, facial features, and facial muscular response to voluntary emotion and facial motion analysis. In this section, we briefly overview the subject areas with the highest level of development.

6.1. Pain Detection and Monitoring in Community Settings

The automatic recognition and monitoring of pain in the community and outpatient setting is well-established. A systematic review in 2021 (n = 76 studies) reported on the use of CV to diagnose and treat chronic lower back pain [167]. Also, in 2021, a survey of automated detection of pain summarized studies and CV across populations, providing an in-depth overview of datasets, learning approaches, spatial representations, and machine learning methods used [170]. A narrative review highlighted the state-of-the-art technology published on pain detection and monitoring [234]. Lastly, a scoping review reported on several community and outpatient models to detect and monitor pain using CV [177]. A recent study used recent developments in CV automated segmentation and deep learning along with the updated definition of pain from the International Association for the Study of Pain to develop a sentiment analysis system within a Smart Healthcare System for pain monitoring [180]. This CV model and most models mentioned in the reviews need to be prospectively tested [170].

6.2. Neurologic, Neurodevelopmental, and Psychiatric Disorders

Autism spectrum disorder, a neurodevelopmental disorder, is increasingly prevalent in pediatric populations [239,240]. Time to diagnose and receipt of needed care and resources can be delayed by months, leading to deficiencies in care. To address this gap in clinical care, an interactive, mobile health technology was developed through a series of studies that uses CV in a closed-loop system to automatically code signs and behaviors associated with autism [202]. The intent is for parents to use this technology at home to improve the early recognition of autism and access to needed resources and care. This use case and framework could be expanded to include additional neurodevelopmental disorders with similar impact. A different study developed a CV model that could differentiate between individuals with autism spectrum disorders, ADHD, and healthy controls [200]. Head motion and facial expression were used to distinguish between these disorders [200].

The detection and severity of depression have been automated using CV in a few studies [174,178]. Depression is one of the most common psychiatric disorders and is often underrecognized, leading to delays in patient care and decreased quality of life. The application of CV to detect and measure depression could have widespread implications and lead to the early detection and allocation of resources to improve patient care. For example, such an algorithm could be applied in telehealth outpatient visits where depression may not otherwise be discussed or during a hospitalization where situational depression can increase patient stress and lead to prolonged hospitalization and readmissions. In addition to depression, one study explored the accuracy between clinician-rated and computerized recognition of blunted facial affect [209,241].

To detect facial weakness, Zhuang et al. (2020) developed CV using images and videos of people collected from Google Images and YouTube videos [203]. Six neurologists used a rating scale (likelihood of facial weakness) to label the images. Following model development, the authors concluded that the combination of landmark and intensity features led to the highest accuracy. The ability to detect the shape (i.e., landmark) and texture (i.e., gradient intensity) was contributed by the neurologists who labeled the images [203]. Facial palsy detection has been the focus of several studies. Guarin et al. (2018) retrained a CV facial landmark detection model that was previously trained using healthy individuals with facial palsy patients to develop a more accurate model [201]. Ruiter et al. (2023) studied the use of facial recognition software to identify patterns of facial weakness, and a deep learning model was trained for classification and disease severity in a cohort of myasthenia gravis patients [204]. The images used for training were collected in the outpatient setting. The area under the curve for diagnosis of myasthenia gravis was 0.82 and 0.88 for disease severity [204]. Another recent study assessed the intensity of facial weakness in patients with facial palsy. The intensity was classified into three levels by focusing on specific facial landmarks. The accuracy of detecting palsy patients was 95.61%. The accuracy for class assignment (intensity level) was 95.58% [205].

To improve virtual interactions and patient education efforts, a CV algorithm detected changes in facial expressions indicative of confusion and compared its accuracy to forty medical students [189]. The accuracy of the human raters in identifying confusion was 41% compared to 72% accuracy by the CV algorithm using four different facial action units (lowered brow, raised cheek, stretched lips, and lip corner pulled).

6.3. Falls

With the increasing population of aging adults and their preferences to live at home, falls at home have contributed to a significant increase in the risk of morbidity and mortality in the population [242]. To address the increasing incidence of falls, CV technology has been developed to detect and monitor for risks and signs of falls in the home environment. A literature review completed in 2023 surveyed the use of ambient sensors to detect falls in the home environment. While some studies have used CV to detect falls, other systems are a combination, or hybrid, of wearable and ambient sensor technologies [216]. One example of CV fall detection was developed by Joshi et al. (2017) [215]. A single camera was used to detect four different movements indicative of a fall event, and notifications were sent via email to the designated individual if a fall was detected. The CV model achieved an accuracy of 91.8% [215]. This model, and the majority identified by the recent review, focused on the detection of falls and not on the prediction or identification of early warning signs [216].

7. Journey Mapping and Future Computer Vision Application

To investigate how CV could be applied in the hospital setting, the temporal journeys of the clinician and patient through the healthcare system were mapped and analyzed for opportunities. The overview of the journey map is illustrated in Figure 1. Many identified opportunities to incorporate CV into patient care and the healthcare system overlapped. These identified opportunities are displayed in Figure 2. For example, the use of facial recognition technology to automate patient check-ins in the outpatient and inpatient settings improves the efficiency of the system while also providing a smoother process for the patient. The monitoring of parking lots for available patient parking and the use of interactive displays to provide directions to clinic or hospital appointments benefit the system and the patient. Individuals who were responsible for patient check-in could instead meet, greet, and accompany a patient on their clinic or hospital journey to improve the coordination of care. The integration of CV into clinic and hospital rooms could improve the monitoring of patient conditions, resulting in early detection of acute deterioration or patient discomfort, assist with diagnostic testing, provide real-time feedback on the effectiveness of interventions to ameliorate patient discomfort, and complete auto-documentation of patient care procedures. These examples benefit the clinician, patient, and efficiency of the system. When taking ideas such as these from concept to development, it is crucial to identify who the end-user is, who benefits from the model or service (which may be different than the end-user), what efficiencies are improved, and what unintended consequences may result once the algorithm is in production. Additionally, the privacy, safety, and ethical principles and values must be considered. These are discussed next.

8. Summary and Implications for Computer Vision Use in Healthcare

This rapid review identified a few applications of CV in the hospital setting. Most of the CV in hospitals is still in the feasibility and proof-of-concept stage, lagging behind other healthcare settings and industries. This gap in CV application in hospital settings is likely due to the availability of public datasets to train and develop models, data privacy and security needs, ethical considerations, and barriers inherent within a complex system.. We will discuss these limitations, including ethical and economic considerations. Our journey mapping exercise identified many future opportunities for CV in the hospital and outpatient settings. As future opportunities are considered, it is critical to understand what problem the CV aims to solve, the stakeholders involved in using it, how privacy, safety, and ethical concerns are addressed, and the potential unintended consequences of its use in these settings.

9. Data Privacy and Safety Considerations

Before using a CV model in the hospital setting, it is crucial to consider the data privacy and patient safety requirements. Privacy has multiple meanings that depend on the stakeholder’s perspective [243]. A recently published meta-synthesis highlights perspectives patients and health professionals share on the benefits and risks of artificial intelligence (AI) in healthcare [244]. A theme identified by patients and clinicians involved the importance of data security and use. Both stakeholder groups shared how the storage and protection of these data were essential to prevent records from being hacked and/or leaked. Further, the meta-syntheses reported that the unwarranted use of these data for commercial purposes was a significant concern [244]. These concerns are related to overall AI use in healthcare and are not specific to CV.

Privacy and data management concerns unique to CV center on the nature of ambient intelligence, how it is applied, and what information is captured in the video images [245]. As a recent perspectives article highlighted, it is important to collect the minimum amount of information needed to train and use the model [246]. This could mean using black and white images instead of color or blurring or removing unnecessary pieces that do not contribute needed information. It is also important to consider the inclusion of individuals who are not the focus of the model. For example, the patient may be the focus of data collection, but clinicians and visitors in the hospital room may also be included in the video capture. Everyone and everything within the image field is included in the data collection. It is important that all individuals who may enter the room are either consented (i.e., patient) or informed (i.e., clinician) of the data collection and what privacy protections are in place [246]. If the collected data are being considered for other purposes and the bystanders are reidentified, there should be a process in place to notify and gain consent of those individuals. These concerns emphasize the importance of data management (storage and use) within research studies, production teams, and the healthcare system.

Informed consent or assent of data collection detailing the why, how, and when behind the collection and the privacy protections in place is imperative to complete, particularly when capturing images of patients in a very vulnerable state. Decision points can be built into the consent process, allowing the patient to opt in to use their image data for other purposes. For example, an opt-in for sharing their images with external institutions or scientists can help facilitate the development of public image datasets that may accelerate CV development. It is important to clarify that once a model is in production, images do not need to be retained as it can operate as a closed-loop system. This may limit the transparency, or the ability to review the algorithm to understand the output, but it does improve privacy protections.

Safety considerations for CV in hospitals are multi-factorial. It is essential to consider the end-user of the model. How will the end-user use the information provided by the model in their decision-making? Who is responsible for maintaining the model to ensure its accuracy? CV models can improve patient and clinician safety. For example, a model could recognize the early warning signs of workplace violence and notify clinicians to improve their situational awareness and implement mitigation measures to prevent verbal or physical abuse. Another example that proof-of-concept models have demonstrated in the operating room setting is the detection of missed care, poor situational awareness, or procedure errors. Both examples would improve patient and clinician safety [194,198]. On the other hand, CV models could decrease the safety of patient care. Previous studies have shown how bias is readily introduced into models if the training data are not representative of a diverse population [247]. These biases can lead to embedded stereotypes, discrimination, and exclusion of certain patients [248]. Current deep learning models employed in CV tasks directly derive their knowledge from the training data. Consequently, the performance of the model is heavily influenced by the distribution of the training data. If bias is present in the training set, the model identifies it as a significant context, impacting the generative capabilities of the model for unseen examples. Numerous studies in the literature have explored methods to extract bias-independent feature embeddings, resulting in enhanced performance of neural networks when trained on biased datasets [249,250,251,252]. These methods can be integrated into model development along with representative sampling to minimize the risk of bias.

9.1. Ethical Considerations in Computer Vision

The use of CV in healthcare has broad ethical considerations that need to be addressed as algorithms and models are designed, developed, tested, and deployed. Each stage of the algorithm, including maintenance, should be considered, and continually re-evaluated to ensure the medical ethics of autonomy, beneficence, non-maleficence, and justice are upheld for the end-user. It is also important to define and consider who is the end-user (i.e., patient, a decision-maker for the patient, clinicians, administration, support staff) and proactively address ethical concerns [253]. Depending on the end-user and circumstances concerning the use of the technology, different ethical principles or values may need to be considered. A recent scoping review identified eleven different ethical principles and values on the use of artificial intelligence [254]. These include transparency, justice and fairness, non-maleficence, responsibility, privacy, beneficence, freedom and autonomy, trust, sustainability, and solidarity [254]. Similar themes along with societal implications were summarized in a recent narrative review by Elendu et al. (2023). Inherent within these principles is the importance of placing the patient at the forefront and ensuring that every patient has a fair and equitable opportunity to benefit from the technology [255]. This priority encapsulates the responsibility to ensure the model was built on a representative dataset that can be generalized broadly, i.e., any risk of bias, discrimination, and stereotyping is minimized, and the welfare of the patient is prioritized. To accomplish these goals, it is important to partner with a medical ethicist, sociologist, or patient-community stakeholder group to evaluate the technology from multiple viewpoints within an ethical framework [255]. Questions evaluating the intent of the model, who will use the model and who will benefit from the model, how the model will be implemented and maintained, the acceptability and usability of the model, the transparency of the algorithm and resulting decisions, who holds the ultimate responsibility for performance, and how unintended consequences will be identified, tracked, and evaluated are just a few topics that are essential to work through prior to the inception of the project.

9.2. Economic Considerations

Job replacement and loss are significant concerns regarding the application of artificial intelligence, including CV [256]. To ensure safety and ethical considerations are followed, it is important to build “human in the loop” models that use CV as a tool to inform decisions; however, the human is the critical decision maker on how to use the information provided [257]. CV should enhance processes to improve decision-making, efficiencies within the system, and patient outcomes. It should not replace humans. A framework for evaluating the implications of automation in artificial intelligence was shared in a recent working paper by the National Bureau of Economic Research [256]. This paper discusses the balance between potential displacement and increased demand in non-automated labor tasks that could enhance the human experience. This type of framework is important for healthcare systems to use as the adoption of CV is considered. A recent review studied how artificial intelligence models could result in healthcare cost savings over several years [258]. Although they reported significant cost savings with the use of artificial intelligence for diagnosis and treatment, they highlighted that a major disadvantage to artificial intelligence is the prioritization of accuracy over clinical evaluation and scientific validation.

9.3. Acceptability and Readiness for Computer Vision

The implementation of artificial intelligence in healthcare is impacted by public opinion. In a comprehensive review published by Bekbolatova et al. (2024), the results of Pew Research surveys are highlighted, emphasizing the correlation between familiarity with artificial intelligence and the expressed potential for it to benefit healthcare [259]. While readiness for artificial intelligence is growing, the need to address specific knowledge gaps within the community to increase familiarity with artificial intelligence tools is also growing. Parallel efforts are needed to develop a comprehensive understanding of legislation and guidelines for the responsible use of artificial intelligence in healthcare [259]. A recent 10-question survey focused on the use of CV in healthcare was completed by 233 providers and 50 patients and family members. The potential for the use of CV data in lawsuits (81% clinicians) and privacy breaches (50% patients) were major areas of concern [245]. Future work should focus on further exploring provider, patient, and public perceptions and knowledge needs on CV.

9.4. Data Needs and Considerations

Despite the impressive performance of deep learning models on general datasets, achieving accurate results in the medical domain remains challenging. This difficulty arises mainly from the substantial number of parameters in each layer of CNN models. When a sufficient amount of data is available, as found in large CV datasets like ImageNet [260] (1 million images), the model is better able to generalize and overfitting is mitigated. Acquiring a sufficient sample of labeled data for model training within the healthcare system to produce models that are generalizable and statistically fit can be prohibitively expensive. One potential solution to address overfitting is employing models with fewer parameters [261,262]. However, these compact models often struggle to capture intricate features of the dataset, resulting in reduced detection or classification accuracy. To cope with the scarcity of labeled data, data augmentation is used to generate additional training data [114,263]. While this approach partially resolves the problem, the repetition of images may lead to overfitting. Another strategy involves utilizing transfer learning, where the model is initially trained on a large dataset with available labels and then fine-tuned on the smaller medical datasets [115,264]. This approach aims to leverage pre-existing knowledge from the larger dataset to enhance the performance of the medical data. Each of these solutions is a trade-off in model performance and must be weighed in the development and testing stages. Another option to scale the development of CV models in medicine is to use available deep-learning techniques to classify, segment, and detect specific structures or abnormalities. Detectron2 [265], developed by Facebook AI Research (FAIR), offers a high-quality implementation of state-of-the-art object detection and segmentation models. MMDetection [266], another open-source PyTorch library, facilitates the utilization of pre-trained state-of-the-art models and their training on medical datasets. Torchvision, an official PyTorch library, provides general models that can be tailored for use with medical domain datasets. OpenPose [267] stands out as one of the initial open-source and real-time multi-person models designed to identify human body structures, body key points, as well as facial and hand features in visual footage. These are a sampling of available deep learning techniques, and it is important to consider their development and validation prior to use to develop subsequent CV models. Lastly, the training of CV models demands significant computing resources and expertise, including GPUs and AI specialists, which may not be readily available at every institution due to resource constrained- environments. In light of these challenges, many clinician-scientists opt to use traditional machine learning methods, like logistic regression, that limit model development. Future CV studies may explore how federated learning could expand datasets and computational resources [268].

9.5. Computer Vision Datasets

Object detection datasets typically consist of images with annotated bounding boxes and segmented areas depicting objects of interest. The Pascal Visual Object Classes (VOC) [269] dataset stands out as a well-known benchmark, featuring 5000 images across 20 object classes with 12,000 annotations. Another widely used benchmark, the Common Objects in Context (COCO) [270] dataset, offers a substantial dataset of 164,000 images covering 80 object classes, accompanied by 897,000 annotations, encompassing both indoor and outdoor environments. However, in the context of hospital environments, there is currently a lack of sufficient datasets capturing diverse objects under various conditions. For example, the MCIndoor2000 [223] dataset includes 2055 images of three object classes including doors, stairs, and hospital signs. The MYNursingHome [224] dataset focuses on object classification and detection in nursing homes, containing 37,500 images featuring objects commonly found in elderly home care centers, such as toilet seats, tables, and wheelchairs. The Hospital Indoor Object Detection (HIOD) dataset comprises 4417 images covering 56 object categories, including items like surgical lights, IV poles, and bedside monitors, with a total of 51,869 annotations. On average, the images in this dataset contain 10 objects spanning 6.8 object categories [225]. There are several datasets available for medical imaging purposes. The website https://www.cancerimagingarchive.net/browse-collections/, accessed on 16 February 2024, holds several publicly available datasets.

This dearth of public datasets is illustrated best by examining the large amount of literature and models developed in neonatal populations. The cumulation of this work over the past two decades has led to a point-of-care mobile application for procedural pain that has passed the feasibility stage [172]. This type of technology could greatly improve pain management not only in neonatal populations but also in adult populations. Several studies aimed at identifying chronic or outpatient pain have used the UNBC-McMaster Pain Archive [160,161,165]. While these images have aided in the development of automated models for pain detection and monitoring in adult outpatient populations, they have not facilitated the expansion of such models into the acute care setting. Public datasets of hospitalized patients across age groups to facilitate this type of modeling are needed [172].

9.6. Limitations of This Review

This review used a broad search strategy. That being said, a systematic review approach was not used, and it is possible that studies involving CV were not included. The CV field is rapidly expanding. Due to that expansion, this review is limited in scope and strove to highlight advances in CV for healthcare clinicians and clinician-scientists (i.e., end-users of technology).

10. Conclusions

This review summarizes the application of CV in healthcare, and we highlight important considerations for the use of CV in healthcare including privacy, safety, and ethical factors. The overall goal is to improve the patient and clinician journey within the industry. There continues to be a paucity of data to train CV and for it to catch up to other industries in its application; substantial work is needed to overcome to barriers of privacy and safety considerations.

Funding

Heidi Lindroth is funded by the National Institute of Health, National Institute on Aging, Career Development Award listed, NIA/NIH K23AG076662.

Institutional Review Board Statement

This review is based on published research and did not collect any patient identifiers. IRB approval was not needed.

Informed Consent Statement

Not applicable. No prospective data collection was completed for this review paper.

Data Availability Statement

Search results and datasheets of extracted data are available upon request to the corresponding author.

Conflicts of Interest

All other authors declare no conflict of interest.

References

McDonald, C.J.; Overhage, J.M.; Tierney, W.M.; Dexter, P.R.; Martin, D.K.; Suico, J.G.; Zafar, A.; Schadow, G.; Blevins, L.; Glazener, T.; et al. The Regenstrief Medical Record System: A quarter century experience. Int. J. Med. Inform. 1999, 54, 225–253. [Google Scholar] [CrossRef]
Evans, R.S. Electronic Health Records: Then, Now, and in the Future. Yearb. Med. Inform. 2016, 25 (Suppl. S1), S48–S61. [Google Scholar] [CrossRef]
Tobin, M.J. Why Physiology Is Critical to the Practice of Medicine: A 40-year Personal Perspective. Clin. Chest Med. 2019, 40, 243–257. [Google Scholar] [CrossRef] [PubMed]
Prince, S.J. Computer vision: Models, Learning, and Inference; Cambridge University Press: Cambridge, UK, 2012. [Google Scholar]
Burton, A.; Radford, J. Thinking in Perspective: Critical Essays in the Study of Thought Processes; Taylor & Francis: Oxfordshire, UK, 2022. [Google Scholar]
Horn, B.K.; Schunck, B.G. Determining optical flow. Artif. Intell. 1981, 17, 185–203. [Google Scholar] [CrossRef]
Belongie, S.; Carson, C.; Greenspan, H.; Malik, J. Color-and texture-based image segmentation using EM and its application to content-based image retrieval. In Proceedings of the Sixth International Conference on Computer Vision, Bombay, India, 7 January 1998; IEEE: Piscataway, NJ, USA, 1998; pp. 675–682. [Google Scholar]
Kirby, M.; Sirovich, L. Application of the Karhunen-Loeve procedure for the characterization of human faces. IEEE Trans. Pattern Anal. Mach. Intell. 1990, 12, 103–108. [Google Scholar] [CrossRef]
Krizhevsky, A.; Sutskever, I.; Hinton, G.E. Imagenet classification with deep convolutional neural networks. Adv. Neural Inf. Process. Syst. 2012, 25, 1097–1105. [Google Scholar] [CrossRef]
Hansen, U.S. 6 Use Cases for Computer Vision in Insurance. Available online: https://encord.com/blog/computer-vision-use-cases-insurance/ (accessed on 5 January 2024).
Dong, S.; Wang, P.; Abbas, K. A survey on deep learning and its applications. Comput. Sci. Rev. 2021, 40, 100379. [Google Scholar] [CrossRef]
Cui, Y.; Chen, R.; Chu, W.; Chen, L.; Tian, D.; Li, Y.; Cao, D. Deep learning for image and point cloud fusion in autonomous driving: A review. IEEE Trans. Intell. Transp. Syst. 2021, 23, 722–739. [Google Scholar] [CrossRef]
Nawaratne, R.; Alahakoon, D.; De Silva, D.; Yu, X. Spatiotemporal anomaly detection using deep learning for real-time video surveillance. IEEE Trans. Ind. Inform. 2019, 16, 393–402. [Google Scholar] [CrossRef]
Lad, A.; Kanaujia, P.; Soumya, P.; Solanki, Y. Computer Vision enabled Adaptive Speed Limit Control for Vehicle Safety. In Proceedings of the 2021 International Conference on Artificial Intelligence and Machine Vision (AIMV), Gandhinagar, India, 24–26 September 2021; pp. 1–5. [Google Scholar]
Sinshaw, N.T.; Assefa, B.G.; Mohapatra, S.K.; Beyene, A.M. Applications of Computer Vision on Automatic Potato Plant Disease Detection: A Systematic Literature Review. Comput. Intell. Neurosci. 2022, 2022, 7186687. [Google Scholar] [CrossRef]
Szeliski, R. Computer Vision: Algorithms and Applications; Springer Nature: Berlin/Heidelberg, Germany, 2022. [Google Scholar]
Le, Q.V. Building high-level features using large scale unsupervised learning. In Proceedings of the 2013 IEEE International Conference on Acoustics, Speech and Signal Processing, Vancouver, BC, Canada, 26–31 May 2013; IEEE: Iscataway, NJ, USA, 2013; pp. 8595–8598. [Google Scholar]
Baldi, P. Autoencoders, unsupervised learning, and deep architectures. In Proceedings of the ICML Workshop on Unsupervised and Transfer Learning, Bellevue, WA, USA, 2 July 2011; pp. 37–49. [Google Scholar]
Srivastava, N.; Mansimov, E.; Salakhudinov, R. Unsupervised learning of video representations using lstms. In Proceedings of the International Conference on Machine Learning, Lille, France, 6–11 July 2015; PMLR: London, UK, 2015; pp. 843–852. [Google Scholar]
Liu, L.; Ouyang, W.; Wang, X.; Fieguth, P.; Chen, J.; Liu, X.; Pietikäinen, M. Deep learning for generic object detection: A survey. Int. J. Comput. Vis. 2020, 128, 261–318. [Google Scholar] [CrossRef]
Tan, M.; Pang, R.; Le, Q.V. Efficientdet: Scalable and efficient object detection. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Seattle, WA, USA, 13–19 June 2020; pp. 10781–10790. [Google Scholar]
Jaeger, P.F.; Kohl, S.A.; Bickelhaupt, S.; Isensee, F.; Kuder, T.A.; Schlemmer, H.-P.; Maier-Hein, K.H. Retina U-Net: Embarrassingly simple exploitation of segmentation supervision for medical object detection. In Proceedings of the Machine Learning for Health Workshop, Virtual, 11 December 2020; PMLR: London, UK, 2020; pp. 171–183. [Google Scholar]
Jogin, M.; Madhulika, M.; Divya, G.; Meghana, R.; Apoorva, S. Feature extraction using convolution neural networks (CNN) and deep learning. In Proceedings of the 2018 3rd IEEE International Conference on Recent Trends in Electronics, Information & Communication Technology (RTEICT), Bangalore, India, 18–19 May 2018; IEEE: Piscataway, NJ, USA; pp. 2319–2323. [Google Scholar]
Cai, G.; Wei, X.; Li, Y. Privacy-preserving CNN feature extraction and retrieval over medical images. Int. J. Intell. Syst. 2022, 37, 9267–9289. [Google Scholar] [CrossRef]
Yang, A.; Yang, X.; Wu, W.; Liu, H.; Zhuansun, Y. Research on feature extraction of tumor image based on convolutional neural network. IEEE Access 2019, 7, 24204–24213. [Google Scholar] [CrossRef]
Tajbakhsh, N.; Shin, J.Y.; Gurudu, S.R.; Todd Hurst, R.; Kendall, C.B.; Gotway, M.B.; Liang, J. On the necessity of fine-tuned convolutional neural networks for medical imaging. In Deep Learning and Convolutional Neural Networks for Medical Image Computing: Precision Medicine, High Performance and Large-Scale Datasets; Springer: Berlin/Heidelberg, Germany, 2017; pp. 181–193. [Google Scholar]
Dutta, P.; Upadhyay, P.; De, M.; Khalkar, R. Medical image analysis using deep convolutional neural networks: CNN architectures and transfer learning. In Proceedings of the 2020 International Conference on Inventive Computation Technologies (ICICT), Coimbatore, India, 26–28 February 2020; IEEE: Piscataway, NJ, USA; pp. 175–180. [Google Scholar]
Lee, K.-S.; Kim, J.Y.; Jeon, E.-t.; Choi, W.S.; Kim, N.H.; Lee, K.Y. Evaluation of scalability and degree of fine-tuning of deep convolutional neural networks for COVID-19 screening on chest X-ray images using explainable deep-learning algorithm. J. Pers. Med. 2020, 10, 213. [Google Scholar] [CrossRef] [PubMed]
Kesav, N.; Jibukumar, M. Efficient and low complex architecture for detection and classification of Brain Tumor using RCNN with Two Channel CNN. J. King Saud Univ.-Comput. Inf. Sci. 2022, 34, 6229–6242. [Google Scholar] [CrossRef]
Deepak, S.; Ameer, P. Brain tumor classification using deep CNN features via transfer learning. Comput. Biol. Med. 2019, 111, 103345. [Google Scholar] [CrossRef] [PubMed]
Chiao, J.-Y.; Chen, K.-Y.; Liao, K.Y.-K.; Hsieh, P.-H.; Zhang, G.; Huang, T.-C. Detection and classification the breast tumors using mask R-CNN on sonograms. Medicine 2019, 98, e15200. [Google Scholar] [CrossRef] [PubMed]
Jorda, M.; Valero-Lara, P.; Pena, A.J. Performance evaluation of cudnn convolution algorithms on nvidia volta gpus. IEEE Access 2019, 7, 70461–70473. [Google Scholar] [CrossRef]
Ahramovich, A. Top Applications for Computer Vision in Sports. Available online: https://builtin.com/articles/computer-vision-sports (accessed on 5 January 2024).
Hao, N.; Ruan, S.; Song, Y.; Chen, J.; Tian, L. The Establishment of a precise intelligent evaluation system for sports events: Diving. Heliyon 2023, 9, e21361. [Google Scholar] [CrossRef]
Rodriguez-Ortega, Y.; Ballesteros, D.M.; Renza, D. Copy-Move Forgery Detection (CMFD) Using Deep Learning for Image and Video Forensics. J. Imaging 2021, 7, 59. [Google Scholar] [CrossRef]
Tyagi, S.; Yadav, D. ForensicNet: Modern convolutional neural network-based image forgery detection network. J. Forensic Sci. 2023, 68, 461–469. [Google Scholar] [CrossRef] [PubMed]
Auberson, M.; Baechler, S.; Zasso, M.; Genessay, T.; Patiny, L.; Esseiva, P. Development of a systematic computer vision-based method to analyse and compare images of false identity documents for forensic intelligence purposes-Part I: Acquisition, calibration and validation issues. Forensic Sci. Int. 2016, 260, 74–84. [Google Scholar] [CrossRef] [PubMed]
Story, D.; Kacira, M. Design and implementation of a computer vision-guided greenhouse crop diagnostics system. Mach. Vis. Appl. 2015, 26, 495–506. [Google Scholar] [CrossRef]
Liu, J.; Wang, X. Plant diseases and pests detection based on deep learning: A review. Plant Methods 2021, 17, 22. [Google Scholar] [CrossRef] [PubMed]
Hussain, M.; Al-Aqrabi, H.; Munawar, M.; Hill, R.; Alsboui, T. Domain Feature Mapping with YOLOv7 for Automated Edge-Based Pallet Racking Inspections. Sensors 2022, 22, 6927. [Google Scholar] [CrossRef] [PubMed]
Panahi, R.; Louis, J.; Podder, A.; Swanson, C.; Pless, S. Bottleneck Detection in Modular Construction Factories Using Computer Vision. Sensors 2023, 23, 3982. [Google Scholar] [CrossRef]
Masalkhi, M.; Waisberg, E.; Ong, J.; Zaman, N.; Sarker, P.; Lee, A.G.; Tavakkoli, A. Apple Vision Pro for Ophthalmology and Medicine. Ann. Biomed. Eng. 2023, 51, 2643–2646. [Google Scholar] [CrossRef] [PubMed]
Nakamura, R.; Izutsu, M.; Hatakeyama, S. Estimation Method of Clothes Size for Virtual Fitting Room with Kinect Sensor. In Proceedings of the 2013 IEEE International Conference on Systems, Man, and Cybernetics, Manchester, UK, 13–16 October 2013; IEEE: Piscataway, NJ, USA, 2013; pp. 3733–3738. [Google Scholar]
Yuan, M.; Khan, I.R.; Farbiz, F.; Yao, S.; Niswar, A.; Foo, M.H. A Mixed Reality Virtual Clothes Try-On System. IEEE Trans. Multimed. 2013, 15, 1958–1968. [Google Scholar] [CrossRef]
Zhang, W.; Begole, B.; Chu, M.; Liu, J.; Yee, N. Real-time clothes comparison based on multi-view vision. In Proceedings of the 2008 Second ACM/IEEE International Conference on Distributed Smart Cameras, Palo Alto, CA, USA, 7–11 September 2008; IEEE: Piscataway, NJ, USA, 2008; pp. 1–10. [Google Scholar]
Budrionis, A.; Plikynas, D.; Daniušis, P.; Indrulionis, A. Smartphone-based computer vision travelling aids for blind and visually impaired individuals: A systematic review. Assist. Technol. 2022, 34, 178–194. [Google Scholar] [CrossRef]
Tapu, R.; Mocanu, B.; Zaharia, T. A computer vision system that ensure the autonomous navigation of blind people. In Proceedings of the 2013 E-Health and Bioengineering Conference (EHB), Iasi, Romania, 21–23 November 2013; IEEE: Piscataway, NJ, USA, 2013; pp. 1–4. [Google Scholar]
Sivan, S.; Darsan, G. Computer vision based assistive technology for blind and visually impaired people. In Proceedings of Proceedings of the 7th International Conference on Computing Communication and Networking Technologies, Dallas, TX, USA, 6–8 July 2016; pp. 1–8. [Google Scholar]
Kraus, M. Keeping Track of Animals in the Wild with Computer Vision. Available online: https://www.vantage-ai.com/en/blog/keeping-track-of-animals-in-the-wild-with-computer-vision (accessed on 5 January 2024).
Boesch, G. Animal Monitoring with Computer Vision—Case Study. Available online: https://viso.ai/applications/animal-monitoring/ (accessed on 10 January 2024).
Spratt, E.L.; Elgammal, A. Computational beauty: Aesthetic judgment at the intersection of art and science. In Proceedings of the Computer Vision-ECCV 2014 Workshops, Zurich, Switzerland, 6–7, 12 September 2014; Proceedings, Part I 13. Springer: Berlin/Heidelberg, Germany, 2014; pp. 35–53. [Google Scholar]
Klingler, N. Viso Suite Guide: Develop a Computer Vision Parking Lot Occupancy Application—Viso.ai. Available online: https://viso.ai/product/computer-vision-parking-lot-occupancy-tutorial/ (accessed on 15 July 2023).
Vasluianu, F.; Timofte, R. Efficient video enhancement transformer. In Proceedings of the 2022 IEEE International Conference on Image Processing (ICIP), Bordeaux France, 16–19 October 2022; IEEE: Piscataway, NJ, USA, 2022; pp. 4068–4072. [Google Scholar]
Huihui, Y.; Daoliang, L.; Yingyi, C. A state-of-the-art review of image motion deblurring techniques in precision agriculture. Heliyon 2023, 9, e17332. [Google Scholar] [CrossRef]
Alsabhan, W. Student Cheating Detection in Higher Education by Implementing Machine Learning and LSTM Techniques. Sensors 2023, 23, 4149. [Google Scholar] [CrossRef]
Thompson, B. ActionVFX|How To Transform Real-World Objects Into 3D Assets. Available online: https://www.actionvfx.com/blog/how-to-transform-real-world-objects-into-3d-assets (accessed on 15 December 2023).
Combating Food Waste Using AI and Computer Vision—Cogniphi. Available online: https://cogniphi.com/combating-food-waste-using-ai-and-computer-vision/ (accessed on 15 July 2023).
An Introduction to the Kinect Sensor|Microsoft Press Store. Available online: https://www.microsoftpressstore.com/articles/article.aspx?p=2201646 (accessed on 10 January 2024).
Le, N.V.; Qarmout, M.; Zhang, Y.; Zhou, H.; Yang, C. Hand Gesture Recognition System for Games. In Proceedings of the 2021 IEEE Asia-Pacific Conference on Computer Science and Data Engineering (CSDE), Brisbane, Australia, 8–10 December 2021; IEEE: Piscataway, NJ, USA, 2021; pp. 1–6. [Google Scholar]
Nodado, J.T.G.; Morales, H.C.P.; Abugan, M.A.P.; Olisea, J.L.; Aralar, A.C.; Loresco, P.J.M. Intelligent traffic light system using computer vision with android monitoring and control. In Proceedings of the TENCON 2018—2018 IEEE Region 10 Conference, Jeju Island, Republic of Korea, 28–31 October 2018; IEEE: Piscataway, NJ, USA, 2018; pp. 2461–2466. [Google Scholar]
Rosebrock, A. Detecting Natural Disasters with Keras and Deep Learning—PyImageSearch. Available online: https://pyimagesearch.com/2019/11/11/detecting-natural-disasters-with-keras-and-deep-learning/ (accessed on 5 January 2024).
Yilmaz, H. Top Applications of Computer Vision in Insurance (2022 Guide). Available online: https://www.plugger.ai/blog/top-applications-of-computer-vision-in-insurance-2022-guide (accessed on 5 January 2024).
Stefanovskyi, O. Computer Vision in Insurance: Vehicle Damage Assessment Case. Available online: https://intelliarts.com/blog/computer-vision-in-insurance-vehicle-damage-assessment/ (accessed on 5 January 2024).
Thakkar, P.; Patel, D.; Hirpara, I.; Jagani, J.; Patel, S.; Shah, M.; Kshirsagar, A. A comprehensive review on computer vision and fuzzy logic in forensic science application. Ann. Data Sci. 2023, 10, 761–785. [Google Scholar] [CrossRef]
Facial Recognition in a Crowd. Available online: https://kintronics.com/solutions/ip-camera-systems/facial-recognition/ (accessed on 10 January 2024).
Identity Verification with Deep Learning: ID-Selfie Matching Method. Available online: https://medium.com/coinmonks/identity-verification-with-deep-learning-id-selfie-matching-method-be56d72be632 (accessed on 10 January 2024).
Wiggers, K. Ambient’s Computer Vision Detects Dangerous Behaviors, but Raises Privacy Concerns. Available online: https://venturebeat.com/uncategorized/ambients-computer-vision-detects-dangerous-behaviors-but-raises-privacy-concerns/ (accessed on 10 January 2024).
Novacura. Quality Inspections of the Production Line Using Computer Vision and Novacura Flow. Available online: https://www.novacura.com/computer-vision-quality-inspections/ (accessed on 10 January 2024).
CompScience. CompScience Workplace Safety Analytics. Available online: https://www.compscience.com/about-us/ (accessed on 10 January 2024).
Simpson, S. Tactical Multi-Drone Mapping Demonstrated to US Military|Unmanned Systems Technology. Available online: https://www.unmannedsystemstechnology.com/2022/08/tactical-multi-drone-mapping-demonstrated-to-us-military/ (accessed on 15 July 2023).
Maltsev, A. Drones at War and Computer Vision. Available online: https://medium.com/@zlodeibaal/drones-at-war-and-computer-vision-a16b8063be7b (accessed on 15 July 2023).
Liu, L.; Catelli, E.; Katsaggelos, A.; Sciutto, G.; Mazzeo, R.; Milanic, M.; Stergar, J.; Prati, S.; Walton, M. Digital restoration of colour cinematic films using imaging spectroscopy and machine learning. Sci. Rep. 2022, 12, 21982. [Google Scholar] [CrossRef]
Tabernik, D.; Šela, S.; Skvarč, J.; Skočaj, D. Deep-learning-based computer vision system for surface-defect detection. In Proceedings of the Computer Vision Systems: 12th International Conference, ICVS 2019, Thessaloniki, Greece, 23–25 September 2019; Proceedings 12. Springer: Berlin/Heidelberg, Germany, 2019; pp. 490–500. [Google Scholar]
Canedo, D.; Fonseca, P.; Georgieva, P.; Neves, A.J. A deep learning-based dirt detection computer vision system for floor-cleaning robots with improved data collection. Technologies 2021, 9, 94. [Google Scholar] [CrossRef]
Savit, A.; Damor, A. Revolutionizing Retail Stores with Computer Vision and Edge AI: A Novel Shelf Management System. In Proceedings of the 2023 2nd International Conference on Applied Artificial Intelligence and Computing (ICAAIC), Salem, India, 4–6 May 2023; IEEE: Piscataway, NJ, USA, 2023; pp. 69–74. [Google Scholar]
Smink, A.R.; Frowijn, S.; van Reijmersdal, E.A.; van Noort, G.; Neijens, P.C. Try online before you buy: How does shopping with augmented reality affect brand responses and personal data disclosure. Electron. Commer. Res. Appl. 2019, 35, 100854. [Google Scholar] [CrossRef]
Vidal, C. Technology Primer: Social Media Recommendation Algorithms. Available online: https://www.belfercenter.org/publication/technology-primer-social-media-recommendation-algorithms (accessed on 20 July 2023).
Yousaf, K.; Nawaz, T. A deep learning-based approach for inappropriate content detection and classification of youtube videos. IEEE Access 2022, 10, 16283–16298. [Google Scholar] [CrossRef]
Yekkehkhany, B.; Shokri, P.; Zadeh, A. A Computer Vision Approach for Detection of Asteroids/comets in Space Satellite Images. Int. Arch. Photogramm. Remote Sens. Spat. Inf. Sci. 2020, 43, 1185–1190. [Google Scholar] [CrossRef]
Yang, X.; Bobkov, A. Development of a vision system for safe and high-precision soft landing on the Moon. Procedia Comput. Sci. 2021, 186, 503–511. [Google Scholar] [CrossRef]
Li, S. Computer vision based autonomous navigation for pin-point landing robotic spacecraft on asteroids. In Intelligent Robotics and Applications: First International Conference, ICIRA 2008, Wuhan, China, 15–17 October 2008; Proceedings, Part II 1; Springer: Berlin/Heidelberg, Germany, 2008; pp. 1115–1126. [Google Scholar]
Saskovec, P. AI-Generated Sports Highlights: Different Approaches—KDnuggets. Available online: https://www.kdnuggets.com/ai-generated-sports-highlights-different-approaches (accessed on 5 July 2023).
Computer Vision in Sports: Applications and Challenges SuperAnnotate. Available online: https://www.superannotate.com/blog/computer-vision-in-sports (accessed on 5 July 2023).
Cioppa, A.; Deliege, A.; Giancola, S.; Ghanem, B.; Van Droogenbroeck, M. Scaling up SoccerNet with multi-view spatial localization and re-identification. Sci. Data 2022, 9, 355. [Google Scholar] [CrossRef]
About Face ID Advanced Technology—Apple Support. Available online: https://support.apple.com/en-us/102381 (accessed on 20 July 2023).
Live Translate on Pixel Phones: Meet Your New Interpreter. Available online: https://store.google.com/intl/en/ideas/articles/meet-your-helpful-interpreter-pixel-6/ (accessed on 20 July 2023).
Azure Kinect DK—Develop AI Models|Microsoft Azure. Available online: https://azure.microsoft.com/en-us/products/kinect-dk (accessed on 20 July 2023).
CareAI. Sensor Technology. Available online: https://www.care.ai/sensors.html (accessed on 18 January 2024).
Artisight. Available online: https://artisight.com/ (accessed on 18 January 2024).
Ocuvera. Our Solution. Available online: https://ocuvera.com/our-solution/ (accessed on 18 January 2024).
VirtuSense. Available online: https://www.virtusense.ai/ (accessed on 18 January 2024).
Inspiren. Available online: https://inspiren.com/solutions/ (accessed on 18 January 2024).
Philips. AI Enabled Solutions. Available online: https://www.philips.com/a-w/about/artificial-intelligence/ai-enabled-solutions (accessed on 18 January 2024).
AI, S. Philips and Silo AI Develop Computer Vision to Improve the Accuracy of Radiotherapy. Available online: https://www.silo.ai/blog/philips-and-silo-ai-develop-computer-vision-to-improve-the-accuracy-of-radiotherapy (accessed on 18 January 2024).
Gao, J.; Yang, Y.; Lin, P.; Park, D.S. Computer Vision in Healthcare Applications. J. Healthc. Eng. 2018, 2018, 5157020. [Google Scholar] [CrossRef]
Olveres, J.; González, G.; Torres, F.; Moreno-Tagle, J.C.; Carbajal-Degante, E.; Valencia-Rodríguez, A.; Méndez-Sánchez, N.; Escalante-Ramírez, B. What is new in computer vision and artificial intelligence in medical image analysis applications. Quant. Imaging Med. Surg. 2021, 11, 3830–3853. [Google Scholar] [CrossRef] [PubMed]
Isikli Esener, I.; Ergin, S.; Yuksel, T. A New Feature Ensemble with a Multistage Classification Scheme for Breast Cancer Diagnosis. J. Healthc. Eng. 2017, 2017, 3895164. [Google Scholar] [CrossRef] [PubMed]
Yin, G.; Song, Y.; Li, X.; Zhu, L.; Su, Q.; Dai, D.; Xu, W. Prediction of mediastinal lymph node metastasis based on 18 F-FDG PET/CT imaging using support vector machine in non-small cell lung cancer. Eur. Radiol. 2021, 31, 3983–3992. [Google Scholar] [CrossRef] [PubMed]
Lakhani, P.; Sundaram, B. Deep learning at chest radiography: Automated classification of pulmonary tuberculosis by using convolutional neural networks. Radiology 2017, 284, 574–582. [Google Scholar] [CrossRef] [PubMed]
Ardila, D.; Kiraly, A.P.; Bharadwaj, S.; Choi, B.; Reicher, J.J.; Peng, L.; Tse, D.; Etemadi, M.; Ye, W.; Corrado, G. End-to-end lung cancer screening with three-dimensional deep learning on low-dose chest computed tomography. Nat. Med. 2019, 25, 954–961. [Google Scholar] [CrossRef] [PubMed]
Raza, R.; Zulfiqar, F.; Khan, M.O.; Arif, M.; Alvi, A.; Iftikhar, M.A.; Alam, T. Lung-EffNet: Lung cancer classification using EfficientNet from CT-scan images. Eng. Appl. Artif. Intell. 2023, 126, 106902. [Google Scholar] [CrossRef]
Sun, R.; Pang, Y.; Li, W. Efficient Lung Cancer Image Classification and Segmentation Algorithm Based on an Improved Swin Transformer. Electronics 2023, 12, 1024. [Google Scholar] [CrossRef]
Said, Y.; Alsheikhy, A.A.; Shawly, T.; Lahza, H. Medical images segmentation for lung cancer diagnosis based on deep learning architectures. Diagnostics 2023, 13, 546. [Google Scholar] [CrossRef] [PubMed]
Samant, P.; Agarwal, R. Comparative analysis of classification based algorithms for diabetes diagnosis using iris images. J. Med. Eng. Technol. 2018, 42, 35–42. [Google Scholar] [CrossRef]
Jena, P.K.; Khuntia, B.; Palai, C.; Nayak, M.; Mishra, T.K.; Mohanty, S.N. A novel approach for diabetic retinopathy screening using asymmetric deep learning features. Big Data Cogn. Comput. 2023, 7, 25. [Google Scholar] [CrossRef]
Kothadiya, D.; Rehman, A.; Abbas, S.; Alamri, F.S.; Saba, T. Attention-based deep learning framework to recognize diabetes disease from cellular retinal images. Biochem. Cell Biol. 2023, 101, 550–561. [Google Scholar] [CrossRef]
Pacal, I. MaxCerVixT: A Novel Lightweight Vision Transformer-Based Approach for Precise Cervical Cancer Detection. Knowl. -Based Syst. 2024, 289, 111482. [Google Scholar] [CrossRef]
Abd-Alhalem, S.M.; Marie, H.S.; El-Shafai, W.; Altameem, T.; Rathore, R.S.; Hassan, T.M. Cervical cancer classification based on a bilinear convolutional neural network approach and random projection. Eng. Appl. Artif. Intell. 2024, 127, 107261. [Google Scholar] [CrossRef]
Attallah, O. Cervical cancer diagnosis based on multi-domain features using deep learning enhanced by handcrafted descriptors. Appl. Sci. 2023, 13, 1916. [Google Scholar] [CrossRef]
Payabvash, S.; Aboian, M.; Tihan, T.; Cha, S. Machine learning decision tree models for differentiation of posterior fossa tumors using diffusion histogram analysis and structural MRI findings. Front. Oncol. 2020, 10, 71. [Google Scholar] [CrossRef]
Olberg, S.; Choi, B.S.; Park, I.; Liang, X.; Kim, J.S.; Deng, J.; Yan, Y.; Jiang, S.; Park, J.C. Ensemble learning and personalized training for the improvement of unsupervised deep learning-based synthetic CT reconstruction. Med. Phys. 2023, 50, 1436–1449. [Google Scholar] [CrossRef] [PubMed]
Bejnordi, B.E.; Veta, M.; Van Diest, P.J.; Van Ginneken, B.; Karssemeijer, N.; Litjens, G.; Van Der Laak, J.A.; Hermsen, M.; Manson, Q.F.; Balkenhol, M. Diagnostic assessment of deep learning algorithms for detection of lymph node metastases in women with breast cancer. JAMA 2017, 318, 2199–2210. [Google Scholar] [CrossRef]
Tellez, D.; Balkenhol, M.; Otte-Höller, I.; van de Loo, R.; Vogels, R.; Bult, P.; Wauters, C.; Vreuls, W.; Mol, S.; Karssemeijer, N. Whole-slide mitosis detection in H&E breast histology using PHH3 as a reference to train distilled stain-invariant convolutional networks. IEEE Trans. Med. Imaging 2018, 37, 2126–2136. [Google Scholar]
Hussain, Z.; Gimenez, F.; Yi, D.; Rubin, D. Differential data augmentation techniques for medical imaging classification tasks. AMIA Annu. Symp. Proc. 2017, 2017, 979–984. [Google Scholar]
Ravishankar, H.; Sudhakar, P.; Venkataramani, R.; Thiruvenkadam, S.; Annangi, P.; Babu, N.; Vaidya, V. Understanding the mechanisms of deep transfer learning for medical images. In Deep Learning and Data Labeling for Medical Applications: First International Workshop, LABELS 2016, and Second International Workshop, DLMIA 2016, Held in Conjunction with MICCAI 2016, Athens, Greece, 21 October 2016; Proceedings 1; Springer: Berlin/Heidelberg, Germany, 2016; pp. 188–196. [Google Scholar]
Prinzi, F.; Insalaco, M.; Orlando, A.; Gaglio, S.; Vitabile, S. A YOLO-based model for breast cancer detection in mammograms. Cogn. Comput. 2024, 16, 107–120. [Google Scholar] [CrossRef]
Luo, L.; Wang, X.; Lin, Y.; Ma, X.; Tan, A.; Chan, R.; Vardhanabhuti, V.; Chu, W.C.; Cheng, K.-T.; Chen, H. Deep learning in breast cancer imaging: A decade of progress and future directions. IEEE Rev. Biomed. Eng. 2024, 1–20. [Google Scholar] [CrossRef]
Minh, T.C.; Quoc, N.K.; Cong Vinh, P.; Nhu Phu, D.; Chi, V.X.; Tan, H.M. UGGNet: Bridging U-Net and VGG for Advanced Breast Cancer Diagnosis. arXiv 2024. [Google Scholar] [CrossRef]
Xu, W.; He, J.; Shu, Y. DeepHealth: Deep representation learning with autoencoders for healthcare prediction. In Proceedings of the 2020 IEEE Symposium Series on Computational Intelligence (SSCI), Canberra, Australia, 1–4 December 2020; IEEE: Piscataway, NJ, USA, 2020; pp. 42–49. [Google Scholar]
Chanda, D.; Onim, M.S.H.; Nyeem, H.; Ovi, T.B.; Naba, S.S. DCENSnet: A new deep convolutional ensemble network for skin cancer classification. Biomed. Signal Process. Control 2024, 89, 105757. [Google Scholar] [CrossRef]
Akilandasowmya, G.; Nirmaladevi, G.; Suganthi, S.; Aishwariya, A. Skin cancer diagnosis: Leveraging deep hidden features and ensemble classifiers for early detection and classification. Biomed. Signal Process. Control 2024, 88, 105306. [Google Scholar] [CrossRef]
Hu, B.; Zhou, P.; Yu, H.; Dai, Y.; Wang, M.; Tan, S.; Sun, Y. LeaNet: Lightweight U-shaped architecture for high-performance skin cancer image segmentation. Comput. Biol. Med. 2024, 169, 107919. [Google Scholar] [CrossRef] [PubMed]
Zhang, L.; Zhang, J.; Gao, W.; Bai, F.; Li, N.; Ghadimi, N. A deep learning outline aimed at prompt skin cancer detection utilizing gated recurrent unit networks and improved orca predation algorithm. Biomed. Signal Process. Control 2024, 90, 105858. [Google Scholar] [CrossRef]
Ponzio, F.; Macii, E.; Ficarra, E.; Di Cataldo, S. Colorectal cancer classification using deep convolutional networks. In Proceedings of the 11th International Joint Conference on Biomedical Engineering Systems and Technologies, Funchal, Portugal, 19–21 January 2018; pp. 58–66. [Google Scholar]
Choi, K.; Choi, S.J.; Kim, E.S. Computer-Aided diagonosis for colorectal cancer using deep learning with visual explanations. In Proceedings of the 2020 42nd annual international conference of the IEEE engineering in Medicine & Biology Society (EMBC), Montreal, QC, Canada, 20–24 July 2020; IEEE: Piscataway, NJ, USA, 2020; pp. 1156–1159. [Google Scholar]
Khan, A.; Brouwer, N.; Blank, A.; Müller, F.; Soldini, D.; Noske, A.; Gaus, E.; Brandt, S.; Nagtegaal, I.; Dawson, H. Computer-assisted diagnosis of lymph node metastases in colorectal cancers using transfer learning with an ensemble model. Mod. Pathol. 2023, 36, 100118. [Google Scholar] [CrossRef] [PubMed]
Su, Y.; Bai, Y.; Zhang, B.; Zhang, Z.; Wang, W. Hat-net: A hierarchical transformer graph neural network for grading of colorectal cancer histology images. In Proceedings of the British Machine Vision Conference, Online, 22–25 November 2021. [Google Scholar]
Sabottke, C.F.; Breaux, M.A.; Spieler, B.M. Estimation of age in unidentified patients via chest radiography using convolutional neural network regression. Emerg. Radiol. 2020, 27, 463–468. [Google Scholar] [CrossRef]
Khalif, K.M.N.K.; Chaw Seng, W.; Gegov, A.; Bakar, A.S.A.; Shahrul, N.A. Integrated Generative Adversarial Networks and Deep Convolutional Neural Networks for Image Data Classification: A Case Study for COVID-19. Information 2024, 15, 58. [Google Scholar] [CrossRef]
Mezina, A.; Burget, R. Detection of post-COVID-19-related pulmonary diseases in X-ray images using Vision Transformer-based neural network. Biomed. Signal Process. Control 2024, 87, 105380. [Google Scholar] [CrossRef]
Varde, A.S.; Karthikeyan, D.; Wang, W. Facilitating COVID recognition from X-rays with computer vision models and transfer learning. Multimed. Tools Appl. 2024, 83, 807–838. [Google Scholar] [CrossRef] [PubMed]
Jin, C.; Chen, W.; Cao, Y.; Xu, Z.; Tan, Z.; Zhang, X.; Deng, L.; Zheng, C.; Zhou, J.; Shi, H. Development and evaluation of an artificial intelligence system for COVID-19 diagnosis. Nat. Commun. 2020, 11, 5088. [Google Scholar] [CrossRef] [PubMed]
Yoo, S.H.; Geng, H.; Chiu, T.L.; Yu, S.K.; Cho, D.C.; Heo, J.; Choi, M.S.; Choi, I.H.; Cung Van, C.; Nhung, N.V. Deep learning-based decision-tree classifier for COVID-19 diagnosis from chest X-ray imaging. Front. Med. 2020, 7, 427. [Google Scholar] [CrossRef] [PubMed]
Gao, Z.; Hong, B.; Zhang, X.; Li, Y.; Jia, C.; Wu, J.; Wang, C.; Meng, D.; Li, C. Instance-based vision transformer for subtyping of papillary renal cell carcinoma in histopathological image. In Medical Image Computing and Computer Assisted Intervention–MICCAI 2021: 24th International Conference, Strasbourg, France, 27 September–1 October 2021; Proceedings, Part VIII 24; Springer: Berlin/Heidelberg, Germany, 2021; pp. 299–308. [Google Scholar]
Khosravi, P.; Lysandrou, M.; Eljalby, M.; Li, Q.; Kazemi, E.; Zisimopoulos, P.; Sigaras, A.; Brendel, M.; Barnes, J.; Ricketts, C. A deep learning approach to diagnostic classification of prostate cancer using pathology–radiology fusion. J. Magn. Reson. Imaging 2021, 54, 462–471. [Google Scholar] [CrossRef] [PubMed]
Kather, J.N.; Pearson, A.T.; Halama, N.; Jäger, D.; Krause, J.; Loosen, S.H.; Marx, A.; Boor, P.; Tacke, F.; Neumann, U.P. Deep learning can predict microsatellite instability directly from histology in gastrointestinal cancer. Nat. Med. 2019, 25, 1054–1056. [Google Scholar] [CrossRef]
Liu, M.; Zhang, D.; Shen, D.; Alzheimer’s Disease Neuroimaging Initiative. Hierarchical fusion of features and classifier decisions for Alzheimer’s disease diagnosis. Hum. Brain Mapp. 2014, 35, 1305–1319. [Google Scholar] [CrossRef] [PubMed]
Li, W.; Jia, F.; Hu, Q. Automatic segmentation of liver tumor in CT images with deep convolutional neural networks. J. Comput. Commun. 2015, 3, 146–151. [Google Scholar] [CrossRef]
Gao, S.; Lima, D. A review of the application of deep learning in the detection of Alzheimer’s disease. Int. J. Cogn. Comput. Eng. 2022, 3, 1–8. [Google Scholar] [CrossRef]
Helaly, H.A.; Badawy, M.; Haikal, A.Y. Deep Learning Approach for Early Detection of Alzheimer’s Disease. Cogn. Comput. 2022, 14, 1711–1727. [Google Scholar] [CrossRef]
Kundisch, A.; Hönning, A.; Mutze, S.; Kreissl, L.; Spohn, F.; Lemcke, J.; Sitz, M.; Sparenberg, P.; Goelz, L. Deep learning algorithm in detecting intracranial hemorrhages on emergency computed tomographies. PLoS ONE 2021, 16, e0260560. [Google Scholar] [CrossRef]
Thung, K.-H.; Wee, C.-Y.; Yap, P.-T.; Shen, D.; Initiative, A.s.D.N. Neurodegenerative disease diagnosis using incomplete multi-modality data via matrix shrinkage and completion. NeuroImage 2014, 91, 386–400. [Google Scholar] [CrossRef] [PubMed]
Jiang, X.; Hu, Z.; Wang, S.; Zhang, Y. Deep Learning for Medical Image-Based Cancer Diagnosis. Cancers 2023, 15, 3608. [Google Scholar] [CrossRef] [PubMed]
John, J.; Ravikumar, A.; Abraham, B. Prostate cancer prediction from multiple pretrained computer vision model. Health Technol. 2021, 11, 1003–1011. [Google Scholar] [CrossRef]
Kott, O.; Linsley, D.; Amin, A.; Karagounis, A.; Jeffers, C.; Golijanin, D.; Serre, T.; Gershman, B. Development of a deep learning algorithm for the histopathologic diagnosis and Gleason grading of prostate cancer biopsies: A pilot study. Eur. Urol. Focus 2021, 7, 347–351. [Google Scholar] [CrossRef] [PubMed]
Rampun, A.; Chen, Z.; Malcolm, P.; Tiddeman, B.; Zwiggelaar, R. Computer-aided diagnosis: Detection and localization of prostate cancer within the peripheral zone. Int. J. Numer. Methods Biomed. Eng. 2016, 32, e02745. [Google Scholar] [CrossRef]
Brunese, L.; Mercaldo, F.; Reginelli, A.; Santone, A. An ensemble learning approach for brain cancer detection exploiting radiomic features. Comput. Methods Programs Biomed. 2020, 185, 105134. [Google Scholar] [CrossRef] [PubMed]
Leemput, S.C.V.D.; Meijs, M.; Patel, A.; Meijer, F.J.A.; Ginneken, B.V.; Manniesing, R. Multiclass Brain Tissue Segmentation in 4D CT Using Convolutional Neural Networks. IEEE Access 2019, 7, 51557–51569. [Google Scholar] [CrossRef]
Kleesiek, J.; Urban, G.; Hubert, A.; Schwarz, D.; Maier-Hein, K.; Bendszus, M.; Biller, A. Deep MRI brain extraction: A 3D convolutional neural network for skull stripping. NeuroImage 2016, 129, 460–469. [Google Scholar] [CrossRef] [PubMed]
Pei, L.; Vidyaratne, L.; Rahman, M.M.; Iftekharuddin, K.M. Context aware deep learning for brain tumor segmentation, subtype classification, and survival prediction using radiology images. Sci. Rep. 2020, 10, 19726. [Google Scholar] [CrossRef]
Roth, H.R.; Lu, L.; Farag, A.; Shin, H.-C.; Liu, J.; Turkbey, E.B.; Summers, R.M. Deeporgan: Multi-level deep convolutional networks for automated pancreas segmentation. In Medical Image Computing and Computer-Assisted Intervention—MICCAI 2015: 18th International Conference, Munich, Germany, 5–9 October 2015, Proceedings, Part I 18; Springer: Berlin/Heidelberg, Germany, 2015; pp. 556–564. [Google Scholar]
Mehta, S.D.; Sebro, R. Computer-aided detection of incidental lumbar spine fractures from routine dual-energy X-ray absorptiometry (DEXA) studies using a support vector machine (SVM) classifier. J. Digit. Imaging 2020, 33, 204–210. [Google Scholar] [CrossRef]
Kuo, R.Y.L.; Harrison, C.; Curran, T.A.; Jones, B.; Freethy, A.; Cussons, D.; Stewart, M.; Collins, G.S.; Furniss, D. Artificial Intelligence in Fracture Detection: A Systematic Review and Meta-Analysis. Radiology 2022, 304, 50–62. [Google Scholar] [CrossRef]
Ju, R.-Y.; Cai, W. Fracture detection in pediatric wrist trauma X-ray images using YOLOv8 algorithm. Sci. Rep. 2023, 13, 20077. [Google Scholar] [CrossRef] [PubMed]
Zech, J.R.; Carotenuto, G.; Igbinoba, Z.; Tran, C.V.; Insley, E.; Baccarella, A.; Wong, T.T. Detecting pediatric wrist fractures using deep-learning-based object detection. Pediatr. Radiol. 2023, 53, 1125–1134. [Google Scholar] [CrossRef] [PubMed]
Davoudi, A.; Malhotra, K.R.; Shickel, B.; Siegel, S.; Williams, S.; Ruppert, M.; Bihorac, E.; Ozrazgat-Baslanti, T.; Tighe, P.J.; Bihorac, A.; et al. Intelligent ICU for Autonomous Patient Monitoring Using Pervasive Sensing and Deep Learning. Sci. Rep. 2019, 9, 8020. [Google Scholar] [CrossRef] [PubMed]
Chase, J.G.; Agogue, F.; Starfinger, C.; Lam, Z.; Shaw, G.M.; Rudge, A.D.; Sirisena, H. Quantifying agitation in sedated ICU patients using digital imaging. Comput. Methods Programs Biomed. 2004, 76, 131–141. [Google Scholar] [CrossRef] [PubMed]
Becouze, P.; Hann, C.E.; Chase, J.G.; Shaw, G.M. Measuring facial grimacing for quantifying patient agitation in critical care. Comput. Methods Programs Biomed. 2007, 87, 138–147. [Google Scholar] [CrossRef] [PubMed]
Brahnam, S.; Nanni, L.; Sexton, R. Introduction to neonatal facial pain detection using common and advanced face classification techniques. In Advanced Computational Intelligence Paradigms in Healthcare–1; Springer: Berlin/Heidelberg, Germany, 2007; pp. 225–253. [Google Scholar]
Hammal, Z.; Cohn, J.F. Automatic detection of pain intensity. In Proceedings of the 14th ACM International Conference on Multimodal Interaction, Santa Monica, CA, USA, 22–26 October 2012; Volume 2012, pp. 47–52. [Google Scholar] [CrossRef]
Kharghanian, R.; Peiravi, A.; Moradi, F. Pain detection from facial images using unsupervised feature learning approach. In Proceedings of the 2016 38th Annual International Conference of the IEEE Engineering in Medicine and Biology Society (EMBC), Orlando, FL, USA, 16–20 August 2016; Volume 2016, pp. 419–422. [Google Scholar] [CrossRef]
Thevenot, J.; López, M.B.; Hadid, A. A Survey on Computer Vision for Assistive Medical Diagnosis From Faces. IEEE J. Biomed. Health Inform. 2018, 22, 1497–1511. [Google Scholar] [CrossRef] [PubMed]
Zamzmi, G.; Kasturi, R.; Goldgof, D.; Zhi, R.; Ashmeade, T.; Sun, Y. A Review of Automated Pain Assessment in Infants: Features, Classification Tasks, and Databases. IEEE Rev. Biomed. Eng. 2018, 11, 77–96. [Google Scholar] [CrossRef]
Gavrilescu, M.; Vizireanu, N. Predicting depression, anxiety, and stress levels from videos using the facial action coding system. Sensors 2019, 19, 3693. [Google Scholar] [CrossRef]
Erekat, D.; Hammal, Z.; Siddiqui, M.; Dibeklioğlu, H. Enforcing Multilabel Consistency for Automatic Spatio-Temporal Assessment of Shoulder Pain Intensity. In Proceedings of the ACM International Conference on Multimodal Interaction, Virtual, 25–29 October 2020; Volume 2020, pp. 156–164. [Google Scholar] [CrossRef]
Xu, X.; de Sa, V.R. Exploring multidimensional measurements for pain evaluation using facial action units. In Proceedings of the 2020 15th IEEE International Conference on Automatic Face and Gesture Recognition (FG 2020), Buenos Aires, Argentina, 16–20 November 2020; IEEE: Piscataway, NJ, USA, 2020; pp. 786–792. [Google Scholar]
D’Antoni, F.; Russo, F.; Ambrosio, L.; Vollero, L.; Vadalà, G.; Merone, M.; Papalia, R.; Denaro, V. Artificial Intelligence and Computer Vision in Low Back Pain: A Systematic Review. Int. J. Environ. Res. Public Health 2021, 18, 10909. [Google Scholar] [CrossRef]
Prinsen, V.; Jouvet, P.; Al Omar, S.; Masson, G.; Bridier, A.; Noumeir, R. Automatic eye localization for hospitalized infants and children using convolutional neural networks. Int. J. Med. Inform. 2021, 146, 104344. [Google Scholar] [CrossRef] [PubMed]
Versluijs, Y.; Moore, M.G.; Ring, D.; Jayakumar, P. Clinician Facial Expression of Emotion Corresponds with Patient Mindset. Clin. Orthop. Relat. Res. 2021, 479, 1914–1923. [Google Scholar] [CrossRef] [PubMed]
Hassan, T.; Seus, D.; Wollenberg, J.; Weitz, K.; Kunz, M.; Lautenbacher, S.; Garbas, J.U.; Schmid, U. Automatic Detection of Pain from Facial Expressions: A Survey. IEEE Trans. Pattern Anal. Mach. Intell. 2021, 43, 1815–1831. [Google Scholar] [CrossRef] [PubMed]
Oster, H. Automated facial analysis of infant pain expressions: Progress and future directions. Lancet Digit. Health 2021, 3, e613–e614. [Google Scholar] [CrossRef] [PubMed]
Hoti, K.; Chivers, P.T.; Hughes, J.D. Assessing procedural pain in infants: A feasibility study evaluating a point-of-care mobile solution based on automated facial analysis. Lancet Digit. Health 2021, 3, e623–e634. [Google Scholar] [CrossRef] [PubMed]
Bringuier, S.; Macioce, V.; Boulhais, M.; Dadure, C.; Capdevila, X. Facial expressions of pain in daily clinical practice to assess postoperative pain in children: Reliability and validity of the facial action summary score. Eur. J. Pain 2021, 25, 1081–1090. [Google Scholar] [CrossRef]
Liu, D.; Liu, B.; Lin, T.; Liu, G.; Yang, G.; Qi, D.; Qiu, Y.; Lu, Y.; Yuan, Q.; Shuai, S.C.; et al. Measuring depression severity based on facial expression and body movement using deep convolutional neural network. Front. Psychiatry 2022, 13, 1017064. [Google Scholar] [CrossRef] [PubMed]
Giannakakis, G.; Koujan, M.R.; Roussos, A.; Marias, K. Automatic stress analysis from facial videos based on deep facial action units recognition. Pattern Anal. Appl. 2022, 25, 521–535. [Google Scholar] [CrossRef]
Wu, C.L.; Liu, S.F.; Yu, T.L.; Shih, S.J.; Chang, C.H.; Yang Mao, S.F.; Li, Y.S.; Chen, H.J.; Chen, C.C.; Chao, W.C. Deep Learning-Based Pain Classifier Based on the Facial Expression in Critically Ill Patients. Front. Med. 2022, 9, 851690. [Google Scholar] [CrossRef]
Zhang, M.; Zhu, L.; Lin, S.Y.; Herr, K.; Chi, C.L.; Demir, I.; Dunn Lopez, K.; Chi, N.C. Using artificial intelligence to improve pain assessment and pain management: A scoping review. J. Am. Med. Inform. Assoc. 2023, 30, 570–587. [Google Scholar] [CrossRef]
Ma, Y.; Shen, J.; Zhao, Z.; Liang, H.; Tan, Y.; Liu, Z.; Qian, K.; Yang, M.; Hu, B. What Can Facial Movements Reveal? Depression Recognition and Analysis Based on Optical Flow Using Bayesian Networks. IEEE Trans. Neural Syst. Rehabil. Eng. 2023, 31, 3459–3468. [Google Scholar] [CrossRef] [PubMed]
Heiderich, T.M.; Carlini, L.P.; Buzuti, L.F.; Balda, R.C.X.; Barros, M.C.M.; Guinsburg, R.; Thomaz, C.E. Face-based automatic pain assessment: Challenges and perspectives in neonatal intensive care units. J. Pediatr. 2023, 99, 546–560. [Google Scholar] [CrossRef] [PubMed]
Ghosh, A.; Umer, S.; Khan, M.K.; Rout, R.K.; Dhara, B.C. Smart sentiment analysis system for pain detection using cutting edge techniques in a smart healthcare framework. Clust. Comput. 2023, 26, 119–135. [Google Scholar] [CrossRef] [PubMed]
Madrigal-Garcia, M.I.; Rodrigues, M.; Shenfield, A.; Singer, M.; Moreno-Cuesta, J. What Faces Reveal: A Novel Method to Identify Patients at Risk of Deterioration Using Facial Expressions. Crit. Care Med. 2018, 46, 1057–1062. [Google Scholar] [CrossRef] [PubMed]
Madrigal-Garcia, M.I.; Archer, D.; Singer, M.; Rodrigues, M.; Shenfield, A.; Moreno-Cuesta, J. Do Temporal Changes in Facial Expressions Help Identify Patients at Risk of Deterioration in Hospital Wards? A Post Hoc Analysis of the Visual Early Warning Score Study. Crit. Care Exp. 2020, 2, e0115. [Google Scholar] [CrossRef] [PubMed]
Do, Q.T.; Chaudri, J. Creating Computer Vision Models for Respiratory Status Detection. In Proceedings of the 2022 44th Annual International Conference of the IEEE Engineering in Medicine & Biology Society (EMBC), Glasgow, UK, 11–15 July 2022; Volume 2022, pp. 1350–1353. [Google Scholar] [CrossRef]
Zhou, C.; Chase, J.G. Low-cost structured light imaging of regional volume changes for use in assessing mechanical ventilation. Comput. Methods Programs Biomed. 2022, 226, 107176. [Google Scholar] [CrossRef]
Ebrahimian, S.; Homayounieh, F.; Rockenbach, M.A.; Putha, P.; Raj, T.; Dayan, I.; Bizzo, B.C.; Buch, V.; Wu, D.; Kim, K. Artificial intelligence matches subjective severity assessment of pneumonia for prediction of patient outcome and need for mechanical ventilation: A cohort study. Sci. Rep. 2021, 11, 858. [Google Scholar] [CrossRef]
Reiter, A.; Ma, A.; Rawat, N.; Shrock, C.; Saria, S. Process Monitoring in the Intensive Care Unit: Assessing Patient Mobility Through Activity Analysis with a Non-Invasive Mobility Sensor. In Medical Image Computing and Computer-Assisted Intervention–MICCAI 2016: 19th International Conference, Athens, Greece, 17–21 October 2016; Springer: Berlin/Heidelberg, Germany, 2016; Volume 9900, pp. 482–490. [Google Scholar] [CrossRef]
Yeung, S.; Rinaldo, F.; Jopling, J.; Liu, B.; Mehra, R.; Downing, N.L.; Guo, M.; Bianconi, G.M.; Alahi, A.; Lee, J.; et al. A computer vision system for deep learning-based detection of patient mobilization activities in the ICU. Npj Digit. Med. 2019, 2, 11. [Google Scholar] [CrossRef] [PubMed]
Kumar, S.; Singhal, P.; Krovi, V.N. Computer-vision-based decision support in surgical robotics. IEEE Des. Test 2015, 32, 89–97. [Google Scholar] [CrossRef]
Marie Postma-Nilsenová, E.P.; Tates, K. Automatic Detection of Confusion in Elderly Users of a Web-Based Health Instruction Video. Telemed. e-Health 2015, 21, 514–519. [Google Scholar] [CrossRef]
Kennedy-Metz, L.R.; Mascagni, P.; Torralba, A.; Dias, R.D.; Perona, P.; Shah, J.A.; Padoy, N.; Zenati, M.A. Computer vision in the operating room: Opportunities and caveats. IEEE Trans. Med. Robot. Bionics 2020, 3, 2–10. [Google Scholar] [CrossRef] [PubMed]
Gul, M.A.; Yousaf, M.H.; Nawaz, S.; Ur Rehman, Z.; Kim, H. Patient monitoring by abnormal human activity recognition based on CNN architecture. Electronics 2020, 9, 1993. [Google Scholar] [CrossRef]
Singh, A.; Haque, A.; Alahi, A.; Yeung, S.; Guo, M.; Glassman, J.R.; Beninati, W.; Platchek, T.; Fei-Fei, L.; Milstein, A. Automatic detection of hand hygiene using computer vision technology. J. Am. Med. Inform. Assoc. 2020, 27, 1316–1320. [Google Scholar] [CrossRef] [PubMed]
Ahmed, I.; Jeon, G.; Piccialli, F. A deep-learning-based smart healthcare system for patient’s discomfort detection at the edge of internet of things. IEEE Internet Things J. 2021, 8, 10318–10326. [Google Scholar] [CrossRef]
Dias, R.D.; Kennedy-Metz, L.R.; Yule, S.J.; Gombolay, M.; Zenati, M.A. Assessing Team Situational Awareness in the Operating Room via Computer Vision. In Proceedings of the 2022 IEEE Conference on Cognitive and Computational Aspects of Situation Management (CogSIMA), Salerno, Italy, 6–10 June 2022; Volume 2022, pp. 94–96. [Google Scholar] [CrossRef]
Chan, P.Y.; Tay, A.; Chen, D.; De Freitas, M.; Millet, C.; Nguyen-Duc, T.; Duke, G.; Lyall, J.; Nguyen, J.T.; McNeil, J.; et al. Ambient intelligence-based monitoring of staff and patient activity in the intensive care unit. Aust. Crit. Care 2023, 36, 92–98. [Google Scholar] [CrossRef] [PubMed]
Bai, L.; Wang, G.; Wang, J.; Yang, X.; Gao, H.; Liang, X.; Wang, A.; Islam, M.; Ren, H. OSSAR: Towards Open-Set Surgical Activity Recognition in Robot-assisted Surgery. arXiv 2024. [Google Scholar] [CrossRef]
Tseng, L.-A.; Lin, H.-C.; Bai, M.-Y.; Li, M.-F.; Lee, Y.-L.; Chiang, K.-J.; Wang, Y.-C.; Guo, J.-M. DeepVinci: A Semantic Segmentation Model with Edge Super-vision and Densely Multi-scale Pyramid Module for DaVinci Gynecological Surgery. 2024. [Google Scholar] [CrossRef]
Chadebecq, F.; Vasconcelos, F.; Mazomenos, E.; Stoyanov, D. Computer Vision in the Surgical Operating Room. Visc. Med. 2020, 36, 456–462. [Google Scholar] [CrossRef] [PubMed]
Chen, Z.; Zhang, Y.; Yan, Z.; Dong, J.; Cai, W.; Ma, Y.; Jiang, J.; Dai, K.; Liang, H.; He, J. Artificial intelligence assisted display in thoracic surgery: Development and possibilities. J. Thorac. Dis. 2021, 13, 6994–7005. [Google Scholar] [CrossRef] [PubMed]
Jaiswal, S.; Valstar, M.F.; Gillott, A.; Daley, D. Automatic Detection of ADHD and ASD from Expressive Behaviour in RGBD Data. In Proceedings of the 2017 12th IEEE International Conference on Automatic Face & Gesture Recognition (FG 2017), Washington, DC, USA, 30 May–3 June 2017; pp. 762–769. [Google Scholar]
Guarin, D.L.; Dusseldorp, J.; Hadlock, T.A.; Jowett, N. A Machine Learning Approach for Automated Facial Measurements in Facial Palsy. JAMA Facial Plast. Surg. 2018, 20, 335–337. [Google Scholar] [CrossRef]
Hashemi, J.; Dawson, G.; Carpenter, K.L.H.; Campbell, K.; Qiu, Q.; Espinosa, S.; Marsan, S.; Baker, J.P.; Egger, H.L.; Sapiro, G. Computer Vision Analysis for Quantification of Autism Risk Behaviors. IEEE Trans. Affect. Comput. 2021, 12, 215–226. [Google Scholar] [CrossRef]
Zhuang, Y.; McDonald, M.; Uribe, O.; Yin, X.; Parikh, D.; Southerland, A.M.; Rohde, G.K. Facial Weakness Analysis and Quantification of Static Images. IEEE J. Biomed. Health Inform. 2020, 24, 2260–2267. [Google Scholar] [CrossRef]
Ruiter, A.M.; Wang, Z.; Yin, Z.; Naber, W.C.; Simons, J.; Blom, J.T.; van Gemert, J.C.; Verschuuren, J.; Tannemaat, M.R. Assessing facial weakness in myasthenia gravis with facial recognition software and deep learning. Ann. Clin. Transl. Neurol. 2023, 10, 1314–1325. [Google Scholar] [CrossRef]
Parra-Dominguez, G.S.; Garcia-Capulin, C.H.; Sanchez-Yanez, R.E. Automatic Facial Palsy Diagnosis as a Classification Problem Using Regional Information Extracted from a Photograph. Diagnostics 2022, 12, 1528. [Google Scholar] [CrossRef]
Ardalan, A.; Yamane, N.; Rao, A.K.; Montes, J.; Goldman, S. Analysis of gait synchrony and balance in neurodevelopmental disorders using computer vision techniques. Health Inform. J. 2021, 27, 14604582211055650. [Google Scholar] [CrossRef]
Chambers, C.; Seethapathi, N.; Saluja, R.; Loeb, H.; Pierce, S.R.; Bogen, D.K.; Prosser, L.; Johnson, M.J.; Kording, K.P. Computer vision to automatically assess infant neuromotor risk. IEEE Trans. Neural Syst. Rehabil. Eng. 2020, 28, 2431–2442. [Google Scholar] [CrossRef]
He, L.; Li, H.; Wang, J.; Chen, M.; Gozdas, E.; Dillman, J.R.; Parikh, N.A. A multi-task, multi-stage deep transfer learning model for early prediction of neurodevelopment in very preterm infants. Sci. Rep. 2020, 10, 15072. [Google Scholar] [CrossRef]
Cowan, T.; Masucci, M.D.; Gupta, T.; Haase, C.M.; Strauss, G.P.; Cohen, A.S. Computerized analysis of facial expressions in serious mental illness. Schizophr Res 2022, 241, 44–51. [Google Scholar] [CrossRef]
Alhazmi, A.K.; Alanazi, M.A.; Alshehry, A.H.; Alshahry, S.M.; Jaszek, J.; Djukic, C.; Brown, A.; Jackson, K.; Chodavarapu, V.P. Intelligent Millimeter-Wave System for Human Activity Monitoring for Telemedicine. Sensors 2024, 24, 268. [Google Scholar] [CrossRef]
Li, R.; St George, R.J.; Wang, X.; Lawler, K.; Hill, E.; Garg, S.; Williams, S.; Relton, S.; Hogg, D.; Bai, Q. Moving towards intelligent telemedicine: Computer vision measurement of human movement. Comput. Biol. Med. 2022, 147, 105776. [Google Scholar] [CrossRef] [PubMed]
Zhang, X.; Ding, J.; Wu, M.; Wong, S.T.; Van Nguyen, H.; Pan, M. Adaptive privacy preserving deep learning algorithms for medical data. In Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, Virtual, 5–9 January 2021; pp. 1169–1178. [Google Scholar]
Deepika, J.; Rajan, C.; Senthil, T. Security and privacy of cloud-and IoT-based medical image diagnosis using fuzzy convolutional neural network. Comput. Intell. Neurosci. 2021, 2021, 6615411. [Google Scholar] [CrossRef] [PubMed]
Ren, L.; Zhang, D. A New Data Model for the Privacy Protection of Medical Images. Comput. Intell. Neurosci. 2022, 2022, 5867215. [Google Scholar] [CrossRef] [PubMed]
Joshi, N.B.; Nalbalwar, S.L. A fall detection and alert system for an elderly using computer vision and Internet of Things. In Proceedings of the 2017 2nd IEEE International Conference on Recent Trends in Electronics, Information & Communication Technology (RTEICT), Bengaluru, India, 19–20 May 2017; pp. 1276–1281. [Google Scholar]
Orejel Bustos, A.S.; Tramontano, M.; Morone, G.; Ciancarelli, I.; Panza, G.; Minnetti, A.; Picelli, A.; Smania, N.; Iosa, M.; Vannozzi, G. Ambient assisted living systems for falls monitoring at home. Expert Rev. Med. Devices 2023, 20, 821–828. [Google Scholar] [CrossRef]
Al Nahian, M.J.; Ghosh, T.; Uddin, M.N.; Islam, M.M.; Mahmud, M.; Kaiser, M.S. Towards artificial intelligence driven emotion aware fall monitoring framework suitable for elderly people with neurological disorder. In Proceedings of the International Conference on Brain Informatics, Padua, Italy, 19 September 2020; Springer: Berlin/Heidelberg, Germany, 2020; pp. 275–286. [Google Scholar]
Chen, W.; Jiang, Z.; Guo, H.; Ni, X. Fall detection based on key points of human-skeleton using openpose. Symmetry 2020, 12, 744. [Google Scholar] [CrossRef]
Ramirez, H.; Velastin, S.A.; Meza, I.; Fabregas, E.; Makris, D.; Farias, G. Fall detection and activity recognition using human skeleton features. IEEE Access 2021, 9, 33532–33542. [Google Scholar] [CrossRef]
Awwad, S.; Tarvade, S.; Piccardi, M.; Gattas, D.J. The use of privacy-protected computer vision to measure the quality of healthcare worker hand hygiene. Int. J. Qual. Health Care 2019, 31, 36–42. [Google Scholar] [CrossRef]
Guo, P.; Chiew, Y.S.; Shaw, G.M.; Shao, L.; Green, R.; Clark, A.; Chase, J.G. Clinical Activity Monitoring System (CATS): An automatic system to quantify bedside clinical activities in the intensive care unit. Intensive Crit. Care Nurs. 2016, 37, 52–61. [Google Scholar] [CrossRef]
Peng, G.; Yeong Shiong, C.; Shaw, G.; Chase, G. Validation of clinical activity tracking system in Intensive Care Unit to assess nurse workload distribution. In Proceedings of the 2015 37th Annual International Conference of the IEEE Engineering in Medicine and Biology Society (EMBC), Milan, Italy, 25–29 August 2015; Volume 2015, pp. 458–461. [Google Scholar] [CrossRef]
Bashiri, F.S.; LaRose, E.; Peissig, P.; Tafti, A.P. MCIndoor20000: A fully-labeled image dataset to advance indoor objects detection. Data Brief 2018, 17, 71–75. [Google Scholar] [CrossRef]
Ismail, A.; Ahmad, S.A.; Soh, A.C.; Hassan, M.K.; Harith, H.H. MYNursingHome: A fully-labelled image dataset for indoor object classification. Data Brief 2020, 32, 106268. [Google Scholar] [CrossRef]
Hu, D.; Li, S.; Wang, M. Object detection in hospital facilities: A comprehensive dataset and performance evaluation. Eng. Appl. Artif. Intell. 2023, 123, 106223. [Google Scholar] [CrossRef]
Lea, C.; Facker, J.; Hager, G.; Taylor, R.; Saria, S. 3D Sensing Algorithms Towards Building an Intelligent Intensive Care Unit. AMIA Jt. Summits Transl. Sci. Proc. 2013, 2013, 136–140. [Google Scholar] [PubMed]
Szegedy, C.; Liu, W.; Jia, Y.; Sermanet, P.; Reed, S.; Anguelov, D.; Erhan, D.; Vanhoucke, V.; Rabinovich, A. Going deeper with convolutions. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Boston, MA, USA, 8–10 June 2015; pp. 1–9. [Google Scholar]
Yamashita, R.; Nishio, M.; Do, R.K.G.; Togashi, K. Convolutional neural networks: An overview and application in radiology. Insights Into Imaging 2018, 9, 611–629. [Google Scholar] [CrossRef]
Rezazade Mehrizi, M.H.; van Ooijen, P.; Homan, M. Applications of artificial intelligence (AI) in diagnostic radiology: A technography study. Eur. Radiol. 2021, 31, 1805–1811. [Google Scholar] [CrossRef]
Katzman, B.D.; van der Pol, C.B.; Soyer, P.; Patlas, M.N. Artificial intelligence in emergency radiology: A review of applications and possibilities. Diagn. Interv. Imaging 2023, 104, 6–10. [Google Scholar] [CrossRef] [PubMed]
Kim, B.; Romeijn, S.; van Buchem, M.; Mehrizi, M.H.R.; Grootjans, W. A holistic approach to implementing artificial intelligence in radiology. Insights Into Imaging 2024, 15, 22. [Google Scholar] [CrossRef]
Komura, D.; Ishikawa, S. Machine learning methods for histopathological image analysis. Comput. Struct. Biotechnol. J. 2018, 16, 34–42. [Google Scholar] [CrossRef]
Stollings, J.L.; Kotfis, K.; Chanques, G.; Pun, B.T.; Pandharipande, P.P.; Ely, E.W. Delirium in critical illness: Clinical manifestations, outcomes, and management. Intensive Care Med. 2021, 47, 1089–1103. [Google Scholar] [CrossRef] [PubMed]
Berger, S.E.; Baria, A.T. Assessing Pain Research: A Narrative Review of Emerging Pain Methods, Their Technosocial Implications, and Opportunities for Multidisciplinary Approaches. Front. Pain Res. 2022, 3, 896276. [Google Scholar] [CrossRef]
Nordness, M.F.; Hayhurst, C.J.; Pandharipande, P. Current Perspectives on the Assessment and Management of Pain in the Intensive Care Unit. J. Pain Res. 2021, 14, 1733–1744. [Google Scholar] [CrossRef] [PubMed]
Pisani, M.A.; Devlin, J.W.; Skrobik, Y. Pain and Delirium in Critical Illness: An Exploration of Key 2018 SCCM PADIS Guideline Evidence Gaps. Semin. Respir. Crit. Care Med. 2019, 40, 604–613. [Google Scholar] [CrossRef]
Devlin, J.W.; Skrobik, Y.; Gélinas, C.; Needham, D.M.; Slooter, A.J.C.; Pandharipande, P.P.; Watson, P.L.; Weinhouse, G.L.; Nunnally, M.E.; Rochwerg, B.; et al. Clinical practice guidelines for the prevention and management of pain, agitation/sedation, delirium, immobility, and sleep disruption in adult patients in the ICU. Crit. Care Med. 2018, 46, e825–e873. [Google Scholar] [CrossRef]
Herasevich, S.; Lipatov, K.; Pinevich, Y.; Lindroth, H.; Tekin, A.; Herasevich, V.; Pickering, B.W.; Barwise, A.K. The Impact of Health Information Technology for Early Detection of Patient Deterioration on Mortality and Length of Stay in the Hospital Acute Care Setting: Systematic Review and Meta-Analysis. Crit. Care Med. 2022, 50, 1198–1209. [Google Scholar] [CrossRef] [PubMed]
Davidovitch, M.; Slobodin, O.; Weisskopf, M.G.; Rotem, R.S. Age-Specific Time Trends in Incidence Rates of Autism Spectrum Disorder Following Adaptation of DSM-5 and Other ASD-Related Regulatory Changes in Israel. Autism Res. 2020, 13, 1893–1901. [Google Scholar] [CrossRef] [PubMed]
Christensen, D.L.; Maenner, M.J.; Bilder, D.; Constantino, J.N.; Daniels, J.; Durkin, M.S.; Fitzgerald, R.T.; Kurzius-Spencer, M.; Pettygrove, S.D.; Robinson, C.; et al. Prevalence and Characteristics of Autism Spectrum Disorder Among Children Aged 4 Years—Early Autism and Developmental Disabilities Monitoring Network, Seven Sites, United States, 2010, 2012, and 2014. MMWR Surveill. Summ. 2019, 68, 1–19. [Google Scholar] [CrossRef]
Cowan, S.L.; Preller, J.; Goudie, R.J.B. Evaluation of the E-PRE-DELIRIC prediction model for ICU delirium: A retrospective validation in a UK general ICU. Crit. Care 2020, 24, 123. [Google Scholar] [CrossRef] [PubMed]
Guirguis-Blake, J.M.; Michael, Y.L.; Perdue, L.A.; Coppola, E.L.; Beil, T.L.; Thompson, J.H. U.S. Preventive Services Task Force Evidence Syntheses, formerly Systematic Evidence Reviews. In Interventions to Prevent Falls in Community-Dwelling Older Adults: A Systematic Review for the U.S. Preventive Services Task Force; Agency for Healthcare Research and Quality (US): Rockville, MD, USA, 2018. [Google Scholar]
Petersen, C. Through Patients’ Eyes: Regulation, Technology, Privacy, and the Future. Yearb. Med. Inform. 2018, 27, 10–15. [Google Scholar] [CrossRef]
Vo, V.; Chen, G.; Aquino, Y.S.J.; Carter, S.M.; Do, Q.N.; Woode, M.E. Multi-stakeholder preferences for the use of artificial intelligence in healthcare: A systematic review and thematic analysis. Soc. Sci. Med. 2023, 338, 116357. [Google Scholar] [CrossRef] [PubMed]
Glancova, A.; Do, Q.T.; Sanghavi, D.K.; Franco, P.M.; Gopal, N.; Lehman, L.M.; Dong, Y.; Pickering, B.W.; Herasevich, V. Are We Ready for Video Recognition and Computer Vision in the Intensive Care Unit? A Survey. Appl. Clin. Inform. 2021, 12, 120–132. [Google Scholar] [CrossRef]
Gerke, S.; Yeung, S.; Cohen, I.G. Ethical and Legal Aspects of Ambient Intelligence in Hospitals. JAMA 2020, 323, 601–602. [Google Scholar] [CrossRef] [PubMed]
Bender, E.M.; Gebru, T.; McMillan-Major, A.; Shmitchell, S. On the Dangers of Stochastic Parrots: Can Language Models Be Too Big? In Proceedings of the 2021 ACM Conference on Fairness, Accountability, and Transparency, Virtual Event, 3–10 March 2021; pp. 610–623. [Google Scholar] [CrossRef]
Fournier-Tombs, E.; McHardy, J. A Medical Ethics Framework for Conversational Artificial Intelligence. J. Med. Internet Res. 2023, 25, e43068. [Google Scholar] [CrossRef]
Kim, B.; Kim, H.; Kim, K.; Kim, S.; Kim, J. Learning not to learn: Training deep neural networks with biased data. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Long Beach, CA, USA, 15–20 June 2019; pp. 9012–9020. [Google Scholar]
Karkkainen, K.; Joo, J. Fairface: Face attribute dataset for balanced race, gender, and age for bias measurement and mitigation. In Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, Virtual, 5–9 January 2021; pp. 1548–1558. [Google Scholar]
Bahng, H.; Chun, S.; Yun, S.; Choo, J.; Oh, S.J. Learning de-biased representations with biased representations. In Proceedings of the International Conference on Machine Learning, Virtual, 13–18 July 2020; PMLR: London, UK, 2020; pp. 528–539. [Google Scholar]
Nam, J.; Cha, H.; Ahn, S.; Lee, J.; Shin, J. Learning from failure: De-biasing classifier from biased classifier. Adv. Neural Inf. Process. Syst. 2020, 33, 20673–20684. [Google Scholar]
Morley, J.; Floridi, L. An ethically mindful approach to AI for health care. Lancet 2020, 395, 254–255. [Google Scholar] [CrossRef] [PubMed]
Jobin, A.; Ienca, M.; Vayena, E. The global landscape of AI ethics guidelines. Nat. Mach. Intell. 2019, 1, 389–399. [Google Scholar] [CrossRef]
Elendu, C.; Amaechi, D.C.; Elendu, T.C.; Jingwa, K.A.; Okoye, O.K.; John Okah, M.; Ladele, J.A.; Farah, A.H.; Alimi, H.A. Ethical implications of AI and robotics in healthcare: A review. Medicine 2023, 102, e36671. [Google Scholar] [CrossRef] [PubMed]
Acemoglu, D.; Restrepo, P. Artificial intelligence, automation, and work. In The Economics of Artificial Intelligence: An Agenda; University of Chicago Press: Chicago, IL, USA, 2018; pp. 197–236. [Google Scholar]
Braun, M.; Hummel, P.; Beck, S.; Dabrock, P. Primer on an ethics of AI-based decision support systems in the clinic. J. Med. Ethics 2020, 47, e3. [Google Scholar] [CrossRef]
Khanna, N.N.; Maindarkar, M.A.; Viswanathan, V.; Fernandes, J.F.E.; Paul, S.; Bhagawati, M.; Ahluwalia, P.; Ruzsa, Z.; Sharma, A.; Kolluri, R.; et al. Economics of Artificial Intelligence in Healthcare: Diagnosis vs. Treatment. Healthcare 2022, 10, 2493. [Google Scholar] [CrossRef] [PubMed]
Bekbolatova, M.; Mayer, J.; Ong, C.W.; Toma, M. Transformative Potential of AI in Healthcare: Definitions, Applications, and Navigating the Ethical Landscape and Public Perspectives. Healthcare 2024, 12, 125. [Google Scholar] [CrossRef] [PubMed]
Deng, J.; Dong, W.; Socher, R.; Li, L.-J.; Li, K.; Fei-Fei, L. Imagenet: A large-scale hierarchical image database. In Proceedings of the 2009 IEEE Conference on Computer Vision and Pattern Recognition, Miami, FL, USA, 20–25 June 2009; IEEE: Piscataway, NJ, USA, 2009; pp. 248–255. [Google Scholar]
Ying, X. An overview of overfitting and its solutions. Proc. J. Phys. Conf. Ser. 2019, 1168, 022022. [Google Scholar] [CrossRef]
Montesinos López, O.A.; Montesinos López, A.; Crossa, J. Overfitting, model tuning, and evaluation of prediction performance. In Multivariate Statistical Machine Learning Methods for Genomic Prediction; Springer: Berlin/Heidelberg, Germany, 2022; pp. 109–139. [Google Scholar]
Garcea, F.; Serra, A.; Lamberti, F.; Morra, L. Data augmentation for medical imaging: A systematic literature review. Comput. Biol. Med. 2023, 152, 106391. [Google Scholar] [CrossRef]
Salehi, A.W.; Khan, S.; Gupta, G.; Alabduallah, B.I.; Almjally, A.; Alsolai, H.; Siddiqui, T.; Mellit, A. A Study of CNN and Transfer Learning in Medical Imaging: Advantages, Challenges, Future Scope. Sustainability 2023, 15, 5930. [Google Scholar] [CrossRef]
Girshick YWaAKaFMaW-YLaR. Detectron2. 2019. Available online: https://detectron2.readthedocs.io/en/latest/ (accessed on 15 July 2023).
Chen, K.; Wang, J.; Pang, J.; Cao, Y.; Xiong, Y.; Li, X.; Sun, S.; Feng, W.; Liu, Z.; Xu, J. MMDetection: Open mmlab detection toolbox and benchmark. arXiv 2019. [Google Scholar] [CrossRef]
Cao, Z.; Hidalgo, G.; Simon, T.; Wei, S.-E.; Sheikh, Y. Openpose: Realtime multi-person 2d pose estimation using part affinity fields. arXiv 2018. [Google Scholar] [CrossRef]
Chang, K.; Balachandar, N.; Lam, C.; Yi, D.; Brown, J.; Beers, A.; Rosen, B.; Rubin, D.L.; Kalpathy-Cramer, J. Distributed deep learning networks among institutions for medical imaging. J. Am. Med. Inform. Assoc. 2018, 25, 945–954. [Google Scholar] [CrossRef] [PubMed]
Everingham, M.; Van Gool, L.; Williams, C.K.; Winn, J.; Zisserman, A. The pascal visual object classes (voc) challenge. Int. J. Comput. Vis. 2010, 88, 303–338. [Google Scholar] [CrossRef]
Lin, T.-Y.; Maire, M.; Belongie, S.; Hays, J.; Perona, P.; Ramanan, D.; Dollár, P.; Zitnick, C.L. Microsoft coco: Common objects in context. In Proceedings of the Computer Vision–ECCV 2014: 13th European Conference, Zurich, Switzerland, 6–12 September 2014; Proceedings, Part V 13. Springer: Berlin/Heidelberg, Germany, 2014; pp. 740–755. [Google Scholar]

Figure 1. The completed journey mapping of the patient and clinician through the healthcare system. This figure depicts a scenario demonstrating the application of computer vision in a hospital setting. Each data point addresses a specific instance where computer vision could effectively enhance the system, optimizing the patient’s experience and the clinical workflow. This system would aim at directing time and resources more efficiently towards patient care management and improvement of patient outcomes.

Figure 2. The Venn diagram displays how the main applications of computer vision in healthcare are organized. There are three central domains that we identified through our review: monitoring, job enhancement, and automation. In bold, attention is drawn to applications that are frequently mentioned in the literature. Overlap between categories is shown. The non-bold text highlights potential uses of computer vision that are either in development/testing stages or need proof-of-concept work to be completed.

Table 1. Display of different types of machine learning models used in computer vision.

Approach	Supervision	Machine Learning Model	Description
Deep Learning	Supervised	Convolutional Neural Network	Mostly used for classification and segmentation. It includes wide range of model architectures such as ResNet, VGG-net, and AlexNet.
		Mask-region-based Convolutional Neural Network	CNN type primarily employed for detecting objects in input images.
		YOLO	CNN types primarily employed for image segmentation or classification.
		U-net	Type of CNN mainly used for image segmentation.
		Gated recurrent unit	Type of recurrent neural network tailored for modeling time dependent data to address long-range dependencies in sequential data.
		Long short-term memory (LSTM)
		Vision transformer	Novel category of CNNs. Adopts transformer architecture commonly used in NLP and shows high performance in image classification benchmarks.
	Unsupervised	Convolutional Deep Belief Network (CDBN)	Type of deep generative models that is constructed by stacking max-pooling Convolutional Restricted Boltzmann Machines (CRBMs).
	Unsupervised	Autoencoder	A type of neural network that specializes in learning to convert data into a compact and efficient representation, often employed for the purpose of dimensionality reduction.
Traditional	Supervised	k-nearest neighbors	Assigns class labels or values according to the distance of the input data to the k-nearest neighbors in the training data.
		Binary Tree	Decision-making algorithm that navigates the tree from root to leaf to make decisions based on specific features or attributes.
		Naïve Bayes	Probabilistic machine learning algorithm that classifies data based on the conditional independence between every pair of features.
		Support vector machine (SVM)	It uses the kernel trick to find a linear decision boundary to separate input data in the transformed space.
		Fuzzy Inference System	A computational model that uses fuzzy logic to perform reasoning on uncertain or imprecise information.
		Fisher’s linear discriminant analysis	Classifies input data based on linear combination of features that represent items in each class.
		Linear Mixed Model	An extension of simple linear models that allow fixed and random effects, useful for complex data
		Logistic/linear regression	This is a statistical model that uses the logistic function to predict the probability of a specific class.
	Supervised and Unsupervised	Random forest	Ensemble learning method that comprises multiple trees trained on random subsets of data. The final prediction is aggregated from all trees.
		Neural network	It is a conventional machine learning model employed for classification and regression. In comparison to existing deep methods, it exhibits lower accuracy.
		Singular Value Decomposition (SVD)	It decomposes the input feature space into 3 generic and familiar matrices.
	Unsupervised	Fuzzy C-means	A computational model that uses fuzzy logic to perform reasoning on uncertain or imprecise information
	Unsupervised	Gaussian Mixture Model Segmentation	It uses Gaussian distribution to partition pixels into similar segments

Table 1 outlines the types of machine learning models used in CV algorithm development. As computer vision has progressed over the years, the use of deep supervised models has increased. This innovation includes the use of the transformers and autoencoders listed above.

Table 2. Uses and categorizations (themes) of computer vision in industry outside healthcare.

Industry	Themes
	Job Enhancement	Surveillance (S) Monitoring (M)	Automation	Augmented Reality
Agriculture		Monitor crops (M) [15,38,39]	Weed detection and elimination https://weedbot.eu/, acessed on 1 July 2023
Animal control		Wildlife monitoring (M) [49] Farm animals monitoring (M) [50]
Art security		Forgery detection (S) [35,36,37,51]
Automotive		Parking lot analysis (M) [52]	Self-driving cars https://tesla.com/, accessed on 15 July 2023
Digital design	Video enhancement [53] Image/video deblurring [54]
Education		Cheating prevention (S) [55]
Engineering	Importing real life objects into modeling software [56]
Food service		Reduce food waste in restaurants (M) [57]	Robotic food delivery https://starship.xyz/, accessed on 15 July 2023
Gaming				Xbox Kinect [58] Gesture based gaming [59]
Government	Control traffic lights [60]	Detecting natural disasters (M) [61]
Insurance	Insurance appraisals [10,62,63]
Law enforcement	Forensic analysis[64]	Facial recognition in large crowds (S) [65] Identity verification (S) [66] Detect dangerous situations (S) [67]	Speeding enforcement [14]
Manufacturing			Defective products on an assembly line [68]
Manufacturing	Workplace inspection [69]
Medical	See Table 3
Military		Terrain Reconnaissance (S) [70]	Automate military drones [71]
Movie	Film movie restoration [72]
Retail	Customer behavior analysis (Traffic volume heatmaps) https://n-ix.com/, accessed on 1 July 2023 Staff demand for optimal shift assignments https://n-ix.com/, accessed on 1 July 2023	Detecting defective products (M) [73]	Floor-cleaning robots [74] Detecting restock [75] Identifying retail products at sale https://n-ix.com/, accessed on 1 July 2023	Trying on clothes virtually [43,44,45] Virtual testing and visualization of products in their intended space [76]
Robotics			Helping robots move around environment https://inbolt.com/, accessed on 1 July 2023
Social media	Social media recommendation [77]		Inappropriate content detection [78]
Space		Tracking asteroids and debris (M) [79]	Landing spacecraft [80,81]
Sports	Finding game highlights in videos [82] Sport performance analysis [33,34]		Ball tracking in sports [83,84] Refereeing automation [33]
Tech		Facial recognition personal mobile devices (M) [85]		Language translation of video and images [86] Azure Kinect [87] Apple vision https://apple.com/apple-vision-pro/, accessed on 15 July 2023

Table 2 outlines the use of computer vision across different industries (first column) and how that use is categorized into identified themes. This is not an exhaustive list. This table was created to illustrate how other industries use computer vision to improve efficiencies across systems and could provide insight into how computer vision could be used in healthcare.

Table 3. CV applications in healthcare.

Areas	Citations	Image Type	Deep Model	Application
Medical imaging and diagnosis	[98,99,100,101,102,103]	CT, F-FDG PET/CT, Chest X-rays	Mask-RCNN, CNN, Transformer, SVM, random forest, k-nearest neighbor	Lung cancer, tuberculosis
	[104,105,106]	Iris, cellular retinal, fundus	Binarytree, Random Forest, SVM, neural network, CNN	Changes in vision related to diabetes
	[107,108,109]	HD microscope	Vision transformer, CNN	Cervical cancer
	[97,110,111,112,113,114,115,116,117,118,119]	Mammogram, whole slide images, hematoxylin, eosin	YOLO, CNN, random forest, SVM, decision tree, Naïve Bayes, Logistic linear classifier, Linear discriminant classifier, Fischer’s Linear Discriminant analysis, k-nearest neighbor, Autoencoders	Breast cancer, data augmentation
	[120,121,122,123]	Dermoscopic image	CNN, Gated recurrent unit	Skin cancer detection/segmentation
	[124,125,126,127]	Endoscopic images, hematoxylin & eosin, whole-slide images	CNN, transformer, U-net	Colorectal, gastrointestinal cancer
	[128,129,130,131,132,133]	Chest X-rays	CNN, transformer, logistic regression	COVID-19 diagnosis Age estimation in unidentified patients
	[134]	Whole slide images	Vision transformer	Subtyping of papillary renal cell carcinoma
	[25,29,30,98,127,135,136,137,138,139,140,141,142,143,144,145,146,147,148,149,150,151]	MRI, Histogram, CT, X-ray, ultrasound, PET	CNN, Naïve Bayes, Random Forest, Neural Networks, SVM, k-nearest neighbor, Decision Tree, logistic function, Naive Bayes, Fuzzy k-means	Cancers (brain, bladder, breast, liver, lung, pancreas, prostate, other), CT reconstruction, Alzheimer’s Disese, intracranial hemorrhage
	[152,153,154,155]	Dual energy X-Ray absorptiometry (DEXA) X-ray	SVM YOLOv8.0, Detectron2, several others (see systematic review)	Lumbar spine fractures Pediatric fractures Overall fracture identification
Delirium	[156]	Surveillance images	CNN, k-nearest neighbors	Delirium monitoring
Pain, Agitation, Stress, Level of sedation	[157,158,159,160,161,162,163,164,165,166,167,168,169,170,171,172,173,174,175,176,177,178,179,180]	Surveillance images, depth image, face images, pain datasets	YOLO, Mask-RCNN, CNN, CDBN, SVM, LSTM, LMM, Neural Network	Activity recognition, detection of pain and discomfort, stress, automated facial analysis for grimacing, agitation, eye localization, depression, anxiety, stress levels, AU
Patient deterioration	[181,182]	Color videos	Logistic/linear Regression	Deterioration prediction using AUs
Mechanical ventilation	[183,184,185]	Chest X-rays, ICU videos	U-net, YOLO, TL, Feature descriptor	Need for mechanical ventilation, detect and recognize ventilation objects and positioning, estimate lung volume
Mobility	[186,187]	ICU video images	CNN, YOLOv2	Patient mobilization activities in ICU, NIMS
Patient safety	[188,189,190,191,192,193,194,195]	Surgical videos, depth images, video recordings	OpenPose, Yolo, CNN, Mask-RCNN	Surgical team behavioral analysis, patient mobilization activities, hand hygiene, ICU staff monitoring, assessing situational awareness
Surgical assistance	[194,196,197,198,199]	Surgical activity images, OR videos	CNN	Robot-assisted surgery, situational awareness in OR
Neurological, neurodevelopmental, psychiatric disorders	[142,162,174,178,200,201,202,203,204,205,206,207,208,209],	Whole-body video recording, MRI, PET, patient images	Detectron2, OpenPose, CNN, k-nearest neighbor, SVM, K-SVD, Bayesian Networks	Analysis of gain synchrony, balance, Infant neuromotor risk, neurodegenerative disease, behavioral analysis in ASD and ADHD, facial expression in depression, facial weakness
Remote monitoring, telemedicine	[210,211]	Surveillance images	Deep reinforcement learning, CNN	In-home elbow rehabilitation
Data security and privacy	[114,212,213,214]	X-rays, MRI	CNN, Fuzzy CNN	Privacy protections for deep learning algorithms containing medical data
Fall detection	[215,216,217,218,219]	Surveillance images	Gaussian Mixture Model, CNN Segmentation, AlphaPose, OpenPose, LSTM	Human fall detection
Hospital scene recognition	[192,220,221,222,223,224,225,226]	Indoor images of ICU, hospital, nursing home; pediatric ICU videos	YOLO, CNN, SVM, CATS	ICU and hospital indoor object detection, hand hygiene, ICU activity measurement

Table 3 describes the varying uses of CV technology in healthcare and outlines the image captures, machine learning models used, and the focus area. This is not an exhaustive list. Abbreviations: AU: action unit; ASD: autism spectrum disorder; CATS: Clinical Activity Tracking System; CNN: Convolutional Neural Network; NIMS: Non-Invasive Mobility Sensor; OR: operating room.

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Lindroth, H.; Nalaie, K.; Raghu, R.; Ayala, I.N.; Busch, C.; Bhattacharyya, A.; Moreno Franco, P.; Diedrich, D.A.; Pickering, B.W.; Herasevich, V. Applied Artificial Intelligence in Healthcare: A Review of Computer Vision Technology Application in Hospital Settings. J. Imaging 2024, 10, 81. https://doi.org/10.3390/jimaging10040081

AMA Style

Lindroth H, Nalaie K, Raghu R, Ayala IN, Busch C, Bhattacharyya A, Moreno Franco P, Diedrich DA, Pickering BW, Herasevich V. Applied Artificial Intelligence in Healthcare: A Review of Computer Vision Technology Application in Hospital Settings. Journal of Imaging. 2024; 10(4):81. https://doi.org/10.3390/jimaging10040081

Chicago/Turabian Style

Lindroth, Heidi, Keivan Nalaie, Roshini Raghu, Ivan N. Ayala, Charles Busch, Anirban Bhattacharyya, Pablo Moreno Franco, Daniel A. Diedrich, Brian W. Pickering, and Vitaly Herasevich. 2024. "Applied Artificial Intelligence in Healthcare: A Review of Computer Vision Technology Application in Hospital Settings" Journal of Imaging 10, no. 4: 81. https://doi.org/10.3390/jimaging10040081

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Applied Artificial Intelligence in Healthcare: A Review of Computer Vision Technology Application in Hospital Settings

Abstract

1. Introduction

2. Overview of Computer Vision

3. Application of Computer Vision in Industry Outside of Healthcare

4. Computer Vision Application in Hospital Settings

4.1. Radiology

4.2. Histology

5. Development and Testing of Computer Vision in the Hospital Setting

5.1. Detection and Monitoring of Brain Health

5.2. Detection and Quantification of Pain, Agitation, and Level of Sedation

5.3. Patient Deterioration

5.4. Mechanical Ventilation

5.5. Mobility

5.6. Patient Safety

5.7. Quantification of Workload in the ICU

6. Computer Vision in Outpatient and Community Settings

6.1. Pain Detection and Monitoring in Community Settings

6.2. Neurologic, Neurodevelopmental, and Psychiatric Disorders

6.3. Falls

7. Journey Mapping and Future Computer Vision Application

8. Summary and Implications for Computer Vision Use in Healthcare

9. Data Privacy and Safety Considerations

9.1. Ethical Considerations in Computer Vision

9.2. Economic Considerations

9.3. Acceptability and Readiness for Computer Vision

9.4. Data Needs and Considerations

9.5. Computer Vision Datasets

9.6. Limitations of This Review

10. Conclusions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI