Next Article in Journal
Cascadable First-Order and Second-Order Inverse Filters Based on Second-Generation Voltage Conveyors
Previous Article in Journal
Deposition and Characterization of Cu-Enhanced High-Entropy Alloy Coatings via DC Magnetron Sputtering
Previous Article in Special Issue
Machine Learning-Based Model for Emergency Department Disposition at a Public Hospital
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Review

Artificial Intelligence in Ophthalmology: Advantages and Limits

by
Hariton-Nicolae Costin
1,*,†,
Monica Fira
1,† and
Liviu Goraș
2,†
1
Institute of Computer Science, Romanian Academy Iași Branch, 700481 Iași, Romania
2
Faculty of Electronics, Telecommunications & Information Technology, Gheorghe Asachi Technical University of Iași, 700050 Iași, Romania
*
Author to whom correspondence should be addressed.
These authors contributed equally to this work.
Appl. Sci. 2025, 15(4), 1913; https://doi.org/10.3390/app15041913
Submission received: 7 December 2024 / Revised: 23 January 2025 / Accepted: 10 February 2025 / Published: 12 February 2025
(This article belongs to the Special Issue Recent Progress and Challenges of Digital Health and Bioengineering)

Abstract

:
In recent years, artificial intelligence has begun to play a salient role in various medical fields, including ophthalmology. This extensive review is addressed to ophthalmologists and aims to capture the current landscape and future potential of AI applications for eye health. From automated retinal screening processes and machine learning models predicting the progression of ocular conditions to AI-driven decision support systems in clinical settings, this paper provides a comprehensive overview of the clinical implications of AI in ophthalmology. The development of AI has opened new horizons for ophthalmology, offering innovative solutions to improve the accuracy and efficiency of ocular disease diagnosis and management. The importance of this paper lies in its potential to strengthen collaboration between researchers, ophthalmologists, and AI specialists, leading to transformative findings in the early identification and treatment of eye diseases. By combining AI potential with cutting-edge imaging methods, novel biomarkers, and data-driven approaches, ophthalmologists can make more informed decisions and provide personalized treatment for their patients. Furthermore, this paper emphasizes the translation of basic research outcomes into clinical applications. We do hope this comprehensive review will act as a significant resource for ophthalmologists, researchers, data scientists, healthcare professionals, and managers in the healthcare system who are interested in the application of artificial intelligence in eye health.

1. General Introduction to AI for Clinicians

First, we can say that not every automation, however complex, is AI. So, what is not AI? Technologies that need humans to operate, make decisions, and control the functionalities are not AI. Such technology works only on algorithms. AI can learn from its experience, but non-AI technology cannot improvise itself.
One can appreciate that the essential features of AI are as follows:
  • Autonomy
Autonomy is the ability of AI software to perform tasks in complex environments without constant guidance from a user.
  • Adaptability
Adaptability is the ability to approach new situations in the environment and to improve performance by learning from experience.
Some top developments that have renewed interest in AI and enabled access to a wide scale in the last two decades are
-
the availability of large amounts of data (big data) for various applications and
-
increased computing power at lower cost through methods such as graphics processing units
(GPUs) within computing systems.
For example, the ImageNet system has been particularly used to train popular models such as the currently used AlexNet, VGG16, Inception, and ResNet modules (www.image-net.org, accessed on 5 November 2024).
Other datasets are available for applications such as simulators in music, facial recognition, text, and natural language processing. Because AI is a quickly developing domain, new innovations and applications are happening today, including in general medicine and ophthalmology. Definitions of some often used terms in AI are done in Appendix A.

1.1. Should a Clinician Worry About AI?

A quick search in the PubMed database shows that the number of papers published in medical AI has reached 226,403 in the last 25 years, and 153,776 (67.9%) articles were written from 2018 to date [1]!
Another piece of information comes from the Gartner Hype Index, which supervises and predicts the manner in which a technology will develop over time [2]. Machine learning (ML) and deep learning (DL) were at the peak of usage from 2016 to 2020, and today, the number of applications and their demand is growing again (Figure 1).
The lack of special provisions to understand or evaluate certain applications or the concepts behind them has resulted in recent efforts in the medical world to use AI in the delivery of patient care. However, failures in AI for medicine could decrease public trust in healthcare. Such failures could happen in many ways. For instance, bias in AI can provide wrong medical assessments, while intentional “adversarial” attacks could destabilize AI unless detected by explicit algorithmic defenses. As the most obvious ‘driver’ close to where AI is used in a clinical setting, the clinician could easily end up being held similarly liable for harmful decisions. AI systems being developed using current models risk using clinicians as ‘liability sinks’, absorbing liability that could otherwise be shared across all those involved in the design, running, and use of the system. Alternative models can return the patient to the center of decision-making and also allow the clinician to do what they are best at, rather than simply acting as a final check on a software program [4].
The new arrivals like Explainable AI increase user confidence in AI, ML, and DL.

1.2. How to Read and Understand a Medical Paper with AI?

Jaeschke et al. yielded a setting for evaluating diagnostic tests in clinical medicine [5]. This framework can be extended to include relevant information about AI-based algorithms.
Step 1: Assess whether the study outcomes are valid.
Main guidelines
-
Has there been an unconstrained, fair comparison with an explanation standard?
-
Did the patient group incorporate an adequate gamut of patients with diagnostic tests made in clinical settings?
For AI-powered methods, they can be adjusted as follows:
  • Are the databases adequate and delineated in enough features?
  • Was the reference level for training the algorithm adequate and well-founded?
Subsidiary guidelines
-
Did the outcomes of the assessed test affect the decision to apply the “gold” standard?
-
Were the test methods described in sufficient detail to allow replication?
For AI-based algorithms, they can be adapted as follows:
  • Is the algorithm development methodology described in sufficient detail to allow replication?
  • Are the algorithm/datasets used available for external validation?
Step 2: Evaluate the presented results.
-
Are probability ratios presented for test results or is the data required for these calculations provided?
Note: In statistics, the likelihood ratio test evaluates the fit of two competing statistical models.
For AI-based algorithms, they can be adapted as follows:
  • Are appropriate performance metrics reported? [6]
Step 3: Evaluate the usefulness of the results in the care of your patients.
-
Will the reproducibility of the test result and its interpretation be satisfactory for me?
-
Are the outcomes appropriate for sick persons?
-
Will the outcomes replace my usual approach?
-
Will sick persons be treated in a better way due to the new test?
For AI-powered methods, they may be adjusted as follows:
  • Are the conclusions of the method used responsible and explainable?
  • Does the method show generalizability (can it be simply modified for other input data)?
  • Was the performance of the initial method too cheerful?
  • Has the method been confirmed in my region?
  • Is there a cost-effectiveness evaluation for using the method?
  • Will there be a relevant relationship with the patient’s quality of life after applying the AI method?
  • Is there any aim to calculate this act?
Example: Framework for evaluating a study using AI in medicine
Title of the study and associated report: Clinically applicable deep learning for diagnosis in retinal disease [7].
Aim: To develop a patient triage system based on AI, by means of 3D OCT data.
Step 1: Assess whether the study outcomes are well-founded
  • Has there been an independent, blind comparison with a reference standard?
-
Are the datasets adequate and described in sufficient detail?
Yes: The authors describe in detail the training set for OCT segmentation (Topcon) (877 scans), the validation set for segmentation (224 scans), the training set for classification (14,884 scans), the validation set for classification (993 scans), and the test set for comparing the method (997 random scans) to a reference treatment.
  • Did the patient group incorporate an adequate range of patients for the application of the diagnostic measurements in a clinical setting?
-
Was the benchmark for training/testing the algorithm adequate and reliable?
Yes: Segmentation of the training data was manually done by trained ophthalmologists. For the purpose of classification, the validation set was annotated by three experts, and the test set used the gold standard for the complete clinical records to establish the diagnosis of the patient. The performance of the method was confronted with four ophthalmologists specializing in the retina and with four optometrists.
  • Did the outcomes of the test in question affect the applicability of the conclusions to the reference gold standard?
No, the reference gold standard was based on retrospective data from complete clinical records of patients undergoing the current standard of care.
  • Were the methods for performing the test described detailed enough to allow replication?
-
Is the algorithm progress explained in enough detail to permit its replica?
-
Are the algorithm and datasets used obtainable for external confirmation?
The researchers explained the application of the U-net structure in detail but noted that the data are not publicly accessible but may be disposable upon solicitation, conditional on local and national ethics approval. Yet, in a later paper, those researchers show the availability of the segmentation algorithm and dataset publicly available for validation [8].
Step 2: Assess the reported outcomes
  • Are the probability ratios of the outcomes given, or at least the data needed for their computation provided?
-
Are performance metrics adequately reported? [6]
Yes: The authors report receiver operating characteristic (ROC) curves (see Appendix B), confusion matrices, accuracies, and the influence of the extra data (OCT only, OCT + fundus + entire case abstract) on experts’ conclusions.
The method benefited from an AUC (area under the ROC curve, Appendix B) of 99.21% and an error rate of 5.5% (55/997).
Step 3: Assess the usefulness of the outcomes for patient care
  • Will the test’s reproducibility and its explanation be adequate for the attending physician?
-
Are the outcomes of the method used explainable?
-
Does the method show generalization?
-
Was the performance of the original method too cheerful?
The authors described information for generalizing the use of their method by means of other OCT equipment (Spectralis). Although initially, the method used performed badly and had an error rate of 46.6% for classification, training the segmentation method once again increased the AUC to 99.93% and decreased the error rate to 3.4% (4/116).
Thus, the software proved to be adjustable to another machine. Algorithm developers also tried to incorporate explainable elements into the AI area by yielding segmentation maps highlighting retinal composition, pathology, artifacts, and predictions of diagnostic likelihood and recommendations. Yet, in the videos used as additional documents, the automatic segmentation was not always precise.
  • Are the outcomes pertinent to the sick persons? Equivalently, are we in the case of personalized medicine (a very trendy approach now)?
-
Has the algorithm been validated in the local population?
No, the outcomes are from patients at Moorfields Hospital, London, UK. They will need further confirmation in different ethnicities and research centers before they can be applicable to your patients.
  • Will the results change my management?
-
Is there any contrast of the new method with current standards of healthcare?
-
Is there any cost-effectiveness examination to justify using the algorithm?
The algorithm was analyzed by four ophthalmologists and four optometrists. It performed in a similar way or even performed better than the specialists. However, there was no endeavor to evaluate its cost-effectiveness against industry standards of care.
  • Will patients be better cared for because of the test?
-
Will there be a relevant impact on the patient’s health after implementing the new method?
-
Is there any endeavor to estimate or calculate its impact?
The algorithm has the capacity to act as a decision aid for the clinician, but the immediate impact on the patient cannot be evaluated. No trial was carried out by the stakeholders to compute possible real-world impact.

1.3. Partial Conclusions

  • Exciting times lie ahead in medicine as well, because of the huge possibilities of AI as a tool of clinically aided decision-making.
  • Yet, the possible legal and ethical problems of liability management, decrease in clinical expertise due to the overuse of algorithms, inadequate representation of data (mainly for various minorities, the absence of individual privacy, “biomarking” because of intensive testing), and an inappropriate understanding of outcomes (with AI seen as a black box), can all hold back the implementation and acceptance of AI methods [9].
  • AI algorithms are fundamentally unbiased. Yet, the bias in the training data and the inherent biases of software developers or the diversity of medical schools can ultimately create complex, interpretable ethics.
  • Issues: A critical assessment by all stakeholders before any acquisition of new technologies will aid in setting apart reality from “hype”.
  • It should be kept in mind that the main goal of the medical world is to always provide patients with the “best” procedures of care on the market.
  • The accessibility, availability and spread of the social impact of the “health care model” should also be taken into consideration when doing so.

2. Technical Introduction to AI for Clinicians

2.1. Brief Introduction to AI

The Institute of Electrical and Electronics Engineers (IEEE) associates specific areas with AI:
  • Artificial neural networks (ANN), built on connectionist paradigms, seeking to imitate the human cerebral matter;
  • Evolutionary algorithms that use bio-inspired optimization methods, such as the mechanism of natural selection;
  • Fuzzy logic, which may emulate the natural language of humans, modifying classical logic.
These paradigms have been included in the concept of computational intelligence (CI). The principal feature of CI is the numerical portrayal of expertise as opposed to symbolic depiction in traditional AI.
Vapnik and Chervonenkis developed several systems that automatically learn from data, using different statistical tools for different strategies for solving classification and regression problems. This is how the new field of machine learning (ML)—automatic learning—appeared. The field of ML contains examples like Artificial Neural Networks, regression and classification trees, and Support Vector Machines (SVM).
Rosenblatt perfected the multilayer perceptron, and McLellan’s backpropagation algorithm (BP Algorithm) became famous.
Neural networks have been developed with the addition of new layers to them. This model was called deep learning (DL), which, actually, is one of the most effective techniques for approaching and implementing complex applications, e.g., in image processing and computer (artificial) vision, including in medicine [10]. The number of neural network parameters has increased, and new problems arise in the training stage, because much more synaptic weights need to be tuned, demanding huge databases for training, more computing power, and increased processing time.
AI has found different challenges and complex problems in medicine, research, or clinical decisions: data mining, systems of decision support, medical imaging, etc. In medical specialties that consume a lot of analysis time (radiology, ophthalmology), AI offers real help in increasing the quality of the medical act.

2.2. Difference Between Machine Learning and Deep Learning

To train the network, a preprocessing step should be added. This stage aims to decrease the amount of input data and select the most relevant characteristics.
This procedure is named feature extraction and is relevant for any classification or regression that applies ML. Many methods exist for feature extraction, and the steps used depend on the analyst’s experience, the limitations, and the needs of the problem. This stage uses, in general, significant time for good project implementation, and network performance is crucially dependent on this process.
The Deep Learning model tries to automate feature extraction processes (Figure 2b), which have become a “black box” procedure. In this way, the first layers obtain the characteristics to be used by the last layers in the decision (classification or regression) problem.
Crucially, in these applications, much more data must be used to improve all of these processes and outcomes. In order to better approach this, one can use refined techniques to deal with this problem, e.g., bootstrapping or data augmentation methods in image processing applications.
Obs.: Bootstrapping is a technique of deducing the outcomes for a population from the results obtained on a collection of smaller random samples from that population, by means of a replacement during the sampling process. Data augmentation enriches training data by creating new data by randomly transforming existing data. Thus, we artificially increase the size of the training set, reducing “overfitting”.

2.3. Machine Learning (ML)

Briefly, Machine Learning is an AI paradigm in which software programs can extract relationships and traits/characteristics as a learning procedure, but often without being specifically programmed for this activity. For example, automated processing of eye images as a diagnostic aid is a challenge of research in order to make the best paradigm with the least computational effort [12].
In ophthalmology, choosing the best method of representation, analysis and diagnosis using fundus images of the eyes is a complex computational problem [13].

Types of Learning Methods for ML

ML algorithms are mainly organized according to their learning model.
In this respect, supervised and unsupervised training are extensively used in relation to still images and videos. Similarly, reinforcement learning can be used without human intervention when performing eye surgery.
The characteristics of the three learning methods are shown in Figure 3:
  • Supervised learning: The network has current inputs and desired outputs and delineates the connection of the inputs and those wanted outputs;
  • Unsupervised learning: The network has certain input data, and it must group them into n classes based on similarity between those data;
  • Reinforcement learning: Within this method, some actions and states are fed as inputs, and the network finds a policy on how to take actions, specified for a particular state.
ML methods have been applied with certain success to a number of eye diseases/conditions:
-
In [14], it is shown how retinal blood vessels are detected by means of an extreme machine learning (ELM) method and probabilistic neural networks;
-
Gurudath et al. [13] studied fundus images with a three-layer ANN and Support Vector Machine (SVM) to classify retinal images;
-
Priyadarshini et al. classified with a data mining technique to provide useful predictions to diabetics diagnosed with retinopathy (DR) [15].
In spite of good outcomes, the principal issues are that the datasets were small, and the annotations are expensive to perform (in computing resources, and the work is cumbersome).

2.4. Deep Learning (DL)

Deep learning represents a ramification of the ML approach, which gathers algorithms that have, as common feature, their architecture developed on hierarchical levels.
The basic structure of an ANN is a set of neurons organized into three layers: an input layer, some hidden layers, and an output layer. Information moves through the network from input to output. The input information is transformed according to the weights’ connections and the activation function of each neuron. An activation function acts as a filter for the output of the neuron.
The purpose is to learn those characteristics that best express patterns in the input data. To achieve this goal, a cost function must be kept to a minimum. This function can measure the accuracy of predictions achieved by the network and may vary according to the model and the learning task. In general, DL is implemented on deep neural networks, by means of tens or hundreds of hidden layers. DL has many applications in research [16] due to significant and efficient open-source programs like Tensorflow 2.18.0, Keras 3.8.0, or PyTorch 2.0.
A special type of DL is often used for applications in medical imaging: Convolutional Neural Networks (CNNs).

2.4.1. Convolutional Neural Network (CNN)

CNNs are a special type of neural network mainly used to classify visual shapes (Figure 4).
CNNs are made up of many hidden layers built on convolutional, subsampling, and/or normalization functions, which permit using the structural information of an image without having to use a vast quantity of adaptable characteristics. The CNN input is a multidimensional array, with the dimensions stated by the image geometric resolution.
The basic architecture of CNNs is convolutional layers, which are matrices with sizes representing a hyperparameter that must be adapted. Their elements are characteristics learnable by the network. Network kernels examine the picture and extract a tensor from a linear mixture of those kernel elements with the explored input tensor. The newly obtained tensor is a picture filtered by some geometric shapes found during learning. The complexity of the models increases as the magnitude of the network increases.
CNNs represent the most effective instruments in image examination, but they have only lately begun to be applied in the studies of eye diseases [17].
Figure 4. Example of feature selection by a CNN [18].
Figure 4. Example of feature selection by a CNN [18].
Applsci 15 01913 g004

2.4.2. Learning by Transfer

In medical image processing, many DL architectures use transfer learning, a method that enables designers to use previously trained deep NNs.
Certain widely used DL structures in fundus examination are Inception V1 and V3 [19]. These networks were initially qualified with ImageNet [20], which is one of the largest datasets of natural images. The above architectures have yielded satisfactory outcomes, but their direct application in medical image analysis is not feasible for many reasons:
-
Natural and medical images have different and particular statistical characteristics;
-
Moreover, training these deep NNs with medical image datasets from scratch is not efficient because these datasets are often small. For fundus images, for instance, one of the largest freely available databases is EyePACS, which has just 35,126 training images;
-
Due to the large quantity of factors that deep NNs have to optimize, their hit relies on large volumes of available data [21];
-
Transfer learning takes models with parameters learned from ImageNet (or another natural image set like CIFAR) and then performs a fine-tuning process.
This process is helpful for feature extraction, which may later be implied in another ML paradigm, but also for the classification issue.
While it is feasible to create our own models, there is research that has demonstrated that under special conditions (like small training sets), the fine-tuning method offers better outcomes than training from scratch.
The best CNN models for medical imaging are classification models, segmentation models, and multimodal architectures [22,23].

2.4.3. Classification

Classification in medicine can be binary (healthy vs. sick patients) or multi-class (e.g., for disease progression). Eye disease classification by means of fundus images effectively uses fine-adapted CNNs. Inception V1 and Inception V3 are examples of good networks for this application [21]. Inception V1 is a network that works with determined convolution dimensions for a certain input. In addition, it comprises a mean pooling layer at its end.
Inception V3 represents a better structure, which gathers more layers of normalized dataset and label flattening techniques to hinder overfitting. Impressive performance in diabetic retinopathy (DR) fine-tuned classification has been reported [24]. It can also be used for learning data representation. For example, we can define a hidden layer as the output layer, and in this way, it is feasible to provide a representation vector of images and use this vector as input for traditional ML models. Thus, the special representation capacity of DL can be mixed with the rigor of the basis of more traditional and powerful models, like SVM (Support Vector Machine) or probabilistic techniques. Thus, it is possible to obtain more interpretable models, which provide predictions and, for example, uncertainty indicators [21].

2.4.4. Segmentation

As a special case of classification, segmentation recognizes and isolates various anatomical structures.
In many cases, segmentation is a step in analytics applications before classification. For instance, CNNs are appropriate for optic disc segmentation, helpful for glaucoma diagnosis [17], blood vessel image segmentation, as well as exudates for DR screening [25], or segmentation for drusen detection, which is useful in age-related macular degeneration (AMD) diagnosis.
An efficient model for segmentation is U-Net [26], which may be trained with small image sets and has two sections: one of under-sampling and another of over-sampling, mutually connected in a symmetric model that permits the stage of training. In this way, a direct relation between the original and segmented images is achieved.
An additional efficient structure is the Recurrent Neural Network (RNN) [27], which implies recurrent connections and permits temporary storage of data from the latest inputs. The utility of RNNs is the analysis of volumetric images, such as those of Optical Coherence Tomography (OCT), as certain relationships can be found between successive sets of images.

2.4.5. Multimodal Learning

Multimodal learning finds and combines information from different sources to enable improved results compared to those obtained by assessing the sources of information separately. Thus, it provides synergistic behavior.
The structure of CNNs can be changed to receive more images in the form of extra channels [28].
Multimodal strategies and DL have recently been used in the diagnosis of eye diseases. Recently, it has been proven that a union of certain sources of information can offer models with higher predictive magnitude and robustness [28]. They have applications in volumetric OCT image analysis or in techniques that merge fundus with OCT images for the diagnosis of age-related macular degeneration (AMD) [29].
Recent works have targeted the segmentation and classification of glaucoma and AMD. In [30], a method was proposed to register retinal vessels by combining information from fundus and OCT images. This combination has been studied for the diagnosis of AMD and for cup and optic disc segmentation for the diagnosis of glaucoma.
Some issues derived from these studies include the following: In what way can the correct depiction for each origin of information be found? In what manner and where should representational characteristics be merged in the given model? Data from different modalities may have different statistical properties, so a simple concatenation of representative features is not necessarily a good strategy. The so-called latent spaces must be searched, and various strategies can be applied.
In terms of data hybridization, there are three types: Primary, tardy and hybrid fusion.
Primary fusion makes a mixture of the depictions of every category prior to resolving the issue. Tardy fusion combines the outcomes of the individual models.
An example of this was shown in [24], in which structural and non-structural characteristics were selected from the fundus pictures, and, afterwards, they are combined in a late fusion strategy.
Hybrid fusion may, for example, contain images and text. The first study on this topic was presented in [31], in which CNN was combined with semantic information issued from medical reports to analyze OCT images. The fusion consisted of a hybridization of characteristics into a fully joined layer. An advance in the accuracy of retinal tissue imaging was reported.
Additionally, one can hybridize visual and morphological data by integrating deep neural networks with morphological characteristics for glaucoma identification and categorization [32]. Image processing and morphological feature release can be provided separately and combined into one layer then passed to a fully joined layer.

3. AI and Ophthalmology: View of the Ensemble

3.1. Introduction

Traditional techniques for diagnosing eye diseases are mainly related to the professional expertise and skills of ophthalmologists, which, unfortunately, may lead to a higher rate of misdiagnosis and underutilization of medical data.
The combination of ophthalmology and AI has the capability to objectively revolutionize monitoring, diagnostics, and models of management of many eye diseases.
Traditionally, an eye exam involves describing the results by means of words, drawings, and pictures accompanied by a medical verdict. However, this general procedure of diagnosis is subjective, qualitative, and often inappropriate. Most diagnostics in ophthalmology are based on image processing; therefore, much depends on the analysis and quantification of various parameters in the images.
Medical image processing usually extracts characteristics (features), which can be hard to visually evaluate for the specialist. In general, there are two types of features:
  • The semantic (or syntactic) features, expressed by specialists;
  • The numerical characteristics, represented by mathematical formulas.
The use of big data and analytics has increased, and neural networks are developing, so computers have helped in the detection, extraction, learning combinations, and permutations of important features.
Note: Big data refers to extremely large and diverse collections of structured, unstructured, and semi-structured data that continue to grow over time. Analytics is the process of discovering, interpreting, and communicating meaningful patterns in data. It helps us see insights, patterns, and meaningful data that we might not otherwise detect.

3.2. AI and Anterior Segment Diseases

3.2.1. AI and Corneal Ectasia

In this domain, methods like risk scores, linear models, and AI models have been used to develop a screening protocol.
For instance, Lopes et al. acquired Pentacam HR (Oculus, Wetzlar, Germany) data for three patient categories, with firm LASIK (Laser-Assisted in Situ Keratomileusis), post-LASIK ectasia, and clinical keratoconus, coming from three countries to develop a “Pentacam Random Forest Index” (PRFI), which was characterized by a sensitivity of 94.2% and a specificity of 98.8% for detecting corneal ectasia with an AUC of 0.992 [33].
Yoo et al. estimated corneal refractive surgery by studying subjects’ demographics, corneal tomography, and ophthalmic examinations by means of five different ML methods. They utilized an ensemble of classifiers, confirmed by means of internal and external data, and obtained an AUC of 0.983 and 0.972, respectively [34].

3.2.2. AI and Keratoconus

AI for keratoconus identification is a modern field of research, with certain methods used throughout years to identify the disease, differentiating the normal from the fruste forms, and even grading the disease [35].
Kamiya et al. used DL to classify the presence or absence of keratoconus based on numerical computations of output data by using color-coded corneal maps on anterior segment swept-source OCT (AS-OCT). An AUC of 0.991 was generated for keratoconus detection and classification of disease grade in keratoconic patients [36].
Valdés-Mas et al. researched corneal curvature and astigmatism in keratoconic patients after intracorneal ring implantation to foresee visual performance by means of ML [37].
Yousefi et al. analyzed corneal characteristics on the anterior and posterior surfaces, identified unique non-overlapping clusters by means of an unsupervised ML method, and performed a post hoc analysis for the corresponding clusters to predict the probability of needing further keratoplasty surgery [38].

3.3. AI and Posterior Segment Diseases

3.3.1. AI and Diabetic Retinopathy

Roughly 600 million people could be diabetic in 2040, and almost a third of them will have diabetic retinopathy (DR) [39].
Screening techniques like ophthalmoscopy, dilated slit-lamp biomicroscopy with a portable lens, mydriatic or non-mydriatic fundus imaging, teleretinal monitoring, and retinal video recording can be utilized.
A major obstacle to the strong implementation of DR monitoring is caused by the limited availability of human specialists and long-term financial support.
DL has significantly improved diagnostics for the detection of DR. The main aim is to use DL applications that can be applied to different ethnic groups and are very appropriate for retinal images acquired with different video cameras.
For example, El Tanboly et al. implemented a DL-powered application to identify DR in 52 OCT images and obtained an AUC of 0.98 [40]. It is necessary to validate their system in larger subject sets.
A computer-aided diagnosis (CAD) application, built by means of a continuous machine learning (CML) model, uses optical coherence tomography angiography (OCTA) for automatic detection of nonproliferative DR (NPDR) has also obtained great precision and AUC [41]. The OCTA performance shows the need for further methodological analysis and robust clinical assessments.
Applications for the detection of DR were among the first to obtain acceptance for usual clinical usage from the FDA [42].

3.3.2. AI and Retinal Vein Occlusion

Automatic diagnosis of Branch Retinal Vein Occlusion (BRVO) has been performed, e.g., by Zhang et al. using hierarchical local binary model (HLBP) and maximal pooling for characteristics selection. The AUC was 0.961 [43].
Nagasato et al. [44] utilized fundus ultrawide-field photographs of central retinal vein occlusion (CRVO) and healthy subjects to instruct a CNN with DL and an SVM to identify CRVO. The CNN with DL was superior to SVM: the AUC was 0.989, sensitivity was 98.4%, and specificity was97.9%.
A Random Forest (RF) network was used to categorize images by means of vitreomacular adhesion (VMA) in BRVO patients. VMA is a salient biomarker that predicts answers to anti-VEGF (Anti–Vascular Endothelial Growth Factor) treatment. One could observe that eyes with VMA undergoing therapy had better visual acuity in comparison to eyes without VMA [45].

3.3.3. AI and Retinopathy of Prematurity

Blindness caused by retinopathy of prematurity (ROP) is largely preventable through early detection and timely treatment. However, ROP monitoring is a practice that takes a long time and needs specially trained staff. It has been stated that interclinic partiality and geographical variability in the detection of ROP lead to poorer outcomes.
Automated techniques using DL have recently been validated in the identification of ROP by means of retrospective data but have not yet been fully approved in the clinical environment [46]. To ameliorate the generalization of the DL algorithms used, some diversity in fundus imaging must be considered, including different camera models and lenses.

3.3.4. AI and Age-Related Macular Degeneration (ARMD)

ARMD is a chronic and untreatable macular disease characterized by retinal pigment changes, drusen, choroidal neovascularization, hemorrhage, and geographic atrophy. ARMD is a main source of vision damage in individuals over 50. Due to the aging population and the gravity of ARMD, regular screening is needed.
Automatic diagnosis of ARMD can, of course, reduce the workload of ophthalmologists and increase efficiency. In [47], a VGG16 CNN was used to analyze 11 central OCT images of normal and ARMD patients for the automatic diagnosis of ARMD. Data were analyzed for all images, and the mean likelihood of images containing OCT data and mean likelihood from all patient scans were computed, obtaining AUCs of 0.928, 0.938, and 0.975, respectively.
DL was used by Burlina et al. to identify and evaluate ARMD. They implemented DL to categorize ARMD in accordance with the Age-Related Eye Disease Study (AREDS) Severity Scale and estimated the 5-year probability of development using soft and hard prediction, as well as regression maps. The average five-year overall risk estimate errors ranged from 3.4% to 5.8% in higher AREDS categories, with a larger average error [48].
DL and CNNs were implemented by Grassmann et al. for the detection and classification of ARMD into 13 classes by means of color fundus photographs [49]. They implemented a random forest ensemble model of six CNNs with different architectures to achieve 92% accuracy in 13-class classification.
Peng et al. utilized the DeepSeeNet network to categorize fundus color images into ARMD severity grades [50]. This network had greater precision compared to retinal experts for ARMD grading (0.671 vs. 0.599), with AUCs of 0.94, 0.93, and 0.97 for large drusen, pigmentary anomalies, and identification of late ARMD, respectively.

3.3.5. AI and Glaucoma

Even if there is no gold standard for glaucoma detection, AI can help significantly improve glaucoma screening, diagnosis, and classification, as AI enables the automated processing of large datasets and the identification of specific and new disease patterns [51].
First, fundus photographs were analyzed by ML to detect optic nerve impairment due to glaucoma [52]. Afterwards, larger datasets were analyzed using DL [53]. Ting et al. processed a large dataset of 125,189 fundus images and obtained a sensitivity of 96.4% and a specificity of 87.2% [54].
In addition, AI achievements have been released using digital visual field images as well as OCT, and patients have been evaluated using fusing data from both imaging modalities.

3.3.6. AI and Retinal Detachment

Japanese researchers utilized a DL network to identify rhegmatogenous retinal detachment in fundus images obtained from Optos ultrawide-field eyes. They showed a sensitivity of 97.6% and a specificity of 96.5% [55].

3.4. AI and Various Eye Diseases

3.4.1. Ocular Oncology

A conditional probability estimation neural network (CPENN) for survival prediction in patients with choroidal melanoma was implemented by Damato et al., and they compared the outcomes with the standard Kaplan–Meier analysis [56]. All-cause survival curves were similar in the two techniques (p < 0.05), not including older patients for whom the NN assessed decreased death than the Kaplan–Meier method did.
Nguyen et al. used a two-step model to obtain automatic segmentation of uveal melanoma images. First, they implemented a class activation map, a conditional random field, and an active shape model to identify tumors in magnetic resonance images. Secondly, a 2D-Unet CNN segmented the tumor [57].
Sun et al. classified uveal melanomas for BAP1 (tumor predisposition syndrome) expression with an AUC of 0.99 by means of histopathological images. They created 8176 images of 256 × 256 pixels and processed them with a large CNN. BAP1 expression is related to a reduced prognosis in those tumors because it provides metastatic capacity [58].

3.4.2. Pediatric Ophthalmology

Recent studies show that the detection and classification of retinopathy of prematurity based on AI have high accuracy, with accuracy very close to that of experts [59].
Other domains in pediatric ophthalmology that have developed with AI-based applications refer to pediatric cataract identification, cataract categorization, foreseeing of post-cataract surgery complications, strabismus identification, prognosis of potential high myopia progression, fundus vessel segmentation, and analysis of visual development.
In conclusion, by deploying systematic and interpretable AI applications by means of advanced methods with enough multimodal and high-class data, the applicability of AI in hospital settings can be clearly improved.

4. Representative Areas of Application of AI in Ophthalmology

4.1. Artificial Intelligence and Cataract

4.1.1. Scope of the Problem

Prevalence: There are about 67.2 million persons with visual impairment or blindness worldwide, and cataract is the main source of preventable cecity. This figure is projected to grow to 72.5 million by 2025. Diagnosis and identification of cases remain significant issues, mainly in countries with poor public healthcare. It seems that cataract causes half of all cases of blindness and 33% of vision impairment worldwide. By 2050, the prevalence of cataract will increase by 78% to reach more than 200 million cases. Therefore, there is an important source for expenditure, research, and innovation in deploying instruments that can effectively approach this severe public healthcare issue.

4.1.2. Limitations of Current Clinical Procedures

There are four categories of such limitations in the case of cataract disease.
-
Case Detection Programs
Nowadays, cataract disease is clinically assessed on slit-lamp inspection, using the System of Lens Opacities Classification (LOCS) III, also used to recommend the type of treatment required.
This assessment needs clinical expertise, training, and valuable equipment for clinical decision-making, which is an important challenge for poor or developing countries or rural environments. Furthermore, image-based grading scores are not objective and can be notably influenced by inter- and intra-equipment variations. These kinds of limitations are such that screening for cataracts is time-consuming and costly. In this way, there is a significant need to deploy new techniques that can handle such limitations and help in screening cataracts.
-
Calculation of Intraocular Lens (IOL) Power
The usual procedure for cataract treatment is surgery followed by IOL implantation to restore vision. Cataract surgery planning is essentially determined by IOL power calculation, which depends on parameters such as axial length, effective lens position, corneal curvature, and the type of IOL chosen.
This calculation is performed quickly using biometric machines, but due to the large variation in eye biometrics between individuals, there is currently no unique formula suitable for all patients. Additionally, it cannot be used for eyes with non-usual corneal shapes, such as eyes with previous refractive surgery, keratoplasty, keratoconus, microcornea, or significant astigmatism.
Through the use of AI, ML, and databases, new IOL formulas have been studied that can be efficiently used for a subgroup of patients. In this respect, in the coming years, we may face major developments.
-
Workforce Training and Surgical Evaluation
For approaching the needs of the aging population, the current number of ophthalmologists should increase by 75–100% in the next 20 years.
An important aspect of current training for surgeons in ophthalmology is the necessity of extensive patient exposure, which is often not available. This involves a significant investment in training costs and the time required to develop skills. Techniques such as virtual reality (VR) [60] and augmented reality (AR), “wet” laboratories, training courses, and didactic videos are the main instruments for teaching the best surgical techniques.
-
Postoperative Care and Quality of Life (QoL)
Appropriate postoperative care is a very important characteristic of any surgery. With the increased use of digital technologies in patient diagnosis, treatment, and management of medicines, the importance of AI will increase in the future. Live teleconsultations, chatbots, and tele-simulated assistants will be used for teleconsultation and feedback collection, various mobile applications and digital health monitoring will increase in the coming years. For example, it is possible to predict the risk of complications such as posterior capsular opacification in post-cataract surgery by means of artificial neural networks (ANNs) [61].

4.1.3. AI and Cataract Detection

There are several published studies reporting algorithms for automatic cataract detection and grading/classification. These software programs are different in methodology, input data type, and use case scenario.

Cataract Detection Based on Slit Lamp Photographs

The studies in the table below exclusively use slit-lamp photographs for training data for algorithms for automatic detection and classification of nuclear cataracts (Table 1).

Based on Color Fundus Photographs

Recently, the use of fundus photographs to screen for diabetic retinopathy [66] provides an occasion to use fundus images to detect other sources of visual impairment, such as cataracts (Table 2).

4.1.4. Calculation of the Power of Intraocular Lenses

With AI, one can research complex non-linear relations between ocular parameters and calculate IOL power, customized to each individual’s eye characteristics. It is also appropriate to use the effects of postoperative results, such as postoperative refractive status, to calibrate surgeon performance parameters, such as SIA (surgeon induced astigmatism).
Two formulas that use AI are the Hill Radial Basis Function (Hill-RBF) and the Kane formula. The Hill-RBF technique was developed using 12,000 eyes with measurements taken with the Haag-Streit Lenstar optical biometer [69]. The Kane method was developed by means of high-quality cloud computing, which employed regression and ML models to refine IOL power predictions [70].
Accuracy results (comparison of postoperative refractive errors estimated by formulas and actual ones) are shown in Table 3. It is noted that some improvements in eye care are possible by using short axial length. Many studies on eyes with a wide diversity of severe refractive conditions are imperatively necessary to confirm the above formulas for hospital practice. The Hill-RBF method was also compared with the traditional Barrett Universal II method and the SRK/T Formula. The formula based on AI was better than both the Barrett and SRK/T formulas [71].
For quality prediction, the accuracy of postoperative target eyes within ±0.5 D was 83.62% by means of the Hill-RBF method, 79.66% by means of the Barrett Universal II, and 74.01% using the SRK/T method. This shows the efficiency of the Hill-RBF technique in the management of cataract surgery.

4.1.5. AI and Post-Operative Care: Quality of Life (QoL)

Smartphone-integrated and AI-based Unified Healthcare Systems are currently being tested [72]. For example, the CC-Guardian platform provides management for congenital cataract patients. These individuals are at risk of the two most common types of complications: elevated intraocular pressure (IOP) and visual axis opacification (VAO).
The system has three modules: (1) The prognosis mode to detect the risk; (2) The mode of schedule/dispatch to plan further visits if risk is detected; (3) Telehealth mode for intervention decisions according to those further visits.
The training set comprised clinical records of 594 congenital cataract patients and 4881 images of follow-up (2615 follow-ups, 2266 interventions).
Confirmation was performed on 142 patients with clinical records (61 VAO, 81 non-VAO; 79 high IOP, 63 normal) and 1220 follow-up images (671 follow-ups, 549 interventions). Diagnostic references were made through expert team assessment. As for performance indices, the authors reported an AUC of 0.991 for VAO and 0.979 for high IOP. For the telehealth application, the AUC was 0.996.
The researchers also carried out a cost-effectiveness study in another self-controlled trial with 198 patients (93 VOA, 105 high IOP). Precision in this group was 96.8% for VAO and 96.2% for elevated IOP. Another study showed that patients had 1579 telemedicine visits (instead of 987 remote visits), decreased travel by 928.6 miles/year, and reduced expenditures of 1324 USD/year.
These outcomes are an eloquent example of the significant effect of new healthcare paradigms on existing standards of care. Yet, this method may not perform as well in certain ethnic groups, and the study design is certainly adaptable for regional use. Similar analyses will be very important in organizing the long-term security and reliability of AI-powered methods for public healthcare in clinical practice.

4.2. AI and Glaucoma

4.2.1. Introduction

Glaucoma is a set of optic neuropathies labeled by progressive degeneration of the optic nerve and retinal ganglia cell loss (RGC). Glaucoma progresses without causing symptoms until the disease is advanced, with substantial neuronal damage. When symptoms occur, up to 30–50% of retinal ganglion cells may be lost, leading to irreversible visual field (VF) loss with an associated decrease in quality of life. Glaucoma is the second leading cause of irreversible blindness worldwide, currently affecting over 80 million people globally, and is estimated to affect 110 million by 2040.
Risk factors for glaucoma include elevated intraocular pressure (IOP), a family history of the disease, advanced age (>50), African or Asian descent, myopia, and the use of systemic or topical corticosteroids. Primary open-angle glaucoma (POAG) is the most common type worldwide, with the highest prevalence (3.05%), followed by primary angle-closure glaucoma (PACG, 0.5%).
There is currently no cure for glaucoma, and it is important to detect the disease as early as possible so that IOP-lowering treatment can be initiated to avoid irreversible visual functional loss.
Recent advances in AI, especially the advent of DL, have demonstrated a transformative impact on the healthcare industry, with outstanding performance in skin cancer classification, early diagnosis of Alzheimer’s disease, glioma prognosis, detection of diabetic retinopathy [73], and, most recently, assessment of the severity of COVID-19, as well as many other applications.

4.2.2. Overview of AI in Glaucoma

Glaucoma AI models are mainly sustained by data issued from Functional Vision trials (perimetry), fundus images, and optical coherence tomography (OCT), because these imaging methods offer very structured data suitable for training AI architectures.
Glaucoma AI models can be divided into two classes, depending on how these networks analyze the data to achieve classification or prediction.

Classical Machine Learning (ML) Models

Systems in the first category use traditional ML methods, which commonly consist of three stages, as shown in Figure 5, to extract characteristic vectors from the input data and then provide a classification.
The initial stage involves preprocessing the input image, together with removing the artifacts in the images, to improve the detection of regions of interest such as the optic cup (OC) and optic disc (OD). Then, a module has to select significant characteristics, as well as clinical parameters such as the cup-to-disc ratio (CDR) and visual features (such as morphological, spectral, and texture features) derived from the image preprocessing stage. The final module is an ML network, e.g., Support Vector Machine (SVM) or Naive Bayes, which can be trained by means of the selected feature vectors for classification or prediction activities.
Data preprocessing, feature extraction, and classification techniques used in this glaucoma systems have been extensively presented in previous studies [74].

Deep Learning (DL) Models

Another class of AI architecture is based on deep learning (DL) methodologies, which are a part of ML methods. DL techniques demonstrated that a multilayer neural network with several levels of abstraction can automatically learn an appropriate depiction of data.
  • A characteristic deep CNN uses only pixel magnitudes as inputs and is able to learn images and categorical labels. For example, Figure 6 shows a multilayer CNN that uses optic disc (OD) images as input and then forecasts glaucoma risk [75].
  • DL networks can detect complicated patterns in input data by automatically tuning internal model parameters that are used to calculate data representations (without feature extraction) and classification in a two-step process.

4.3. AI-Based Systems in Glaucoma

Artificial intelligence proves a high capacity for the treatment of glaucoma, for detecting structures related to glaucoma and its functional changes, and for its clinical diagnosis and its prognosis. Typical applications of AI in this domain include (1) detection, (2) diagnosis, and (3) prognosis.

4.3.1. Detection of Glaucomatous Characteristics

AI-based glaucoma detection identifies, marks, highlights, or directs attention to portions of the input images that may contain glaucoma-based anatomical or functional abnormalities, and/or detects characteristics and makes measurements that can express those anomalies. Thus, specialists can use visual and quantitative information to detect glaucoma or monitor changes during patient follow-up. AI-based detection methods are usually designed for unimodal data. For example, optic disc/optic cup segmentation techniques are mainly applied in fundus image processing.
Optic Disc/Optic Cup (OD/OC) segmentation
The visualization of the optic nerve head is crucial in the identification of glaucoma-based lesions. Certain AI-driven segmentation methods have been deployed for OD and OC in retinal images to compute CDR (cup-to-disc ratio).
Prior to DL methods, contour-based and superpixel categorization methods were the most commonly used means for OD and OC segmentation. Advances in DL networks, high-performance computing (HPC), and large-scale publicly available image databases have led to the increasing use of DL-based models (e.g., U-Net) in OD/OC segmentation [76,77]. In addition to local datasets, several public databases, like RIM-ONE, Drishti-GS, ORIGA, and HRF, are often used grow and confirm OC/OD segmentation methods.
A late attempt in this domain was the international Retinal Fundus Glaucoma Challenge (REFUGE) [78], which initiated and standardized an assessment foundation to differentiate certain methods in OD/OC segmentation and glaucoma categorization.
Detection of VF (view field) loss
Computational VF testing is essential in identifying glaucoma-induced functional modifications. Elze et al. [79] evolved an unsupervised ML technique, named archetypal analysis, to leverage the sensitivity data included in VFs. Mayro et al. emphasized the role of AI in glaucoma analysis with a focus on detecting VF loss [80].
Alternative AI-based attempts for glaucoma identification comprise tissue segmentation, retinal vessel segmentation, visualization, anomaly identification, and risk assessment. An important use of these models is population screening to achieve early detection of glaucoma and referral for specialized exam.

4.3.2. Diagnosis of Glaucoma

Glaucoma diagnostic methods, be they mono- or multi-modal, are usually utilized to yield a second judgement from ophthalmologists. Certain autonomous AI-based diagnostic systems can work without the involvement of clinical physicians, such as the IDx-DR platform for the detection of diabetic retinopathy [81].
Data-driven monomodal diagnosis
Most AI-based glaucoma diagnostic models use retinal imaging data because it provides rich information, including optic nerve color, texture, and morphology. Many traditional methodologies have used image feature extraction to train classification models. DL models continue to benefit from structural details in images, without feature engineering, and are the latest generation in glaucoma diagnosis. For instance, VGG and ResNet models are two of the most popular neural networks for structural data analysis of images.
A number of selected diagnostic models are presented in Table 4.
Data-driven multimodal diagnosis
As demonstrated in an early study by Brigatti et al., when the same network model was applied, i.e., back-propagation neural network (BPNN) on monomodal data (functional: VF; structural: OD/RNFL (retinal nerve fiber layer) measurements) and on multimodal data, BPNN performed better on multimodal data than on functional or structural data alone.
An additional advantage of using multimodal data is that a model has the possibility to foresee one modality by means of another; in this way, the network can solve certain difficulties. For instance, Mederiros et al. [82] studied a DL architecture for the objective detection of structural defects in OD images. This approach, utilizing spectral domain OCT images as a standard assessment knowledge and comparing them to highly variable assessments by ophthalmologists, can fairly evaluate the level of neuronal harm and predict glaucoma with greater accuracy.
Table 4. A review of research on glaucoma diagnosis and prognosis that use models of neural networks (chronological order).
Table 4. A review of research on glaucoma diagnosis and prognosis that use models of neural networks (chronological order).
AuthorsYearDatasetModalityModelApplicationAUCSensitivitySpecificity
Goldbaum et al. [83]1994Local datasetVisual fieldsA two-layer FNNDiagnosis0.650.72
Brigatti et al. [84]1996Local datasetVisual fields; OD and RNFL measurementsA four-layer BPNNDiagnosis: Functional0.840.86
Diagnosis: Structural0.870.56
Diagnosis: Multimodal0.900.84
Chen et al. [85]2015ORIGA; SCES*Fundus photosA six-layer CNNDiagnosis0.898
Asaoka et al. [86]2016Local datasetVisual fieldsA four-layer FNNDiagnosis0.9260.7491.000
Ting et al. [54]2017SiDRP 14–15#Fundus photosVGG-19Diagnosis0.9420.9640.932
Liu et al. [75]2018Local dataset; HRF; RIM-ONEFundus photosResNet50Diagnosis: Local dataset0.9700.8930.971
Diagnosis: HRF0.8900.8670.867
Li et al. [53]2018LabelMeFundus photosInception_v3Diagnosis0.9860.9560.920
Li et al. [87]2018Local datasetVisual fieldsVGG-16Diagnosis0.9660.9320.826
Shibata et al. [88]2018Local datasetFundus photosResNetDiagnosis0.965
Christopher et al. [89]2018ADAGES$; DIGS~Fundus photosResNet50Diagnosis0.9100.8400.830
Medeiros et al. [82]2019Local datasetOCT scans; fundus photosResNet34Diagnosis0.9440.9000.800
Liu et al. [90]2019CGSA^Fundus photosResNetDiagnosis0.9960.9620.977
Asaoka et al. [91]2019Local datasetOCT scansA 12-layer CNNDiagnosis0.9370.8250.939
Fu et al. [92]2019Local datasetAS-OCT scansVGG-16Diagnosis0.960.900.92
Normando et al. [93]2020Local datasetOCT scansMobileNet_v2Diagnosis0.9110.971
Prognosis0.8900.8570.917
Thakur et al. [94]2020OHTS+Fundus photosMobileNet_v2Diagnosis0.94
Prognosis: 1–3 years0.88
Prognosis: 4–7 years0.77
Several other applications of DL and networks utilizing multimodal data may be read in a well-regarded paper [74].

4.3.3. Progress and Prognosis of Glaucoma

After glaucoma is detected, it follows the prediction of the course of the disease to avoid over- or under-treatment. However, glaucoma increases in a non-linear manner and can be influenced by certain parameters, presenting significant prognostic provocation. There are no extensively acknowledged clinical tests that can predict the development of glaucoma, and evaluation is based largely on the skills and expertise of clinicians, often requiring many visits to hospitals.
Trajectory based prediction
Glaucoma prognostic models require serial patient data to develop progression trajectories. For example, Kalman filtering, a widely used prognostic technique, plays a role in increasing efficiency (with 29% fewer tests) and decreasing delays (57% earlier than fixed-interval monitoring) in detecting disease development and foreseeing mean VF and IOP computations for normal-tension glaucoma.
By means of archetypal analysis, Wang et al. evolved a calculation technique regarding the adaptation rate of weights related to different architectures and developed a foreseen representation based on a large dataset with serial tests. This representation of glaucoma reached a precision of 0.77 in predicting glaucoma progression.
Mayro et al. [80] yielded similar prediction models by means of trajectories.
Prediction at a single point in time
Several DL models have predicted glaucoma progression by means of data at a single moment in time, which, if feasible, are more efficient than NNs that use serial data. However, it is very difficult to deploy such architectures, because the forecasts have to be made preferably a long time before the clinical manifestations of the disease.
Two such models are shown in Table 4. Thakur et al. [94] developed a DL architecture to predict OD or VF anomalies using fundus images. Such a structure reached an AUC of 0.77 for evaluating glaucoma progress 4–7 years before clinical symptoms, and the AUC was greater (0.88) when prognosing progress 1–3 years before disease onset.
Another study [93] proposed a CNN-assisted method to predict glaucoma progression using an assay of apoptotic retinal cells (DARC). The CNN structure in the above research was utilized not to foresee progress but to identify DARC-positive stained cells in retinal fluorescence images. By means of RNFL (retinal nerve fiber layer) OCT measurements at 18 months as a reference, the CNN-assisted DARC experiment showed a sensitivity of 0.857 and a specificity of 0.917 with an AUC of 0.89 in changing rapidly progressing eyes from steady ones.

4.4. Provocations of AI in Glaucoma

4.4.1. Dataset Dependency

Training data sets significantly affect AI, especially the DL approach, which needs a high quantity of annotated training input data. In principle, it is labor-intensive to acquire, extract, and tag many instances, which may require a significant amount of time. This is mainly because an AI model cannot yield good results with training input data taken from separate ethnicities using different hardware support and standards/procedures with different values.

4.4.2. Lack of Agreement Among Ophthalmologists

Glaucoma differs from illnesses such as diabetic retinopathy, which have testimonial protocols. Thus, disagreements between medical staff or even glaucoma specialists in assessing glaucoma patients can exist. The diagnosis and prognosis of glaucoma are highly dependent on the expertise of the clinicians and the available clinical procedures. Even though the input data are very unitary, disagreements among ophthalmologists can induce variations in the labeling of reference baseline, which will then spread during the training stage and disturb the diagnostic results.

4.4.3. Early Glaucoma

Pathologically, approximately 50% of glaucoma instances are undiagnosed until a relatively tardy stage, because this disease progresses without symptoms in the initial stages. Early diagnosis is crucial, as in the case of other illnesses. However, AI applications may fail to identify less severe cases, such as suspected or preperimetric glaucoma.
There is no uniquely accepted protocol to confirm early glaucoma diagnosis. Yet, the World Association of Glaucoma has provided a consensus act defining its principal characteristics [95]. The study published by Thakur et al. [94] indicates encouraging results in forecasting glaucoma several years before the onset of the disease, though it is still unclear how this approach detects the early features of this illness.

4.5. Possible Evolution

4.5.1. Wearable Equipment and Cloud Computing

Due to the aging of the population, the demand for ophthalmic services will increase because of the high incidence of glaucoma and other age-related eye diseases.
Today, glaucoma monitoring is an expensive issue for individuals and the healthcare system. Wearable devices and cloud platforms could be appropriate solutions, as shown by several zonal and governmental diabetic retinopathy screening projects.
An approach that can identify glaucomatous features by means of wearable fundus cameras and a cloud-based AI architecture has been developed [90], as illustrated in Figure 7. The examinee’s fundus photos are directed to a cloud platform, where they are analyzed. The platform generates a review report. Subjects with a high-probability glaucoma result may be directed to ophthalmologists for additional examinations. For the rest of the subjects, a new test in 12 months is recommended.

4.5.2. Explainable AI

Especially DL models, but many other AI models, have problems with interpretability, i.e., the predictive instrument of the models is unrevealed, and one cannot describe how the AI reaches a particular verdict. The explainable AI advantages are numerous: For instance, it can increase our confidence in different models. How AI understands data can suggest new diagnostic and prognostic biomarkers to clinicians, which can lead to new knowledge on the pathological techniques of illness.
Nowadays, continued advances in explainable AI in glaucoma can be outlined. For example, in an early paper, Goldbaum et al. [83] explicated glaucoma perimetry outcomes by associating feature weights with visual field areas. One of the most recent methods is archetype analysis, which maps a number for every distinctive archetype. Many powerful AI applications for fundus imaging and OCT images are capable of visualizing suspicious pathologies or important areas in images [96].

4.5.3. Convergent Technologies

Another potential future development is combining distinct methods by means of AI. For instance, imaging genetics is a very encouraging field. The goal here is to find the anatomical and functional genetic foundation of certain anomalies and to explain how this fact is linked to glaucoma.
The authors of [97] detected 112 genomic loci related to IOP and glaucoma progression. A regression implementation by means of these loci obtained an AUC of 0.76 for glaucoma detection. Margeta et al. [98] demonstrated that the APOE-ε4 allele is related to a decreased risk of POAG (Primary Open-Angle Glaucoma), recommending a protective role of APOE-ε4 in glaucoma.
By means of the converging methods, AI will more effectively detect diagnostic, prognostic, and treatment biomarkers for glaucoma.

4.6. Final Traits

AI has a few functions in glaucoma treatment, for instance, detecting signs of structural and functional deterioration, as well as supporting illness diagnosis, but is far from reaching its capacity.
ML and DL models remain dependent on training accuracy with various datasets. Although clinical and technical provocations in using AI in medicine do exist, future research is expected to speed up the impact of efficient AI applications in glaucoma treatment.

4.7. AI in Retinal Diseases: Other Applications

4.7.1. Introduction

AI in ophthalmology research mainly focuses on the retina, as this domain involves large amounts of images that are able to correctly address various conditions. Diseases of the retina are frequently studied using multimodal imaging, i.e., fundus, retinal angiography, and optical coherence tomography (OCT).
Moreover, retinal diseases often have similar and overlapping phenotypes, allowing diagnosis and management using pattern recognition methods.
Retinal vascular diseases such as diabetic retinopathy (DR) and retinal vein occlusions can show characteristics like microaneurysms and complications such as macular edema, the identification of which has a salient function in the early diagnosis and treatment of these diseases.
Basic applications of AI in various ophthalmic image analysis algorithms include image enhancement, region of interest identification in images (segmentation), feature computation, and classification (screening) [99].

4.7.2. Applications of AI in Retinal Diseases

AI in Diabetic Retinopathy

Various applications have been described in previous sections.

AI in Age-Related Macular Degeneration

This subject has been covered in previous sections.

AI in Choroidal Neovascularization and Macular Diseases

OCT images, especially macular ones, are suitable for developing DL algorithms. The large amount of macular OCT images can provide very large datasets needed to train DL networks. Yet, the precision in the acquisition of OCT images can reduce data complexity and allow DL networks to find meaningful information from a small dataset. Because macular OCTs provide a large amount of structural information about retinal layers, DL can also be utilized to find new biomarkers for macular diseases.
The initial study of DL in the analysis of macular OCTs was the automatic classification of AMD (age-related macular degeneration). Lee et al., using over 100,000 OCT images, trained a DL network, achieving an AUC of 0.97 [47].
Most studies using macular OCT in DL have used B-scans (2D) of OCT rather than 3D images, which is a challenge in the use of DL in methods based on OCT.
DL applied to CNNs has been utilized successfully for improved segmentation of retinal structural margins, intra-retinal cyst fluid, and subretinal fluid on B-scans [100].
A new application of AI was made by De Fauw et al. [7]: segmentation and classification of OCT images. Initially, a segmentation stage found nearly 15 retinal morphological features and artifacts from OCT images. A network categorized the results of the segmentation network into ten diseases (together with membrane choroidal neovascularization: CNV, macular edema, drusen, full-thickness macular hole, geographic atrophy, partial-thickness macular hole, vitreomacular traction, epiretinal membrane, central serous retinopathy, and “normal”). The above diseases were further classified into urgent, semi-urgent, routine, and observational. Their conclusion was that the application classified similarly to that of experts [7].
This DL network can be introduced for quick access in “virtual clinics” and can help triage patients with macular diseases, reducing the burden on tertiary health institutions. Additionally, these triage clinics can also be used by optometrists in rural or outpatient environments.

AI in Retinopathy of Prematurity

This subject has been covered in previous sections.

Retinal Vein Occlusions (RVOs)

RVO is the most common cause of visual damage in the group of retinal vascular causes, after DR. It produces retinal hemorrhages, macular edema, and exudation. RVO is more common in older age, where high blood pressure, atherosclerosis, and heart disease are major risk factors.
Currently, AI has not been extensively used for RVO. A study showed that using CNN integrated with image voting methods for fundus analysis automatically detected RVO. The authors obtained a great precision of over 97% [101].
Nagasato et al. [44] implemented DL approach in the detection of retinal non-perfusion areas in RVO by using OCT angiography. The researchers used heat maps to identify sectors of retinal non-perfusion. The detection of OCT angiographic images with RVO from healthy individuals was very good (AUC = 0.986). The sensitivity, specificity, and mean time needed to distinguish the images were, respectively, 93.7%, 97.3%, and 176.9 s, and the DL approach performed better than ophthalmologists in all parameters (!).
Table 5 shows the power of AI methods in identifying retinal diseases using fundus images.

4.8. Pearls and Pitfalls in Using AI Applications in Ophthalmology

AI is a very complex domain of research and potentially revolutionary in the medical field, including ophthalmology. A prominent challenge for AI in ophthalmology is that the number of training datasets and validation sets is not standardized [103]. Many databases utilize many, many images because more is thought to be better. However, superabundant databases, or those with heterogeneous elements and different programming environments, can decrease the precision of the evaluations. As a result, including broad demographic profiles, decreasing datasets, algorithm complexity, and reducing the number of classes in a classification task can increase precision and have notable prognostic significance.
Since most subspecialties in ophthalmology use imaging modalities, AI can be involved in improving patient care. However, the cost of imaging equipment can be high, and investors, including governmental bodies, must ensure that the medical equipment is available to all people, including those in poorer regions.
As AI is not an unfailing technology, some patients with very severe diseases could go undetected. In this case, a false-negative decision can have very severe consequences that influence the visual system. Technically, methods that have high accuracy may also exhibit comparatively high false-negative figures in disease identification, leading to low diagnostic quality. For example, certain manifestations of diabetic disease, such as lack of retinal features, associated glaucoma, or macular degeneration may be missed.
An actual drawback of AI applications is that there must be personalized programs, separately adapted for each individual condition, placing us in the paradigm called weak or narrow AI. In this way, until this issue is resolved, the gold standard will continue to be clinical examination. Even if AI holds great promises, it shows its own set of limitations and dangers, including the risk of de-skilling specialists to the point where clinicians lose their ability to make quality diagnoses.
In the future, AI will be incorporated into computerized diagnostic, treatment, and management instruments. This vision will be especially practical in rural and poorer regions that have limited access to healthcare, as well as in educational systems. Moreover, AI-related systems are seen as a salient tool to decrease social imbalances in the healthcare domain.

4.9. Conclusions

AI and ML enable improved screening and prognosis in retinal diseases, especially DR and ROP (Retinopathy of Prematurity). This advance is suitable to potentially grow patient access to hospitals and decrease healthcare expenditure. In developing countries in Asia, Africa, or Latin America, this is an encouraging approach that can benefit millions of people who cannot have easy access to oculists.
Further research is definitely needed for clinical applications, given reasonable costs, and DL is promising to have a substantial effect on practical medicine, especially in ophthalmology, in the future.

4.10. AI in Neuro-Ophthalmology

Over the last 10 years, AI has provided clinicians with new, rapid, precise, and automated methods to diagnose and treat eye diseases, as well as open new avenues for modern eye care [101].
A few ophthalmologic subspecialties have benefited from AI to a greater extent than others, e.g., automated DL identification of diabetic retinopathy, glaucoma, and age-related macular degeneration.
On the contrary, subspecialties such as neuro-ophthalmology have lacked major improvements in the detection, treatment, and prognosis based on AI.

4.10.1. Neuro-Ophthalmology

The anatomy and physiology of the ocular network extends behind the eye to the posterior areas of the brain. Therefore, individuals with intracranial diseases frequently present with visual disturbances, and an ophthalmologist is consulted.
Neuro-ophthalmology, as a subspecialty at the confluence of ophthalmology and neurology, deals with conditions that have an impact on the afferent (e.g., vision) and/or efferent (i.e., eye motions, pupillary feedbacks) pathways that connect the eyeballs to the rest of the nervous system.
Current neuro-ophthalmology is a unifying specialty, connecting not only ophthalmologists with neurologists, but also neurosurgeons, neuro-radiologists, neuro-otologists, geneticists, neuro-immunologists, and neuropathologists. These multi- and interdisciplinary perspectives often require difficult diagnostics and skills for neuro-ophthalmic conditions. These are relatively infrequent and complex tasks in comparison with ophthalmic and neurological diseases taken separately.
In short, neuro-ophthalmic diseases may be classified into
  • diseases impacting the afferent ocular system (the pathway of retina–optic nerve, chiasm, retro-chiasmal pathways, and occipital lobes), which are sources of high-order optical abnormalities;
  • diseases that affect the efferent visual system, which are sources of central ocular motor disorders (at the cortical level, brainstem), gaze instability, cranial motor ocular neuropathies, pupillary conditions, and several peripheral abnormalities impacting the neuromuscular junction as well as the muscles themselves.
The set of diseases that can specifically impact these anatomical tissues is substantial: autoimmune, ischemic, inflammatory, infectious, traumatic, compressive, congenital, and degenerative conditions. For example, sometimes a rather benign neuro-ophthalmic dysfunction (e.g., inflammatory optic neuropathy) indicates other more severe neurological diseases (e.g., multiple sclerosis). Likewise, acute strabismus because of the obtained ocular misalignment (or inflammation of the optic nerve head) may be the sole expression of a life-threatening condition (cerebral aneurysms, tumors, systemic metabolic diseases, etc.), which requires quick detection and treatment.

4.10.2. AI in Optic Nerve Head (Optic Disc) Anomalies

The optic nerve head is the proximal termination of the optic nerve. Clinically, the integrity of the optic nerve head (optic disc) can be assessed by means of direct ophthalmoscopy or fundus photography. Optic nerve injuries lead to visible disc modifications, like swelling, pallor, “cupping” (e.g., in glaucoma), or infiltration.
A large reserve of optic disc images in glaucomatous eyes allows DL to automatically detect glaucoma on digital images of the fundus [74], based only on optic disc appearance or in combination with imaging (OCT) [101].
Several studies have automatically determined optic disc laterality using DL and transfer learning [104], as well as detecting abnormalities of the neuro-ophthalmic optic nerve head [105].
Research by means of fundus images have shown that feature selection and extraction using ML (e.g., SVM) and powerful statistical methods (e.g., with gray-level co-occurrence matrix) can detect optic discs with papillary edema with high precision (93%) [106]. They can also classify the severity of the disease in high agreement with a neuro-ophthalmologist (Kappa score = 0.71) (see Appendix B).
Ahn et al. used ML to classify normal discs, bulging discs due to certain optic neuropathies, and pseudopapilledema. By means of data augmentation to prevent overfitting and a traditional CNN with TensorFlow and transfer learning, the researchers differentiated true optic disc swelling from pseudo-swelling with ~95% precision.
Milea and his team were able to train a DL system with 14,341 images acquired from 6779 patients from 19 neuro-ophthalmology world stations (BONSAI syndicate), along with 9156 normal optic disc images, 2148 discs with validated papilledema, and 3037 discs with other anomalies [105]. Assessed using a different set of 1505 fundus photographs acquired retroactively from five other medical stations, integrating different ethnic populations, BONSAI-DLS had high accuracy for classifying healthy discs, discs with papilledema, and discs with different anomalies (e.g., non-arteritic ischemic optic neuropathy, disc degeneration, optic disc drusen) with AUCs of 0.98, 0.96, and 0.96, respectively [105].
DL provides a real opportunity for healthcare staff who are not experienced in performing ophthalmoscopy to automatically, and with good results, analyze vision and detect dangerous situations on fundus images. Physicians and neurologists in intensive care units may also benefit from automated optic disc analysis in their regular assessments of patients, particularly those with limited access to neuro-ophthalmologists.

4.11. AI in Eye Movement Disorders

These disorders are due to ocular motor nerve paralyses and other sources of diplopia and ocular non-alignment, joint gaze anomalies, nystagmus, or other anomalous eye motions.
The literature on AI applications in this area is more limited. Some authors have used ML for conjugate gaze limitation assessment techniques as ocular biomarkers for neurodegenerative diseases (Parkinson’s [107], Alzheimer’s [108], and Huntington’s disease [109]) or neuropsychiatric illnesses [110].
Moreover, AI has helped neuroimaging modalities in certain processes such as image segmentation and classification, diagnostics, prognosis, outcome prediction, and risk evaluation [111].
ML techniques have been used to model ocular motor information [112] or to foresee characteristics related to congenital nystagmus [113]. More powerful DL methods have been used for the identification of strabismus in pediatric ophthalmology [59,114,115], which could be used for cranial palsy and telemedicine.

AI and Ocular Motor Characteristics

Few studies have describe ML methods for researching conjugate gaze abnormalities and nystagmus. The conjugate anomalies of gaze comprise vertical or horizontal joint gaze restraints, chasing or saccadic deficiencies, or reflexive divergence of conjugated gaze.
D’Addio et al. [113] used electrooculography to record eye motions in 20 patients and computed some characteristics of the signals using standard computer programs. They built predictive models by means of two ML methods: Random Forest and Logistic Regression Tree. They predicted visual acuity and eye positioning variability as a function of nystagmus characteristics (such as initial oscillations and foveal periods of nystagmus) with accuracies above 0.72 and 0.70, respectively.
The application of AI in the case of strabismus diagnosis was described in the review [59]. Certain AI systems were used for the identification of cranial nerve paralysis and other eye imbalances.
Lu et al. [116] compiled a strabismus database comprising 5685 facial images and utilized it to train and test their method in two steps: in the first stage, a fully CNN was built based on region (R-FCN) for eye area segmentation, and a second stage used a deep CNN that distinguished the eye areas as normal or strabismic. The results of their approach were heartening: The sensitivity was 93.3%, the specificity reached 96.2%, and the accuracy was 93.9%.
Refs. [114,117] present diagnostics and divergence models in vertical strabismus by means of an expert system (StrabNet) that used ANNs. Four hypertrophic and four hypotrophic deviations linked with a diagnostic were studied, and the researchers used 50 groups of measurements for every one of the eight diagnostics to train and evaluate the ANN. The global results of StrabNet were good: 94% accuracy and 100% specificity.
Ref. [59] highlights some limitations or obstacles faced by AI applications in pediatric ophthalmology, as in other medical domains:
(1)
lack of agreement regarding the reference standard among specialists;
(2)
limited ability to reproduce and compare results when researchers do not use publicly accessible databases;
(3)
absence of time-related assessments;
(4)
the existence of uninterpretable achievements in DL applications (black boxes), which make them unreliable for some medical service providers.

4.12. Conclusions

AI applications have been shown to be practical for the characterization of optic disc and certain eye movement abnormalities, as well as for the early identification and prognosis of disease stages in individuals with different neurological and neuro-ophthalmological diseases. In addition, in acute situations, AI can increase the healthcare of sick individuals who require neuro-ophthalmology if specialists are not accessible. Therefore, more research is necessary to assess the clinical usage of AI applications in neuro-ophthalmic diseases.
The favorable reception of telehealth and AI-powered medical services by suppliers, patients, and regulatory bodies increased due to the COVID-19 pandemic [118].
However, the growth of AI applications requires complementarity with digital innovations that enable remote clinical diagnosis and self-investigations (e.g., digital fundus photography, vision testing applications) [119] to accelerate tele-neuro-ophthalmology as a viable healthcare system.

5. AI and Different Applications in Ophthalmology and Other Fields

Ophthalmology is well-suits to the AI domain, as it extensively uses image acquisition and processing. ML and DL have the capacity to learn from raw input data without them being especially encoded. Yet, many AI processes need data preprocessing.
Furthermore, DL approaches, like CNNs, have been successfully improved. In this respect, DL models are capable of avoiding preprocessing stage and be trained in an unsupervised manner, allowing accelerated, more effective, and often more accurate investigations, including in ophthalmology.
This section presents the role of AI in significantly improving ocular pathology, oncology, genetics, and pediatric ophthalmology. In addition, it has been proven that the eye is the interface to many biological systems in the body. For instance, the retina has anticipatory skills in assessing cardiovascular conditions, anemia, multiple sclerosis, and certain neurodegenerative diseases such as Alzheimer’s and Parkinson’s disease.
The use of AI in tele-ophthalmology is another challenge in eye care that has with great transformative power.

5.1. AI in Pathology

5.1.1. Models

Digital images related to pathology offer vigorous databases for training and using DL algorithms. For example, CNNs have demonstrated the ability to diagnose nodal metastases in breast cancer better than a standard panel of pathologists [120]: A DL algorithm classified test images with an AUC of 0.996, which is remarkably better than pathologists’ evaluations (AUC = 0.810). This method detected micrometastases (AUC = 0.885) more effectively, while physicians obtained an AUC of 0.808. The best pathologist in this group missed 37% of these cases.
AI is a proven framework for providing genotype and phenotype information based on pathology images, including in ophthalmology. For example, a deep CNN precisely classified lung images into normal, adenocarcinoma, and squamous cell carcinoma in much less time than pathologists [121]. The study obtained sensitivity and specificity comparable to pathologists’ evaluations, i.e., 89% and 93%, respectively. About 50% of images that were wrongly classified by the automated procedure were also wrongly classified by pathologists. Conversely, over 80% of images wrongly classified by at least one pathologist were accurately classified by the software. As datasets for ophthalmic histopathology expand, AI will play an important role in ocular pathology.

5.1.2. AI in Ocular Oncology

AI has shown accurate applications in the detection, diagnosis, and prediction of oncological diseases, and in certain situations even surpassing the performance of clinicians [120,122].

Choroidal Melanoma

Applications of AI have also been successfully demonstrated in the analysis of choroidal melanoma. For instance, AI could predict five-year mortality from choroidal melanoma considering patient demographics and ultrasound tumor images using an ANN which provided higher precision (86%) in comparison with an ocular oncologist (70%) [56].
Another ANN-based technique modeled the survival prognosis in choroidal melanoma patients using additional significant risk parameters (age, gender, tumor stage, cytogenetic melanoma type, and histological grade of malignancy) [56].

Retinoblastoma and Leukocoria

Retinoblastoma is the most common source of primary intraocular cancer in childhood and is responsible for 10–15% of cancers occurring in the first year of life. This tumor can grow and metastasize rapidly.
AI can be used to improve screening and identification of retinoblastoma.
CRADLE (Computer-Assisted Leukocoria Detector) is a smartphone-based application dedicated to screening children for leukocoria [123]. Patients had retinoblastoma, Coats disease, amblyopia, hypermetropia, and cataracts. CRADLE has been redesigned and is now based on a built-in CNN that was formerly designed to find leukocoria in nonclinical environments. A total of 52,982 photos were utilized to train and test the specific algorithm, and they were all acquired occasionally by the patients’ relatives. Retroactive research for CRADLE’s abilities showed that the application was suitable for identifying leukocoria in 80% of patients with eye conditions. Additionally, it found leukocoria in images that were acquired an average of 1.3 years before clinical diagnosis [123].

5.1.3. AI in Ocular Genetics

The clinical applications of AI in ocular genetics are comparatively limited, even if the extensive use of imaging in ophthalmology offers a rich source of data for genetic prediction.

Precedents

A study was conducted in which a CNN was trained to assess brain MRIs of low- and high-grade glioma patients to forecast genotypic features [123]. The application independently predicted both IDH1 and -2 gene mutations and MGMT gene methylation status with 94% and 83% accuracy, respectively. For a condition like glioma, the discovery of the genotype is crucial for suitable treatment.
Likewise, six different genetic mutations in lung adenocarcinoma—STK11, EGFR, FAT1, SETBP1, KRAS, and TP53—could be foreseen, with AUCs ranging from 0.733 to 0.856 when pathological images were analyzed [124]. This analysis can be implemented for any kind of cancer, including eye cancers.

Inherited Retinal Disorders (IRD)

IRD is an area of active development for DL systems.
Through modern sequencing, more than 250 genes and 300 loci implicated in retinal dystrophies have already been detected, laying the groundwork for evolving gene therapies and pharmacological agents [125].
Combined genetic and clinical studies associated morphological characteristics with specific genes in different retinal dystrophies, such as ABCA4, RP1L1, and EYS.
Ref. [125] describes the development of deep neural networks to precisely foresee causative genes in macular dystrophy (ABCA4 and RP1L1) compared to those in retinitis pigmentosa (EYS) by means of spectral domain on optical coherence tomography (SD-OCT) images. The study gave an overall average test precision of 90.9%, with individual category accuracies of 100% for ABCA4, 78% for RP1L1, 89.8% for EYS, and 93.4% for normal subjects.
AI systems in IRD and other specialties are proving their value by linking genetic prediction to clinical settings. These applications are very useful for conditions where specific expertise is limited, such as in IRD. Genetics will be incorporated into general screening to increase the value of clinical diagnoses.

5.1.4. AI in Pediatric Ophthalmology

Pediatric Cataract

Cataracts in children are a common and avoidable cause of visual abnormalities and possible everlasting vision loss at the international level [126]. Left untreated, mainly during the critical interval from birth to age 5, it can lead to irreversible amblyopia. In this way, early identification and surgical procedure are essential. Cataracts are nowadays diagnosed by means of the slit lamp, an examination that can often be hard to perform on children.
CC-Cruiser is an AI-based cloud platform that visualizes data and suggests treatment for patients with congenital cataracts. It comprises three CNNs that (1) visualize and identify potential cataract patients, (2) assess disease severity (lens opacity, density, and location) as well as risk stratification, and (3) provide advice—surgery or follow-up—based on the risk analysis.
A multicenter randomized study by means of the CC-Cruiser showed the accuracy of the platform in detecting cataracts and recommending specific treatment. The study obtained 87.4% and 70.8%, respectively, compared to experts, who achieved significantly better accuracies of 99.1% and 96.7%, respectively [127]. Yet, the CC-Cruiser provided a diagnosis in an average of 3 min, compared to almost 9 min for the experts.
However, cataract imaging is lacking. Therefore, similar to eye tumors, ML/DL algorithms cannot be trained reliably on small image sets. In addition, pediatric cataracts show certain characteristics and risk patterns, and thus, available adult cataract representations and datasets cannot be used.

Strabismus

Strabismus is a condition of ocular misalignment and is diagnosed through simple clinical examinations such as the Hirschberg test and coverage test.
In a hospital environment with specialized equipment, CNNs can identify strabismus from eye-tracking fixation divergence (95% accuracy) and from retinal birefringence scanning (100% accuracy) with high sensitivity and specificity [115].
Ref. [128] describes the application of AI techniques to analyze the red pupil reflex (Bruckner test) in video and photorefraction images for amblyogenic factors. The decision tree made the same referral/non-referral decision 77% of the time as the clinicians, who were considered the “gold standard”.
For situations with limited access to ophthalmic clinics, a new model was proposed for the development of a CNN for telemedicine. Supporting both tele-ophthalmology and clinical strabismus diagnosis, an AI-based mobile platform localizes eyes and classifies them into nine gaze positions [129].

5.1.5. Supplementary AI Applications in Eye Healthcare

The retina provides significant data on pathological conditions of the eye, and also serves as an interface to the rest of the body. Retinal imaging provides an abundance of data to assess cardiovascular (CV) health, anemia, and systemic disorders of the central nervous system, such as, for example, multiple sclerosis, Alzheimer’s disease, and Parkinson’s disease. These methods will act as the foundation of non-invasive and inexpensive techniques for screening, diagnosis, and monitoring of certain systemic conditions.

Cardiovascular Risk Factors

Precise, efficient, and scalable procedures of cardiovascular disease (CVD) risk assessment still are an important element and an urgent necessity for predicting outcomes of heart diseases, stroke, chronic kidney disease (CKD), and mortality, and for preventing unfavorable cardiovascular conditions such as heart attacks.
Common risk calculators such as the Framingham test (coronary risk assessment method) and the pooled cohort all necessitate invasive blood tests.
The retina, which presents as the only non-invasive visualization of the microvasculature, acts as an encouraging source of information regarding the subjects’ cardiovascular risk [130]. Ocular manifestations of cardiovascular diseases encompass hypertensive retinopathy, cholesterol emboli and occlusions, flare hemorrhages, blebs, and ischemic events.
Retinal vascular measurements (diameters of arteries and veins) in fundus photographs have been found to be related to CVD events. For instance, the risk of CVD is greater in individuals with narrower retinal arterioles and wider venules.
In the case of patients with diabetes, predictive capacity increases when retinal imaging information is completed with extra details such as established risk parameters (e.g., blood pressure) and CRP (C-reactive protein) level [130].
Moreover, investigations have demonstrated that retinal image processing improves stroke risk prognosis compared to that given by typical risk factors (gender, age, blood pressure, total cholesterol, LDL cholesterol, glycosylated hemoglobin (HbA1c), and antihypertensive medications) [131]. However, interpretation and evaluation of retinal images is a difficult and time-consuming activity for ophthalmologists, even if with semi-automated computer programs [101].
In a study by Verily, a Google research company, a DL-based method was developed and validated. It is able to better foresee several CV risk parameters than the method based on retinal fundus images alone: These parameters comprise age, gender, smoking status, systolic blood pressure, and body mass index [124]. The CNN learned by means of retinal fundus images from 48,101 patients from the Biobank Study (UK) and 236,234 patients from the eyePACS database. The research also generated additional software that estimated the onset of major adverse cardiovascular events (MACE) within 5 years using only fundus images. MACE comprises unstable angina, myocardial infarction, stroke, or death from CV causes. The model predicted very well, with an AUC of 0.70 (95% CI: 0.65, 0.74).
Despite the promising results, the above DL techniques need to be clinically confirmed and trained with larger and more diverse databases. Although fundus retinal imaging can complement CV risk evaluations nowadays, more advanced exploration is needed to determine if it can replace certain risk markers.

Multiple Sclerosis

This illness is a neurodegenerative disease originating from central nervous system (CNS) demyelination, resulting in progressive clinical disability. Diagnosis is made using the McDonald criteria, which contains clinical tests and objective evidence, including MRI of the brain and spinal cord and cerebrospinal fluid (CSF) examinations.
As MS progresses, patients experience events of optic neuritis or vision loss and diplopia because of internuclear ophthalmoplegia.
Diagnosis. Lately, some research has detected a correlation between MS and loss of both optic nerve axons, as well as macular thickness loss. Retinal and ML OCT investigations can supplement the traditional diagnosis of MS [132].
The retinal structure contains retinal ganglion cells and their axons, which compose the retinal nerve fiber layer (RNFL). By means of OCT images, research has shown that MS patients have thinner RNFL in comparison with healthy controls, both for those with and without optic neuritis [133].
In [18], three different ML algorithms are compared—decision trees, multilayer perceptron, and Support Vector Machine—to assess the diagnostic utility of RNFL and ganglion cell layer (GCL) thickness loss as computed by SS-OCT (swept source). Decision trees were found to give the best forecast (97.4%) by means of RNFL data.
Furthermore, in those three ML methods, RNFL thickness loss data were more appropriate in classifying MS or healthy subjects in comparison with GCL thickness loss data.
Progress monitoring. Existing learning techniques, which automatically segment and estimate retinal layers, imply important manual preprocessing for characteristics and other parameter extraction, a tedious and time-consuming activity [134]. The authors created a deep neural network that avoids manual characteristic and other factor extraction and can fully segment 3D retinal layers in just 10 s with precision similar to that in classical segmentation techniques.
Thus, while McDonald’s criteria for diagnosing MS are based mainly on neuroimaging and CSF exams, retinal decline and its easy visualization may serve as a later diagnostic instrument. Studies are required to further refine not only the accuracy of the algorithms, but also the specific markers within the retina.

Other Neurodegenerative Diseases

In recent decades the retina has been studied as a measuring tool for different neurodegenerative diseases, such as Alzheimer’s and Parkinson’s diseases. Different retinal biomarkers may serve as means to identify Alzheimer’s and Parkinson’s diseases. RNFL thinning, macular volume and thickness, and retinal vasculature are all factors that have been related to Alzheimer’s disease, although results have been inadequate [135].
Another study built an ML-based classification system that distinguished between patients with Alzheimer’s, Parkinson’s, and healthy individuals by means of retinal “texture” biomarkers [136]. Texture analysis is an evolving technique that permits early quantification of signal modifications that are not visible on existing images [137]. The study yielded average sensitivities of 88.7%, 79.5%, and 77.8% for healthy controls, Alzheimer’s patients, and Parkinson’s subjects, respectively.
Regarding Alzheimer’s disease, Optina Diagnostics is exploring the connection between retinal vasculature and β-amyloid plaques. By means of an AI-powered hyperspectral retinal imaging system, it was found that retinal venules in patients with cerebral β-amyloid plaques have higher tortuosity than patients without β-amyloid plaques [138]. Optina Diagnostics aims to foresee brain β-amyloid status in Alzheimer’s patients and looks for the connection between disease pathophysiology and clinical symptoms [139]. The above platform recently obtained Breakthrough Device Designation from the FDA, allowing an accelerated path for the development and evaluation of specific devices [140].

5.1.6. AI in Tele-Ophthalmology

AI applications in tele-ophthalmology have the power to increase access to healthcare, enable cost savings for patients, providers, and governmental bodies. Three delivery methods for the tele-ophthalmology domain have been described:
  • Asynchronous: This “store-and-forward” method uses clinical imaging information and sends it to the ophthalmologist for evaluation. This method is currently applied for diabetic retinopathy [141].
  • Synchronous: This approach may provide real-time telemedicine between patients and ophthalmologists through certain communication channels (e.g., video chat, phone calls, and smartphone and internet apps). This method has been implemented with success in many intensive care units (ICUs) where remote eye care is offered to centers that do not have such services.
  • Remote monitoring: This type of service permits suppliers to monitor sick individuals at home or remotely. Intraocular pressure (IOP)-measuring contact lenses, home IOP monitors, IOP sensors, visual field devices, and age-related macular degeneration (AMD) devices are examples of instruments that automatically send acquired data to providers in order to enhance clinical services.
AI is permeating the entire ophthalmic healthcare sector, i.e., in
  • Administrative work: AI programs reduce the administrative workload through intelligent scheduling, automatic invoicing, patient tracking, and claims management;
  • Robotics and procedures: AI software can be found in many specific devices, such as alignment automatic guides, and focus and data acquisition. In addition, AI has been used in minimally invasive surgery, remote surgery, and slit-lamp examinations. Robotic tools such as co-manipulators, tele-manipulators, and highly stable hands are being developed;
  • Diagnosis and screening: Integrating the interpretive and predictive skills of AI for tele-diagnosis significantly increases access to healthcare, decreases unnecessary eye doctor consultations, and saves money and time for sick individuals, providers, and the healthcare system. Tele-ophthalmology and AI have so far shown effectiveness for diabetic retinopathy and are being used, for instance, for glaucoma screening, AMD monitoring, corneal abnormal topography identification, and pediatric cataract screening.
  • Remote Monitoring: Telemonitoring is an important challenge in tele-ophthalmology. AI programs can provide personalized information to a large number of patients and assist providers in making clinical decisions.

5.1.7. Limitations

Despite the tremendous potential that AI holds, clinical implementation, and widespread usage face significant barriers. Clinical applications of AI are still in their infancy, with non-standardized procedures and somewhat limited publications. The majority of the emerging studies trained and tested specific computing programs on datasets that were not methodically assessed for value, unbiasedness, or clinical validity.
In this way, the standardization of procedures for clinical applications of AI is the following critical step [142]. Moreover, a systematic and accepted standard is needed for constructing and evaluating data sets, as well as for robust algorithm training, testing, and validation.
To date, there are no means to clinically validate AI applications, the specific acceptable levels of diagnostic precision in certain subspecialties and/or AUC that algorithms need to achieve, as well as potential risks and responsibilities that need to be agreed upon. Only with standardized processes and regulated oversight leading to the development of safe, inclusive platforms and accurate AI can consistent application in clinical management be achieved.

6. Ethics and AI: Pandora’s Box

6.1. Introduction

AI has made and will make significant contributions to healthcare services, biomedical research, and medical education [143]. AI can learn and integrate information from large clinical data sets and assist in diagnosis and decision-making, including prognostication. AI applications extend to many medical fields, for example, to physical task support systems, robotic prostheses, and assisting mobile manipulators and robots in telemedicine.
In this context, AI has several related ethical aspects that need to be identified and addressed. AI technology risks may compromise patient preferences, safety, and privacy. Ethical guidelines for AI technology are still a gray area [144]. In general, very clear guidelines on what represents “ethical AI” are absent. Moreover, ethical demands and technical regulations are required for this domain.
Another sensible issue is that AI algorithms and applications are programmed by computer engineers, and this human element, which is affected by bias, can introduce errors that can lead to unanticipated results.
A real concern is that AI and big data pose threats to the fundamental human right to privacy, as ML collects and saves data from multiple sources.
Speaking of ethics in relation to AI, we consider that (1) moral behavior of people is necessary when designing, building, and using AI systems, and (2) the machine must behave “ethically” [145].

6.2. Key Ethical Principles

Ethical principles regarding patient healthcare include non-maleficence, charity, respect for patient privacy, and justice.

6.2.1. Beneficence and Non-Maleficence

The fundamental professional and moral obligation of a physician is Primum non nocere (from Latin, “First, do no harm”). Yet, this does not only include medical aspects but also the general well-being of the patients. Physicians must maintain, when possible, a high quality of life for their patients by respecting the wishes and values of each patient.

6.2.2. Respect for Autonomy

In terms of AI, the paradigm of autonomy mainly applies to “brain-computer interface” (BCI) or “neural control interface” (NCI) applications, which involve direct communication between the brain and various external devices, where signals from the brain are converted into commands for these external devices.
BCIs are normally used as assistive devices for patients with disabilities because of neuromuscular diseases like stroke, cerebral palsy, or a spinal cord injury. Thus, the main objective is to use practical and effective BCI devices. The reliability of BCI systems should be monitored, and errors must be corrected to obtain a reliability similar to natural muscle-based functions.

6.2.3. Justice

AI development and sharing procedures are linked to justice aspects. In accordance with the main principle of justice, physicians can propose therapeutic treatments and options that benefit their patients without subjecting them to unjustified risks.
For instance, as many BCIs are now available to patients, it is mandatory for the patient to give informed consent regarding knowledge of device design, use, and potentially conflicting requirements. The ethical concern in this case is that the literature on BCI treats ‘disability’ as a medical rather than a social problem, and as a result, the perspective of disabled people is likely to be disregarded.
AI applications have a certain probability of replacing humans in some domains and conditions, which may negatively impact human dignity, mainly in jobs where practical ethics are essential, such as doctors, nurses, judges, and police officers. Consequently, much care should be taken when designing an AI-powered application. Self-ameliorating AI systems can emerge to become so assertive that humans may encounter barriers to achieving their intentions, which can lead to unintended outcomes.

6.3. The Black Box Problem: The AI Enigma

The “black box” approach of many AI algorithms, known as opacity, is suggestively exemplified by deep neural networks. If input data are provided, for example, as a fundus image, a neural network trained on a large dataset can find a pattern in the input data and yield an output decision (e.g., a classification of retinopathy), but is unable to conclusively explain how it reached that result.
Moreover, because the neural network learns in a way similar to humans through self-learning, when given supplementary data, the neural network modifies its decision-making process for a more accurate output, again without explaining how it did so. In the case of possible medical malpractice resulting from AI technology, this opacity can lead to possible legal problems, as tort liability is not enough to approach medical malpractice resulting from the use of AI as a black box.

6.4. Explainable AI (XAI) in Ophthalmology

The opacity of automatic diagnosis posed by ML or DL systems can be successfully addressed using so-called explainable AI models.
A suggestive illustration of this concept is made in [127], which presents an explainable AI (XAI) study for the detection of geographic atrophy by means of color retinal photographs. XAI provides automated geographic atrophy (GA) screening using easily accessible color retinal images. Geographic atrophy (GA) is an advanced stage of age-related macular degeneration and is an important concern for global healthcare, as it is a main cause of blindness. This condition affects approximately 5 million people, and predictions suggest that this number will reach 10 million cases by 2040. The major problem in addressing GA lies in the poor understanding of its etiology and pathogenesis.
Due to its high performance and explainability, the model shown in [127] can aid the clinical validation of AI for the diagnosis of GA (Figure 8). Through the early detection of GA, this AI method can improve patient access to effective treatments, increasing the chance of preserving sight.
The model outperforms previous methods based on retinal imaging. Class activation mapping techniques are used to explain the visual decision-making process of the AI model, thereby increasing transparency.
  • Outcomes: A total of 540 color retinal images were collected. The data were used as follows: 300 images trained the AI model, 120 for validation, and 120 for testing. To distinguish GA from healthy subjects’ eyes, the model demonstrated 100% sensitivity, 97.5% specificity, and 98.4% overall diagnostic accuracy. Other performance values: AUC-ROC = 0.988 and precision-recall = AUC-PR = 0.952, which are very good model values.
When distinguishing GA from other retinal diseases, the DL network retained a diagnostic accuracy of 96.8%, a precision of 90.9%, and a recall of 100%, resulting in an F1 score of 0.952. The AUC-ROC and AUC-PR were 0.975 and 0.909, respectively. (The F1 score is the harmonic mean of precision and recall.)
Conclusions
  • The explainable AI model provides automatic GA screening, with very good results, using easily accessible retinal imaging.
  • Because of high performance and explainability, the above architecture can aid clinical validation and propose AI acceptance for GA diagnosis.
  • Due to optimization, early detection of GA can occur, and this method with AI can increase patient access to innovative, possibly vision-preserving treatments.

7. General Conclusions

The incorporation of artificial intelligence (AI) in ophthalmology has encouraged the development of this medical discipline, thus creating opportunities for improving diagnostic precision, patient healthcare, and treatment results. The goal of this paper, mainly written for ophthalmologists, is to offer a fundamental understanding of AI systems in ophthalmology, with an outline of presenting research related to AI-powered diagnosis. The essence of our endeavor is to show various AI paradigms, comprising deep learning (DL) frameworks for identifying and quantifying ophthalmic features in imaging data, also using transfer learning for efficient network training with rather small datasets. This paper focuses on the significance of high-quality, diverse databases for training AI systems and the necessity for transparent description of certain methodologies to guarantee reproducibility and reliability in AI research. Moreover, we discuss the clinical implications of AI diagnosis, highlighting the equilibrium between decreasing false negatives to avoid wrong diagnoses and minimizing false positives to exclude unnecessary medical interventions. The paper also presents ethical issues and potential biases in AI models, as well as the significance of continuous monitoring and enhancement of AI systems in clinical environments. As a corollary, this review consists of a guide for ophthalmologists who want to understand the basics of AI in their specialty, leading them through certain important issues of explaining how AI works and the practical assessments for incorporating AI into clinical procedures.
AI has great potential to improve healthcare by enhancing the ophthalmologists’ work, allowing existing physicians to consult more patients, improving patient status, and decreasing health discrepancies. In ophthalmology, AI systems have demonstrated achievements similar to or even better than the best ophthalmologists in activities, such as diabetic retinopathy identification and grading. Yet, regardless of these very good results, few AI structures have been implemented in clinical environments, showing the efficiency of these systems. This analysis gives a survey of the existing main AI systems in ophthalmology, shows the challenges that need to be resolved before clinical usage of the AI systems, and talks about the approaches that may pave the way for the translation of these systems into the clinical environment.
Yet, AI applications in medicine and ophthalmology suffer from some disadvantages that must be permanently tackled.
For instance, AI bias may occur when training datasets are unbalanced (e.g., insufficient or no data for certain subpopulations), such as having a paucity of training images of diabetic retinopathy (DR) for individuals of a given ethnicity, such as Africans. As a possible solution, such unbalanced databases may be modified by adding clinician-annotated labels and constructing an artificial scenario of data imbalance and domain generalization by disallowing training (but not testing) exemplars for images of retinas with DR warranting referral (DR-referable) from darker-skinned individuals.
Data privacy poses certain dangers because one of the main limitations of machine learning and deep learning approaches is their requirement for large datasets for development and testing. Compared to other medical specialties, ophthalmology has benefited from the widespread availability of large imaging datasets. Although the availability of anonymized datasets has been a benefit for technological advancement, it also represents a significant risk. Breaches of patient privacy can cause major harm and can also have unintended consequences. These could potentially impact one’s employment or insurance coverage and may even allow computer hackers to obtain Social Security numbers and personal financial information.
Removal of all potentially identifiable information from large datasets can be a very difficult task, and this is not an issue unique to ophthalmology. As examples, features from the periocular region have been used to identify the age of patients using machine learning algorithms. Gender, age, and cardiovascular risk factors have been identified from fundus photographs. https://pmc.ncbi.nlm.nih.gov/articles/PMC7424948/ (accessed on 4 November 2024). Even for datasets not involving medical images, it may be possible to identify individuals by linkage with other datasets. This is particularly the case as patient information generally accumulates over time [146].
As future directions for using AI in ophthalmology, we can say that the convergence of AI and generative modeling techniques has unlocked new possibilities in ophthalmic research and clinical practice. This is a strong future approach that will have profound implications for Generative AI (GAI) and Explainable AI (XAI) in ophthalmology, exploring its applications, challenges, and prospects. This generative AI, encompassing techniques such as generative adversarial networks (GANs) and variational autoencoders (VAEs), has revolutionized the synthesis and manipulation of ophthalmic images.
Moreover, GAI holds promise in personalized and precision medicine, as well as in treatment optimization in ophthalmology. GAI and XAI will enable clinicians to better tailor treatment plans and predict treatment responses with greater accuracy by generating patient-specific simulations and predictive models. Furthermore, GAI facilitates data augmentation and domain adaptation, addressing challenges related to dataset scarcity and distributional shifts in ophthalmic imaging. By using diverse and representative datasets, GAI will enhance the robustness and generalizability of AI algorithms, enabling seamless deployment across different clinical settings and populations.
Also, ethical considerations related to the generation and use of synthetic data, as well as the potential for bias and adversarial attacks, must be carefully addressed. Additionally, ensuring the reliability and interpretability of GAI and XAI outputs is crucial for fostering trust and acceptance among clinicians, decisional bodies, and patients. Through interdisciplinary collaboration and knowledge exchange between physicians and technical staff, the field of ophthalmology will advance in order to improve patient care and visual health outcomes on a global scale.

Author Contributions

All authors contributed to this work. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

Not applicable.

Conflicts of Interest

The authors declare no conflicts of interest.

Appendix A

Definitions and Acronyms of AI Terms
Accuracy is a scoring system in binary classification (i.e., determining if an answer or output is correct or not) and is calculated as (True Positives + True Negatives)/(True Positives + True Negatives + False Positives + False Negatives).
AI ethics deals with the problems that AI stakeholders such as program developers and healthcare officials have to think about to make sure that the technology is deployed and used in a trustworthy manner. This means implementing applications that contribute to a cautious, tight, unprejudiced, and environmentally kind and pleasant approach to AI.
AMD. Age-Related Macular Degeneration
ANN. Artificial Neural Network.
Annotation. The process of tagging language data by identifying and flagging grammatical, semantic, or phonetic elements in language data.
Application programming interface (API). An API is a set of rules that decide how two computer programs will interact with each other. APIs are inclined to be written in programming languages like C++ or JavaScript.
Bias. Assumptions made by a model that simplify the process of learning to perform its assigned task. Most supervised machine learning models perform better with low bias, as these assumptions can negatively affect results.
Big Data. Big Data is a term popularized by John Mashey, used to describe large datasets that are difficult to process using traditional database and software techniques.
Blockchain. An evolving list of records linked by means of cryptography via a peer-to-peer network. It works like an open, distributed register that can record transactions between two entities efficiently in a verifiable and permanent manner.
Chatbot. A chatbot (or simply bot) is a computer program that is intended to emulate human conversation through text or voice tools.
Cognitive computing. Cognitive computing is primordially the same as AI. It is a computerized environment that concentrates on emulating human thought processes such as pattern recognition and learning.
Data analytics. Data analytics is known as a set of methods for analyzing raw data to extract conclusions about that data. These techniques can disclose tendencies and metrics that would otherwise be lost in an ocean of information.
Data mining. The procedure of discovering patterns in large databases using techniques at the union of ML, statistics, and database systems.
Data science. Data science is an interdisciplinary field of technology that uses algorithms and processes to gather and analyze large amounts of data to uncover patterns and insights that inform business decisions.
Deep learning. Deep learning is a function of AI that imitates the human brain by learning how it structures and processes information to make decisions. Instead of relying on an algorithm that can only perform one specific task, this subset of machine learning can learn from unstructured data without supervision.
Explainable AI/Explainability. An AI approach in which the performance of its methods can be trusted and easily understood by humans. Unlike black-box AI, this approach reaches a decision, and the reasoning can be seen behind its outcomes.
F-score (F-measure, F1 measure). An F-score is the harmonic mean of a system’s precision and recall values. It can be calculated using the following formula: 2 × [(Precision × Recall)/(Precision + Recall)]. Criticism of the use of the F-score to determine the quality of a predictive system shows that a moderately high F-score can be the result of a disparity between precision and recall, and, therefore, it does not tell the real story. However, systems with a high level of accuracy try to improve precision or recall without negatively affecting the other.
FDA. Food and Drug Administration.
Generative AI. A type of technology that creates content using AI, including text, images, video, and code. A generative AI application is trained using large amounts of data to find patterns for creating new content.
Hyperparameter. Occasionally used interchangeably with parameter, although the terms have some subtle differences. Hyperparameters are values that affect the way your model learns. They are usually set manually outside the model.
Image recognition. Image recognition is the process of identifying an object, person, place, or text in an image or video.
ICU. Intensive care unit(s).
Inference Engine. A module of an [expert] system that uses logical rules for the knowledge base to find new or extra information.
Large Language Models (LLMs). A supervised learning approach that uses ensemble learning methods for regression. This ensemble learning method is an approach that combines forecasts from certain ML methods to provide a more accurate prediction than a single method.
Machine learning. A branch of AI that includes issues of computer science, mathematics, and informatics. Machine learning concentrates on deploying algorithms and models that help computers learn from data and foresee trends and behaviors without human intervention.
Metadata. Data that explains or yields information about other processed data.
Natural language processing (NLP). NLP is a branch of AI that permits computing machines to understand spoken and written human language. NLP allows features like text and speech recognition on certain computing devices.
Neural network. A neural network is a machine learning or a deep learning software tool implemented to emulate the human brain’s basic structure. It needs large datasets to execute computations and create outcomes, which permits actions such as speech or image recognition.
OCT. Optical Coherence Tomography.
Ontology. Similar to a taxonomy, it improves its simple tree-like classification approach by adding properties to each node and links between nodes that can be extended to other branches. This structure is not standard, nor is it limited to a predefined set. Consequently, they must be negotiated upon by the classifying program and the user.
Overfitting. Overfitting takes place in machine learning training when the algorithm can only run on certain samples within the training data. An archetypal AI system should be capable of generalizing patterns in the input data to address new tasks.
Pattern recognition. It is the approach of using algorithms to analyze, detect, and label certain regular structures in data, thus showing how the data is categorized into different classes.
Precision. In pattern recognition, information retrieval, and classification (machine learning), precision (also called positive predictive value) is the fraction of relevant instances among the retrieved instances.
Predictive analytics. Predictive analytics is a type of software that predicts what will occur in a specific time slot based on precedent data and patterns.
Pretrained model. It is a model trained to achieve a relevant task for different organizations or contexts. Such a pretrained model can be used to create a fine-tuned, contextualized version of that model, by means of applying a transfer learning approach.
Quantum computing. The attempt of using quantum-mechanical principles such as entanglement and superposition to fulfill calculations. Quantum machine learning accelerates work because it executes much faster than a traditional machine learning method.
Random Forest. A supervised ML algorithm that evolves and comprises multiple decision trees to generate a “forest”. It is used for both classification and regression tasks and is usually written in R or Python programming languages.
Recall. Given a set of outcomes from a processed database, recall is the percentage value that shows how many correct results have been retrieved based on the expectations of the application. This measure can be applied to any category of predictive AI systems such as search, categorization, and entity recognition. In pattern recognition, information retrieval, and classification (machine learning), recall (also known as sensitivity) is the fraction of relevant instances that were retrieved.
Recommendation system. A recommender system or a recommendation system (sometimes also called a recommendation platform or engine) is a subclass of information-filtering systems that seeks to predict the “rating” or “preference” a user would give to an item.
Recurrent Neural Networks (RNN). It is a neural network architecture commonly used in natural language processing and speech recognition, enabling previous outputs to be used as inputs for the current calculations.
Reinforcement learning. A type of machine learning in which a software program learns by interaction with its surroundings, being either rewarded or penalized based on its activity.
Responsible AI. A broad term that comprises the business and ethical choices related to how organizations use and develop AI capacities. Generally, Responsible AI seeks to enable Transparent (to see how an AI system works), Explainable (to explain why a specific decision in an AI system was taken), Fair (to ensure that a specific group is not discriminated based on an AI model’s decision), and Sustainable (Can the development and curation of AI models be done on an environmentally sustainable basis?) use of AI.
Sentiment analysis. Synonymous with opinion mining, sentiment analysis is the activity of analyzing the mood and opinion of a given text or video by means of AI. Sentiment is commonly measured on a linear scale (negative, neutral, or positive), but advanced implementations can categorize input data in terms of emotions, moods, and feelings.
Structured data. Defined and searchable data. It comprises different data, such as phone numbers, dates, or product stock-keeping units.
Supervised learning. A type of machine learning in which classified output data is used to train the algorithm, producing a correct software program. It is more usual than unsupervised learning.
SVM. Support Vector Machine (an ML method).
Token. A token is a basic unit of text that a Large Language Model uses to understand and generate language. A token may be a whole word or fragments of a word.
Training data. Training data comprise the information or samples offered to an AI application to allow it to learn, discover patterns, or generate new content.
Transfer learning. A machine learning method that uses previously learned data and applies it to new tasks and operations.
Unstructured data. Undefined and difficult-to-search data. This comprises audio, still images, and video content. Most existing data is unstructured.
Unsupervised learning. A type of machine learning in which a network is trained with unlabeled data in such a manner that it acts in the absence of supervision.
Validation data. Structured like training data with input and labels, this data is used to test a recently trained model against new data and to analyze performance, with a particular focus on checking for overfitting.
Voice recognition. Also known as speech recognition, it is a procedure of human–computer interaction in which computing programs listen to and interpret human speech and provide written or spoken outcomes. Examples include Apple’s Siri and Amazon’s Alexa, applications that allow hands-free requests and tasks.
Weak AI. Also called narrow AI, this is a model that has a set range of skills and focuses on one particular set of tasks. Most AI currently in use is weak AI, unable to learn or perform tasks outside of its specialized skill set.

Appendix B

  • Area under the ROC curve (AUC)
Figure A1. AUC (area under the ROC curve).
Figure A1. AUC (area under the ROC curve).
Applsci 15 01913 g0a1
  • AUC is widely used to measure the accuracy of diagnostic tests. The closer the ROC (Receiver Operating Characteristic) curve is to the upper-left corner of the graph, the higher the accuracy of the test, because in the upper left corner, sensitivity = 1, and false positive rate = 0 (specificity = 1).
  • Area under the curve (AUC) is the measure of a binary classifier’s ability to distinguish between classes and is used as a summary of the ROC curve. The higher the AUC, the better the performance of the model in distinguishing between positive and negative classes.
  • In general, an AUC of 0.5 suggests that there is no discrimination (i.e., for example, the ability to diagnose patients with or without a disease or condition based on the test), 0.7 to 0.8 is considered acceptable, 0.8 to 0.9 is considered excellent, and greater than 0.9 is considered outstanding.
  • Cohen’s Kappa score (κ) is a statistical measure used to quantify the level of agreement between two raters (or judges, observers, etc.) who each classify items into categories. It is especially useful in situations where decisions are subjective and the categories are nominal (i.e., they do not have a natural order).
If the two raters each classify N items into C mutually exclusive categories, the formula of k is
k p 0 p e 1 p e = 1 1 p 0 1 p e   ,
where po is the relative observed agreement among raters, and pe is the hypothetical probability of chance agreement, using the observed data to calculate the probabilities of each observer randomly selecting each category. If the raters are in complete agreement, then k = 1. If there is no agreement among the raters other than what would be expected by chance (as given by pe), then k = 0.

References

  1. Available online: https://pubmed.ncbi.nlm.nih.gov/?term=artificial+intelligence&filter=years.2018-2024&Timeline=expanded (accessed on 4 November 2024).
  2. Hype Cycle Research Methodology. Available online: https://www.gartner.com/en/research/methodologies/gartner-hype-cycle (accessed on 4 November 2024).
  3. Sevakula, R.; Au-yeung, W.-T.S.; Jagmeet, H.; Isselbacher, E.; Armoundas, E. State-of-the-Art Machine Learning Techniques Aiming to Improve Patient Outcomes Pertaining to the Cardiovascular System. J. Am. Heart Assoc. 2020, 9, e013924. [Google Scholar] [CrossRef] [PubMed]
  4. Kang, D.; Wu, H.; Yuan, L.; Shi, Y.; Jin, K.; Grzybowski, A. A beginner’s guide to artificial intelligence for ophthalmologists. Ophthalmol. Ther. 2024, 13, 1841–1855. [Google Scholar] [CrossRef]
  5. Jaeschke, R.; Guyatt, G.H.; Sackett, D.L. Users’ guides to the medical literature: III. How to use an article about a diagnostic test B. what are the results and will they help me in caring for my patients? JAMA 1994, 271, 703–707. [Google Scholar] [CrossRef] [PubMed]
  6. Yu, M.; Tham, Y.-C.; Rim, T.H.; Ting, D.S.W.; Wong, T.Y.; Cheng, C.-Y. Reporting on deep learning algorithms in health care. Lancet Digit. Health 2019, 1, e328–e329. [Google Scholar] [CrossRef] [PubMed]
  7. De Fauw, J.; Ledsam, J.R.; Romera-Paredes, B.; Nikolov, S.; Tomasev, N.; Blackwell, S.; Askham, H.; Glorot, X.; O’Donoghue, B.; Visentin, D.; et al. Clinically applicable deep learning for diagnosis and referral in retinal disease. Nat. Med. 2018, 24, 1342–1350. [Google Scholar] [CrossRef] [PubMed]
  8. Moraes, G.; Fu, D.J.; Wilson, M.; Khalid, H.; Wagner, S.K.; Korot, E.; Ferraz, D.; Faes, L.; Kelly, C.J.; Spitz, T.; et al. Quantitative analysis of OCT for neovascular age-related macular degeneration using deep learning. Ophthalmology 2021, 128, 693–705. [Google Scholar] [CrossRef]
  9. Cabitza, F.; Rasoini, R.; Gensini, G.F. Unintended consequences of machine learning in medicine. JAMA 2017, 318, 517–518. [Google Scholar] [CrossRef]
  10. Raví, D.; Wong, C.; Deligianni, F.; Berthelot, M.; Andreu-Perez, J.; Lo, B.; Yang, G.Z. Deep learning for health informatics. IEEE J. Biomed. Health Inform. 2017, 21, 4–21. [Google Scholar] [CrossRef]
  11. Ichhpujani, P.; Thakur, S. (Eds.) Artificial Intelligence and Ophthalmology. Perks, Perils and Pitfalls; Springer: Berlin/Heidelberg, Germany, 2021. [Google Scholar] [CrossRef]
  12. Gurudath, N.; Celenk, M.; Riley, H.B. Machine Learning Identification of Diabetic Retinopathy from Fundus Images. In Proceedings of the IEEE Signal Processing in Medicine and Biology Symposium (SPMB), Philadelphia, PA, USA, 13 December 2014; pp. 1–7. [Google Scholar]
  13. Decencière, E.; Cazuguel, G.; Zhang, X.; Thibault, G.; Klein, J.-C.; Meyer, F.; Marcotegui, B.; Quellec, G.; Lamard, M.; Danno, R.; et al. TeleOphta: Machine learning and image processing methods for teleophthalmology. IRBM 2013, 34, 196–203. [Google Scholar] [CrossRef]
  14. Vandarkuhali, T.; Ravichandran, D.R.C.S. ELM based detection of abnormality in retinal image of eye due to diabetic retinopathy. J. Theor. Appl. Inf. Technol. 2014, 6, 423–428. [Google Scholar]
  15. Priyadarshini, R.; Dash, N.; Mishra, R. A Novel Approach to Predict Diabetes Mellitus Using Modified Extreme Learning Machine. In Proceedings of the Electronics and Communication Systems (ICECS), Coimbatore, India, 13–14 February 2014; pp. 1–5. [Google Scholar]
  16. Bietti, A.; Mairal, J. Group invariance stability to deformations, and complexity of deep convolutional representations. J. Mach. Learn. Res. 2019, 20, 1–49. [Google Scholar]
  17. Claro, M.; Santos, L.; Silva, W.; Araújo, F.; Moura, N.; Macedo, A. Automatic glaucoma detection based on optic disc segmentation and texture feature extraction. CLEI Electron. J. 2016, 19, 1–10. [Google Scholar] [CrossRef]
  18. Litjens, G.; Kooi, T.; Ehteshami Bejnordi, B.; Adiyoso Setio, A.A.; Ciompi, F.; Ghafoorian, M.; van der Laak, J.A.W.M.; van Ginneken, B.; Sánchez, C.I. A survey on deep learning in medical image analysis. Med. Image Anal. 2017, 42, 60–88. [Google Scholar] [CrossRef] [PubMed]
  19. Perdomo, O.; González, F. A systematic review of deep learning methods applied to ocular images. Cienc. Ing. Neogranadina 2019, 30, 9–26. [Google Scholar] [CrossRef]
  20. Deng, J.; Dong, W.; Socher, R.; Li, L.J.; Li, K.; FeiFei, L. ImageNet: A Large-Scale Hierarchical Image Database. In Proceedings of the 2009 IEEE Conference on Computer Vision and Pattern Recognition, Miami, FL, USA, 20–25 June 2009. [Google Scholar]
  21. Toledo-Cortés, S.; de la Pava, M.; Perdomo, O.; González, F.A. Hybrid Deep Learning Gaussian Process for Diabetic Retinopathy Diagnosis and Uncertainty Quantification. In Proceedings of the 7th International Workshop, OMIA 2020, Lima, Peru, 8 October 2020. [Google Scholar]
  22. Müller, H.; Unay, D. Medical decision support using increasingly large multimodal data sets. In Big Data Analytics for Large-Scale Multimedia Search; Wiley: Hoboken, NJ, USA, 2019; pp. 317–336. [Google Scholar]
  23. Zhou, S.K.; Greenspan, H.; Shen, D. Deep Learning for Medical Image Analysis; Springer: Berlin/Heidelberg, Germany, 2017; pp. 1–433. [Google Scholar]
  24. Voets, M.; Møllersen, K.; Bongo, L.A. Reproduction study using public data of: Development and validation of a deep learning algorithm for detection of diabetic retinopathy in retinal fundus photographs. PLoS ONE 2019, 14, e0217541. [Google Scholar] [CrossRef] [PubMed]
  25. Kumar, S.A.; Satheesh Kumar, J. A Review On Recent Developments for the Retinal Vessel Segmentation Methodologies and Exudate Detection in Fundus Images Using Deep Learning Algorithms. In Computational Vision and Bio-Inspired Computing; Smys, S., Tavares, J., Balas, V.E., Iliyasu, A.M., Eds.; Springer: Cham, Switzerland, 2020; pp. 1363–1370. [Google Scholar]
  26. Ronneberger, O.; Fischer, P.; Brox, T. U-net: Convolutional networks for Biomedical Image Segmentation. In Proceedings of the MICCAI 2015—18th International Conference, Munich, Germany, 5–9 October 2015; pp. 234–241. [Google Scholar]
  27. Hesamian, M.H.; Jia, W.; He, X.; Kennedy, P. Deep learning techniques for medical image segmentation: Achievements and challenges. J. Digit. Imaging 2019, 32, 582–596. [Google Scholar] [CrossRef] [PubMed]
  28. Andrearczyk, V.; Müller, H. Deep Multimodal Classification of Image Types in Biomedical Journal Figures. In Proceedings of the 9th International Conference of the CLEF Association, CLEF 2018, Avignon, France, 10–14 September 2018; pp. 3–14. [Google Scholar]
  29. Yoo Tae, K.; Choi Joon, Y.; Seo Jeong, G.; Ramasubramanian, B.; Selvaperumal, S.; Kim, D.W. The possibility of the combination of OCT and fundus images for improving the diagnostic accuracy of deep learning for age-related macular degeneration: A preliminary experiment. Med. Biol. Eng. Comput. 2019, 57, 677–687. [Google Scholar]
  30. Golabbakhsh, M.; Rabbani, H. Vessel-based registration of fundus and optical coherence tomography projection images of retina using a quadratic registration model. IET Image Process 2013, 7, 768–776. [Google Scholar] [CrossRef]
  31. Schlegl, T.; Waldstein, S.; Vogl, W.D.; Schmidt Erfurth, U.; Langs, G. Predicting semantic descriptions from medical images with convolutional neural networks. Inf. Process. Med. Imaging 2015, 9123, 733–745. [Google Scholar]
  32. Perdomo, O.J.; Arevalo, J.; González, F.A. Combining morphometric features and convolutional networks fusion for glaucoma diagnosis. In Proceedings of the 13th International Symposium on Medical Information Processing and Analysis, San Andres Island, Colombia, 5–7 October 2017. [Google Scholar] [CrossRef]
  33. Lopes, B.T.; Ramos, I.C.; Salomão, M.Q.; Guerra, F.P.; Schallhorn, S.C.; Schallhorn, J.M.; Vinciguerra, R.; Vinciguerra, P.; Price, F.W., Jr.; Price, M.O.; et al. Enhanced tomographic assessment to detect corneal ectasia based on artificial intelligence. Am. J. Ophthalmol. 2018, 195, 223–232. [Google Scholar] [CrossRef]
  34. Yoo, T.K.; Ryu, I.H.; Lee, G.; Kim, Y.; Kim, J.K.; Lee, I.S.; Kim, J.S.; Rim, T.H. Adopting machine learning to automatically identify candidate patients for corneal refractive surgery. NPJ Digit. Med. 2019, 2, 59. [Google Scholar] [CrossRef] [PubMed]
  35. Lin, S.R.; Ladas, J.G.; Bahadur, G.G.; Al-Hashimi, S.; Pineda, R. A review of machine learning techniques for keratoconus detection and refractive surgery screening. Semin. Ophthalmol. 2019, 34, 317–326. [Google Scholar] [CrossRef] [PubMed]
  36. Kamiya, K.; Ayatsuka, Y.; Kato, Y.; Fujimura, F.; Takahashi, M.; Shoji, N.; Mori, Y.; Miyata, K. Keratoconus detection using deep learning of colour-coded maps with anterior segment optical coherence tomography: A diagnostic accuracy study. BMJ Open 2019, 9, e031313. [Google Scholar] [CrossRef] [PubMed]
  37. Valdés-Mas, M.A.; Martín-Guerrero, J.D.; Rupérez, M.J.; Pastor, F.; Dualde, C.; Monserrat, C.; Peris-Martínez, C. A new approach based on machine learning for predicting corneal curvature (K1) and astigmatism in patients with keratoconus after intracorneal ring implantation. Comput. Methods Programs Biomed. 2014, 116, 39–47. [Google Scholar] [CrossRef]
  38. Yousefi, S.; Takahashi, H.; Hayashi, T.; Tampo, H.; Inoda, S.; Arai, Y.; Tabuchi, H.; Asbell, P. Predicting the likelihood of need for future keratoplasty intervention using artificial intelligence. Ocul. Surf. 2020, 18, 320–325. [Google Scholar] [CrossRef]
  39. Yau, J.W.; Rogers, S.L.; Kawasaki, R.; Lamoureux, E.L.; Kowalski, J.W.; Bek, T.; Chen, S.J.; Dekker, J.M.; Fletcher, A.; Grauslund, J.; et al. Global prevalence and major risk factors of diabetic retinopathy. Diabetes Care 2012, 35, 556–564. [Google Scholar] [CrossRef] [PubMed]
  40. ElTanboly, A.; Ismail, M.; Shalaby, A.; Switala, A.; El-Baz, A.; Schaal, S.; Gimel’farb, G.; El-Azab, M. A computer-aided diagnostic system for detecting diabetic retinopathy in optical coherence tomography images. Med. Phys. 2017, 44, 914–923. [Google Scholar] [CrossRef] [PubMed]
  41. Faes, L.; Bodmer, N.S.; Locher, S.; Keane, P.A.; Balaskas, K.; Bachmann, L.M.; Schlingemann, R.O.; Schmid, M.K. Test performance of optical coherence tomography angiography in detecting retinal diseases: A systematic review and meta-analysis. Eye 2019, 33, 1327–1338. [Google Scholar] [CrossRef] [PubMed]
  42. Food and Drug Administration. FDA Permits Marketing of Artificial Intelligence-Based Device to Detect Certain Diabetes-Related Eye Problems. 2018. Available online: https://www.fda.gov/news-events/press-announcements/fda-permits-marketing-artificial-intelligence-based-device-detect-certain-diabetes-related-eye (accessed on 6 November 2024).
  43. Zhang, H.; Chen, Z.; Chi, Z.; Fu, H. Hierarchical local binary pattern for branch retinal vein occlusion recognition with fluorescein angiography images. Electron. Lett. 2014, 50, 1902–1904. [Google Scholar] [CrossRef]
  44. Nagasato, D.; Tabuchi, H.; Masumoto, H.; Enno, H.; Ishitobi, N.; Kameoka, M.; Niki, M.; Mitamura, Y. Automated detection of a nonperfusion area caused by retinal vein occlusion in optical coherence tomography angiography images using deep learning. PLoS ONE 2019, 14, e0223965. [Google Scholar] [CrossRef]
  45. Waldstein, S.M.; Montuoro, A.; Podkowinski, D.; Philip, A.M.; Gerendas, B.S.; Bogunovic, H.; Schmidt-Erfurth, U. Evaluating the impact of vitreomacular adhesion on anti-VEGF therapy for retinal vein occlusion using machine learning. Sci. Rep. 2017, 7, 2928. [Google Scholar] [CrossRef] [PubMed]
  46. Brown, J.M.; Campbell, J.P.; Beers, A.; Chang, K.; Ostmo, S.; Chan, R.V.P.; Dy, J.; Erdogmus, D.; Ioannidis, S.; Kalpathy-Cramer, J.; et al. Automated diagnosis of plus disease in retinopathy of prematurity using deep convolutional neural networks. JAMA Ophthalmol. 2018, 136, 803–810. [Google Scholar] [CrossRef] [PubMed]
  47. Lee, C.S.; Baughman, D.M.; Lee, A.Y. Deep learning is effective for classifying normal versus age-related macular degeneration OCT images. Ophthalmol. Retin. 2017, 1, 322–327. [Google Scholar] [CrossRef] [PubMed]
  48. Burlina, P.M.; Joshi, N.; Pacheco, K.D.; Freund, D.E.; Kong, J.; Bressler, N.M. Use of deep learning for detailed severity characterization and estimation of 5-year risk among patients with age-related macular degeneration. JAMA Ophthalmol. 2018, 136, 1359–1366. [Google Scholar] [CrossRef]
  49. Grassmann, F.; Mengelkamp, J.; Brandl, C.; Harsch, S.; Zimmermann, M.E.; Linkohr, B.; Peters, A.; Heid, I.M.; Palm, C.; Weber, B.H.F. A deep learning algorithm for prediction of age-related eye disease study severity scale for age-related macular degeneration from color fundus photography. Ophthalmology 2018, 125, 1410–1420. [Google Scholar] [CrossRef] [PubMed]
  50. Peng, Y.; Dharssi, S.; Chen, Q.; Keenan, T.D.; Agrón, E.; Wong, W.T.; Chew, E.Y.; Lu, Z. DeepSeeNet: A deep learning model for automated classification of patient-based age-related macular degeneration severity from color fundus photographs. Ophthalmology 2019, 126, 565–575. [Google Scholar] [CrossRef]
  51. Stuart, A. Harnessing AI for Glaucoma. Eyenet Magazine, 1 May 2023; 40–45. [Google Scholar]
  52. Salam, A.A.; Khalil, T.; Akram, M.U.; Jameel, A.; Basit, I. Automated detection of glaucoma using structural and nonstructural features. Springerplus 2016, 5, 1519. [Google Scholar] [CrossRef] [PubMed]
  53. Li, Z.; He, Y.; Keel, S.; Meng, W.; Chang, R.T.; He, M. Efficacy of a deep learning system for detecting glaucomatous optic neuropathy based on color fundus photographs. Ophthalmology 2018, 125, 1199–1206. [Google Scholar] [CrossRef] [PubMed]
  54. Ting, D.S.W.; Cheung, C.Y.; Lim, G.; Tan, G.S.W.; Quang, N.D.; Gan, A.; Hamzah, H.; Garcia-Franco, R.; San Yeo, I.Y.; Lee, S.Y.; et al. Development and validation of a deep learning system for diabetic retinopathy and related eye diseases using retinal images from multiethnic populations with diabetes. JAMA 2017, 318, 2211–2223. [Google Scholar] [CrossRef] [PubMed]
  55. Ohsugi, H.; Tabuchi, H.; Enno, H.; Ishitobi, N. Accuracy of deep learning, a machine-learning technology, using ultra-wide-field fundus ophthalmoscopy for detecting rhegmatogenous retinal detachment. Sci. Rep. 2017, 7, 9425. [Google Scholar] [CrossRef]
  56. Damato, B.; Eleuteri, A.; Fisher, A.C.; Coupland, S.E.; Taktak, A.F. Artificial neural networks estimating survival probability after treatment of choroidal melanoma. Ophthalmology 2008, 115, 1598–1607. [Google Scholar] [CrossRef] [PubMed]
  57. Nguyen, H.G.; Pica, A.; Hrbacek, J.; Weber, D.C.; La Rosa, F.; Schalenbourg, A.; Sznitman, R.; Bach Cuadra, M. A novel segmentation framework for uveal melanoma based on magnetic resonance imaging and class activation maps. In Proceedings of the 2nd International Conference on Medical Imaging with Deep Learning, London, UK, 8–10 July 2019; Volume 102, pp. 370–379. [Google Scholar]
  58. Sun, M.; Zhou, W.; Qi, X.; Zhang, G.; Girnita, L.; Seregard, S.; Grossniklaus, H.E.; Yao, Z.; Zhou, X.; Stålhammar, G. Prediction of BAP1 expression in uveal melanoma using densely-connected deep classification networks. Cancers 2019, 11, 1579. [Google Scholar] [CrossRef] [PubMed]
  59. Reid, J.E.; Eaton, E. Artificial intelligence for pediatric ophthalmology. Curr. Opin. Ophthalmol. 2019, 30, 337–346. [Google Scholar] [CrossRef]
  60. Ma, M.K.I.; Saha, C.; Poon, S.H.L.; Yiu, R.S.W.; Shih, K.C.; Chan, Y.K. Virtual reality and augmented reality—Emerging screening and diagnostic techniques in ophthalmology: A systematic review. Surv. Ophthalmol. 2022, 67, 1516–1530. [Google Scholar] [CrossRef]
  61. Mohammadi, S.-F.; Sabbaghi, M.; Z-Mehrjardi, H.; Hashemi, H.; Alizadeh, S.; Majdi, M.; Taee, F. Using artificial intelligence to predict the risk for posterior capsule opacification after phacoemulsification. J. Cataract. Refract. Surg. 2012, 38, 403–408. [Google Scholar] [CrossRef] [PubMed]
  62. Li, H.; Lim, J.H.; Liu, J.; Wong, D.W.; Tan, N.M.; Lu, S.; Zhang, Z.; Wong, T.Y. An automatic diagnosis system of nuclear cataract using slit-lamp images. In Proceedings of the Annual International Conference of the IEEE Engineering in Medicine and Biology Society IEEE Engineering in Medicine and Biology Society Conference, Minneapolis, MN, USA, 3–6 September 2009; pp. 3693–3696. [Google Scholar]
  63. Xu, Y.; Gao, X.; Lin, S.; Wong, D.W.K.; Liu, J.; Xu, D.; Cheng, C.Y.; Cheung, C.Y.; Wong, T.Y. Automatic Grading of Nuclear Cataracts from Slit-Lamp Lens Images Using Group Sparsity Regression; Springer: Berlin/Heidelberg, Germany, 2013. [Google Scholar]
  64. Gao, X.; Lin, S.; Wong, T.Y. Automatic feature learning to grade nuclear cataracts based on deep learning. IEEE Trans. Biomed. Eng. 2015, 62, 2693–2701. [Google Scholar] [CrossRef]
  65. Wu, X.; Huang, Y.; Liu, Z.; Lai, W.; Long, E.; Zhang, K.; Jiang, J.; Lin, D.; Chen, K.; Yu, T.; et al. Universal artificial intelligence platform for collaborative management of cataracts. Br. J. Ophthalmol. 2019, 103, 1553–1560. [Google Scholar] [CrossRef] [PubMed]
  66. Xiao Lian, J.; Gangwani, R.A.; McGhee, S.M.; Chan, C.K.W.; Lam, C.L.K.; Wong, D.S.H. Systematic screening for diabetic retinopathy (DR) in Hong Kong: Prevalence of DR and visual impairment among diabetic population. Br. J. Ophthalmol. 2016, 100, 151–155. [Google Scholar] [CrossRef] [PubMed]
  67. Dong, Y.; Zhang, Q.; Qiao, Z.; Yang, J.-J. Classification of Cataract Fundus Image Based on Deep Learning. In Proceedings of the 2017 IEEE International Conference on Imaging Systems and Techniques (IST), Beijing, China, 18–20 October 2017. [Google Scholar] [CrossRef]
  68. Pratap, T.; Kokil, P. Computer-aided diagnosis of cataract using deep transfer learning. Biomed. Signal Process. Control 2019, 53, 101533. [Google Scholar] [CrossRef]
  69. Hee, M.R. State-of-the-art of intraocular lens power formulas. JAMA Ophthalmol. 2015, 133, 1436–1437. [Google Scholar] [CrossRef] [PubMed]
  70. Melles, R.B.; Kane, J.X.; Olsen, T.; Chang, W.J. Update on intraocular lens calculation formulas. Ophthalmology 2019, 126, 1334–1335. [Google Scholar] [CrossRef] [PubMed]
  71. Nemeth, G.; Modis, L., Jr. Accuracy of the Hill-radial basis function method and the Barrett Universal II formula. Eur. J. Ophthalmol. 2020, 31, 566–571. [Google Scholar] [CrossRef]
  72. Long, E.; Chen, J.; Wu, X.; Liu, Z.; Wang, L.; Jiang, J.; Li, W.; Zhu, Y.; Chen, C.; Lin, Z.; et al. Artificial intelligence manages congenital cataract with individualized prediction and telehealth computing. NPJ Digit. Med. 2020, 3, 112. [Google Scholar] [CrossRef] [PubMed]
  73. Gulshan, V.; Peng, L.; Coram, M.; Stumpe, M.C.; Wu, D.; Narayanaswamy, A.; Venugopalan, S.; Widner, K.; Madams, T.; Cuadros, J.; et al. Development and validation of a deep learning algorithm for detection of diabetic retinopathy in retinal fundus photographs. JAMA 2016, 316, 2402–2410. [Google Scholar] [CrossRef] [PubMed]
  74. Devalla, S.K.; Liang, Z.; Pham, T.H.; Boote, C.; Strouthidis, N.G.; Thiery, A.H.; Girard, M.J.A. Glaucoma management in the era of artificial intelligence. Br. J. Ophthalmol. 2020, 104, 301–311. [Google Scholar] [CrossRef] [PubMed]
  75. Liu, S.; Graham, S.L.; Schulz, A.; Kalloniatis, M.; Zangerl, B.; Cai, W.; Gao, Y.; Chua, B.; Arvind, H.; Grigg, J.; et al. A deep learning-based algorithm identifies glaucomatous discs using monoscopic fundus photographs. Ophthalmol. Glaucoma 2018, 1, 15–22. [Google Scholar] [CrossRef] [PubMed]
  76. Fu, H.; Cheng, J.; Xu, Y.; Wong, D.W.K.; Liu, J.; Cao, X. Joint optic disc and cup segmentation based on multi-label deep network and polar transformation. IEEE Trans. Med. Imaging 2018, 37, 1597–1605. [Google Scholar] [CrossRef] [PubMed]
  77. Alawad, M.; Aljouie, A.; Alamri, S.; Alghamdi, M.; Alabdulkader, B.; Alkanhal, N.; Almazroa, A. Machine Learning and Deep Learning Techniques for Optic Disc and Cup Segmentation—A Review. Clin. Ophthalmol. 2022, 16, 747–764. [Google Scholar] [CrossRef]
  78. Orlando, J.I.; Fu, H.; Breda, J.B.; van Keer, K.; Bathula, D.R.; Diaz-Pinto, A.; Fang, R.; Heng, P.-A.; Kim, J.; Lee, J.; et al. REFUGE challenge: A unified framework for evaluating automated methods for glaucoma assessment from fundus photographs. Med. Image Anal. 2020, 59, 101570. [Google Scholar] [CrossRef] [PubMed]
  79. Elze, T.; Pasquale, L.R.; Shen, L.Q.; Chen, T.C.; Wiggs, J.L.; Bex, P.J. Patterns of functional vision loss in glaucoma determined with archetypal analysis. J. R. Soc. Interface 2015, 12, 20141118. [Google Scholar] [CrossRef]
  80. Mayro, E.L.; Wang, M.; Elze, T.; Pasquale, L.R. The impact of artificial intelligence in the diagnosis and management of glaucoma. Eye 2020, 34, 1–11. [Google Scholar] [CrossRef]
  81. Abràmoff, M.D.; Lavin, P.T.; Birch, M.; Shah, N.; Folk, J.C. Pivotal trial of an autonomous AI-based diagnostic system for detection of diabetic retinopathy in primary care offices. NPJ Digit. Med. 2018, 1, 39. [Google Scholar] [CrossRef] [PubMed]
  82. Medeiros, F.A.; Jammal, A.A.; Thompson, A.C. From machine to machine: An OCT-trained deep learning algorithm for objective quantification of glaucomatous damage in fundus photo- graphs. Ophthalmology 2019, 126, 513–521. [Google Scholar] [CrossRef] [PubMed]
  83. Goldbaum, M.H.; Sample, P.A.; White, H.; Côlt, B.; Raphaelian, P.; Fechtner, R.D.; Weinreb, R.N. Interpretation of automated perimetry for glaucoma by neural network. Investig. Ophthalmol. Vis. Sci. 1994, 35, 3362–3373. [Google Scholar]
  84. Brigatti, L.; Hoffman, D.; Caprioli, J. Neural networks to identify glaucoma with structural and functional measurements. Am. J. Ophthalmol. 1996, 121, 511–521. [Google Scholar] [CrossRef] [PubMed]
  85. Chen, X.; Xu, Y.; Yan, S.; Wong, D.W.K.; Wong, T.Y.; Liu, J. Automatic feature learning for glaucoma detection based on deep learning. In Proceedings of the Medical Image Computing and Computer-Assisted Intervention—MICCAI 2015—18th International Conference, Munich, Germany, 5–9 October 2015; pp. 669–677. [Google Scholar]
  86. Asaoka, R.; Murata, H.; Iwase, A.; Araie, M. Detecting preperimetric glaucoma with standard auto- mated perimetry using a deep learning classifier. Ophthalmology 2016, 123, 1974–1980. [Google Scholar] [CrossRef]
  87. Li, F.; Wang, Z.; Qu, G.; Song, D.; Yuan, Y.; Xu, Y.; Gao, K.; Luo, G.; Xiao, Z.; Lam, D.S.C.; et al. Automatic differentiation of Glaucoma visual field from non-glaucoma visual filed using deep convolutional neural network. BMC Med. Imaging 2018, 18, 35. [Google Scholar] [CrossRef]
  88. Shibata, N.; Tanito, M.; Mitsuhashi, K.; Fujino, Y.; Matsuura, M.; Murata, H.; Asaoka, R. Development of a deep residual learning algorithm to screen for glaucoma from fundus photography. Sci. Rep. 2018, 8, 14665. [Google Scholar] [CrossRef] [PubMed]
  89. Christopher, M.; Belghith, A.; Bowd, C.; Proudfoot, J.A.; Goldbaum, M.H.; Weinreb, R.N.; Girkin, C.A.; Liebmann, J.M.; Zangwill, L.M. Performance of deep learning architectures and transfer learning for detecting glaucomatous optic neuropathy in fundus photographs. Sci. Rep. 2018, 8, 16685. [Google Scholar] [CrossRef] [PubMed]
  90. Liu, H.; Li, L.; Wormstone, I.M.; Qiao, C.; Zhang, C.; Liu, P.; Li, S.; Wang, H.; Mou, D.; Pang, R.; et al. Development and validation of a deep learning system to detect glaucomatous optic neuropathy using fundus photographs. JAMA Ophthalmol. 2019, 137, 1353–1360. [Google Scholar] [CrossRef]
  91. Asaoka, R.; Murata, H.; Hirasawa, K.; Fujino, Y.; Matsuura, M.; Miki, A.; Kanamoto, T.; Ikeda, Y.; Mori, K.; Iwase, A.; et al. Using deep learning and transfer learn- ing to accurately diagnose early-onset glaucoma from macular optical coherence tomography images. Am. J. Ophthalmol. 2019, 198, 136–145. [Google Scholar] [CrossRef]
  92. Fu, H.; Baskaran, M.; Xu, Y.; Lin, S.; Wong, D.W.K.; Liu, J.; Tun, T.A.; Mahesh, M.; Perera, S.A.; Aung, T. A deep learning system for automated angle-closure detection in anterior segment optical coherence tomography images. Am. J. Ophthalmol. 2019, 203, 37–45. [Google Scholar] [CrossRef] [PubMed]
  93. Normando, E.M.; Yap, T.E.; Maddison, J.; Miodragovic, S.; Bonetti, P.; Almonte, M.; Mohammad, N.G.; Ameen, S.; Crawley, L.; Ahmed, F.; et al. A CNN-aided method to predict glaucoma progression using DARC (Detection of Apoptosing Retinal Cells). Expert Rev. Mol. Diagn. 2020, 20, 737–748. [Google Scholar] [CrossRef] [PubMed]
  94. Thakur, A.; Goldbaum, M.; Yousefi, S. Predicting glaucoma before onset using deep learning. Ophthalmol. Glaucoma 2020, 3, 262–268. [Google Scholar] [CrossRef] [PubMed]
  95. Weinreb, R.N.; Garway-Heath, D.F.; Leung, C.; Medeiros, F.A.; Liebmann, J. (Eds.) 10th Consensus Meeting: Diagnosis of Primary Open Angle Glaucoma; Kugler Publications: Amsterdam, The Netherlands, 2016. [Google Scholar]
  96. Keel, S.; Wu, J.; Lee, P.Y.; Scheetz, J.; He, M. Visualizing deep learning models for the detection of referable diabetic retinopathy and glaucoma. JAMA Ophthalmol. 2019, 137, 288–292. [Google Scholar] [CrossRef]
  97. Khawaja, A.P.; Cooke Bailey, J.N.; Wareham, N.J.; Scott, R.A.; Simcoe, M.; Igo, R.P., Jr.; Song, Y.E.; Wojciechowski, R.; Cheng, C.Y.; Khaw, P.T.; et al. Genome-wide analyses identify 68 new loci associated with intraocular pressure and improve risk prediction for primary open-angle glaucoma. Nat. Genet. 2018, 50, 778–782. [Google Scholar] [CrossRef] [PubMed]
  98. Margeta, M.A.; Letcher, S.M.; Igo, R.P.; for the NEIGHBORHOOD Consortium. Association of APOE with primary open-angle glaucoma suggests a protective effect for APOE ε4. Investig. Ophthalmol. Vis. Sci. 2020, 61, 3. [Google Scholar] [CrossRef]
  99. Raman, R.; Srinivasan, S.; Virmani, S.; Sivaprasad, S.; Rao, C.; Rajalakshmi, R. Fundus photograph-based deep learning algorithms in detecting diabetic retinopathy. Eye 2019, 33, 97–109. [Google Scholar] [CrossRef] [PubMed]
  100. Venhuizen, F.G.; van Ginneken, B.; Liefers, B.; van Asten, F.; Schreur, V.; Fauser, S.; Hoyng, C.; Theelen, T.; Sánchez, C.I. Deep learning approach for the detection and quantification of intraretinal cystoid fluid in multivendor optical coherence tomography. Biomed. Opt. Express 2018, 9, 1545–1569. [Google Scholar] [CrossRef]
  101. Ting, D.S.W.; Pasquale, L.R.; Peng, L.; Campbell, J.P.; Lee, A.Y.; Raman, R.; Tan, G.S.W.; Schmetterer, L.; Keane, P.A.; Wong, T.Y. Artificial intelligence and deep learning in ophthalmology. Br. J. Ophthalmol. 2019, 103, 167–175. [Google Scholar] [CrossRef]
  102. Abràmoff, M.D.; Lou, Y.; Erginay, A.; Clarida, W.; Amelon, R.; Folk, J.C.; Niemeijer, M. Improved automated detection of diabetic retinopathy on a publicly available dataset through integration of deep learning. Investig. Ophthalmol. Vis. Sci. 2016, 57, 5200–5206. [Google Scholar] [CrossRef] [PubMed]
  103. Hogarty, D.T.; Mackey, D.A.; Hewitt, A.W. Current state and future prospects of artificial intelligence in ophthalmology: A review. Clin. Exp. Ophthalmol. 2019, 47, 128–139. [Google Scholar] [CrossRef] [PubMed]
  104. Liu, T.A.; Ting, D.S.; Paul, H.Y.; Wei, J.; Zhu, H.; Subramanian, P.S.; Li, T.; Hui, F.K.; Hager, G.D.; Miller, N.R. Deep learning and transfer learning for optic disc laterality detection: Implications for machine learning in neuro-ophthalmology. J. Neuroophthalmol. 2020, 40, 178–184. [Google Scholar] [CrossRef]
  105. Milea, D.; Singhal, S.; Najjar, R.P. Artificial intelligence for detection of optic disc abnormalities. Curr. Opin. Neurol. 2020, 33, 106–110. [Google Scholar] [CrossRef] [PubMed]
  106. Akbar, S.; Akram, M.U.; Sharif, M.; Tariq, A.; Yasin, U.U. Decision support system for detection of papilledema through fundus retinal images. J. Med. Syst. 2017, 41, 66. [Google Scholar] [CrossRef]
  107. Prashanth, R.; Dutta Roy, S.; Mandal, P.K.; Ghosh, S. High-accuracy detection of early Parkinson’s disease through multimodal features and machine learning. Int. J. Med. Inform. 2016, 90, 13–21. [Google Scholar] [CrossRef] [PubMed]
  108. Nam, U.; Lee, K.; Ko, H.; Lee, J.-Y.; Lee, E.C. Analyzing facial and eye movements to screen for Alzheimer’s disease. Sensors 2020, 20, 5349. [Google Scholar] [CrossRef]
  109. Miranda, Â.; Lavrador, R.; Júlio, F.; Januário, C.; Castelo-Branco, M.; Caetano, G. Classification of Huntington’s disease stage with support vector machines: A study on oculomotor performance. Behav. Res. 2016, 48, 1667–1677. [Google Scholar] [CrossRef]
  110. Mao, Y.; He, Y.; Liu, L.; Chen, X. Disease classification based on eye movement features with decision tree and random forest. Front. Neurosci. 2020, 14, 798. [Google Scholar] [CrossRef]
  111. Zhu, G.; Jiang, B.; Tong, L.; Xie, Y.; Zaharchuk, G.; Wintermark, M. Applications of deep learning to neuro-imaging techniques. Front. Neurol. 2019, 10, 869. [Google Scholar] [CrossRef]
  112. Viikki, K.; Isotalo, E.; Juhola, M.; Pyykkö, I. Using decision tree induction to model oculomotor data. Scand. Audiol. 2001, 30, 103–105. [Google Scholar] [CrossRef]
  113. D’Addio, G.; Ricciardi, C.; Improta, G.; Bifulco, P.; Cesarelli, M. Feasibility of Machine Learning in Predicting Features Related to Congenital Nystagmus. In XV Mediterranean Conference on Medical and Biological Engineering and Computing—MEDICON 2019; Henriques, J., Neves, N., de Carvalho, P., Eds.; Springer: Cham, Switzerland, 2020; pp. 907–913. [Google Scholar]
  114. Fisher, A.C.; Chandna, A.; Cunningham, I.P. The differential diagnosis of vertical strabismus from prism cover test data using an artificially intelligent expert system. Med. Biol. Eng. Comput. 2007, 45, 689–693. [Google Scholar] [CrossRef] [PubMed]
  115. Chen, Z.; Fu, H.; Lo, W.-L.; Chi, Z. Strabismus recognition using eye-tracking data and convolutional neural networks. J. Healthc. Eng. 2018, 2018, e7692198. [Google Scholar] [CrossRef] [PubMed]
  116. Lu, J.; Fan, Z.; Zheng, C.; Feng, J.; Huang, L.; Li, W.; Goodman, E.D. Automated Strabismus Detection for Telemedicine Applications. arXiv 2018, arXiv:1809.02940. [Google Scholar]
  117. Chandna, A.; Fisher, A.C.; Cunningham, I.; Stone, D.; Mitchell, M. Pattern recognition of vertical strabismus using an artificial neural network (StrabNet©). Strabismus 2009, 17, 131–138. [Google Scholar] [CrossRef] [PubMed]
  118. Bloem, B.R.; Dorsey, E.R.; Okun, M.S. The coronavirus disease 2019 crisis as catalyst for telemedicine for chronic neurological disorders. JAMA Neurol. 2020, 77, 927–928. [Google Scholar] [CrossRef] [PubMed]
  119. Ko, M.W.; Busis, N.A. Tele-neuro-ophthalmology: Vision for 20/20 and beyond. J. Neuroophthalmol. 2020, 40, 378–384. [Google Scholar] [CrossRef] [PubMed]
  120. Kann, B.H.; Aneja, S.; Loganadane, G.V.; Kelly, J.R.; Smith, S.M.; Decker, R.H.; Yu, J.B.; Park, H.S.; Yarbrough, W.G.; Malhotra, A.; et al. Pretreatment identification of head and neck Cancer nodal metastasis and Extranodal extension using deep learning neural networks. Sci. Rep. 2018, 8, 14036. [Google Scholar] [CrossRef] [PubMed]
  121. Coudray, N.; Ocampo, P.S.; Sakellaropoulos, T.; Narula, N.; Snuderl, M.; Fenyö, D.; Moreira, A.L.; Razavian, N.; Tsirigos, A. Classification and mutation prediction from non–small cell lung cancer histopathology images using deep learning. Nat. Med. 2018, 24, 1559–1567. [Google Scholar] [CrossRef]
  122. Kann, B.H.; Thompson, R.; Thomas, C.R., Jr.; Dicker, A.; Aneja, S. Artificial intelligence in oncology: Current applications and future directions. Oncology 2019, 33, 46–53. [Google Scholar] [PubMed]
  123. Munson, M.C.; Plewman, D.L.; Baumer, K.M.; Henning, R.; Zahler, C.T.; Kietzman, A.T.; Beard, A.A.; Mukai, S.; Diller, L.; Hamerly, G.; et al. Autonomous early detection of eye disease in childhood photographs. Sci. Adv. 2019, 5, eaax6363. [Google Scholar] [CrossRef]
  124. Akkara, J.D.; Kuriakose, A. Role of artificial intelligence and machine learning in ophthalmology. Kerala J. Ophthalmol. 2019, 31, 150–160. [Google Scholar] [CrossRef]
  125. Fujinami-Yokokawa, Y.; Pontikos, N.; Yang, L.; Tsunoda, K.; Yoshitake, K.; Iwata, T.; Miyata, H.; Fujinami, K.; Japan Eye Genetics Consortium OBO. Prediction of causative genes in inherited retinal disorders from spectral-domain optical coherence tomography utilizing deep learning techniques. J. Ophthalmol. 2019, 2019, 1691064. [Google Scholar] [CrossRef] [PubMed]
  126. Medsinge, A.; Nischal, K.K. Pediatric cataract: Challenges and future directions. Clin. Ophthalmol. 2015, 9, 77–90. [Google Scholar] [PubMed]
  127. Sarao, V.; Veritti, D.; De Nardin, A.; Misciagna, M.; Foresti, G.; Lanzetta, P. Explainable artificial intelligence model for the detection of geographic atrophy using colour retinal photographs. BMJ Open Ophthalmol. 2023, 8, e001411. [Google Scholar] [CrossRef]
  128. Van Eenwyk, J.; Agah, A.; Giangiacomo, J.; Cibis, G. Artificial intelligence techniques for automatic screening of amblyogenic factors. Trans. Am. Ophthalmol. Soc. 2008, 106, 64–73. [Google Scholar] [PubMed]
  129. de Figueiredo, L.A.; Debert, I.; Dias, J.V.P.; Polati, M. An artificial intelligence app for strabismus. Investig. Ophthalmol. Vis. Sci. 2020, 61, 2129. [Google Scholar]
  130. Ting, D.S.W.; Wong, T.Y. Eyeing cardiovascular risk factors. Nat. Biomed. Eng. 2018, 2, 140–141. [Google Scholar] [CrossRef] [PubMed]
  131. Cheung, C.Y.; Tay, W.T.; Ikram, M.K.; Ong, Y.T.; De Silva, D.A.; Chow, K.Y.; Wong, T.Y. Retinal microvascular changes and risk of stroke: The Singapore Malay eye study. Stroke 2013, 44, 2402–2408. [Google Scholar] [CrossRef] [PubMed]
  132. Pérez Del Palomar, A.; Cegoñino, J.; Montolío, A.; Orduna, E.; Vilades, E.; Sebastián, B.; Pablo, L.E.; Garcia-Martin, E. Swept source optical coherence tomography to early detect multiple sclerosis disease. The use of machine learning techniques. PLoS ONE 2019, 14, e0216410. [Google Scholar] [CrossRef] [PubMed]
  133. Cavaliere, C.; Vilades, E.; Alonso-Rodríguez, M.C.; Rodrigo, M.J.; Pablo, L.E.; Miguel, J.M.; López-Guillén, E.; Morla, E.M.S.; Boquete, L.; Garcia-Martin, E. Computer-aided diagnosis of multiple sclerosis using a support vector machine and optical coherence tomography features. Sensors 2019, 19, 5323. [Google Scholar] [CrossRef] [PubMed]
  134. He, Y.; Carass, A.; Liu, Y.; Jedynak, B.M.; Solomon, S.D.; Saidha, S.; Calabresi, P.A.; Prince, J.L. Deep learning-based topology guaranteed surface and MME segmentation of multiple sclerosis subjects from retinal OCT. Biomed. Opt. Express. 2019, 10, 5042–5058. [Google Scholar] [CrossRef] [PubMed]
  135. Lee, C.S.; Apte, R.S. Retinal biomarkers of Alzheimer’s disease. Am. J. Ophthalmol. 2020, 218, 337–341. [Google Scholar] [CrossRef] [PubMed]
  136. Nunes, A.; Silva, G.; Duque, C.; Januário, C.; Santana, I.; Ambrósio, A.F.; Castelo-Branco, M.; Bernardes, R. Retinal texture biomarkers may help to discriminate between Alzheimer’s, Parkinson’s, and healthy controls. PLoS ONE 2019, 14, e0218826. [Google Scholar] [CrossRef]
  137. Cai, J.H.; He, Y.; Zhong, X.L.; Lei, H.; Wang, F.; Luo, G.H.; Zhao, H.; Liu, J.C. Magnetic resonance texture analysis in Alzheimer’s disease. Acad. Radiol. 2020, 27, 1774–1783. [Google Scholar] [CrossRef]
  138. Sharafi, S.M.; Sylvestre, J.P.; Chevrefils, C.; Soucy, J.P.; Beaulieu, S.; Pascoal, T.A.; Arbour, J.D.; Rhéaume, M.A.; Robillard, A.; Chayer, C.; et al. Vascular retinal biomarkers improve the detection of the likely cerebral amyloid status from hyperspectral retinal images. Alzheimers Dement. 2019, 5, 610–617. [Google Scholar] [CrossRef] [PubMed]
  139. Available online: https://www.prnewswire.com/news-releases/optina-diagnostics-receives-breakthrough-device-designation-from-us-fda-for-a-retinal-imaging-platform-to-aid-in-the-diagnosis-of-alzheimers-disease-300846450.html (accessed on 7 November 2024).
  140. Available online: https://www.fda.gov/medical-devices/how-study-and-market-your-device/breakthrough-devices-program (accessed on 7 November 2024).
  141. ASCRS. Guide to Teleophthalmology. Available online: https://ascrs.org/advocacy/regulatory/telemedicine (accessed on 7 November 2024).
  142. Kapoor, R.; Walters, S.P.; Al-Aswad, L.A. The current state of artificial intelligence in ophthalmology. Surv. Ophthalmol. 2019, 64, 233–240. [Google Scholar] [CrossRef] [PubMed]
  143. Heinke, A.; Radgoudarzi, N.; Huang, B.B.; Baxter, S.L. A review of ophthalmology education in the era of generative artificial intelligence. Asia-Pac. J. Ophthalmol. 2024, 13, 100089. [Google Scholar] [CrossRef] [PubMed]
  144. Luxton, D.D. Recommendations for the ethical use and design of artificial intelligent care providers. Artif. Intell. Med. 2014, 62, 1–10. [Google Scholar] [CrossRef] [PubMed]
  145. Johnson, S.L.J. AI machine learning, and ethics in health care. J. Leg. Med. 2019, 39, 427–441. [Google Scholar] [CrossRef]
  146. Tom, E.; Keane, P.A.; Blazes, M.; Pasquale, L.R.; Chiang, M.F.; Lee, A.Y.; Lee, C.S. AAO Artificial Intelligence Task Force. Protecting Data Privacy in the Age of AI-Enabled Ophthalmology. Transl. Vis. Sci. Technol. 2020, 9, 36. [Google Scholar] [CrossRef] [PubMed]
Figure 1. The connection among deep learning (DL), machine learning (ML), and AI [3].
Figure 1. The connection among deep learning (DL), machine learning (ML), and AI [3].
Applsci 15 01913 g001
Figure 2. (a) Machine Learning (ML) and (b) Deep Learning (DL) for Diabetic Macular Edema (DME) assessment [11].
Figure 2. (a) Machine Learning (ML) and (b) Deep Learning (DL) for Diabetic Macular Edema (DME) assessment [11].
Applsci 15 01913 g002
Figure 3. Types of Machine Learning (ML) [13].
Figure 3. Types of Machine Learning (ML) [13].
Applsci 15 01913 g003
Figure 5. Diagram of a traditional ML procedure for glaucoma classification [74].
Figure 5. Diagram of a traditional ML procedure for glaucoma classification [74].
Applsci 15 01913 g005
Figure 6. Architecture of the DL model used for glaucoma screening [75].
Figure 6. Architecture of the DL model used for glaucoma screening [75].
Applsci 15 01913 g006
Figure 7. Schematic diagram of AI platform for glaucoma screening [90].
Figure 7. Schematic diagram of AI platform for glaucoma screening [90].
Applsci 15 01913 g007
Figure 8. Explainable learning architecture. ReLU = Rectified Linear Unit [127].
Figure 8. Explainable learning architecture. ReLU = Rectified Linear Unit [127].
Applsci 15 01913 g008
Table 1. Automatic detection and classification of nuclear cataracts.
Table 1. Automatic detection and classification of nuclear cataracts.
YearAuthorData SourceMethodDefinition of Gold Standard/Ground TruthPerformance
2009Li et al. [62]Singapore Malay eye study (SiMES),
10,000 training,
5490 testing
Modified ASM: Lens structure detection
HSV model: Feature extraction
SVM: Automatic grading
Wisconsin cataract grading systemAccuracy: 95%
MAE (grading): 0.36
2013Xu et al. [63]ACHIKO-NC (subset of SiMES,
10,000 training,
5278 testing
Modified ASM: Lens structure detection
BOF model: Feature extraction
GSR: Automatic grading
Wisconsin cataract grading systemMAE: 0.336
2015Gao. et al. [64]ACHIKO-NC (subset of SiMES), 10,000 training,
5278 testing
CRNN: Feature learning
SVM regression: Automatic grading
Wisconsin cataract grading systemMAE: 0.304
2019Wu X. et al. [65]Chinese Medical Alliance for Artificial Intelligence (CMAAI), 30,132 training, 7506 testingResNetLOCS IICapture mode recognition AUC = 99.36%
a. AUC = 99.28%
b. AUC = 99.68%
c. AUC = 99.71%
Cataract diagnosis Cataract AUC = 99.93%
a. AUC = 99.96%
b. AUC = 99.19%
c. AUC = 99.38%d
Post-operative eye AUC = 99.93%
a. AUC = 99.93%
b. AUC = 98.99%
c. AUC = 99.74%
Detection of referable cataracts
a. Adult cataract: AUC = 94.88%
b. Pediatric cataract with VA involvement: AUC = 100%
c. PCO with VA involvement: AUC = 91.90%
In the above table: ASM: Active Shape Model; HSV: hue, saturation, value; SVM: support vector machine; BOF: Bag of Features; GSR: Group Dispersion Regression; CRNN: Convolutional Residual Neural Network; ResNet: Residual Neural Network; MAE: Mean Absolute Error; PCO: Posterior Capsular Opacification; VA: Visual Axis; (a) dilated-diffuse; (b) lamp with dilated slit; (c) non-dilated diffuse; (d) lamp with non-dilated slit [12].
Table 2. Studies on automatic fundus-based cataract detection and grading.
Table 2. Studies on automatic fundus-based cataract detection and grading.
YearAuthorData SourceMethodDefinition of Gold Standard/Ground TruthPerformance
2017Dong et al. [67]5495 training,
2355 testing
Caffe: Feature extraction SoftMax: Detection
and grading
Labelled fundus images by ophthalmologistsAccuracy a = 94.07% b
Accuracy a = 90.82% c
2017Zhang et al.Beijing Tongren Eye Centre’s clinical database,
4004 training,
1606 testing
DCNN:
Detection and grading
Labelled fundus images by gradersAUC = 0.935 b AUC = 0.867 c
2018Ran et al. Not describedDCNN:
Feature extraction Random forest: Detection and grading
Labelled fundus images by ophthalmologists, crosschecked by gradersAUC = 0.970 b
Sensitivity = 97.26% b
Specificity = 96.92% b
2018Li et al. [53]Beijing Tongren Eye Centre’s clinical database, 7030 training,
1000 testing
ResNet 18: Detection ResNet 50: GradingLabelled fundus images by gradersAUC = 0.972 b AUC = 0.877 c
2019Pratap and Kokil [68]Multiple online databases, 400
training, 400 testing
Pre-trained CNN:
Feature extraction SVM:
Detection and grading
Labelled fundus images by ophthalmologistsAccuracy a = 100% b
Accuracy a = 92.91% c
Legend: DCNN: Deep Convolutional Neural Network; CNN: Convolutional Neural Network; SVM: Support Vector Machine; ResNet: Residual Neural Network. (a) Indicates the proportion of correctly classified images from the total images tested. (class b) non-cataract versus cataract; (class c) no cataract, mild, moderate, and severe [12].
Table 3. Performance values of the Kane and Hill-RBF formula methods of predicting IOL power.
Table 3. Performance values of the Kane and Hill-RBF formula methods of predicting IOL power.
Kane FormulaHill-RBF
Short axial length (≤22.0 mm)0.4410.440
Intermediate axial length (>22.0 mm to <26.0 mm)0.3220.340
Long axial length (≥26.0 mm)0.3260.358
Table 5. Analysis the performance of AI for detection of retinal diseases using fundus images.
Table 5. Analysis the performance of AI for detection of retinal diseases using fundus images.
AuthorStudy TypeAI Algorithm/Fundus CameraDatasetSensitivity (%)Specificity (%)
Abràmoff et al. [102] (2016)Retrospective (DR)Topcon TRC NW6 non-mydriatic fundus camera/IDx-DR X2MESSIDOR-296.887
Gulshan et al. [73] (2016)Retrospective (DR)Topcon TRC NW6 non-mydriatic camera/Inception-V3MESSIDOR-28798.50
Ting et al. (2017) [54]Retrospective (DR)FundusVue, Canon, Topcon, and Carl Zeiss/VCG-19SiDRP 14–1590.591.6
Guangdong98.781.6
SIMES97.182.0
SINDI99.373.3
SCES10076.3
BES94.488.5
AFEDS98.886.5
RVEEH98.992.2
Mexican91.884.8
CUHK99.383.1
Brown et al. [46] (2018)Retrospective (ROP)Inception-V1 and U-NetAREDS10094
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Costin, H.-N.; Fira, M.; Goraș, L. Artificial Intelligence in Ophthalmology: Advantages and Limits. Appl. Sci. 2025, 15, 1913. https://doi.org/10.3390/app15041913

AMA Style

Costin H-N, Fira M, Goraș L. Artificial Intelligence in Ophthalmology: Advantages and Limits. Applied Sciences. 2025; 15(4):1913. https://doi.org/10.3390/app15041913

Chicago/Turabian Style

Costin, Hariton-Nicolae, Monica Fira, and Liviu Goraș. 2025. "Artificial Intelligence in Ophthalmology: Advantages and Limits" Applied Sciences 15, no. 4: 1913. https://doi.org/10.3390/app15041913

APA Style

Costin, H.-N., Fira, M., & Goraș, L. (2025). Artificial Intelligence in Ophthalmology: Advantages and Limits. Applied Sciences, 15(4), 1913. https://doi.org/10.3390/app15041913

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop