Figure 2 shows the procedure to find alterations of ocular movements in cirrhotic patients. The first step is to record ocular movements according to medical protocol; then, OSCANN’s software provides a wide list of features that should be carefully analyzed. The reader is referred to [
35] for a full description of the internal process in OSCANN’s software. In order to obtain those typical alterations of eye movement in cirrhotic patients, we performed two different analyses in the Matlab environment (see
Figure 2). Initially, we separated variables into parametric and non-parametric using the Shapiro–Wilk test (S-W). If the feature is parametric,
p-value is computed through the ANOVA test and through Kruskal–Wallis in the other case.
2.2.1. Data Acquisition and Preprocessing
All included patients performed the full battery of ocular movement tests selected by protocol [
35]. This protocol was designed to characterize the ocular movements of voluntary participants with no previous cognitive impairment in order to establish a control group. In this case, the biomedical procedure established the following steps:
Revision of the patient’s condition.
- 1.
PHES evaluation and clinical diagnosis.
- 2.
Participants are asked to remove make-up and any kind of lenses before the ocular movement tests in order to guarantee the precision and accuracy of the eye-tracking algorithm.
Ocular movement tests: Each test must be clearly explained to the patient, and there is a demo version available if needed.
- 1.
Visually guided saccades tests (VGST).
- 2.
Antisaccades tests (AST).
- 3.
Memory-guided saccades tests (MGST).
- 4.
Smooth pursuit tests (SPT).
- 5.
Fixation test (FIXT).
Each participant in the experiment followed this protocol based on the analysis and design of a group of specialists that tried to guarantee the attention of the participant. A simple test, VGST, is used to present the experiment. Then, the most difficult tests are captured (AST and MGST), and, finally, the easiest ones are presented to complete the ocular movement experiment.
In order to obtain an accurate ocular movement measurement, the patient’s head must remain fixed; a conventional chin rest is used. The system offers twenty-three different ocular movement tests, but according to the study design following the previous literature and feasibility criteria, cirrhotic patients carried out horizontal and vertical saccadic paradigms (visually guided saccades, memory-guided saccades, and antisaccades) without a gap or overlap and a horizontal and vertical smooth pursuit test. Furthermore, all patients performed a fixation test.
The visual stimulus is a green dot (diameter = 1 cm) deployed on a black background. In each position, the stimulus remains for 1500 ms in each position.
As mentioned previously, the saccadic paradigm includes three tests and each one is performed in the horizontal and vertical directions. Here, eye movements are guided by one stimulus, which appears in the centre of the screen and then moves randomly to the left or right position (horizontal test) or the up or down position (vertical test).
In the visually guided saccades test (VGST), the instruction to the patient is to look at the green dot. The visual stimulus performance is shown in
Figure 3.
In the antisaccades test (AST), the patient is asked to look in the opposite direction to the stimulus, such as a mirror. Here, the instruction to the patient is to look to the opposite side of the green dot.
Figure 4 shows the concept.
The memory-guided saccades test (MGST) is the longest test in the saccadic paradigms. Here, the stimulus appears in a particular position, remains there for 1500 ms, and then comes back to the screen’s center and disappears. The patient must remember the stimulus position and then perform a saccadic movement toward that. The user has an extra 1500 ms; for this reason, this test takes double the time (see
Figure 5).
In this test, the instruction to the patient is to remember the position of the last stimulus and move your eyes toward it.
Table 3 describes the recommended parameters. The visually guided saccades and antisaccade take 36 s and 24 s in the horizontal and vertical directions, respectively, while the memory-guided saccades test takes double the time.
The
Smooth Pursuit test (SPT) is also performed in the horizontal and vertical directions, and the stimulus moves following a linear wave. The recommended parameters are summarized in
Table 4. In linear smooth pursuit, the explored visual field is
, while velocity remains constant and each lap takes 8 s (see
Figure 6).
Finally, in the fixation test (FIXT), the stimulus remains in the screen’s center for five seconds. The objective of this test is to measure involuntary eye movements such as microsaccades, drifts or square ware jerks, and distractions.
After data acquisition is completed, videos have to be pre-processed in order to compute the position, velocity, and acceleration of the subject’s pupil. Consequently, the patient’s gaze is determined and, and test performance can be analyzed properly.
Before explaining the next steps in our algorithm, it is necessary to focus on the way cirrhotic patients perform eye movement tests. This circumstance is identified due to the inability to obtain data from all tests. Specifically, this problem shows up with bad eye movement recording due to eye morphology; for example, fallen eyelid impairments cause the impossibility of detecting a patient’s line of sight correctly.
Another common situation is patients who could not collaborate in performing all tests because of fatigue, stress, or nervousness. However, this is not the case for cirrhotic patients, since the patients collaborated and the environment where the tests were performed was quiet and comfortable. Then, all patients completed the eye movement tests.
In
Section 3, there are more details on the number of patients who have missing tests.
2.2.2. Data Analysis
Before the classification task can be carried out, data analysis process is needed in two distinct phases. On the one hand, feature extraction from ocular movement registration is the first step. On the other hand, the most significant features have to be selected in order to train machine learning algorithms. Therefore, finding the best combination of features allows the best possible classification results to be obtained.
Prioritizing this objective, two theoretical methods are proposed: while signal theory is used for computing features, statistical theory is used for selecting significant features.
Feature extraction process
By analyzing the signal generated as a response to the visual stimulus, it is possible to define the features of the eye movement [
9]. From each visual test, different features are computed, and then more than 150 features are evaluated.
In VGST, features such as response time or latency toward the stimulus and latency back toward the center of the screen, mean velocity and velocity peaks, accuracy (dysmetria), and the account of the number of blinks or anticipated saccades (the ones too fast made before 80 milliseconds after the stimulus changes) are examples of variables that are measured [
39].
In MGST, the account of the number of correct memory saccades, together with visual saccades features, are evaluated in this test. Features such as latencies, accuracy, or velocity features are assessed on the memory saccades.
In AST, when the patient looks in the opposite direction to the stimulus directly, these saccadic movements are considered to constitute correct antisaccade performance. If the patient performs a saccade movement to look at the stimulus and then to look in the opposite direction, these saccade movements are considered “reflexive” saccades, but if they do not look in the opposite direction, there will be an incorrect antisaccade [
39]. In this test, other features are measured like in the two previous cases.
In FIXT, as mentioned before, the account of saccades, microsaccades, drifts, square ware jerks, and distractions are measured. Also, different characteristics of each type of micromovement are computed [
7,
40].
In SPT, it is important to care about the account of catch-up saccades, which are performed by the patient to achieve the stimulus when the gaze is lagging behind it, and back-up saccades, which are performed in the opposite direction of the movement of the stimulus when the gaze is ahead [
6]. In addition, some indices related to the time following the stimulus or errors of performance are measured [
41].
Finally,
Table 5 summarizes the number of features extracted from each test used in this study.
Significance analysis
After processing feature extraction, the significance of each ocular movement feature is assessed using classical statistics. First, the normality of each feature must be tested (see
Figure 2) to properly generate
p-values. According to the number of samples (
), the Shapiro–Wilk (SW) test is one of the most suitable methods to evaluate the features’ normality [
42,
43]. Regardless of this, in [
44], it is also stated that the SW test can be used with sample sets with more than hundreds or thousands of samples. A second validation of the features’ normality was made using Lilliefors (LF) in order to guarantee feature significance with both methods.
The second step is to obtain the measure of the significance of each feature. To accomplish this, the p-value of each variable is computed for assessing the rejection of the null hypothesis. After features were classified as parametric or non-parametric variables, the p-values were computed, and just those variables with p-values less than or equal to , , and were included in the following steps.
A one-way analysis of variance (ANOVA) test was used for parametric features in a lot of fields of medicine such as cancer [
45] or mammogram mass classification [
46], biological data analysis [
47], etc. As another parametric test, ANOVA starts from the assumption that the data set fits normally distributed data. This test is a simple case of the linear regression model.
For nonparametric variables, the Kruskal–Wallis (KW) test is used [
48]. This test is equivalent to ANOVA. In KW, it is not assumed that data obey a particular distribution, and then the normality of data set is not supposed. Furthermore, in [
49], it is demonstrated that KW is suitable for a sample set size like the one used in this article.
2.2.3. Classification Algorithm and Validation
The MATLAB™application Classification Learner provides a useful tool for training multiple classification algorithms, including parameter variations for a specific algorithm. Therefore, it is a powerful tool to easily test all these algorithms in order to select them properly.
For instance, in the case of the well-known
Supported Vector Machine algorithm (SVM), linear, quadratic, cubic, and Gaussian kernel functions are available for testing. The SVM classification algorithm has demonstrated very useful qualities such as speed, efficiency, and robustness, which are extremely important for classification tasks [
50,
51,
52].
The best classifiers were selected depending on the accuracy, area under the ROC curve (AUC) [
53], and the number of features used. After that, cross-validation was performed using five folds or subsets. One of these folds, which corresponds to 20% of the available samples, was used for testing, while the remaining four were used for training. The samples of each class were equally divided into the different folds.
In this study, we used the cross-validation method where the data set is aleatory, divided into five subsets following a stratified k-fold division strategy. In order to compute the significance and statistical accuracy of the procedure, the selected classifier algorithm was executed 1000 times in an iterative loop, which corresponds with the two last stages shown in the flowchart of
Figure 2. All result metrics are computed as the mean value of the errors in each iteration;
Figure 7a,b also shows the distribution over these iterations.
Therefore, the training data set and test data set are statistically evaluated. Moreover, the probability of belonging to one of both classes is calculated via predicted class scores [
54]. This metric allows us to know if a sample is in the borderline between sets or, even worse, if it is classified in the wrong set.