*Article* **Development of a Machine Learning Model to Discriminate Mild Cognitive Impairment Subjects from Normal Controls in Community Screening**

**Juanjuan Jiang <sup>1</sup> , Jieming Zhang 1, Chenyang Li 2, Zhihua Yu 3, Zhuangzhi Yan 2,\* and Jiehui Jiang <sup>2</sup>**


**\*** Correspondence: zzyan@shu.edu.cn

**Abstract: Background**: Mild cognitive impairment (MCI) is a transitional stage between normal aging and probable Alzheimer's disease. It is of great value to screen for MCI in the community. A novel machine learning (ML) model is composed of electroencephalography (EEG), eye tracking (ET), and neuropsychological assessments. This study has been proposed to identify MCI subjects from normal controls (NC). **Methods**: Two cohorts were used in this study. Cohort 1 as the training and validation group, includes184 MCI patients and 152 NC subjects. Cohort 2 as an independent test group, includes 44 MCI and 48 NC individuals. EEG, ET, Neuropsychological Tests Battery (NTB), and clinical variables with age, gender, educational level, MoCA-B, and ACE-R were selected for all subjects. Receiver operating characteristic (ROC) curves were adopted to evaluate the capabilities of this tool to classify MCI from NC. The clinical model, the EEG and ET model, and the neuropsychological model were compared. **Results**: We found that the classification accuracy of the proposed model achieved 84.5 ± 4.43% and 88.8 ± 3.59% in Cohort 1 and Cohort 2, respectively. The area under curve (AUC) of the proposed tool achieved 0.941 (0.893–0.982) in Cohort 1 and 0.966 (0.921–0.988) in Cohort 2, respectively. **Conclusions**: The proposed model incorporation of EEG, ET, and neuropsychological assessments yielded excellent classification performances, suggesting its potential for future application in cognitive decline prediction.

**Keywords:** mild cognitive impairment; neuropsychological tests battery; machine learning; screening tool

#### **1. Introduction**

Alzheimer's disease (AD) is the most common neurodegenerative brain disease that affects 50–70% of patients with cognitive impairments over the age of 65 [1]. AD pathology leads to an irreversible deterioration in cognitive functions such as loss of memory, executive dysfunction, and attention disorders [2–4]. Mild cognitive impairment (MCI) refers to the intermediate period between the typical cognitive decline of normal aging and the more severe decline associated with dementia (e.g., AD) [5–7]. Because of the irreversibility of AD, it is of great value to screen MCI subjects at the community level [5,8,9].

Currently, biochemical tests (e.g., Cerebrospinal Fluid and Blood) and neuroimaging tests (e.g., Magnetic Resonance Imaging,) were considered efficient screening tools for MCI [10–12]. However, these techniques were usually invasive and expensive, restricting large-scale screening applications in the community [13,14]. Therefore, an effective and low-cost detectable approach to cognitive decline in MCI is urgently required.

Recently, MCI screening has attracted immersive interests. A Neuropsychological Tests Battery (NTB) is well recognized in the diagnostic pipelines of preclinical AD [15]. Multiple preclinical neuropsychological measures significantly predicted progression to

**Citation:** Jiang, J.; Zhang, J.; Li, C.; Yu, Z.; Yan, Z.; Jiang, J. Development of a Machine Learning Model to Discriminate Mild Cognitive Impairment Subjects from Normal Controls in Community Screening. *Brain Sci.* **2022**, *12*, 1149. https:// doi.org/10.3390/brainsci12091149

Academic Editor: David Facal

Received: 12 July 2022 Accepted: 26 August 2022 Published: 28 August 2022

**Publisher's Note:** MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

**Copyright:** © 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https:// creativecommons.org/licenses/by/ 4.0/).

<sup>3</sup> Shanghai Geriatric Institute of Chinese Medicine, Shanghai University of Traditional Chinese Medicine, Shanghai 200031, China

AD from MCI and detected changes in patients in verbal and visual memory, visuospatial processing, error control, and subjective neuropsychological complaints [16]. Paul et. al. confirmed that neuropsychological tests quick-MCI to assess cognitive status in 3–5 min and can discriminate MCI accurately in primary care [17]. Neuropsychological tests were clearly appropriate for MCI community screening, as are emerging cognitive assessments such as electroencephalogram (EEG) and eye tracking (ET) to monitor cognitive function. Murty et al. found that stimulus-induced gamma rhythms from EEG were significantly lower in MCI/AD subjects compared to their age- and gender-matched controls, suggesting that gamma of EEG could be used as a potential screening tool for MCI or AD in humans [18]. Oyama et al. developed a brief cognitive assessment utilizing an eye-tracking technology that can enable quantitative scoring and the sensitive detection of cognitive impairment in patients with mild cognitive impairment and dementia [19]. Nie et al. found that eye movement parameters are stable indicators to distinguish patients with MCI and cognitively normal subjects and are not affected by different testing versions and numbers [20]. The incorporation of neuropsychological tests and physiological measurements warrants further study as a practical and cost-effective method for wide-scale screening for identifying older adults who may be at risk for pathological cognitive decline. Neuropsychological tests might be limited in their effectiveness in MCI screening while acknowledging that neuropsychological tests are inadequate for making a definitive diagnosis. To increase the precision and sensitivity of MCI screening, several researchers incorporated NTB into objective physiological measures, such as prefrontal EEG [21] and ET [22]. For instance, our previous work validated the feasibility of physiological measures using EEG and ET in distinguishing MCI from HC, with a classification accuracy of 81.5% [23].

In addition, with the development of artificial intelligence techniques, machine learning (ML) methods have been widely used for the differential diagnosis of MCI [15,23–25]. For example, Lin et al. developed non-invasive clinical variables and ML classifiers, including Support Vector Machine (SVM), Logistic Regression (LR), and Random Forest (RF), to achieve over 75% classification accuracy to classify subjects who converted to MCI from normal within four years [25]. Yim et al. proposed a ML algorithm to identify cognitive dysfunction based on neuropsychological tests including the Montreal Cognitive Assessment (MoCA). The results showed a good classification performance between cognitive impairment and normal subjects [15]. However, there were few models using neuropsychological tests, physiological tests, and ML algorithms in the previous studies.

This study aims to propose and validate a novel and low-cost screening model consisting of neuropsychological tests, physiological tests, and ML algorithms. Importantly, to evaluate the robustness of the model, two independent cohorts were used in this study.

#### **2. MCI Prediction Algorithms**

Figure 1 shows the flowchart of the proposed model, which was composed of four steps: data collection, data preprocessing, feature extraction and selection, and classification based on ML classifiers. These steps were described in detail as the following:

**Figure 1.** The flowchart of the proposed model.

#### *2.1. Data Collection*

EEG, ET, neuropsychological test (Table S1), and demographic data (age, gender, and education) were selected as the inputs of the model. Details of the data collection step were described in our previous study [23] and provided in the Figure S1 of Supplementary Material.

#### *2.2. Data Preprocessing*

This model included an automatic data preprocessing step for EEG, ET, and NTB.

#### 2.2.1. EEG Preprocessing

Invalid EEG data was first removed according to whether the EEG electrode was offset. Next, the power frequency noise, electromyogram signal, electrocardiogram signal, and other external noises were removed using a band stop filter and a band pass filter. Simple second-order Butterworth filtering was applied with a passband of 0.5–30 Hz. Finally, we overlapped 60% of the EEG data by applying a 5 s moving window, providing 15 overlapping segments for each subject. The EEG signal was preprocessing using EEGLAB toolbox implemented in MATLAB 2018a (Math Works Inc., Sherborn, MA, USA).

#### 2.2.2. ET Preprocessing

First, excessive noise from ET data was eliminated. Next, the gaze position signal was normalized to the display coordinates to avoid the interpolation bias. Finally, a low pass Butterworth filter with a cut-off frequency of 5 Hz was implemented in MATLAB 2018a (Math Works Inc., Sherborn, MA, USA).

#### 2.2.3. NTB Data Preprocessing

NTB data were cleaned, and all abnormal values were eliminated. Finally, neuropsychological test scores were normalized into 0–1.

#### *2.3. Feature Extraction*

#### 2.3.1. EEG Data

Frequency-domain and spectral-domain features of the EEG signal were extracted. A Fourier transform of the autocorrelation function was employed to transform the EEG signal from time-domain to frequency-domain to get the power spectral density. Four EEG frequency bands (delta 0.5–4 Hz, theta 4–8 Hz, alpha 8–13 Hz, and beta 13–30 Hz) were filtered in this study. The power spectrum of each frequency band and specific spectral power ratios like the alpha/theta power ratio was computed. The extracted linear features of the EEG were consistent with our preliminary work [23]. Nonlinear features of the EEG, including approximate entropy (ApEn) [26], Multiscale entropy (MsEn) [27], and Lempel Ziv complexity (LZC) were calculated [28]. The calculation formulas of the EEG features were described in the section of Feature extraction and selection of Supplementary Material.

#### 2.3.2. ET Data

ET data was divided into saccade data and gaze data. The association of gazes and saccades with specific regions on visual stimuli was examined. Then, visual scan parameters such as blink frequency, blink time, fixation time, and sustained attention duration were calculated. The nonlinear features of ET were extracted by LZC.

#### 2.3.3. NTB Data

NTB data, which are numerical, included subtest scores, total test scores, and response time. Meaningful numerical features were subsequently converted to z-scores using Z transformation.

#### *2.4. Feature Selection*

The Minimum Redundancy-Maximum Relevance (MRMR) algorithm was used for feature selection [29]. In the MRMR algorithm, the correlation between different feature subsets is modeled as:

$$\Theta = \frac{1}{|\Omega|} \sum\_{m} \sum\_{f\_l \in \Omega} M(f\_{l\prime} m) \tag{1}$$

where the feature subset Ω is from the feature set *F* and *F* = { *f*1,..., *fD*}. In this tool, *m* = {+1, −1} represents HC and MCI respectively and *M* is the mutual information between the feature subset and the target classes which is given by

$$M(X,Y) = \sum\_{X} \sum\_{Y} p(X,Y) \log\_2 \left( \frac{p(X,Y)}{p(X)p(Y)} \right) \tag{2}$$

where *p*(*X*), *p*(*Y*), *p*(*X*,*Y*) are the marginal probability distributions and joint probability distributions of variable *X*, *Y* respectively. Clearly, the mutual information comes to zero when *p*(*X*,*Y*) = *p*(*X*)*p*(*Y*), which states that the feature is independent with the target classes.

The redundancy between the feature *fi* and other features can be modeled as:

$$\Delta\_{\Omega, f\_i} = \frac{1}{|\Omega|^2} \sum\_{f\_j \subset \Omega, f\_i \neq f\_j} M(f\_{i\cdot} f\_j) \tag{3}$$

Thus, the feature meeting the minimum redundancy-maximum correlation principle can be obtained via:

$$f\_i^\* = \underset{f\_i \subset \Omega}{\text{argmax}} \frac{\Theta}{\Delta\_{\Omega, f\_i}} \tag{4}$$

In the above equation, the optimal features can be obtained by maximizing the correlation between the features and the target classification and minimizing the redundancy between the features. By performing similar operations on different feature subsets, multiple optimal features can be found to reduce the complexity and improve the algorithm decision performance.

#### *2.5. Classification*

A support vector machine (SVM) was used as the ML classifier with Anaconda Spyder 3.7 (Anaconda Inc., Austin, TX, USA). As a classic supervised learning method, SVM has been widely used in statistical classification and regression analysis due to its ability to map vectors linearly to a higher dimensional space that creates a maximum margin hyperplane to achieve high classification performance.

$$w^T \mathbf{x} + b = \mathbf{0} \tag{5}$$

Support vectors maximize the margin of the classifier by changing the position and orientation of the hyperplane. Kernel functions of SVM or "kernel trick" by SVM were applied to remedy the issue that the points are not separable linearly due to the position of the data. Kernel trick involves the transformation of the existing algorithm from a lower-dimensional data set to a higher one. The amount of information remains the same, but in this higher dimensional space, it is possible to create a linear classifier. Several K kernels are assigned to each point which then helps determine the best fit hyperplane for the newly transformed feature space. With enough K functions, it is possible to get precise separation.

Linear SVM classifier with hard margin:

$$\mathbf{W}(\boldsymbol{\alpha}) = -\sum\_{i=1}^{l} \alpha\_{i} + \frac{1}{2} \sum\_{i=1}^{l} \sum\_{j=1}^{l} \mathbf{y}\_{i} \mathbf{y}\_{j} \alpha\_{i} \alpha\_{j} \mathbf{X}\_{i} \mathbf{X}\_{j} \tag{6}$$

Kernel trick equation minimizing W subject to:

$$\sum\_{i=1}^{l} \mathbf{y}\_i \boldsymbol{\alpha}\_i = \mathbf{0} \qquad \qquad 0 \le \boldsymbol{\alpha}\_i \le \mathbf{C} \tag{7}$$

#### **3. Materials and Methods**

*3.1. Subjects*

We recruited two cohorts for this study. Cohort 1 was composed of 336 subjects from four communities in Jiading district, Shanghai, China, including 152 MCI patients and 184 normal controls (NC) subjects. Cohort 2 was composed of 44 MCI patients and 48 NC subjects from one community in Baoshan district, Shanghai, China. All subjects also underwent a battery of cognitive evaluations, including Addenbrooke's Cognitive Examination-revised (ACE-R) and Montreal cognitive assessment-basic (MoCA-B). The permission of MoCA-B in the study was received via https://www.mocatest.org/permission (accessed on 28 June 2017).

All subjects signed an informed consent before the examinations. This study has been approved by the ethics committee of Long Hua Hospital in Shanghai University of Traditional Chinese Medicine (Ethical number: 2017LCSY345) and conducted in accordance with the principles of the Declaration of Helsinki. In this study, Cohort 1 was used as the training and validation group to train the SVM classifier. Cohort 2 was used as an independent test group to verify the robustness of the classification results.

MCI was defined by an actuarial neuropsychological strategy proposed by Jak and Bondi [30], subjects were considered to have MCI if they met any of the following three criteria and neglected to meet the criteria for dementia. The inclusion criteria for MCI were as follows [31,32]: (1) right-handed, and Mandarin-speaking subjects; (2) a subjective memory complaint; (3) memory impairment relative to age and education-matched healthy elderly individuals confirmed by performance on neuropsychological assessments (below 1.5 standard deviations); (4) intact general cognitive function confirmed by MoCA-B scores ≥ 26; (5) intact activities of daily living; and (6) without dementia confirmed by a physician.

Exclusion criteria of MCI were as follows: (1) other neurological diseases including cerebrovascular disease, brain trauma, Parkinson's syndrome, brain tumor, and epilepsy; (2) current major psychiatric disease such as severe depression and anxiety; (3) other neurological conditions that could cause cognitive decline (e.g., brain tumors, Parkinson's disease, encephalitis, or epilepsy) rather than AD spectrum disorders; (4) systemic diseases that may lead to cognitive decline (thyroid dysfunction, severe anemia, syphilis, or HIV, etc.); (5) other conditions such as a history of CO poisoning and general anesthesia; (6) severe visual or hearing impairment; (7) contraindication for MRI.

The inclusion criteria for NC included the following: (1) no subjective or informantreported memory decline; (2) non-clinical depression (Geriatric Depression Scale scores < 6); (3) normal age-adjusted, gender-adjusted, and education-adjusted performance on standardized cognitive tests.

#### *3.2. Data Acquisition*

All data were selected from 1 September 2017 to 31 August 2018 in the communities, Shanghai, China. The data selection protocol has been introduced in the Supplementary Material.

#### *3.3. Validation Experiments for Optimal Parameters of the Classifier*

We adjusted the hyper-parameters for the SVM classifier such as kernel function, penalty factor C, and coefficient of kernel function gamma with good classification performance by 5-fold cross-validation. Different kernels, including linear, polynomial, and RBF were compared in this study. Cohort 1 was used to train these parameters.

#### *3.4. Discriminative Analysis*

The classification results from four models were compared by using the SVM classifier, including (1) the clinical model (clinical variables including age, gender, educational level, MoCA-B, ACE-R), (2) the single neuropsychological test model (20 subtests of NTB showed in the Supplementary Material), (3) the single physiological test model (EEG and ET), and (4) the proposed tool model. We used the 5-fold cross-validation method to calculate the classification results.

#### *3.5. Statistical Analysis*

Differences in demographic and cognitive performance between the NC group and the MCI group were evaluated by two sample *t*-tests or chi-square (χ2) tests of Statistical Package V24 for Social Sciences (SPSS Inc., Chicago, IL, USA). The significance level was set as *p* < 0.05. Receiver operating characteristic (ROC) curves were used to evaluate the capabilities of the tool in distinguishing MCI from NC. The areas under the curves (AUCs) with 95% confidence intervals (CIs) were calculated.

#### **4. Results**

#### *4.1. Demographic and Clinical Characteristics*

The detailed demographic and clinical characteristics were reported in Table 1. The results showed that the scores of MoCA-B and ACE-R from MCI patients were significantly

lower than NC's scores (*p* < 0.001, two-sample *t*-test). There were no significant differences in age (*p* = 0.875; two-sample *t*-test), gender (*p* = 0.541; chi-square test) or years of education (*p* = 0.071; Wilcoxon rank-sum test) of cohort 1. There were no significant differences in age (*p* = 0.783; two-sample *t*-test), gender (*p* = 0.492; chi-square test) or years of education (*p* = 0.068; Wilcoxon rank-sum test) of cohort 2 either.

**Table 1.** Demographic and clinical characteristics of subjects.


Note: Data are presented as mean ± standard deviation. \* Indicates a statistical difference between groups, *p* < 0.05; a: the *p* value was obtained by χ2 test, b: the *p* value was obtained by two-sample *t* tests, c: the *p* value was obtained by Wilcoxon rank-sum test. Abbreviations: NC, normal control; MCI, Mild Cognitive Impairment; MoCA-B, Montreal cognitive assessment-basic; ACE-R, Addenbrooke's Cognitive Examination Revised.

#### *4.2. Validation Experiments for Optimal Parameters of Classifier*

The best classification performance was obtained under the specific parameters (C = 1.1, GAMMA = 0.001) while the kernel function was set to RBF. Table 2 shows the detailed performance of different kernel functions and corresponding parameters.

**Table 2.** The optimized hyper-parameters of SVM in test dataset.


C represents the regularization coefficient, gamma represents the kernel function coefficient, AUC represents the area under the ROC curve, the bold part in the table is the optimal value of each column, and the values in the table are the mean and standard deviation after five cross-validations.

#### *4.3. Discriminative Analysis*

Tables 3 and 4 showed comparison results of four models in Cohort 1 and 2, respectively. Classification results showed that the performance of the proposed tool was better than other models (Accuracy: 84.5 ± 4.43%; Sensitivity: 81.9 ± 7.88%; Specificity: 86.8 ± 6.19%; AUC: 0.942 (0.893–0.982)) in Cohort 1. Classification results also showed that the performance of the proposed tool was better than other models (Accuracy: 88.8 ± 3.59%; Sensitivity: 86.2 ± 6.46%; Specificity: 91.0 ± 5.39%; AUC: 0.966 (0.921–0.988)) in Cohort 2. Figures 2 and 3 showed the ROC results of the four models in both cohorts.

**Table 3.** The classification results of four models in cohort 1.



**Table 4.** The classification results of four models in cohort 2.

**Figure 2.** The receiver operating curves of four models in cohort 1.

**Figure 3.** The receiver operating curves of four models in cohort 2.

#### **5. Discussion**

Cognitive decline remains highly underdiagnosed in the community despite extensive efforts to find novel approaches to detect MCI and find objective screening methods for cognitive decline could improve early MCI diagnosis. MCI screening in the community has become a hot topic nowadays. In light of their excellent performance in detecting a cognitive decline in MCI patients, multimodal detection approaches have been commonly

used in computer-aided disease diagnostic fields of community screening. In this study, we proposed a ML model based on EEG, eye movement, and neuropsychological tests for MCI screening at the community level. In contrast to other traditional models, such as the EEG-based model, ET-based model, and NTB-based model, the classification results of our model outperformed other traditional models.

So far, a lot of studies have focused on the classification of NC and MCI by using machine learning models for screening in primary care. For instance, Siuly et al. performed a Piecewise Aggregate Approximation (PAA) technique for compressing massive volumes of EEG data for reliable analysis and developed a model based on Extreme Learning Machine (ELM) with permutation entropy (PE) and auto-regressive (AR) model features to achieve the highest MCI classification accuracy (98.8%) [33]; Lagun et al. applied a SVM based machine learning model to reach the accuracy of 87% to detect MCI by modeling eye movement characteristics such as fixations, saccades, and refixations during the Visual Paired Comparison (VPC) task [34]; Yim et al. developed a screening model based on a gradient boosting (GB) algorithm to identify MCI by neuropsychological test results and reached the classification accuracy of 93.5% [15]; and, Wang et al. developed a Random Forest (RF)-based model to optimize the content of cognitive evaluation and achieved an accuracy of 68% in the classification of MCI and NC [35].

Notably, our classification results were similar to previous studies, indicating the reliability of our results. As shown in Table 5, although previous studies based on EEG analysis performed powerful discrimination for MCI detection (ACC = 98.8% in Siuly's model), it is worth noting that these studies based on expensive and long-term physiological signal collection devices are seldom used in primary care. By contrast, the wearable EEG device used in our approach was more suitable for large-scale MCI screening. In contrast to earlier studies based on ET and NTB, our method achieved better accuracy. Additionally, the advantages of our method were also summarized as follows:

**Table 5.** The performance of analogous MCI detection methods in the literature.



Although our proposed method achieved a good classification of screening MCI and NC, several limitations still exist. First, the whole experiment is time-consuming and thus leads to a decrease in the degree of completion and cooperation of patients. Second, the de-noising algorithm may influence the results of feature extraction and classification. Third, the sample size of NC and MCI individuals was limited, and increasing the sample size in future studies should be taken into consideration. Longitudinal imaging studies are still absent. In the subsequent research, ongoing follow-up observational studies of individuals will facilitate the investigation and validation of our results. Finally, SVM was only used as the classifier in this study. If alternative classifiers such as using extreme learning machines or deep learning models were developed, better classification results will be obtained.

#### **6. Conclusions**

In this study, an automatic and non-invasive MCI detection model was proposed, which integrated EEG, Eye movement techniques, and a neuropsychological test battery. The results indicated the potential application for MCI detection and guided referral for a more comprehensive evaluation to ultimately facilitate early intervention in primary care.

**Supplementary Materials:** The following supporting information can be downloaded at: https://www. mdpi.com/article/10.3390/brainsci12091149/s1. Figure S1. Data collection in this study. Table S1. listed neuropsychological Test used in this study.

**Author Contributions:** Conceptualization, Z.Y. (Zhuangzhi Yan) and J.J. (Jiehui Jiang); methodology, J.J. (Juanjuan Jiang); software, J.Z.; validation, C.L.; formal analysis, J.J. (Juanjuan Jiang); investigation, C.L.; data curation, J.Z.; writing—original draft preparation, J.J. (Juanjuan Jiang); writing—review and editing, J.J. (Juanjuan Jiang); supervision, Z.Y. (Zhihua Yu); project administration, Z.Y. (Zhihua Yu). All authors have read and agreed to the published version of the manuscript.

**Funding:** This research was funded by Science and Technology Innovation 2030 Major Projects (2022ZD0211600), National Natural Science Foundation of China (Grant 82020108013), and Research project of Shanghai Health Commission (2020YJZX0111).

**Institutional Review Board Statement:** All procedures performed in studies involving human participants were in accordance with the ethical standards of either the institutional or national research committee and with the 1964 Helsinki Declaration and its later amendments or comparable ethical standards. The study was approved by the Ethics Committee of Long Hua Hospital in Shanghai University of Traditional Chinese Medicine (Ethical number: 2017LCSY345).

**Informed Consent Statement:** Informed consent was obtained from all subjects involved in the study. Written informed consent has been obtained from the subjects to publish this paper.

**Data Availability Statement:** The data that support the findings of this study are available from the corresponding author.

**Conflicts of Interest:** The authors declare no conflict of interest.

#### **References**

