1. Introduction
The human brain can be viewed as a complex network with an enormous amount of locally segregated structural regions; although each region is dedicated to different functionalities, together they maintain globally functional communications among different cognitive resources. One of the most important non-invasive approaches to measure brain functional connectivity (FC) is the functional magnetic resonance imaging (fMRI), which reflects the changes in the blood oxygen level-dependent (BOLD) signal [
1]. As one of the major advancements in recent fMRI data analyses, functional connectivity is used to measure the temporal dependency of neuronal activation patterns in different brain regions and the communications between these regions [
2]. Traditional FC analysis was based on specific experimental paradigms or resting state; recent studies have shown that naturalistic stimuli, which forms ecologically valid paradigms and approximates real life, could improve compliance of the participants [
3] and hence increase test-retest reliability [
4]. Indeed, functional connectivity with high ecological validity assessed through naturalistic stimulus has been found more reliable than that assessed in the resting state [
5]. Additionally, while exposed to this natural stimulus, the processing of sensory information would depend on the topological structure, especially the hierarchical and modular connections [
6].
Many neuroimaging studies have shown that the relationship between biological function and cognitive function can be established using certain statistical measurements (e.g., Pearson correlation). However, statistical methods (e.g., parametric methods) tend to over-fit the data and yield a quantitatively increased certainty of the statistical estimates, while failing to generalize to novel data [
7]. Furthermore, it may be impaired by high-dimensional situations (e.g., FC) [
7]. On the other hand, machine learning methods with well-established processing standards could extract biological patterns and leverage individual-level prediction simultaneously from the neuroimaging data [
8]. By further integrating FC analysis into the machine learning framework, a data-driven approach named connectome-based predictive modeling (CPM) could even predict individual differences in traits and behaviours [
9]. Coupled with the alerting score method, Rosenberg et al. found that CPM could predict sustained attention ability using resting-state fMRI data; this finding may be applied to describe the new insight regarding the relationship between FC and cognitive ability [
10]. In predicting fluid intelligence, Abigail et.al. found that a specific-task-based predictive model outperformed the resting-state-based model; this revealed that identifying the brain patterns in a given group could provide a unique brain-fluid intelligence relationship [
11].
Using machine learning techniques, the physiologically important representations buried within fMRI data could also be excavated and captured [
12]. For example, using deep learning and fMRI, Plis et al. found that deep nets could sift and keep the latent relation and biological patterns from neuroimaging data [
13]. These studies indicate that deep neural nets not only could be used to infer the presence of brain-behavior (e.g., FC and human behavior) relationships and bring new representation to explain the neural mechanisms, but also can be used as the fingerprint to translate neuroimaging findings into practical utility [
14]. However, traditional machine learning models based on a single model were limited in model generalization and model performance [
9]. Previous studies have demonstrated that ensemble learning, proposed by Breiman et al. [
15], has been integrated with bootstrap sampling and multiple classifiers to improve generalization. In addition, the overfitting issue would also be eliminated by using ensemble learning [
16]. Inspired by the fact that the brain networks are hierarchical with information processed in different layers [
6], combining hierarchical structure and ensemble learning could be an effective way to improve the performance of models and extract biological information from data.
In this study, we propose a new machine learning hierarchical structure to predict the fluid intelligence (reflects basic cognitive ability), using the biological patterns extracted by examining the naturalistic functional connectivity. A new regression method based on machine learning and graph theory, namely weighted ensemble model and network analysis (WENA), has been developed for this prediction problem. Compared with the traditional CPM, we used a self-supervised learning method named auto-encoder (AE) to extract non-linear and deep information from the functional connectivity measurements and the graphical theory indices based on fMRI data. To further boost the prediction performance, we also proposed a novel approach, namely weighted stacking (WS), which comprised of a multi-stacking layer structure for WENA to improve the effectiveness of model fusion. The comparative analysis showed that the proposed method outperforms the state-of-the-art methods reported. The results also revealed the existing coherence between biological fluid intelligence and neuroimaging reflection using the proposed data-driven approach.
4. Experiment Results
We compared the performance of WENA method with different weighted stacking models and FC construction methods.
Table 2 illustrated that the proposed WENA achieved the best performance for fluid intelligence prediction across three functional connectivity construction methods. The performance of MI-based features obtained the highest performance with an MAE of 3.85, an R value of 0.66 and an R
2 value of 0.42. The best FIS prediction of each network construction was shown in
Figure 3. Furthermore, conventional stacking structures and feature engineering methods were used to compare with the proposed WENA method based on AE features.
Table 3 showed that the conventional stacking model based on SVR achieved the best performance (the MAE was 4.25, R value was 0.53, and R
2 was 0.26), while the PC network construction method and basic SVR model achieved the best MAE, with a value of 4.20 (the R value was 0.53 and R
2 was 0.28). Compared with conventional feature engineering methods with the MI network construction method, WENA achieved the following performance: MAE of 4.12, an R value of 0.58 and an R
2 value of 0.33 for PCA methods; and MAE of 4.77, an R value of 0.32 and an R
2 value of 0.10 for ICA methods.
Stacking layers and model fusion strategies were used to test the robustness of the proposed WENA.
Figure 4 showed that the number of stacking layers could affect the performance of WENA, and that the three-layer structure was optimized. Additionally,
Table 3 showed that the proposed WENA method outperformed conventional stacking models which were without the WS structure and single basic regression models without a stacking structure. Furthermore, both
Figure 4 and
Table 3 revealed that the proposed WENA method was robust to different FC construction methods.
Figure 5 showed that WENA including the ETR and RR models outperformed WENA integrated with other regression models, including SVR and ELM models.
It has been noticed that there was a significant correlation between age and FIS (R = 0.65,
p < 0.001). There were also substantial differences between the network AE feature and age in the FC pattern (R = −0.34,
p < 0.001), BC pattern (R = 0.59,
p < 0.001) and LE pattern (R = 0.46,
p < 0.001), while there was no significant relationship found between other graph theory indices (DC and RS) and age. The most discriminative age-related FC with network-property patterns was visualized via AE, as well as the important ROIs extracted by WENA (shown in
Figure 4 and
Figure 6,
Table 4,
Table 5 and
Table 6). These results revealed that the most biological patterns extracted by WENA were the sensorimotor network, cingulo-opercular network, occipital network and cerebellum network.
5. Discussion
In this study, we have developed a new regression method based on machine learning and graph theory namely WENA, to extract biological patterns from functional connectivity and predict fluid intelligence effectively. The results indicate that (a) the proposed method outperforms the state-of-the-art reports; (b) our proposed framework is robust toward different network construction methods and variables; (c) the patterns extracted using this method have been found with interesting biological interpretations. These patterns were significantly related to age, which are found may stem from the sensorimotor network, cingulo-opercular network, occipital network and cerebellum network.
The proposed WENA architecture also outperforms other traditional methods in terms of FIS prediction (shown in
Table 2,
Table 3 and
Table 4). In particular, ensemble learning models (including bagging, stacking and boosting), which consisted of several single machine learning models [
30], outperformed the single machine learning model. The single machine learning algorithm was limited in model generalization and model performance [
9], while the performance of ensemble learning could be improved via using bootstrap replicates, and bagging could be further improved via stacking [
31]. Unlike deep learning, which risks overfitting and lacking model generalization [
32], ensemble learning could integrate with bootstrap samples and multiple classifiers, which could lead to the enhancement of model generalization and reduction of model overfitting [
15,
33].
The proposed WENA based on WS methods and model fusion also outperformed traditional stacking methods (see
Table 3). The proposed method was based on a self-supervised learning mechanism (AE); it could extract non-linear features and principal modes from FC data across a population [
34]. It also has been found that the performance of WENA based on WS outperformed that of WENA based on principal component analysis (PCA) and independent component analysis (ICA) (see
Table 4). As traditional approaches in neuroscience, PCA and ICA were both for linear features, the performances based on PCA features and ICA features were influenced by their unsupervised dimension reduction nature [
35]. By contrast, the AE could represent high-layer features and abstract low-level features (e.g., cerebrospinal fluid, cortical thickness and gray matter tissue volume) from neuroimaging data, also create general latent feature representation and improve the performance [
12,
36]. For example, via the AE and fMRI, Suk et al. extracted nonlinear hidden features from neuroimaging data and improved diagnostic accuracy [
36].
However, it should be noted that the network construction methods were used and compared in this study (shown in
Table 1), and our results showed that the performance of machine learning is impacted by FC construction methods (shown in
Table 1 and
Table S1). For example, while WENA was robust to network construction methods for improving the performance of FIS prediction, however, the number of stacking layers and the regression methods could affect the performance of WENA (seen in
Figure 4 and
Figure 7). In all, our results revealed that the proposed WENA model achieved the best regression accuracy on FC constructed via MI methods (MAE = 3.85, R = 0.66, R
2 = 0.42). Furthermore, the proposed WENA was better than other conventional methods and the state-of-the-art (shown in
Table 4).
The proposed WENA methods achieved improvements in the prediction of fluid intelligence from neuroimaging data, it was also able to decode the biological age-related patterns from the naturalistic fMRI data (shown in
Table 3). Fluid intelligence, as a highly age-related cognitive trait, could offer objective evidence in understanding naturalistic neuroimaging data for the ageing problem. For example, fluid intelligence, the ability to think and solve problems under limited knowledge situations [
37], tends to decline with ageing due to reductions in the executive function of the prefrontal cortex [
38]. In our study, FIS was positively correlated to age, and extracted AE features were negatively related to age (
p < 0.05). Furthermore, the functional networks extracted via the AE spatial filter were the sensorimotor network, cingulo-opercular network, occipital network and cerebellum network. To be specific, AE features which corresponded to the sensorimotor network and the cerebellum network were significantly positively correlated to age, which demonstrated compensatory existing age-related decline in motor function [
39]. In line with our study, the existence of increased sensorimotor and cerebellum functional connectivity has been found in elders, supporting the previous report on the increased interactivities found across the networks with ageing [
40]. Similarly, AE features which corresponded to the cingulo-opercular network and occipital network were significantly negatively associated with age, also in line with previous studies [
41]. Previous studies have also shown that the sensorimotor network was associated with sensory processing and the occipital network was related to visual preprocessing [
41]. Additionally, the cingulo-opercular network, also referred to as the salience network, decreased with age, which was the neural factor that affected visual processing speed [
42]. These brain functions were closely related to movie-watching experience and ageing issues, as well as fluid intelligence. Therefore, these studies supported that our methods could decode biological patterns and revealed that network patterns, consisting of the sensorimotor, cingulo-opercular, occipital and cerebellum networks, contributed to the prediction of fluid intelligence as well as the ageing problem.
However, several limitations should be noted. Firstly, the WENA model was unable to clearly reflect the quantitative relationship between age, functional connectivity and fluid intelligence. Secondly, the robustness of the proposed methods should be further tested using samples from other resources. Finally, the overfitting problem in the training dataset should be carefully considered, though ensemble learning could reduce it to some degree.