Transfer Learning from Healthy to Unhealthy Patients for the Automated Classification of Functional Brain Networks in fMRI

Ismaila, Lukman E.; Rasti, Pejman; Bernard, Florian; Labriffe, Mathieu; Menei, Philippe; Minassian, Aram Ter; Rousseau, David; Lemée, Jean-Michel

doi:10.3390/app12146925

Open AccessArticle

Transfer Learning from Healthy to Unhealthy Patients for the Automated Classification of Functional Brain Networks in fMRI

by

Lukman E. Ismaila

¹,

Pejman Rasti

^1,2,

Florian Bernard

^3,4,

Mathieu Labriffe

⁵,

Philippe Menei

^3,6,

Aram Ter Minassian

¹,

David Rousseau

^1,* and

Jean-Michel Lemée

^3,6

¹

Laboratoire Angevin de Recherche en Ingènierie des Système (LARIS), UMR INRAe, IRHS, Université d’Angers, 62 Avenue Notre Dame du Lac, 49000 Angers, France

²

Centre d’Etudes et de Recherche pour l’Aide à la Décision (CERADE), École D’ingénieur Informatique et Prévention des Risques (ESAIP), 49124 Angers, France

³

Service de Neurochirurgie, CHU d’Angers, 49100 Angers, France

⁴

Laboratoire d’Anatomie, Faculté de Médecine d’Angers, 49045 Angers, France

⁵

Département de Radiologie, CHU d’Angers, 49933 Angers, France

⁶

GLIAD, CRCINA, INSERM, Université de Nantes, Université d’Angers, 49000 Angers, France

^*

Author to whom correspondence should be addressed.

Appl. Sci. 2022, 12(14), 6925; https://doi.org/10.3390/app12146925

Submission received: 3 June 2022 / Revised: 2 July 2022 / Accepted: 6 July 2022 / Published: 8 July 2022

(This article belongs to the Section Applied Neuroscience and Neural Engineering)

Download

Browse Figures

Versions Notes

Abstract

:

Functional Magnetic Resonance Imaging (fMRI) is an essential tool for the pre-surgical planning of brain tumor removal, which allows the identification of functional brain networks to preserve the patient’s neurological functions. One fMRI technique used to identify the functional brain network is the resting-state-fMRI (rs-fMRI). This technique is not routinely available because of the necessity to have an expert reviewer who can manually identify each functional network. The lack of sufficient unhealthy data has so far hindered a data-driven approach based on machine learning tools for full automation of this clinical task. In this article, we investigate the possibility of such an approach via the transfer learning method from healthy control data to unhealthy patient data to boost the detection of functional brain networks in rs-fMRI data. The end-to-end deep learning model implemented in this article distinguishes seven principal functional brain networks using fMRI images. The best performance of a

75 %

correct recognition rate is obtained from the proposed deep learning architecture, which shows its superiority over other machine learning algorithms that were equally tested for this classification task. Based on this best reference model, we demonstrate the possibility of boosting the results of our algorithm with transfer learning from healthy patients to unhealthy patients. This application of the transfer learning technique opens interesting possibilities because healthy control subjects can be easily enrolled for fMRI data acquisition since it is non-invasive. Consequently, this process helps to compensate for the usual small cohort of unhealthy patient data. This transfer learning approach could be extended to other medical imaging modalities and pathology.

Keywords:

deep learning; transfer learning; resting-state fMRI; functional brain networks; image classification

1. Introduction

Medical imaging is one of the most investigated use cases for machine learning in healthcare [1]. While effort remains consistent in developing and improving algorithms, data availability is crucial for deploying efficient machine learning solutions [2]. The recent COVID-19 pandemic has demonstrated, for instance, how the availability of a large annotated dataset could significantly boost the power of machine learning [3]. However, in most clinical practices, such an initiative to share a large dataset is still limited.

The machine learning community has developed several workaround approaches to compensate for the lack of data. This compensation can be obtained using algorithms that learn faster, such as in few-shot learning approaches [4]. The lack of data can also be compensated by automatically generating fake data which are realistic enough to boost the training of algorithms. This includes the generation of synthetic data via simulators [5], generative models [6] or via data augmentation [7]. The last approach, known as transfer learning, uses pre-trained models on similar datasets [8]. In this article, we focus on this transfer learning approach.

While largely used in computer vision, transfer learning is still actively investigated (especially in the medical domain [9,10]). One of the most common transfer learning approaches in computer vision is using pre-trained models from 2D color outdoor natural images. However, for specific application domains, such as medical imaging, this approach is neither optimal nor possible, due to the difference in data structure between medical images and 2D color natural images (3D images instead of 2D images, size of images, bit depth of images, etc.) [11]. In addition, the efficiency of transfer learning has been shown to be optimal when images share similar content [12]. Lastly, transfer learning helps when the data used for the pre-training cost less by comparison than the images from the target domain. This analysis brings us to the simple and yet innovative idea of investigating the value of a pre-trained model on healthy subjects’ data when transferred to unhealthy patients. We propose testing and illustrating this transfer learning idea in a specific clinical use case. The rationale for the selection of this use case is: (i) we need a use case for which very few public datasets are indeed currently available, (ii) we need a use case for which the impact of the disease on the image is not too large to minimize the shift between healthy and unhealthy data, and (iii) we need a use case for which imaging modality is non-invasive so that the acquisition of images from healthy controls is relatively easy. This brought us to the selection of the clinical use case presented in the next section to test our original idea of transfer learning from healthy subjects’ data to unhealthy patient data.

Clinical Use Case

Functional MRI (fMRI) is a method that eases the understanding of brain activation by analyzing the blood-oxygen-level-dependent (BOLD) signal, allowing the identification and localization of functional brain areas. The development of this technique promotes a better understanding of the functional anatomy of the human brain and a more accurate characterization of the inter-individual topographical variability in functional brain areas, such as language areas [13]. Thus, some fMRI techniques are progressively included as a procedure in several pathologies for surgical planning [14,15,16,17].

The standard fMRI approach is a task-based block paradigm contrasting brain activation at rest and when performing a specific task. However, despite its usefulness, this technique presents several drawbacks, and inconsistencies [18]: the patient’s cooperation is needed, and it is unsuitable for young children and patients unable to perform the task. In addition, the study of several functional networks is time-consuming, and it requires the acquisition of each network with subsequent development of a specific activation task paradigm [19].

An alternative for the task-based characterization of functional networks is the resting-state fMRI (rs-fMRI), which studies the synchronization of low-frequency oscillation between brain areas at rest [20,21]. It is possible and practical to identify from these signals the so-called Intrinsic Connectivity Networks (ICNs), which reflect the neuro-anatomical substrate that corresponds to the brain’s functional networks [22,23]. However, rs-fMRI for functional network identification is not yet part of the pre-operative routine because of the high level of expertise needed for ICN identification. Indeed, each of the ICNs needs to be visually reviewed by an expert to identify an individual functional network of interest [24]. To broaden the use of this technique in the pre-surgical planning for various surgical procedures, the initial stage consists of the effective automation of fMRI brain network identification in patients’ data.

In the literature, automated machine learning algorithms have been the subject of several studies to identify disease patterns in rs-fMRI data, especially in epilepsy [25,26], as well as traumatic brain injuries [27], addiction [28], cognitive impairment [29], and psychiatric disorders such as depression and schizophrenia [26,28]. There have been relatively few attempts [22,24,30] to automatically identify functional networks on rs-fMRI data using machine learning. Lu et al. [30] developed an instance-based automated method for identifying language networks in brain tumor subjects using independent component analysis (ICA)-based mapping on rs-fMRI. By contrast, we are data-driven and do not limit ourselves to only language networks. In fact, our study considers seven functional networks. Each of these studies has its defined scopes, data variants, and functional networks used for automated identification in rs-fMRI for pre-surgical planning. In [24], the authors proposed a task-free paradigm for acquiring fMRI data, which was less demanding for patients and easy to administer. Further investigation was carried out on right-handed healthy control subjects. A semi-automated language component identification procedure was proposed and tested on healthy patients [24]. In this article, we consider unhealthy patients in addition to healthy subject data. In the study by [22], a model was trained to identify the main functional networks in a small number of healthy volunteers for different functional networks. The performance of the simple feed-forward network proposed in [22] is ultimately dependent on handcrafted features extracted from fMRI images. The above concerns motivated our proposition to design a specific end-to-end deep learning [31,32] knowledge transfer method to identify and automate the detection of functional networks in the rs-fMRI of unhealthy patients. This approach has the advantage of being applicable to patients in need of brain surgery due to brain tumors or other reasons.

While some efforts are being made to provide more and more public datasets of medical images of large interest, there are currently still few available public datasets of resting-state fMRI of healthy or unhealthy individuals [33,34]. However, these datasets have been produced with slightly different protocols than ours. These differences include the type of disease, number of participants, and MRI sequence for some areas. These differences would prevent the transfer learning approach on our dataset. Other related datasets include the database of [35]. It is made of 227 healthy individuals aged 18 to 74 to investigate the impact of adult age on functional brain connectivity; the database of [36] includes 993 patients and 1421 healthy individuals to classify psychiatric disorders. We investigate patients with brain tumors. Therefore, these datasets would also not allow a direct transfer learning approach from healthy to unhealthy on our data. Therefore, the situation of clinical interest considered in this study is perfectly suited to test the possibility of transferring knowledge from healthy to unhealthy patients.

As innovative elements, we (i) automatically identify functional networks on rs-fMRI data for the first time with an end-to-end deep learning method as opposed to handcrafted features that were previously proposed in the closest literature for this problem [22]. (ii) We demonstrate the value of transfer learning from a model of healthy control subjects to unhealthy patients with a brain tumor.

2. Materials and Methods

2.1. Database

We obtained data from 81 healthy subjects and 55 unhealthy patients. While healthy data were acquired from regular volunteers, unhealthy data were obtained from patients with brain tumors with a specific lesion region, as indicated by the provided binary lesion mask. A detailed description of the unhealthy population is provided in [13]. This is a single-center, prospective, open-label trial, in compliance with regulation and ethical guidelines for clinical research, approved by the local ethics committee (Comité de protection des personnes Ouest II, decision reference CPP 2012-25). A total of 81 healthy volunteers (36 females and 45 males) aged from 23 to 38 years old were included and signed written informed consent. Fifty-five adult patients with a brain lesion treated in the Department of Neurosurgery of the university hospital of Angers underwent a preoperative fMRI language mapping with both rs-fMRI and task fMRI, as well as a perioperative cortical mapping of eloquent brain language areas in awake condition. All subjects gave their written, informed consent before enrolling in this study.

For all healthy and unhealthy data, we extracted 55 features ICA with a specific interest in 7 brain features. One of the main difficulties with independent component analysis in resting-state fMRI is the determination of the total number of components (TNC) to be used, which may lead to suboptimal decompositions with the merging of multiple networks in case of low TNC or the fragmentation of a functional network into multiple components in case of high TNC [37,38]. Our choice to analyze 55 ICs among all patients was based on previous works and appeared to be a good compromise to identify functional brain networks [23,39].

These brain features correspond to seven biological networks of the brain, which are the Language Network (LANG), Salience Network (SAL), Ventral Attention Network (VAN), Default Mode Network (DMN), Left Fronto-parietal Control (lFPCN), Right Fronto-parietal Control Network (rFPCN), Dorsal Attention Network (DAN). The seven selected brain features represent the main ICN identified and described in resting-state fMRI literature. These particular networks were selected for the DMN to serve as a control for the others because of the inter-individual variability that makes them difficult to identify using detection software or by non-expert reviewers. These connectivity networks correspond to known functional networks that support cognitive functions and have been used for pre-surgical planning [38,40]. The connectivity networks were also found to be consistent between rs-fMRI and various fMRI data acquisition and analysis techniques [41]. Functional networks without anatomical variabilities, such as the motor, sensory, or visual cortex, were not considered for algorithm training and automated identification because of their consistent anatomical location.

Image labels for each healthy and unhealthy data file marked by domain experts were used to assign each image to its respective network class. In addition to the two variants of network images provided for both healthy and unhealthy, unhealthy data include details of the brain tumor as described in Table 1 and shown in Figure 1.

2.2. Data Acquisitions and Preprocessing

All fMRI acquisitions were performed using a 3 Tesla MRI (Magnetom Skyra, Siemens medical systems, Erlangen, Germany) with slice thickness of 4 mm, which yielded a voxel size of

3 \times 3 \times 4

{mm}^{3}

and consequently a 3-dimensional image of

42 px \times 51 px \times 34 channels

. The fMRI sequences were acquired for each patient in the following order: an anatomical 3D T1, one resting-state acquisition, and two task-induced activities. All patients and healthy volunteers enrolled did not have language impairment at the moment of the fMRI acquisition and during the surgical procedure. The first three volumes acquired in each sequence were discarded to allow the stabilization of the magnetic field gradients.

Data preprocessing was performed using MatLab (The MathWorks, Natick, MA, USA) with Anatomy, SPM8, and VBM 12 toolboxes. The preprocessing of fMRI data was performed using the following steps: slice-timing correction, realignment to the first volume of the first session, and unwrapping to correct head movements and magnetic distortions. Images were then segmented and normalized to the Montreal Neurological Institute template [42]. Rs-fMRI data of each patient were segmented into 55 spatial independent components (ICs) through an intrinsic connectivity network spatial independent component analysis (SICA) approach employing a customized version of the infomax algorithm running under Matlab [18,43]. ICs correspond to 3D fMRI activation volumes of brain areas with spontaneous synchronous activity.

The identification of reference fMRI brain networks was performed manually for each subject by two independent and experienced reviewers (J.-M.L. and A.T.M.) without any disagreement. Based on fMRI activation peaks and spatial distribution of these activations, we selected seven main networks of DMN, LANG, VAN, lFPCN, rFPCN, SAL, and DAN among the 55 generated ICs for each patient. The annotated images were used in two versions: full images (connectivity map) and corresponding thresholded images. Figure 2 shows an image sample of networks. Individual spatial components were thresholded at z = 2 at the cluster level, corresponding to the 5% most activated voxels in each intrinsic connectivity network. This methodology is consistent with the literature and allowed us to overcome the background activation noise to identify the anatomical location of specifically activated brain areas [44].

2.3. Identification of Functional Networks through Machine Learning Algorithms

Among the 55 ICs identified using SICA approach, only a few correspond to functional networks. Several ICs were, in fact, background noise, which was characterized by a low number of activated voxels. Functional networks generally comprise between 1200 and 3000 activated voxels. In order to reduce the number of ICs for each patient and improve the performance of the functional network identification, we added a preliminary step with the exclusion of all ICs with less than 850 activated voxels. We chose to introduce this threshold to fix a minimal number of activated voxels, above which it can be considered as network. Too few activated voxels were found to be only noise and not related to connectivity networks of interest during the manual review by the two expert reviewers. This procedure was performed to discard “noise” networks and increase the algorithm’s detection sensitivity. Then, we extracted the coordinates of the maximal activation peak of each cluster in order to minimize the number of variables considered for training before feeding the data into algorithms.

To identify the most suited family of machine learning algorithm for functional network classification, we implemented six machine learning algorithms, such as Random Forest, Feed forward Neural networks, Naïve Bayesian classifier, K-Nearest Neighbors, Support vector machine, and Classification tree. The random forest classifier consists of a combination of tree classifiers, 100 in our experiment. Each classifier is generated using a random vector sampled independently from the input vector. Each tree casts a unit vote for the most popular class to classify an input vector. Bayesian classifiers are statistical classifiers. They can predict class membership probabilities, such as the probability that a given sample belongs to a particular class. Naive Bayesian classifiers assume that the effect of an attribute value on a given class is independent of the values of the other attributes.The k-nearest neighbors classifier [45] stores the complete training data. New examples are classified by choosing the majority class among the k closest examples in the training data. We used the Euclidean distance to measure the tile distance between examples for our particular problem. Support Vector Machine is a powerful method for building a classifier. It aims to create a decision boundary between two classes that enables the prediction of labels from one or more feature vectors. This decision boundary, known as the hyperplane, is orientated so that it is as far as possible from the closest data points from each of the classes. Decision trees [46] recursively split the feature space based on tests that evaluate one feature variable against a threshold value. We used the information gain criteria for choosing the best test and top-down pruning with a value of

0.95

to reduce over-fitting.

In addition to the six shallow learning methods, we included deep learning methods in our benchmark test. Deep learning aims at jointly learning feature representations with the required prediction models. We chose the predominant approach in computer vision, namely, deep convolutional neural networks [47]. The baseline approach resorts to standard supervised training of the prediction model (the neural network) on the target training data. No additional data sources were used. In particular, given a training set comprised of K pairs of images

f_{i}

and labels

{\hat{y}}_{i}

, we train the parameters

θ

of the network r using stochastic gradient descent to minimize empirical risk:

θ^{*} = arg min_{θ} \sum_{i = 1}^{K} L ({\hat{y}}_{i}, r (f_{i}, θ))

(1)

where

L

denotes the loss function, which is cross-entropy in our case. The minimization is carried out using the Adam optimizer [48] with a learning rate of 0.001. The architecture of networks

r (\cdot, \cdot)

, shown in Figure 3, has been optimized on a cross-sample set and is given as follows: three convolutional layers with filters of size

3 \times 3

and respective numbers of filters 64, 128, 256, each followed by ReLU activations and

2 \times 2

max pooling; a fully connected layer with 256 units, ReLU activation and dropout (0.5); and a fully connected output layer for 7 classes and a softmax activation. The hyperparameters of the optimized CNN were based on a grid-search operating on the depth of the neural network. Other dimensions could be further investigated such as width, such as in EfficientNet [49]. Here, we do not seek an absolute best performance but rather focus on the possible relative gain in performance brought by transfer learning from healthy controls to unhealthy patients. In addition to the optimized CNN of Figure 3, we also included comparison with standard CNN architectures such as VGG16 [50], ResNet [51] and DenseNet [52].

The tested shallow and deep supervised learning classification algorithms were implemented based on fMRI data from 81 healthy subjects. The training dataset included 78 individual cartography of each of the seven main functional networks, corresponding to the seven identified networks among the 55 ICs generated for each of the 78 healthy control subjects in the training group. In order to reduce the dimensionality and minimize over-fitting in shallow learning algorithms, we extracted the coordinates of the network activation peak of each cluster in order to minimize the number of variables considered for training before feeding the data into algorithms. Each algorithm was trained ten times with a cross-validation strategy to ensure robustness and confidence. Algorithms were then tested using the fMRI data from the four other healthy subjects. We used each of these algorithms for each patient to identify the seven identified networks among generated 55 ICs from the main functional networks. The identified networks were further compared to the reference networks by our two expert reviewers for validation. We identified the most suited algorithms for identifying the seven main functional networks (DMN, lFPCN, LANG, rFPCN, SAL, DAN, and VAN). Finally, we tested the different parameters of the model to optimize the results. The best method was selected based on the highest classification performances.

2.4. Transfer Learning Strategies

The best model from the previous section was then investigated in its capability to transfer to unhealthy patients. We explored three main transfer learning techniques: brute transfer, mix transfer, and weight transfer. These techniques allow our unhealthy test data to be identified by some knowledge from healthy data and augmented data. In the brute transfer, a model was entirely trained on data from healthy controls, while in the mix transfer, the training database contained some unhealthy data. For the weight transfer method, our saved model weights from healthy data were loaded for further training and fine-tuning with unhealthy patient data. We tested the model with unseen unhealthy data (patients with tumors). We trained all transfer learning models at a learning rate of

1 \times 10^{- 5}

with 500–1000 epochs. To minimize over-fitting, we used an early stopping method based on the validation error increase. A grid-search algorithm chose optimal hyperparameters for the CNN model based on maximized precision of the training data: the stopping points for network training were ten validation failures followed by a model checkpoint.

2.5. Data Augmentation

Data augmentation was achieved in two ways. First, we computed a spatial stretch on the healthy fMRI network images similar to the effect of a brain tumor on the area within and about 3–5 px around the region of the lesion mask (See Figure 1). A classical filter known as pinch-explode was used for this purpose Figure 4. Second, we introduced a randomly generated 3D lesion mask. The lesion masks were chosen with a radius of 0–10 px across the 10th to 32nd channels of our image data with dimensions

42 px \times 51 px \times 34 channels

, comparable to real tumor masks, as shown in Figure 5 and Figure 6. With such a signal void, we turned the image voxels of the brain tumor region using our masks into zero values, i.e., no signal, to mimic the expected drop of fMRI signal inside the tumor. In both data augmentation ways, the input images were healthy patients. The transformations were chosen (stretch and signal-void) to simulate the expected impact of the tumor on the fMRI signal. In this spirit, data augmentation is another form of transfer learning from healthy to unhealthy patients to be compared with the other transfer learning approaches of the previous sections.

3. Experimental Results

In this section, we give experimental results using the acquisition protocol and training strategies described in the method Section 2. In the first subsection, we compare the performance of several ML techniques to find the best baseline method which can be used in the second subsection for our transfer learning experiments. Finally, in the last subsection, we compare our result with the closely related literature.

3.1. Performance Comparisons

The comparison of the different algorithms in Table 2 identified the proposed CNN model as the most efficient approach for identifying the functional networks of interest on healthy subjects. In addition to the comparison presented in Table 2, we extended our effort to implement other well-known CNN architectures such as VGG16, ResNet, and DenseNet on our dataset. However, the performance of these models was recorded in the range of 50% to 55% on healthy data and, therefore, was perceived to be unreliable. The observed difficulty was in the dimension of the original images and the total number of images in our dataset. The typical image size for well-known CNN architectures for computer vision (such as VGG16, ResNet, and DenseNet) is considered to be at

224 pixels \times 224 pixels

, as they are mainly designed to work on the ImageNet database [53]. Our original images are in multi-channel format and therefore have a size of

42 pixels \times 51 pixels \times 34

(width, height, channel). In order to adjust the image size, a bi-cubic interpolation has been used to up-sample image size by a factor of 4. This up-sampling reduced the quality of images and caused a significant drop in the performance of the models. On the other hand, the number of training images is much lower than the number of parameters in the well-known CNN architecture, leading the model to over-fit and reducing the model performance.

3.2. Transfer Learning

We selected the best method identified in Table 2 for healthy data and conducted the transfer learning approaches on this method to data from unhealthy patients. The results in Table 3 show the recorded accuracy values for several experiments on the proposed CNN model. Each defines the data used for training and testing with their respective data sizes. It has to be mentioned that the trained model never sees the testing data, neither during the training process nor the hyper-parameters’ tuning process.

Several baseline experiments were conducted to assess the other added value of transfer learning approaches. First, we trained on healthy control data and tested on healthy control. This experiment provided an upper bound of performance with the highest accuracy of

86 %

. This high score is possibly also due to the expected higher homogeneity of healthy control. The same experiment was carried out while training unhealthy and testing unhealthy patients. A drop of about

10 %

of accuracy was observed, which builds a second baseline with fewer patients. The investigated transfer learning approaches were expected to provide performances between these two bounds. We considered four transfer learning strategies for this experiment: (i) brute transfer (training on healthy and testing on unhealthy data), (ii) mixed transfer (adding some unhealthy data to healthy data to train the model), (iii) weight transfer (fine-tuning on unhealthy data) and (iv) transfer learning with data augmentation.

On the brute transfer strategy, as indicated in Table 3 row 3, we trained our model with 81 healthy control subjects and conducted testing on all 55 unhealthy patients. We recorded an average accuracy of

0.74 \pm 0.01

for all test data size ranges. The brute transfer is therefore not bringing any improvement here. For the mix transfer strategy, Table 3 row 4, we trained our model with 81 healthy control subjects and 45 unhealthy patients. At the same time, we performed our model test with ten unhealthy patients. An improvement in accuracy to

0.77 \pm 0.01

on test data was observed by comparison with the brute transfer. The addition of data helps, even with a mixture of healthy and unhealthy patients by comparison with pure unhealthy patients experiment of row 1. However, we do not reach the upper bound performance of row 1 despite having more data than in this experiment. This performance demonstrates a discrepancy between healthy and unhealthy patients. Figure 7 shows the validation accuracy (from validation data) of the trained model on healthy data for various amounts of added unhealthy patients

(10, 20, 30, 45)

. We recorded a ≅

1 %

increase in validation accuracy for every ten unhealthy patient data added to training data (seven functional network images per patient). As the third transfer learning strategy, in Table 3 row 5, we transferred the weight and bias of a model fully trained on healthy data (model of row 1) to a model for training on unhealthy data. The model was retrained and fine-tuned on 45 unhealthy patients and tested on the 10 remaining patients. A performance of

0.78 \pm 0.01

is obtained on unhealthy test data. This result is the highest performance among all tested transfer learning strategies. The three transfer learning strategies were repeated in the presence of augmented data (Table 3 rows 6 to 10). Augmented data were produced by data augmentation techniques (Section 2.5) from healthy data to simulate unhealthy data. The recorded performances in these experiments remained in the same range as other transfer learning approaches.

3.3. Comparison with Prior Works

As a closely related work, Mitchell et al. [22] focus on identifying selected functional networks in 21 healthy volunteers by training a simple feed-forward neural network model. This approach was achieved using a Multilayer Perceptron (MLP), which usually follows the procedure of hand-crafted features extracted from data. Generally, Multilayer Perceptrons (MLP) are fully connected neural networks which generate outputs based on inputs. Literature sometimes uses MLP interchangeably with Deep Neural Network (DNN); however, there is a sharp contrast because MLP is a subset of DNN. In this case, there is a pre-selection of ICs of interest. Our ICs were generated using a bottom-up, data-driven approach using an independent component analysis. ICA has gained popularity as one of the two frequently selected analytical methods for rs-fMRI data, which requires no seed on any predefined region [54,55]. In contrast, ICs generated in Mitchell et al.’s [22] study used canonical seed regions of interest scattered across the brain. These two approaches may provide similar features for further analysis. However, hand-crafted feature extraction can limit the flexibility and potential of identifying certain functional brain areas, as demonstrated in our approach. In addition, the location of the seed regions could significantly impact the resulting pattern of a functional system such as the Language network. Furthermore, sensitivity to systematic noise such as head movement and physiological nuisance signals causes false identification of non-language areas (false positive) and false detection of putative language areas (false negative), which limits the clinical application of seed-based rs-fMRI in language mapping [30]. The comparison of our proposed CNN performed in the same conditions as Mitchell’s work and the method of [22] is given in Table 4 and demonstrates the interest of our approach.

3.4. Discussion and Error Analysis

The results of this study indicate that healthy control can help to boost the functional network identification for unhealthy patient data by adding the healthy data during the training process. In this section, we discuss the observed errors and further analyze the origin of the transferability between healthy to unhealthy data.

One may wonder “where do the classification errors in this experiment can come from?”. We generated the confusion matrix (Figure 8) as well as the sensitivity (true positive rate) and specificity (true negative rate) of the classification individual functional brain networks to discover the most sensitive cases. Table 5 shows the model evaluation of each individual network for the classification of healthy subjects, unhealthy patients and transfer learning. The primary source of confusion between the different functional networks is the spatial overlap between the activated areas. We segmented the functional network identification into classification steps, identifying in each of them between the 55 ICs the best-fitted ICs for all 7 functional networks. We realized that the main sources of error came from the confusion between LANG and the VAN, as well as DAN and rFPCN as shown in Figure 8. The difficulty in differentiating between DAN and rFPCN may be explained by the spatial overlapping between the two networks [56]. In contrast, the relationship between VAN and LANG networks is more complex than in other networks. The distinction between the language and ventral attentional networks in rs-fMRI may be difficult, as they present similar activations in the ventrolateral prefrontal cortex, inferior frontal cortex and temporal gyrus in right-handed patients [57]. However, slight differences in the activation may allow for discrimination between these two networks in the inferior parietal lobule, in which the activation is more anterior, located in the temporoparietal junction and the supramarginal gyrus for the attentional network and more posterior in the angular gyrus for the language network [13,57,58]. The ventral attentional network is also located in the non-dominant hemisphere, almost symmetrical to the language network in the dominant hemisphere, which may also explain the difficulties of discriminating between these two networks. Considering the lateralization of these two networks, the handedness assessment using the Edinburgh handedness inventory has been considered as a supplement to discriminate between ventral attentional and language networks [13]. However, while this information may be useful in right-handed patients where left-hemisphere dominance exists in

96 %

of patients. Left-handed patients should be considered with caution since only

27 %

of left-handed patients have a dominant right hemisphere and, therefore, a left-lateralized ventral attentional network [59].

We investigated the overlapping surface of thresholded functional networks and the lesion mask in unhealthy patients to understand better the possibility of transfer from healthy to unhealthy data. The distribution of the intersection over union (IoU) values of 3D binary images of all unhealthy patients data is shown in Figure 9 for correct and wrong classification. Most of the thresholded functional networks have little or no overlap with the lesion mask. The normalized versions of these histograms are provided in Figure 10. The two distributions were observed to be highly skewed values of 2.11 and 1.94 for IoU of correctly and wrongly classified images, respectively, indicating non-Gaussian distribution. These histograms show that the category (correctly classified or wrongly classified) are estimated to be equal across the different IoU values as also confirmed by the p-value of

0.75

in the t-test carried out from the IoU distribution, which indicates non-significance (>0.05) in a difference between the two categories. To qualitatively illustrate this statistical fact, Figure 11 provides a scenario where images with or without overlap are correctly or wrongly classified. No direct effect of the tumor on the thresholded functional networks targeted is observed in our dataset. This observation can explain the possibility of transfer learning from healthy to unhealthy data. Nonetheless, we found a useful but not perfect transferability, and therefore, a discrepancy should exist. This could be in the intrinsic shape of the functional network of unhealthy patients, which may be distorted when located in the vicinity of the tumor.

4. Conclusions

This work demonstrated the interesting possibility of transfer learning from healthy controls to unhealthy patients. This was illustrated for the automatic identification of functional brain networks in rs-fMRI for patients with brain tumors. This result is important as it opens up an easy way to overcome the lack of data in machine learning for biomedical imaging. We demonstrated that healthy control data could boost the classification of functional brain networks in rs-fMRI for patient with brain tumors. This was obtained with an optimized classical CNN, which was shown to outperform standard CNN architectures and shallow learning methods, including the one previously tested in the literature on healthy subjects. The overall best performance obtained with unhealthy patients after transfer learning was

0.78 %

. The remaining errors where found to be indeed corresponding to difficult cases. The gain brought by the transfer from healthy subjects was about

4 %

, which is a classical order of magnitude in transfer learning. These performances remain smaller than the best performance obtained only on healthy control subjects (0.86%). Brain tumors make the classification harder than in healthy subjects; nonetheless, the knowledge gained from healthy control subjects can help classify functional brain networks in rs-fMRI with unhealthy patients. It is, therefore, an interesting result since healthy control subjects can be enrolled relatively quickly in hospitals for the non-invasive rs-fMRI studies.

The limiting factor in transferring knowledge from healthy to unhealthy patients may be the discrepancy between healthy control and unhealthy patients, which occur due to the influence of tumor on a region of the functional brain network. Several paths to compensate for this discrepancy could be investigated. Style transfer from healthy to unhealthy could be investigated to perform this compensation in the image domain. In addition, one could consider domain adaptation in the neural network to operate this shift in the latent space rather than in the image. Lastly, one could also consider the pre-processing image approach to compensate in the image domain for the distortion (spatial deformation, bold signal attenuation, etc.) brought by the tumors in the images. In this article, we demonstrated the possibility of a transfer of knowledge from healthy to unhealthy patients. The same methodology could be extended to other biomedical imaging for which the production of large cohort is critical to benefit from machine learning.

Author Contributions

Conceptualization L.E.I., P.R., D.R. and J.-M.L.; Data curation, F.B., M.L., P.M., A.T.M. and J.-M.L.; Formal analysis, P.R. and D.R.; Methodology, P.R., D.R. and J.-M.L.; Software, L.E.I. and P.R.; Supervision, P.R., D.R. and J.-M.L.; Validation, P.R., D.R. and J.-M.L.; Visualization, L.E.I.; Writing—original draft, L.E.I., P.R., D.R. and J.-M.L.; Writing—review & editing, L.E.I., P.R., F.B., M.L., P.M., A.T.M., D.R. and J.-M.L. All authors have read and agreed to the published version of the manuscript.

Funding

The PhD grant of L.E.I is funded by Petroleum Technology Dvlpt Fund-NIGERIA.

Informed Consent Statement

This is a single-center, prospective, open-label trial, in compliance with regulation and ethical guidelines for clinical research, approved by the local ethics committee (Comité de protection des personnes Ouest II, decision reference CPP 2012-25).

Data Availability Statement

Data available upon reasonable request.

Conflicts of Interest

The authors declare no conflict of interest.

References

Ranschaert, E.R.; Morozov, S.; Algra, P.R. Artificial Intelligence in Medical Imaging: Opportunities, Applications and Risks; Springer: Berlin/Heidelberg, Germany, 2019. [Google Scholar]
Aiello, M.; Cavaliere, C.; D’Albore, A.; Salvatore, M. The challenges of diagnostic imaging in the era of big data. J. Clin. Med. 2019, 8, 316. [Google Scholar] [CrossRef] [PubMed] [Green Version]
Subramanian, N.; Elharrouss, O.; Al-Maadeed, S.; Chowdhury, M. A review of deep learning-based detection methods for COVID-19. Comput. Biol. Med. 2022, 143, 105233. [Google Scholar] [CrossRef] [PubMed]
Kotia, J.; Kotwal, A.; Bharti, R.; Mangrulkar, R. Few shot learning for medical imaging. In Machine Learning Algorithms for Industrial Applications; Springer: Berlin/Heidelberg, Germany, 2021; pp. 107–132. [Google Scholar]
Glatard, T.; Lartizien, C.; Gibaud, B.; Da Silva, R.F.; Forestier, G.; Cervenansky, F.; Alessandrini, M.; Benoit-Cattin, H.; Bernard, O.; Camarasu-Pop, S.; et al. A virtual imaging platform for multi-modality medical image simulation. IEEE Trans. Med. Imaging 2012, 32, 110–118. [Google Scholar] [CrossRef] [PubMed]
Yi, X.; Walia, E.; Babyn, P. Generative adversarial network in medical imaging: A review. Med. Image Anal. 2019, 58, 101552. [Google Scholar] [CrossRef] [Green Version]
Chlap, P.; Min, H.; Vandenberg, N.; Dowling, J.; Holloway, L.; Haworth, A. A review of medical image data augmentation techniques for deep learning applications. J. Med Imaging Radiat. Oncol. 2021, 65, 545–563. [Google Scholar] [CrossRef]
Malik, H.; Farooq, M.S.; Khelifi, A.; Abid, A.; Qureshi, J.N.; Hussain, M. A comparison of transfer learning performance versus health experts in disease diagnosis from medical imaging. IEEE Access 2020, 8, 139367–139386. [Google Scholar] [CrossRef]
Alzubaidi, L.; Fadhel, M.A.; Al-Shamma, O.; Zhang, J.; Santamaría, J.; Duan, Y.; Oleiwi, S. Towards a better understanding of transfer learning for medical imaging: A case study. Appl. Sci. 2020, 10, 4523. [Google Scholar] [CrossRef]
Matsoukas, C.; Haslum, J.F.; Sorkhei, M.; Söderberg, M.; Smith, K. What Makes Transfer Learning Work For Medical Images: Feature Reuse & Other Factors. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Waikoloa, HI, USA, 3–8 January 2022; pp. 9225–9234. [Google Scholar]
Valverde, J.M.; Imani, V.; Abdollahzadeh, A.; De Feo, R.; Prakash, M.; Ciszek, R.; Tohka, J. Transfer learning in magnetic resonance brain imaging: A systematic review. J. Imaging 2021, 7, 66. [Google Scholar] [CrossRef]
Yosinski, J.; Clune, J.; Bengio, Y.; Lipson, H. How transferable are features in deep neural networks? Adv. Neural Inf. Process. Syst. 2014, 27, 1–9. [Google Scholar]
Lemée, J.M.; Berro, D.H.; Bernard, F.; Chinier, E.; Leiber, L.M.; Menei, P.; Ter Minassian, A. Resting-state functional magnetic resonance imaging versus task-based activity for language mapping and correlation with perioperative cortical mapping. Brain Behav. 2019, 9, e01362. [Google Scholar] [CrossRef] [Green Version]
Ille, S.; Krieg, S.M. Functional mapping for glioma surgery, part 1: Preoperative mapping tools. Neurosurg. Clin. 2021, 32, 65–74. [Google Scholar] [CrossRef] [PubMed]
Zijlmans, M.; Zweiphenning, W.; van Klink, N. Changing concepts in presurgical assessment for epilepsy surgery. Nat. Rev. Neurol. 2019, 15, 594–606. [Google Scholar] [CrossRef]
Nyatega, C.O.; Qiang, L.; Adamu, M.J.; Younis, A.; Kawuwa, H.B. Altered Dynamic Functional Connectivity of Cuneus in Schizophrenia Patients: A Resting-State fMRI Study. Appl. Sci. 2021, 11, 11392. [Google Scholar] [CrossRef]
Subah, F.Z.; Deb, K.; Dhar, P.K.; Koshiba, T. A deep learning approach to predict autism spectrum disorder using multisite resting-state fMRI. Appl. Sci. 2021, 11, 3636. [Google Scholar] [CrossRef]
Zhang, D.; Johnston, J.M.; Fox, M.D.; Leuthardt, E.C.; Grubb, R.L.; Chicoine, M.R.; Smyth, M.D.; Snyder, A.Z.; Raichle, M.E.; Shimony, J.S. Preoperative sensorimotor mapping in brain tumor patients using spontaneous fluctuations in neuronal activity imaged with functional magnetic resonance imaging: Initial experience. Oper. Neurosurg. 2009, 65, 226–236. [Google Scholar] [CrossRef] [Green Version]
Mahdavi, A.; Azar, R.; Shoar, M.H.; Hooshmand, S.; Mahdavi, A.; Kharrazi, H.H. Functional MRI in clinical practice: Assessment of language and motor for pre-surgical planning. Neuroradiol. J. 2015, 28, 468–473. [Google Scholar] [CrossRef] [Green Version]
Shimony, J.S.; Zhang, D.; Johnston, J.M.; Fox, M.D.; Roy, A.; Leuthardt, E.C. Resting-state spontaneous fluctuations in brain activity: A new paradigm for presurgical planning using fMRI. Acad. Radiol. 2009, 16, 578–583. [Google Scholar] [CrossRef] [Green Version]
Hart, M.G.; Price, S.J.; Suckling, J. Functional connectivity networks for preoperative brain mapping in neurosurgery. J. Neurosurg. 2016, 126, 1941–1950. [Google Scholar] [CrossRef]
Mitchell, T.J.; Hacker, C.D.; Breshears, J.D.; Szrama, N.P.; Sharma, M.; Bundy, D.T.; Pahwa, M.; Corbetta, M.; Snyder, A.Z.; Shimony, J.S.; et al. A novel data-driven approach to preoperative mapping of functional cortex using resting-state functional magnetic resonance imaging. Neurosurgery 2013, 73, 969–983. [Google Scholar] [CrossRef] [Green Version]
Ter Minassian, A.; Ricalens, E.; Nguyen The Tich, S.; Dinomais, M.; Aubé, C.; Beydon, L. The presupplementary area within the language network: A resting state functional magnetic resonance imaging functional connectivity analysis. Brain Connect. 2014, 4, 440–453. [Google Scholar] [CrossRef] [Green Version]
Tie, Y.; Rigolo, L.; Norton, I.H.; Huang, R.Y.; Wu, W.; Orringer, D.; Mukundan, S., Jr.; Golby, A.J. Defining language networks from resting-state fMRI for surgical planning—A feasibility study. Hum. Brain Mapp. 2014, 35, 1018–1030. [Google Scholar] [CrossRef] [PubMed] [Green Version]
Chiang, S.; Levin, H.S.; Haneef, Z. Computer-automated focus lateralization of temporal lobe epilepsy using fMRI. J. Magn. Reson. Imaging 2015, 41, 1689–1694. [Google Scholar] [CrossRef] [PubMed] [Green Version]
Zeng, L.L.; Shen, H.; Liu, L.; Hu, D. Unsupervised classification of major depression using functional connectivity MRI. Hum. Brain Mapp. 2014, 35, 1630–1641. [Google Scholar] [CrossRef]
Raj, R.; Luostarinen, T.; Pursiainen, E.; Posti, J.P.; Takala, R.S.; Bendel, S.; Konttila, T.; Korja, M. Machine learning-based dynamic mortality prediction after traumatic brain injury. Sci. Rep. 2019, 9, 1–13. [Google Scholar]
Chyzhyk, D.; Savio, A.; Graña, M. Computer aided diagnosis of schizophrenia on resting state fMRI data by ensembles of ELM. Neural Netw. 2015, 68, 23–33. [Google Scholar] [CrossRef]
Zhu, D.; Li, K.; Terry, D.P.; Puente, A.N.; Wang, L.; Shen, D.; Miller, L.S.; Liu, T. Connectome-scale assessments of structural and functional connectivity in MCI. Hum. Brain Mapp. 2014, 35, 2911–2923. [Google Scholar] [CrossRef] [Green Version]
Lu, J.; Zhang, H.; Hameed, N.; Zhang, J.; Yuan, S.; Qiu, T.; Shen, D.; Wu, J. An automated method for identifying an independent component analysis-based language-related resting-state network in brain tumor subjects for surgical planning. Sci. Rep. 2017, 7, 1–16. [Google Scholar] [CrossRef] [Green Version]
Litjens, G.; Kooi, T.; Bejnordi, B.E.; Setio, A.A.A.; Ciompi, F.; Ghafoorian, M.; Van Der Laak, J.A.; Van Ginneken, B.; Sánchez, C.I. A survey on deep learning in medical image analysis. Med. Image Anal. 2017, 42, 60–88. [Google Scholar] [CrossRef] [Green Version]
Zhou, Z.; He, Z.; Shi, M.; Du, J.; Chen, D. 3D dense connectivity network with atrous convolutional feature pyramid for brain tumor segmentation in magnetic resonance imaging of human heads. Comput. Biol. Med. 2020, 121, 103766. [Google Scholar] [CrossRef]
Aerts, H.; Schirner, M.; Dhollander, T.; Jeurissen, B.; Achten, E.; Van Roost, D.; Ritter, P.; Marinazzo, D. Modeling brain dynamics after tumor resection using The Virtual Brain. Neuroimage 2020, 213, 116738. [Google Scholar] [CrossRef]
Tao, Y.; Rapp, B. Investigating the network consequences of focal brain lesions through comparisons of real and simulated lesions. Sci. Rep. 2021, 11, 1–17. [Google Scholar]
Li, X.; Fischer, H.; Manzouri, A.; Månsson, K.N.; Li, T.Q. Dataset of whole-brain resting-state fMRI of 227 young and elderly adults acquired at 3T. Data Brief 2021, 38, 107333. [Google Scholar] [CrossRef]
Tanaka, S.C.; Yamashita, A.; Yahata, N.; Itahashi, T.; Lisi, G.; Yamada, T.; Ichikawa, N.; Takamura, M.; Yoshihara, Y.; Kunimatsu, A.; et al. A multi-site, multi-disorder resting-state magnetic resonance image database. Sci. Data 2021, 8, 1–15. [Google Scholar] [CrossRef] [PubMed]
Li, Y.O.; Adalı, T.; Calhoun, V.D. Estimating the number of independent components for functional magnetic resonance imaging data. Hum. Brain Mapp. 2007, 28, 1251–1266. [Google Scholar] [CrossRef]
Sair, H.I.; Yahyavi-Firouz-Abadi, N.; Calhoun, V.D.; Airan, R.D.; Agarwal, S.; Intrapiromkul, J.; Choe, A.S.; Gujar, S.K.; Caffo, B.; Lindquist, M.A.; et al. Presurgical brain mapping of the language network in patients with brain tumors using resting-state f MRI: Comparison with task f MRI. Hum. Brain Mapp. 2016, 37, 913–923. [Google Scholar] [CrossRef]
Geranmayeh, F.; Wise, R.J.; Mehta, A.; Leech, R. Overlapping networks engaged during spoken language production and its cognitive control. J. Neurosci. 2014, 34, 8728–8740. [Google Scholar] [CrossRef] [PubMed]
Rosazza, C.; Minati, L. Resting-state brain networks: Literature review and clinical applications. Neurol. Sci. 2011, 32, 773–785. [Google Scholar] [CrossRef]
Lee, M.H.; Hacker, C.D.; Snyder, A.Z.; Corbetta, M.; Zhang, D.; Leuthardt, E.C.; Shimony, J.S. Clustering of resting state networks. PLoS ONE 2012, 7, e40370. [Google Scholar] [CrossRef] [Green Version]
Mazziotta, J.C.; Toga, A.W.; Evans, A.; Fox, P.; Lancaster, J. A probabilistic atlas of the human brain: Theory and rationale for its development: The international consortium for brain mapping (icbm). Neuroimage 1995, 2, 89–101. [Google Scholar] [CrossRef]
Beckmann, C.F.; DeLuca, M.; Devlin, J.T.; Smith, S.M. Investigations into resting-state connectivity using independent component analysis. Philos. Trans. R. Soc. B Biol. Sci. 2005, 360, 1001–1013. [Google Scholar] [CrossRef] [Green Version]
Logan, B.R.; Geliazkova, M.P.; Rowe, D.B. An evaluation of spatial thresholding techniques in fMRI analysis. Hum. Brain Mapp. 2008, 29, 1379–1389. [Google Scholar] [CrossRef] [PubMed]
Duda, R.O.; Hart, P.E.; Stork, D.G. Pattern Classification and Scene Analysis; Wiley: New York, NY, USA, 1973; Volume 3. [Google Scholar]
Quinlan, J.R. Induction of decision trees. Mach. Learn. 1986, 1, 81–106. [Google Scholar] [CrossRef] [Green Version]
Goodfellow, I.; Bengio, Y.; Courville, A.; Bengio, Y. Deep Learning; MIT Press: Cambridge, MA, USA, 2016; Volume 1. [Google Scholar]
Kingma, D.P.; Ba, J. Adam: A method for stochastic optimization. arXiv 2014, arXiv:1412.6980. [Google Scholar]
Altan, A.; Karasu, S. Recognition of COVID-19 disease from X-ray images by hybrid model consisting of 2D curvelet transform, chaotic salp swarm algorithm and deep learning technique. Chaos Solitons Fractals 2020, 140, 110071. [Google Scholar] [CrossRef] [PubMed]
Simonyan, K.; Zisserman, A. Very deep convolutional networks for large-scale image recognition. arXiv 2014, arXiv:1409.1556. [Google Scholar]
He, K.; Zhang, X.; Ren, S.; Sun, J. Delving deep into rectifiers: Surpassing human-level performance on imagenet classification. In Proceedings of the IEEE International Conference on Computer Vision, Santiago, Chile, 7–13 December 2015; pp. 1026–1034. [Google Scholar]
Huang, G.; Liu, Z.; Van Der Maaten, L.; Weinberger, K.Q. Densely connected convolutional networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA, 21–26 July 2017; pp. 4700–4708. [Google Scholar]
Russakovsky, O.; Deng, J.; Su, H.; Krause, J.; Satheesh, S.; Ma, S.; Huang, Z.; Karpathy, A.; Khosla, A.; Bernstein, M.; et al. ImageNet Large Scale Visual Recognition Challenge. IJCV 2015, 115, 211–252. [Google Scholar] [CrossRef] [Green Version]
Zhang, D.; Raichle, M.E. Disease and the brain’s dark energy. Nat. Rev. Neurol. 2010, 6, 15–28. [Google Scholar] [CrossRef]
Fox, M.D.; Raichle, M.E. Spontaneous fluctuations in brain activity observed with functional magnetic resonance imaging. Nat. Rev. Neurosci. 2007, 8, 700–711. [Google Scholar] [CrossRef]
Spreng, R.N.; Sepulcre, J.; Turner, G.R.; Stevens, W.D.; Schacter, D.L. Intrinsic architecture underlying the relations among the default, dorsal attention, and frontoparietal control networks of the human brain. J. Cogn. Neurosci. 2013, 25, 74–86. [Google Scholar] [CrossRef] [Green Version]
Corbetta, M.; Patel, G.; Shulman, G.L. The reorienting system of the human brain: From environment to theory of mind. Neuron 2008, 58, 306–324. [Google Scholar] [CrossRef] [Green Version]
Vigneau, M.; Beaucousin, V.; Herve, P.Y.; Duffau, H.; Crivello, F.; Houde, O.; Mazoyer, B.; Tzourio-Mazoyer, N. Meta-analyzing left hemisphere language areas: Phonology, semantics, and sentence processing. Neuroimage 2006, 30, 1414–1432. [Google Scholar] [CrossRef] [PubMed]
Knecht, S.; Dräger, B.; Deppe, M.; Bobe, L.; Lohmann, H.; Flöel, A.; Ringelstein, E.B.; Henningsen, H. Handedness and hemispheric language dominance in healthy humans. Brain 2000, 123, 2512–2518. [Google Scholar] [CrossRef] [PubMed] [Green Version]

Figure 1. Visual representation of the described data of unhealthy patients in Table 1: (a) is the lesion mask, (b) is the grey matter mask, (c) is white matter mask, (d) is cerebrospinal fluid mask, (e) is whole brain, cerebrospinal fluid (skull and skin included), (f) is whole brain (white and grey matter).

Figure 2. Visualization of two forms of image data for Language network (LANG): (a) connectivity map image, (b) Thresholded image.

Figure 3. Three-layer convolutional neural network (CNN) model architecture.

Figure 4. LANG Network (a) with Pinch-Explode augmentation (b).

Figure 5. LANG Network (a), Lesion mask (b) and Mask overlay on Network activation (c).

Figure 6. Comparison between the synthetic and an original mask across several channels.

Figure 7. Validation accuracy curve of unhealthy data added to model training of healthy data.

Figure 8. Confusion matrix of functional network prediction by proposed CNN model on all individual functional brain networks as classes LANG, SAL, VAN, DMN, lFPCN, rFPCN, and DAN. (a) Confusion matrix of brute transfer learning from healthy to unhealthy data. (b) Confusion matrix of brute training on healthy and validation with all unhealthy data. (c) Confusion matrix of weight transfer learning from healthy to unhealthy data.

Figure 9. Histogram of IoU of network and lesion mask.

Figure 10. Normalized Histogram of IoU of network and lesion mask.

Figure 11. Visualization of correctly and wrongly classified Images (with and without Overlap)—(a) is correctly classified with overlap; (b) is wrongly classified with overlap; (c) is correctly classified without overlap; (d) is wrongly classified without overlap.

Table 1. Unhealthy patient data description.

	Unhealthy Patient Image Data Description
	Files Provided	Description
1	Lesion.nii	This file is the binary mask for the brain tumor, each corresponds to a patients
2	Grey Matter mask (mrwp1)	Is the mask for the gray matter (useful since the activation are all in the gray matter)
3	White Matter mask (mrwp2)	Is the mask for the white matter (no activation inside the white matter, but may be a good way to estimate the brain deformations linked to the tumor and the peritumoral edema)
4	Cerebrospinal fluid mask (mrwp3)	The mask for the cerebrospinal fluid (like for the white matter, no activation inside, but may be useful to estimate brain deformations)
5	Whole brain–white gray matter (wms)	The whole brain (white and gray matter) in T1 anatomical MRI sequence, with the skin and skull clipped
6	Whole brain (wmrs)	This provides view of the whole brain cerebrospinal fluid, skull and skin included

Table 2. Comparison of classification techniques with Healthy subjects’ data.

	Classification Techniques	Healthy Data
1	Proposed CNN	0.86 ± 0.01
2	Random Forest	0.82 ± 0.01
3	Feed forward NN	0.84 ± 0.02
4	Naïve Bayesian classifier	0.45 ± 0.02
5	K-Nearest neighbors	0.83 ± 0.02
6	Support vector machine	0.83 ± 0.01
7	Classification tree	0.64 ± 0.06

Table 3. fMRI network classification for healthy and unhealthy data (patient counted as whole represent seven fMRI network images in each case).

	Training		Testing
SN	Data Description	Patient Data	Data Description	Patient Data	Accuracy
1	Healthy	71	Healthy	10	0.86 ± 0.02
2	Unhealthy	45	Unhealthy	10	0.75 ± 0.01
3	Healthy	81	Unhealthy	55	0.74 ± 0.01
4	Healthy + Unhealthy	81 + 45	Unhealthy	10	0.77 ± 0.01
5	Fine-tuning on Unhealthy from Healthy	45	Unhealthy	10	0.78 ± 0.01
6	Augmentation (unhealthy simulation)	81	Unhealthy	10	0.75 ± 0.01
7	Healthy + Augmentation (unhealthy simulation)	81 + 81	Unhealthy	10	0.73 ± 0.01
8	Fine-Tuning on Unhealthy from healthy + Signal void	45	Unhealthy	10	0.76 ± 0.00
9	Healthy + Signal Void	81	Unhealthy	10	0.73 ± 0.01
10	Signal Void + Unhealthy	45	Unhealthy	10	0.74 ± 0.02

Table 4. Comparison of the proposed model with the alternative method.

	Proposed CNN	T.J. Mitchell, et al. [22]
Training: Healthy—Test: Healthy	$0.86 \pm 0.02$	$0.84 \pm 0.01$
Training: Unhealthy—Test: Unhealthy	$0.75 \pm 0.01$	$0.72 \pm 0.01$
Transfer Learning from Healthy to Unhealthy	$0.74 \pm 0.01$	$0.71 \pm 0.01$

Table 5. Model Evaluation for individual networks of healthy subjects, unhealthy patients and the transfer learning (Healthy to unhealthy).

Networks	Healthy Subjects		Unhealthy Patients		Healthy-to-Unhealthy
	Sensitivity	Specificity	Sensitivity	Specificity	Sensitivity	Specificity
DMN	1.00	1.00	0.97	1.00	0.98	0.90
LANG	0.98	0.70	0.97	0.80	0.97	0.60
LFPCN	1.00	1.00	0.95	0.70	0.98	0.90
RFPCN	0.98	0.90	0.95	0.80	0.98	0.70
VAN	0.95	0.90	0.95	0.80	0.88	0.80
DAN	1.00	0.90	1.00	0.60	0.92	0.80
SAL	0.98	1.00	0.98	0.97	0.95	0.8

Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

© 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Ismaila, L.E.; Rasti, P.; Bernard, F.; Labriffe, M.; Menei, P.; Minassian, A.T.; Rousseau, D.; Lemée, J.-M. Transfer Learning from Healthy to Unhealthy Patients for the Automated Classification of Functional Brain Networks in fMRI. Appl. Sci. 2022, 12, 6925. https://doi.org/10.3390/app12146925

AMA Style

Ismaila LE, Rasti P, Bernard F, Labriffe M, Menei P, Minassian AT, Rousseau D, Lemée J-M. Transfer Learning from Healthy to Unhealthy Patients for the Automated Classification of Functional Brain Networks in fMRI. Applied Sciences. 2022; 12(14):6925. https://doi.org/10.3390/app12146925

Chicago/Turabian Style

Ismaila, Lukman E., Pejman Rasti, Florian Bernard, Mathieu Labriffe, Philippe Menei, Aram Ter Minassian, David Rousseau, and Jean-Michel Lemée. 2022. "Transfer Learning from Healthy to Unhealthy Patients for the Automated Classification of Functional Brain Networks in fMRI" Applied Sciences 12, no. 14: 6925. https://doi.org/10.3390/app12146925

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Transfer Learning from Healthy to Unhealthy Patients for the Automated Classification of Functional Brain Networks in fMRI

Abstract

1. Introduction

Clinical Use Case

2. Materials and Methods

2.1. Database

2.2. Data Acquisitions and Preprocessing

2.3. Identification of Functional Networks through Machine Learning Algorithms

2.4. Transfer Learning Strategies

2.5. Data Augmentation

3. Experimental Results

3.1. Performance Comparisons

3.2. Transfer Learning

3.3. Comparison with Prior Works

3.4. Discussion and Error Analysis

4. Conclusions

Author Contributions

Funding

Informed Consent Statement

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI