In this section, various single-view and multiple-view brain MRI tumor models are discussed, along with experimental results.
4.1. Single-View Brain MRI Tumor Models
Brain tumor detection and classification methods primarily utilize CNNs as their backbone. While sharing architectural similarities, these methods diverge in depth, complexity, and specific components. Hybrid approaches combining vision transformers and recurrent units have emerged, offering improved feature extraction and relationship identification. Transfer learning with pre-trained models like EfficientNets, ResNets, MobileNets, or VGG variants is commonly employed to leverage prior knowledge. Newer architectures integrate object detection and segmentation, while multi-task models aim to simultaneously detect, classify, and localize tumors. Attention mechanisms are incorporated to focus on relevant features, and neural architecture search automates the design of optimal network structures. Data augmentation, addressing class imbalance, and various optimization techniques are commonly used to improve accuracy and efficiency in these tasks. Below, well-known architectural models are briefly discussed to highlight the significance of such architectures:
BCM-CNN: The research [
43] introduces a state-of-the-art 3D CNN model that combines the benefits of sine, cosine, and grey wolf optimization algorithms. By utilizing the pre-trained Inception-ResNetV2 model, the BCM-CNN effectively extracts relevant features from brain MRI images. When evaluated on the challenging BRaTS 2021 Task 1 dataset, the model achieved an impressive accuracy of 99.98%.
AlexNet-based CNN: The study [
44] introduces a hybrid DL architecture that integrates the strengths of CNNs and recurrent neural networks (RNNs). Specifically, the model combines AlexNet and Gated Recurrent Units (GRU) to extract both spatial and temporal features from brain tumor images. The proposed model demonstrates impressive performance, achieving 97% accuracy, 97.63% precision, 96.78% recall, and a 97.25% F1-score.
GoogleNet-based model: The study [
45] proposes a novel approach for brain tumor classification using a modified GoogleNet architecture. The researchers fine-tuned the last three fully connected layers of the pre-trained GoogleNet model on a dataset of 3064 T1w (Figshare) MRI images. By combining GoogleNet with SVM, the model significantly improved classification accuracy, reaching approximately 98% for distinguishing between glioma, meningioma, and pituitary tumors.
VGG19 with SVM: The research [
46] introduces a novel approach for brain tumor classification that integrates the power of CNNs and support vector machines (SVMs). The model utilizes the pre-trained VGG19 architecture to extract high-level features from MRI images. Subsequently, SVM classifiers are employed to accurately classify different types of brain tumors. The model demonstrates superior performance, achieving an accuracy of 95.68% on the Brats and Sartaj datasets.
VGG16 and VGG19 with ELM: The authors in [
47] present a multimodal DL framework for accurate brain tumor classification. The model utilizes VGG16 and VGG19 pre-trained CNNs to extract relevant features from MRI images. Subsequently, an Extreme Learning Machine (ELM) classifier, enhanced with a correntropy-based feature selection strategy, is employed for the final classification task. The model demonstrates superior performance on the BraTS2015, BraTS2017, and BraTS2018 datasets, achieving accuracy rates of 97.8%, 96.9%, and 92.5%, respectively.
EfficientNets: The study [
48] proposes a novel DL approach that utilizes transfer learning to detect brain tumors. The model incorporates six pre-trained architectures: VGG16, ResNet50, MobileNetV2, DenseNet201, EfficientNetB3, and InceptionV3. To enhance performance, the models are fine-tuned using Adam and AdaMax optimizers. The proposed approach demonstrates high accuracy, ranging from 96.34% to 98.20%, while requiring minimal computational resources. Another research [
49] introduces a hybrid DL approach for accurate brain tumor classification. The model utilizes EfficientNets, a state-of-the-art CNN architecture, for feature extraction. Grad-CAM visualization is employed to understand the model’s decision-making process. The model demonstrates superior performance on the CE-MRI Figshare dataset, achieving an impressive accuracy of 99.06%, precision of 98.73%, recall of 99.13%, and F1-score of 98.79%.
YOLO NAS: The research [
50] investigates the application of the YOLO NAS (Neural Architecture Search) DL model for accurate brain tumor detection and classification in MRI images. The model’s performance is enhanced by a segmentation process utilizing a deep neural network with a pre-trained EfficientNet decoder and a U-Net encoder. The dataset, consisting of 2570 training images and 630 validation/testing images, was used to train and evaluate the model. The results demonstrate exceptional performance, with 99.7% accuracy, a 99.2% F1-score, and other key metrics exceeding 98%.
Hybrid ViT-GRU: The authors [
51] introduce a hybrid DL model that combines the strengths of Vision Transformers (ViT) and GRU for effective brain tumor detection and classification. To improve model transparency, Explainable AI techniques, such as attention maps, SHAP, and LIME, are integrated. The model was evaluated on the BrTMHD-2023 and brain tumor Kaggle datasets, achieving remarkable results, including a 98.97% F1-score and 96.08% accuracy, respectively.
MobileNetv3: The study [
52] investigates the potential of the MobileNetv3 architecture for improving brain tumor diagnosis accuracy. The model was trained and validated on the Kaggle dataset, with image enhancement techniques applied to balance the dataset. A five-fold cross-validation strategy was employed to ensure robust performance. The proposed approach, which integrates the DenseNet201 architecture with Principal Component Analysis (PCA) and Support Vector Machines (SVM), demonstrated exceptional performance. It achieved 100% accuracy, recall, and precision on dataset 1 and 98% accuracy on dataset 2.
UNet: The architecture comprises two key pathways: a contracting path (encoder) and an expansive path (decoder), creating a distinctive U-shape [
53,
54]. This design enables the network to effectively grasp local and global information within the image, making it well-suited for the accurate segmentation of tumors. Several variations of the U-Net have been developed, including the 3D U-Net [
55], which is designed to capture spatial context across multiple image slices, hybrid approaches such as VGG16-U-Net [
56] that leverage the feature extraction capabilities of pre-trained networks like VGG16, and YOLO-U-Net [
57], among others.
Hybrid models: The study [
58] introduces a cutting-edge approach to brain tumor classification, leveraging DL and advanced optimization techniques. The framework involves modifying pre-trained neural networks, utilizing a quantum theory-based Marine Predator Optimization algorithm for feature selection, and employing Bayesian optimization for hyperparameter tuning. By fusing features through a serial-based approach, the proposed framework achieved remarkable performance on an augmented Figshare dataset, with an accuracy of 99.80%, sensitivity of 99.83%, precision of 99.83%, and a low false negative rate of 17%.
Based on the discussion in this section, two important aspects of the architectural models discussed in this section can be outlined. The first aspect is the clinical application, shown in
Table 6, where each row shows which of the applications (classification, detection, segmentation) is targeted by each model. The other aspect is the process diagram presented in
Figure 6. The process begins with an MRI brain tumor dataset (shown in
Table 3) that enters image pre-processing, including noise reduction, resizing, contrast enhancement, and normalization.
Accurate segmentation of brain tumors in MRI scans is essential for effective diagnosis, treatment planning, and survival prediction. A variety of techniques are used to distinguish tumor regions from healthy brain tissue. Automatic segmentation methods strive to fully automate this process, utilizing approaches such as thresholding (which separates tissues based on intensity), region-based techniques (grouping pixels with similar properties), edge-based methods (detecting boundaries), and clustering algorithms (grouping voxels by feature similarity). Modern methods are designed to capture unique pathological features. Recent DL models have advanced the field by focusing computational attention on tumor regions, achieving high Dice scores for both whole tumor and tumor core segmentation. However, challenges persist due to variations in tumor size, shape, location, and appearance, along with image artifacts and ambiguous tumor boundaries. The ongoing improvements in segmentation highlight the field’s progress toward reliable, automated clinical tools while also emphasizing the need for larger datasets and better generalization across diverse imaging conditions and tumor types.
Data augmentation is employed to expand the dataset and ensure a balanced representation of different tumor types. In some models, it may be desirable to have features extracted and reduced. The augmented dataset is partitioned into training and testing sets (typically 80% and 20%, respectively). DL models are then selected, modified if needed, and then trained on the training set, with hyperparameters optimized for optimal performance. In some instances, additional optimization is performed to refine the feature set with an additional feature fusion mechanism in case multiple models are trained. Next, a classifier is employed—either binary to detect the presence of a brain tumor or multiclass to identify specific tumor types. The trained model is then evaluated on the held-out test set using a set of performance metrics.
4.2. Multiple-View Brain MRI Tumor Models
MRI scans are typically obtained in three anatomical planes—axial, coronal, and sagittal—each offering distinct views of brain structures and abnormalities, illustrated in
Figure 2. Utilizing data specific to these planes in brain tumor detection and classification models allows for a comprehensive understanding of tumor characteristics. These MRI view-specific models capitalize on the strengths inherent to each anatomical plane, enhancing diagnostic precision and accuracy.
Figure 3 shows these multiple-view MRI slices and brain tumor classifications.
Meningiomas, typically benign but potentially problematic, originate near the protective membranes of the brain and spinal cord [
59]. Gliomas, arising from glial cells, are the deadliest brain tumors and constitute about one-third of all cases [
60]. Pituitary tumors are generally benign growths within the pituitary gland [
61]. While accurate diagnosis is vital for determining related treatment, traditional biopsy methods face challenges due to their invasive nature, time consumption, and potential for non-representative sampling [
62,
63]. Moreover, histopathological grading based on biopsies is limited by intratumor variability and subjective interpretation among pathologists [
64], complicating the diagnostic process and restricting treatment options.
The axial (transverse) view offers a horizontal cross-section of the brain, dividing it into upper (superior) and lower (inferior) sections. This view is extensively used in clinical practice because it provides a detailed visualization of key brain regions, such as the ventricles, corpus callosum, and basal ganglia. Axial images are predominant in clinical datasets because of their widespread use in diagnostic procedures. DL models such as GoogLeNet, InceptionV3, DenseNet201, AlexNet, and ResNet50 [
43,
65] are trained specifically on axial slices to analyze spatial relationships and capture critical features across slices. The models can also be used to capture spatial dependencies across slices.
The coronal view offers a frontal cross-section of the brain, dividing it into anterior and posterior regions. This orientation is especially advantageous for examining the brainstem, thalamus, and temporal lobes. It is particularly effective in identifying tumors located in midline structures, such as pituitary adenomas and gliomas in the temporal lobe. Additionally, the coronal perspective is critical for assessing brain symmetry, which helps detect mass effects and midline shifts caused by tumors. Coronal datasets require meticulous preparation since they are less commonly used on their own than axial views. Convolutional neural network (CNN) architectures like ResNet-50, AlexNet, VGGNet, and MobileNet-v2 [
43,
66,
67] can be fine-tuned to process coronal slices effectively. Incorporating attention mechanisms can enhance the model’s ability to focus on midline structures. Moreover, pre-trained models developed for axial views can be further trained on coronal images to take advantage of shared features between the two perspectives.
The sagittal view presents a side-oriented cross-section of the brain, dividing it into the left and right hemispheres. This perspective is particularly important for examining midline structures and understanding overall brain morphology. Sagittal images are crucial for evaluating tumors in areas like the corpus callosum and brainstem. Additionally, they help detect structural abnormalities such as ventricular compression or displacement of the cerebellum caused by tumors. Although sagittal datasets are often smaller, they contain distinctive features essential for diagnosing specific tumor types. To analyze spatial patterns effectively across sagittal slices, models [
43,
67,
68,
69] often employ sequence-based architectures, such as RNNs or transformer models, which can capture the sequential dependencies within the data.
While single-view models demonstrate efficacy for specific tasks, integrating information from axial, coronal, and sagittal MRI views offers a more comprehensive understanding of tumor characteristics, with approaches ranging from feature fusion using separate DL models for enhanced classification to employing attention layers for dynamic weighting of view contributions, utilizing 3D input volumes to capture spatial relationships across planes, and combining predictions from view-specific models through voting or averaging mechanisms, ultimately leading to more robust and accurate brain tumor detection and classification.
Multi-view modeling in brain MRI analysis faces challenges such as data imbalance due to the predominance of axial views in datasets, necessitating augmentation techniques for balanced training while also demanding significant computational resources, particularly for 3D CNNs. The architecture primarily remains the same, as shown in
Figure 6. Accurate model training demands consistent alignment of anatomical structures across all views. Despite these challenges, integrating axial, coronal, and sagittal views provides comprehensive diagnostic insights. The axial view offers broad information, and coronal and sagittal views contribute critical details about midline structures and tumor morphology. When effectively implemented, multi-view modeling can enhance diagnostic accuracy. However, future research must address issues like data imbalance, computational efficiency, generalization, and the impact of subtle variations in slice thickness and orientation. This research is crucial to fully realize the potential of view-specific and multi-view models in clinical AI applications.
4.3. Brain MRI Tumor Progression Models
AI-powered predictive analytics enables proactive and personalized interventions by forecasting disease progression. In brain MRI brain tumor progression modeling, AI methods excel because they can identify complex patterns and relationships within high-dimensional data. These approaches effectively process large datasets, capture non-linear relationships, and learn features directly from raw data, which is crucial for medical imaging. View-specific brain MRI models exemplify this, demonstrating automated feature extraction with reliable accuracy and scalable architecture. This facilitates personalized medicine by identifying critical regions driving disease progression.
Figure 7A–C illustrates this by showcasing glioblastoma progression through (T2, ADC map, T1-enhanced, and CBV) MRI scans taken at the 2nd, 4t, and 6th months, highlighting relapses and guiding the development of tailored treatment plans [
70]. The color in the rightmost image signifies the volume of blood in a given amount of brain tissue. Below, a few specific models are discussed that have successfully tried to model brain MRI tumor progression with reasonable accuracy.
The study [
71] introduces a new DL method for analyzing glioblastoma multiforme (GBM) tumors. By developing a model that estimates how quickly tumor cells spread (diffusivity) and multiply (proliferation rate), the researchers can predict how the tumor will grow over time. The model was tested on both simulated and real patient data, successfully generating complete growth trajectories for all five GBM patients in the study. Importantly, the model not only predicts tumor growth but also provides an assessment of the reliability of its predictions. This research demonstrates a significant step forward in using DL to understand brain tumors through DWI. These findings could lead to more accurate and individualized treatment plans for GBM patients.
The research [
72] introduces a novel method for brain tumor segmentation that combines tumor growth modeling with DL. The approach leverages the Lattice Boltzmann Method (LBM) to extract intensity features from initial MRI scans, enhancing segmentation accuracy, and a Modified Sunflower Optimization (MSFO) algorithm for optimization. Furthermore, the method incorporates texture features such as fractal and multi-fractal Brownian motion (mBm). The extracted features are then fed into a full-resolution convolutional network (FrCN) for final segmentation. Evaluated on three benchmark datasets (BRATS 2020, 2019, and 2018), the method demonstrated high accuracy, achieving 97%, 95.56%, and 95.23%, respectively.
The research [
73] introduces an Enhanced Fuzzy Segmentation Framework (EFSF) designed to extract white matter from MRI scans. Recognizing the critical role in diagnosing neurological disorders, EFSF builds upon the traditional Fuzzy C-means (FCM) clustering technique and refines the derivation of fuzzy membership functions and prototype values, leading to enhanced segmentation accuracy. The white matter region is identified as the area with the highest prototype value. When evaluated on a dataset of 100 MR images, EFSF demonstrated a Dice Similarity Index of 0.8051 ± 0.0577, indicating strong agreement with reference segmentations. These results suggest that EFSF offers a promising solution for white matter segmentation in MR images, potentially improving the assessment of white matter atrophy across various neurological disorders.
The research [
74] investigates both linear and nonlinear models for simulating brain tumor growth through numerical methods. Utilizing the Crank-Nicolson scheme, a finite difference approach, the study performed simulations to examine tumor characteristics such as peak concentration and the total count of cancerous cells. Although specific outcomes are not detailed, the work assesses the effectiveness of linear versus nonlinear models in forecasting tumor development patterns. This contributes to computational oncology by enhancing the understanding of how different mathematical frameworks can represent the intricate progression of brain tumors, potentially influencing clinical decision-making in treatment and prognosis.
The study [
75] develops a mathematical model to explore the intricate interplay between key components of the immune system (dendritic cells and cytotoxic T-cells) and distinct cancer cell populations (cancer stem cells and non-stem cancer cells). The researchers employed a system of ordinary differential equations to simulate the impact of immunotherapy, specifically dendritic cell vaccines and T-cell adoptive therapy, on tumor growth, both in the presence and absence of chemotherapy. The model successfully replicated several experimental observations in the scientific literature, including the temporal dynamics of tumor size in in vivo studies. Notably, the model revealed a crucial finding: chemotherapy can inadvertently increase tumor growth, while immunotherapy targeting cancer stem cells can effectively reduce tumorigenicity.
The inherent complexity of GBM—characterized by its heterogeneous enhancement, irregular and infiltrative growth patterns, and considerable variability among patients—complicates accurate progression detection. Although imaging technologies have advanced significantly, reliably identifying and differentiating true tumor growth from treatment-related effects, such as pseudo-progression or radiation necrosis, remains a clinical obstacle. The review [
76] focuses on existing criteria for assessing tumor progression and highlights the difficulties these methods face in achieving precise detection and consistent application in clinical practice.