Brain Tumor Detection with Deep Learning Methods’ Classifier Optimization Using Medical Images

Güler, Mustafa; Namlı, Ersin

doi:10.3390/app14020642

Open AccessArticle

Brain Tumor Detection with Deep Learning Methods’ Classifier Optimization Using Medical Images

by

Mustafa Güler

^1,*

and

Ersin Namlı

²

¹

Department of Engineering Sciences, Faculty of Engineering, Istanbul University-Cerrahpaşa, 34320 Istanbul, Turkey

²

Department of Industrial Engineering, Faculty of Engineering, Istanbul University-Cerrahpaşa, 34320 Istanbul, Turkey

^*

Author to whom correspondence should be addressed.

Appl. Sci. 2024, 14(2), 642; https://doi.org/10.3390/app14020642

Submission received: 10 December 2023 / Revised: 7 January 2024 / Accepted: 8 January 2024 / Published: 12 January 2024

(This article belongs to the Section Computing and Artificial Intelligence)

Download

Browse Figures

Versions Notes

Abstract

:

It is known that, with the development of artificial intelligence science in recent years, it has started to be used in all areas of life. Due to the increase in diseases that threaten human life, such as epidemics and cancer, more attention has been paid to research in this field. Especially in the field of biomedical image processing, very successful results have been obtained in recent years with the use of deep learning methods. For this study, MR images are utilized to diagnose brain tumors. To assist doctors and radiologists in automatic brain tumor diagnosis and to overcome the need for manual diagnosis, a brain MR image automated classification system is being developed. The data used in the study are open access data obtained from the Kaggle library. This paper presents a novel approach for classifying brain MR images utilizing a dataset of 7022 MR images. To give an unbiased evaluation of the dataset, it is divided into a 40% test and 60% training set. Respectively, VGG, ResNet, DenseNet and SqueezeNet architectures are trained and used for feature extraction from brain MRI images. In order to classify the extracted features, machine learning methods (Support Vector Machines, K-Nearest Neighbors, Naive Bayes, Decision Tree, Linear Regression Analysis) are applied first, then an ensemble learning method is applied and the best validation method is selected. In addition, parameter optimization is applied to the trained CNN algorithms. In order to develop the proposed methods, the Python software program was used in the training and testing phases of the models, and the classification success rates were mutually evaluated. Among the results found, it can see that the ResNet architecture reached 100% accuracy. The data obtained as a result of the study were compared with the results of similar studies. In conclusion, the techniques and methods applied highlight their effectiveness in accurately classifying brain MRI images and their potential to improve diagnostic capabilities.

Keywords:

deep learning; image processing; convolutional neural networks; classification; brain tumor

1. Introduction

Artificial intelligence is a technology consisting of extremely intelligent programs and machines that try to resemble human intelligence and offer creative solutions to humanity with the information they collect. The main purpose of artificial intelligence is to adapt the intelligent behavior observed in humans to computers and machines. In recent years, artificial intelligence technologies have been used frequently in the defense industry, economy, social life and health sector.

Machine learning, which is a subset of artificial intelligence, is the system that creates the architectures of intelligent algorithms that can make predictions through self-learning models. At the same time, there are many studies in the literature on deep learning and image processing, which is an area of machine learning. These studies have focused especially on image extraction, classification and object detection. The success rates of different algorithms on different and similar datasets have varied in studies [1].

The interpretation of radiologic images always requires precision and care. In diseases where X-ray and computed tomography images are used frequently, such as the coronavirus (COVID-19) epidemic in recent years, the responsibilities of specialists have increased; however, the examination of radiological images has become more time-consuming. Specialized physicians working in the field of radiology work very carefully and meticulously. As a result, a report is created by interpreting MR images of many different parts of the body (brain, chest, leg, etc.) and according to these results, specialist physicians in different branches make comments for treatment and diagnosis [2].

Image processing is a very important field of study and its applications in the medical field make it even more important. A mistake made here can have critical consequences for patients. For this reason, experts working in this field have to be very careful [3]. Brain tumors are known to be highly sensitive due to their location. Interpreting MRI images from hundreds of thousands of different patients can sometimes be complex. It is sometimes difficult for specialists to understand whether tumors of different sizes and shapes are benign or malignant. Computer-based systems are needed to facilitate this situation and minimize the margin of error. For this purpose, deep learning methods are of interest to detect tumor cells more accurately and quickly [4].

The vast majority of biomedical image classification studies in the literature have relied on a single CNN model for feature extraction. However, this approach has limitations in terms of feature extraction and performance. Methods that focus on a single model may fail to capture complex brain MRI images and variability. As a result, classification accuracy remains low. In contrast, our study proposes a hybrid method for brain tumor diagnosis. The proposed method uses different CNNs trained to extract features from brain MRI images and multiple ML classifiers to classify brain images into four categories. Here, the brain tumor classes consist of normal and three tumor images. The features extracted from five different ML classifiers and four different trained models are evaluated to select and evaluate the strongest features, and the features extracted from various trained CNN models are combined to develop a feature ensemble approach to tackle the brain image classification problem. The hybrid model extracted from ensemble learning is further classified and tested for accuracy. This approach combines different information gathered by multiple CNN models rather than relying on features extracted from a single model. In addition, in order to improve the model’s accuracy, the proposed approach is optimized by optimizing the grid search. The dataset was classified with the most effective ML classifier after optimization. The experimental results show that our proposed hybrid method significantly improves the performance.

2. Related Studies

Computer vision is software science that focuses on pictorial images, clippings from videos, object identification and understanding. The main purpose of image processing, which is a product of artificial intelligence, is to create systems that can copy human abilities and learn them on their own. In addition, image processing applications use machine and deep learning architectures to imitate the visual processing that occurs in living things. Many image processing techniques are used. Medical image processing is a widely used method for rapid diagnosis. When the studies were examined, it was seen that studies were carried out to diagnose tumors of different types and sizes using magnetic resonance images.

In a sample research study, the aim was to classify different brain tumors (pituitary, glioma and meningioma tumor) using CNN algorithms on MR images and to determine the importance of brain sections such as coronal, axial and sagittal in classification. On the same topic, researchers have proposed a new model, and this model is derived from the DenseNet algorithm. The results were classified with machine learning algorithms and high success rates were achieved [5]. In a study, VGG architecture is preferred because it is easy to understand. The results obtained from 253 brain MRI images, 155 of which had tumors, showed that VGG achieved a 98% success rate [4]. In a similar study, images of three different brain tumors were used to extract sub-layer images that are different from medical images using pre-trained models. Here, feature extraction and merging methods are used while solving the problem. Inception-v3 and DenseNet architectures were used in this problem, and success rates of 99.34% and 99.51% were achieved, respectively, from these two models [6]. In another study, magnetic resonance images were used for brain tumor detection. In this research, a rule-based detection system was introduced and morphological features were utilized. Respectively, for 497 brain MRI images, preprocessing, segmentation, tumor region detection and tumor detection stages were followed. Here, a success rate of 84.26% was achieved [7]. In addition, the Swati study group proposed transfer learning for multi-class brain tumor classification. For this purpose, the AlexNet, VGG-16 and VGG-19 ESA models were used. In the experimental studies, respectively, the AlexNet, VGG-16 and VGG-19 models achieved 89.95%, 94.65% and 94.82% accuracy rates. A statistical study to classify benign and malignant brain tumors compared textural characteristics. Here, the nearest-neighbor algorithm was used and a classification accuracy of 80% was achieved [8]. In a segmentation study developed in addition to CNN structures, a diagnosis study was conducted using MR images. In this study, a thresholding technique was applied using a search algorithm. Morphological operations and connected component analysis were used to reduce the noise in the images and to identify brain tumors at a higher rate. The results obtained were compared with CNN algorithms and high success was achieved [9].

In their research with two different datasets, ref. [10] aimed to see the success rates of classification by applying different labels. A CNN-based deep learning algorithm was tested to classify 3580 open access brain MRI images. An accuracy rate of 96.13% was achieved in the study using first and fourth stage tumors. In a study using a neuro-fuzzy inference system, brain MR images were divided into component tissues, and diseased tissues such as spinal fluid, edema and tumors were separated. Unlike similar studies in the literature, the skull was separated and only brain tissue was evaluated. The statistical features obtained from the system were used and the results were compared with the segmented tissue areas and evaluated with the membrane index. As a result of the research, it was shown that the neuro subtraction system gave very successful results in the segmentation process of MR images [11]. VGG algorithm was used to classify brain tumor images. Before and after this research model’s data augmentation using the VGG-19 algorithm, some optimization techniques were applied and high success rates were achieved [12]. In the literature, it has been seen that the VGG algorithm is frequently used in brain tumor detection and classification studies. The biggest reason for this is that the algorithm gives successful results in similar datasets.

In another study, MR images of three different tumor types such as glioma, meningioma and pituitary were classified using ResNet architecture. In order to obtain a better result in the research, changes were made in the layers, and the number of layers was increased. During the training, the Figshare MRI dataset consisting of 3064 T1-weighted MR images of 233 patients with three different tumor types containing, respectively, 1426, 708 and 930 images was used. The accuracy rate obtained as a result of the research was 98.67% [13]. Another study aimed to improve the detection and precise localization of brain cancer to improve the prognosis and treatment outcomes of patients by leveraging the information provided by brain medical images. Here, 300 brain images were analyzed using the YOLO model and a success percentage of 0.94 was achieved [14]. In research conducted with the transfer learning method, contrast stretching and histogram equalization methods were applied to the input images using the pre-trained ResNet50 architecture, and the success rates were compared in terms of precision and sensitivity. Here, the ResNet50 method achieved a very high success rate of 99.15%, with contrast stretching for the classification process [15]. In another study using three different convolutional neural networks, brain tumor types such as pituitary, glioma and meningioma have been classified via VGGNet, GoogleNet and AlexNet. In the research, the VGG16 architecture achieved a success rate of 98.69% in terms of classification and detection [16].

While searching the literature, it was observed that some studies did not apply any optimization in CNN algorithms, and only machine learning methods were used for classification. Brain tumor segmentation was also performed using the BRaTS 2020 dataset. As a result, an 86% similarity rate and 80% sensitivity percentage were obtained [17]. In addition, using the BraTS 2018 dataset, a U-Net-based model was developed to classify the tumor region using colored pixel label segmentation. As a result of this classification 98% success rate was achieved [18]. In the research, in which a random forest classifier-based system was proposed by dividing the brain images into two classes, first of all, an adaptive median filter was applied to the MR images in the preprocessing part. The goal here was to preserve the edge pixels on the image edges. Then, feature extraction was applied to determine the tumor region. A weighted voting technique is used to distinguish between tumor and non-tumor regions [19]. The study, employing contrast-enhanced MRI, achieved a 93% sensitivity, 82% specificity, and 87% accuracy using classical machine learning techniques. It aimed to predict the 1p/19q co-deletion status in 159 lower-grade gliomas (LGG). This was done by analyzing post-contrast MRI images with convolutional neural network (CNN) algorithms, utilizing a dataset specifically for LGG 1p/19q [20]. Table 1 shows the data sets used by similar studies and the accuracy rates found.

3. Material and Method

3.1. Datasets

The brain is one of the most complex and important structures in the human body. It consists of more than 50 billion nerves connected by millions of connections and a special field working together. Although the brain is the control center of the whole body, it organizes the coordinated work of the heart, lungs, blood vessels and all other organs. All of our senses are connected to the brain [21].

Brain tumors are a deadly disease that develops, especially in adults, with the formation and proliferation of abnormal cells. It is usually caused by abnormal growth of the central nervous system or brain cells. Brain tumors are divided into two groups as primary and secondary brain tumors, and the types of tumors in these two groups are categorized separately. Knowing and classifying brain tumors according to group is of vital importance for patients. Primary brain tumors are tumors originating from a cell or tissue in the brain. These can be categorized among themselves as benign and malignant. Benign tumors occur in a single region and grow relatively slowly. If the correct procedure is performed with the surgical operation, these tumors most likely do not recur. It is very important to completely remove the tumor and clean the area. Malignant tumors in the brain and spinal cord spread and multiply rapidly. Malignant tumors are classified as secondary brain tumors, and rarely start elsewhere in the body and spread to the brain. These tumors are graded between 1 and 4 according to their growth rate. Considering the grading criteria, grades 1 and 2 are considered benign, while grades 3 and 4 are considered malignant [22].

In this research, the authors used the “Brain Tumor MRI dataset” whose MR images were published as open access by Masoud Nickparvar on the Kaggle platform. This dataset is a combination of three different datasets (Figshare, SARTAJ, Br35H). This dataset contains 7022 brain MRI images in total, consisting of 4 different classes. In the figures below, images are given of four different classes and also the numbers of these classes [23].

As seen in Figure 1 and Figure 2, tumorous tissues are marked with green dots. However, this is not always possible, because not all tumors are so big that they can be seen. Missed cases can sometimes become life-threatening in a very short time. The underlying principle of this research is to minimize human error and utilize artificial intelligence. So the dataset consists of 4 classes: glioma, malignant, pituitary and no tumor. The distribution of the classes is as follows: gliomas (1321 images), malignant (1339 images), pituitary tumors (1456 images) and normal tumor cells (1595 images). The original dataset is divided into 5711 training data and 1311 test data images. However, for evaluation, 40% of the test set is allocated to the evaluation set and the remaining 60% to the training set.

Figure 3 shows the flow chart of the model. Here, All classification and CNN algorithms are presented.

3.2. Convolutional Neural Network (CNN) Model

Convolutional neural networks (CNNs) are a type of artificial neural network that has been successfully used in computer vision, voice recognition, natural language processing and various other tasks. CNNs are typically designed to work with two- or three-dimensional input data, such as visual data analysis. CNNs contain convolutional layers that are specifically designed for use in visual recognition tasks. These layers use filters or feature maps to learn and recognize features in the input data. They can be highly effective in visual tasks, for example, recognizing edges, patterns or objects in an image. In Figure 4, the general structure of CNN architecture is expressed.

The general components of CNNs are convolution layers. These are the layers that extract feature maps by performing convolution on the input data. In this way, features can be learned hierarchically. In this way, the learned information can be transferred between layers. After the convolution layers, an activation function is usually used. Activation functions are often used to solve nonlinear problems. When applying deep learning methods, all values obtained after matrix multipliers in the convolution layer are linear [24]. Activation functions are chosen depending on the structure of the estimation problem. Sigmoid, Softmax, Hyperbolic Tangent and ReLU are commonly preferred activation functions. After the convolution layers, pooling layers are usually used to provide scaling and position invariance. These layers can reduce the size of feature maps and images by reducing the number of parameters and highlighting important features. Finally, for the classification or regression process, the features extracted by the convolution layers are used in the fully connected layers to achieve the desired result. With the rapid development of the CPUs and GPUs of workstation computers, computational techniques are used to train CNNs more efficiently [25]. When the studies in the literature are examined, convolutional neural networks are the most popular and powerful tool for image processing, classification and segmentation. CNNs have achieved great success, especially in visual tasks such as image classification, object recognition and face recognition. Various architectural variations can be found, but the basic principles are generally similar [26].

While algorithms are inferring data from images, different extraction methods can be developed according to the characteristics of the image. Generally, pixel values in images are used in classification processes. Algorithms read images based on their pixel values. Each image is a combination of pixel values. Changing the pixel values will also change the image. The properties/inputs of the neural network become pixel values. Thus, the model reads the pixel values for an image, and performs feature extraction and classification. Here, a loss function is determined that measures how far the model’s predictions are from the true labels. This function provides a metric to evaluate the performance of the model. Then, if the model accuracy is not at the desired level, different back-propagation algorithms and optimization techniques are applied to update the weights to minimize the loss of the model. Stochastic gradient descent or derivative algorithms can be used here [27]. These steps are used in the training phase of the deep learning model. Once the model is trained, the trained network can be used to make inferences from new images. That is, new, unseen data can be predicted using the features learned by the model. This is usually performed on test data to evaluate the applicability and generalization capabilities of the model. Apart from the models we use, there are many studies in the literature, especially on GoogleNet and MobileNet.

GoogleNet is a complex architecture due to the Inception modules in its structure. GoogLeNet was the winner of the ImageNet 2014 competition, with 22 layers and an error rate of 5.7%. Overall, it is one of the first CNN architectures to move away from stacking convolution and pooling layers on top of each other in a sequential structure. This new model also has a significant impact on memory and power utilization. To cope with this, parallel interconnected modules were used to avoid excessive power consumption [28].

MobileNet, like other models, is an efficient convolutional neural network model for image recognition applications. MobileNet uses deeply separable convolutions and has 28 layers, excluding depth and point convolutions. This significantly reduces the number of parameters compared to regular convolutional networks of the same depth. Depth separable convolution allows the depth and spatial dimension of a filter to be separated. In addition, MobileNet provides two simple global hyperparameters that efficiently trade off between delay and accuracy. The MobileNet network structure is another factor that improves performance. However, it has less computational power to run or implement transfer learning [29].

3.3. VGG Architecture

VGG (Visual Geometry Group) is a deep learning algorithm and is one of the many network models that emerged especially after the success of AlexNet. It is a network of 13 convolutional, 3 fully connected layers used by the University of Oxford Visual Geometry Group to achieve higher success rates in the ILSVRC-2014 competition. There are 41 layers in total, with Maxpool, Relulayer, Fullconnectedlayer, Dropoutlayer and Softmaxlayer layers in the network structure. In this architecture, the image to be included in the input layer is 224 × 224 × 3 in size but the last layer is again the classification layer [4]. It uses a simpler structure instead of too many hyperparameters, as within the VGG architecture. In this way, it also simplifies its neural network architecture. VGG16 and VGG19 expressions in the literature are distinguished from the number of layers. Figure 5 shows the VGG algorithm adapted to our model.

3.4. ResNet Architecture

ResNet architecture has a different structure from architectures such as VGG and AlexNet. Although ResNet micro-architecture differs from other structures with its micro-architecture module structure, some transitions between layers can be ignored and transitions to the lower layer can be made. With these features, ResNet architecture has succeeded in increasing the success rates to higher levels. ResNet is the 2015 winner of the ILSVRC competition. By introducing the concept of learning on CNN, it developed a 152-layer convolutional model and designed an effective method for training deep networks. With this feature, ResNet was the first architecture to show success above human performance. The most important feature of ResNet, which distinguishes it from classical models, is that the residual values between the linear and ReLU layers will create a faster model with the addition of blocks that feed the next layers. There are two 3 × 3 convolution filters in each residue block, and the number of steps is chosen as 2. Since the model becomes more difficult to optimize as it gets deeper, the best solution for ResNet is to use a jumper link that allows taking the activation function from one layer and feeding it to another layer. Figure 6 shows the transition between layers of the ResNet architecture.

3.5. DenseNet Architecture

During neural network training, feature extraction maps are reduced due to convolution and subsampling operations. However there is a loss of image properties when transitioning between layers. DenseNet architecture has been developed to use image features more effectively. DenseNet connects each layer forward to other layers due to its feature structure. In this architecture, each layer uses the properties of all the previous layers as input and gives all the properties to all subsequent layers as input. Another important feature of DenseNet is that it reduces the number of parameters to enable feature propagation. Although DenseNet has one of the architectures that makes the best use of the transfer function, the spread rate of the feature structure within the network is quite high [30]. Figure 7 shows the layers of the DenseNet architecture.

3.6. SqueezeNet Architecture

Compared to the AlexNet architecture, the SqueezeNet model, which uses much fewer parameters and provides a similar rate of accuracy, is kept to a small memory by using a feature compression method. SqueezeNet architecture is one of the leading models developed for classification and increasing the network accuracy of convolutional neural network algorithms popular in image processing. It was first introduced in 2016 in a paper called “Same accuracy as AlexNet, 50 times smaller in size”. Its main goal is to achieve the same level of accuracy with fewer parameters compared to typically large-sized CNN models. The SqueezeNet architecture works faster than other algorithms because the workload in the neural network is reduced thanks to efficient distributed layers [31]. Figure 8 shows the layers and connections of the SqueezeNet architecture.

3.7. Machine Learning Classifiers

Machine learning is an application of artificial intelligence that allows computers to learn and evolve on their own by accessing the data that we provide to them. Machine learning can also be defined as the process of teaching how to make accurate predictions with the correct parameter values, after filtering the data with different feature extraction techniques. In solving machine learning problems, there are three main categories: supervised, unsupervised and reinforced learning. In addition to these categories, a semi-supervised learning model is also used. The classification of these categories is related to how the data corresponding to the learning methods are processed and analyzed. When we examine the studies in the literature, we come across many machine learning algorithms. Support Vector Machines (SVM), Linear Discriminant Analysis (LDA), K-Nearest Neighbor (k-NN), Naive Bayes (NB), Decision Tree (DT) are among the most used methods [22].

Support Vector Machines: The main purpose of the SVM method is to project nonlinear separable samples to another high-dimensional space plane using different kernel algorithms. The most important issue here is the role of kernel functions in the transition from linearity to nonlinearity. If we give the most known kernel functions as an example, Polynomial, Linear, Sigmoid and Radial basis functions can be given [32]. Although linear kernel functions are often used for large feature sets, quadratic kernels are a common type of polynomial kernel.

K-Nearest Neighbor: This algorithm is the most easily understandable classification method that works with heuristic methods. Here, unlabeled objects are classified by assigning them to the class of similarly labeled examples. Class labels of the input features to be classified are assigned to their closest neighbors. The most basic rule class here is the distance between the input feature to be allocated and the set of features in the training set according to the Euclidean distance rule [33].

Decision Tree: Decision Tree (DT), which is an inductive learning method, consists of a root node for a dataset, several connected internal nodes and leaf nodes for the remaining parts. Here, each leaf node corresponds to a decision, while all other nodes correspond to feature matching. Each non-leaf node in the created algorithm contains a subset. The data samples are divided into sub-nodes according to the feature matching results. Here, the part known as the root node covers the entire dataset. The easiest way to construct a Decision Tree is to split feature fields.

Logistic Regression: Logistic regression is a statistical method used to understand and classify complex and fuzzy events. The logistic function used here and applied as a machine learning technique is actually an analysis method used for classification. Although it is called regression, it is frequently used especially in linear classification problems [34].

Naïve Bayes: The Naive Bayes (NB) theorem, put forward by Thomas Bayes, is a method that aims to check all probabilities. Bayes’ theorem, which is frequently used in probability theory, relates the conditional and extreme probabilities of two random events, and calculations are generally used to calculate the remaining probabilities. This theory, developed by Naive Bayes classifiers, is a method that can be trained very efficiently in a supervised learning environment [35].

3.8. Ensemble Learning

Ensemble learning is one that provides model building by training multiple learners as a community member instead of training the model with a single learner. Here, the aim is that the predictions of the models on the ensemble data sets will give a higher accuracy decision than the individual predictions [36]. Success criteria in these methods is evaluated according to the learning success of basic learners and their differences from each other. Ensemble learning is used to increase the performance of the model by choosing the good and bad characteristics of the learners and at the same time eliminating the possibility of making a bad choice [37]. In this research, the results obtained by using the voting method, one of the ensemble learning methods, were compared with other methods.

Voting Method: One or more classification algorithms can be trained with the same training set, or a single model can be trained with the same data set using different parameter values. In this way, different classification models are created, and the final output value is produced by using the voting method of all outputs obtained as a result.

3.9. Parameter Optimization

In deep learning algorithms, hyperparameters are special parameters that control the learning process of the model and need to be tuned. These hyperparameters are the values that drive the architecture of the network and the training process and can influence the success of a particular deep learning model. Correctly tuning the hyperparameters can help the model achieve better performance [38].

Parameter optimization has become increasingly necessary in the development of deep learning models in recent years, as a result of increasing the number of neural networks to find the best accuracy result and a model designed with fewer weights and parameters. Since the choice of hyperparameters is difficult, it is also difficult to adapt it to experimental values. The tuning of the hyperparameters is a complex and carefully designed structure. For widely used models, hyperparameters can be set manually, as researchers can take examples from previous studies. For small-scale models, hyperparameters can be adjusted manually. But for larger models or newly published models, finding hyperparameters requires a lot of experimentation by researchers [39].

Hyperparameters can be divided into two groups: those used for model training and those used for model design. Choosing appropriate hyperparameters for model training enables neural networks to learn faster and achieve improved performance. The most widely adopted optimization algorithms for training a deep neural network are momentum and stochastic gradient descent, as well as AdaGrad, RMSprop and Adam. Particle size and learning rate are the most important factors, as these determine the convergence rate of the neural network during the training process. Hyperparameters used for model design are more related to the structure of the neural networks. The most typical example of this is the number of hidden layers and the width of the layers [40]. To explain the most important parameter values:

Learning Rate: This hyperparameter controls the amount by which the weights of the network are updated. A high learning rate can update the weights quickly, but there is a risk of overdoing it. A low learning rate can slow down the learning process of the model. In most cases, the learning rate must be manually adjusted during model training, and this adjustment is often necessary to achieve high accuracy [41].

Epoch Number: Epoch means that the entire training data is presented to the model once. The number of epochs determines how many times the model sees all the training data. Too many epochs can lead to overfitting, while insufficient epochs may not allow the model to complete its learning.

Mini-Batch Size: Batch size refers to the number of samples used in each training iteration. Small batch sizes can generally result in faster training processes, but can affect the overall model performance. Also, the randomly generated training sets used by the Probabilistic Gradient Projection Algorithm are called Mini-Batches. The gradient calculation is performed on the samples in the Mini-Batch set.

The process of hyperparameter tuning usually involves trial and error. By trying different hyperparameter values, one tries to find the combination that provides the best performance. This process is important to increase the generalizability of the model and avoid overfitting to the training data.

3.10. Evaluation Criteria

Artificial intelligence applications work on the principle of trial, feedback, correction and result. Before the research, a model is created and feedback on the validity of the model is checked. Afterwards, necessary improvements are made and the model is expected to reach the expected accuracy. Test results are measured with different metric values. The performance of the model is determined according to the results obtained from here. Evaluation criteria play a very important role in comparing different models and distinguishing results.

Various performance measures are used to estimate success rates as a result of classification processes. The most well-known criterion used in classification problems is the accuracy (ACC) metric. However, the accuracy criterion does not always provide certainty when used alone. Other measurement metrics need to be used to make a more precise and reliable analysis [42].

When the studies in the literature are examined, besides the accuracy measurement metric, precision (Prec), sensitivity (Recall) and F1 score (F1) metrics were observed to be used. These values can be calculated in matrix form by using the confusion matrix. True Positive (TP), True Negative (TN), False Positive (FP) and False Negative (FN) values can be calculated in the classification results’ confusion matrices [43]. Table 2 shows the components of the confusion matrix.

Accuracy = \frac{T P + T N}{T P + F P + F N + T N}

Precision = \frac{T P}{T P + F P}

Recall = \frac{T P}{T P + F N}

F 1 - Score = \frac{2 \times P r e c i s i o n \times R e c a l l}{P r e c i s i o n + R e c a l l}

A confusion matrix is a table that is often used to numerically determine the performance of a classification method on a test dataset where the actual values are known. The results of the confusion matrix on the data sets we used in the study are as follows:

TP (True Positive): It means sick to patient.
FP (False Positive): It means sick to someone who is not sick.
TN (True Negative): It means not sick to those who are not sick.
FN (False Negative): Saying that the sick person is not sick.

Here the sensitivity metric gives the percentage of correctly identifying those who are sick. The precision metric shows in percentage terms how many of those we call sick are actually sick. It can be seen that two important metrics, sensitivity and precision, are interdependent. The F1 score is used to eliminate the ambiguity that may arise here. In the F1 score calculation, the harmonic mean is used instead of the arithmetic mean. In some cases, it may be more advantageous to detect the sensitivity metric correctly than incorrectly. It is more acceptable to misjudge a patient with cancer and call someone who does not have cancer to the hospital, rather than failing to detect a patient with cancer and causing their death. However, we can call everyone to the hospital and find all the cancer cases, but this can lead to many false positives. There is a similar situation with the certainty metric. It recommends being absolutely sure before calling someone sick. If a person is diagnosed as sick and turns out to be really sick, the accuracy is 100%, but if the rest of the people are not identified, there is a big mistake.

Applications made in the Python 3.12.1 language were developed in open source libraries and the Google Colab platform, which provides free GPU support, and this software language was used in classifying the created models and distinguishing the results. Our study was conducted on a laptop computer with Apple M1 pro processor, 8-core CPU, 14-core GPU, 16-core Neural Engine and 200 GBps memory bandwidth, 3024 × 1964 resolution at 254 pixels per inch, 32 GB combined memory and 1 TB SSD.

4. Experimental Results

In this part of the research, the classification results are shown of benign and malignant tumors belonging to the dataset described in the method section. First of all, precision, sensitivity, F1 score and accuracy were calculated based on the results obtained by classical machine learning methods, and then confusion matrices were created. In addition, graphs of CNN algorithms showing accuracy rates and epoch numbers are also shown.

In order to increase the performance of individual classifiers, first of all the performance criteria were calculated with the default values of the CNN algorithms, then the current results were examined again by parameter optimization. In addition, the ensemble learning method was applied, and a confusion matrix was created using the voting method. All parameter values, accuracy percentages of the models and comparisons of the models used during the training are shown. Optimum parameters used depending on the data sets are given together with the classification algorithms. While finding these values, many experiments were conducted and calculations were made based on the optimum values. In deep learning methods, the number of rounds is often determined according to the problem to be solved. For example, in our study, the number of rounds could have been calculated as 50 or 100. Here, it was observed that increasing the number of rounds decreased the success rate. In addition to the low success rate, it causes more computational costs and consumes more processing power. Therefore, all values are set as optimum.

4.1. Machine Learning Method and Results

Here, firstly, results regarding brain MR image classification were obtained using our dataset. Our purpose was to compare the results using pre-trained CNNs as feature extractors and different ML algorithms for classification. Feature extraction was performed using networks trained on ImageNet of the four architectures used in neural network prediction. This allowed us to leverage the knowledge captured by models in the form of learned features. The effectiveness of each pre-trained model was evaluated by measuring key performance metrics such as F1 score, recall, precision and accuracy. The machine learning algorithms LDA, SVM, K-NN, DT and NB use a model learned from the split dataset to make predictions or perform classifications for new data points. The models were trained using the Scikit-Learn library and default parameters were used. In addition, a vector of 30,056 features was created by combining the features obtained separately from four different CNN models (VGG, DenseNet, ResNet, SqueezeNet). This is the total number of features extracted separately from MR images for each model used. Here, the newly created features were trained with the same classification algorithms. Table 3 show the accuracy rates of the models and classification algorithms used in feature extraction.

Looking at the model and classification accuracy rates, it was observed that the SVM classifier gave the highest accuracy rates for each model. Here, the features provided by the DenseNet algorithm showed the highest success rate when classified with SVM. In contrast, all classifications with DT showed very low accuracy rates. This may be due to the fact that the model does not fit the training data very well and cannot cope with the data. As a result, overfitting can be occur.

Furthermore, ML classifiers were applied to the hybrid method that we developed. This method, the results of which have been observed through numerous experiments, proposes to classify trained CNN algorithms by combining them into binary, ternary and quadruple combinations. Figure 9 shows the DenseNet + SVM confusion matrix and Figure 10 shows the SqueezeNet + DenseNet + ResNet + VGG confusion matrix.

Figure 11, Figure 12, Figure 13 and Figure 14 show performance metrics for all classification methods. The figures show that all classification accuracy percentages vary. There may be many reasons for this. For example, the DenseNet + DT classification shows a low accuracy rate in general, but the SVM classification of the same model shows the highest accuracy rate. Therefore, only the confusion matrix for this classification is shown. In addition, the accuracy rate of the hybrid model, whose confusion matrix is shown in Figure 10, was found to be 83%. As a result of our initial training and classification studies, we can say that the results we found are quite successful considering similar studies in the literature. Based on these promising results, we further investigated the effectiveness of combining multiple features extracted by different pre-trained CNN models. In the second phase of our study, we optimized our trained data with the most appropriate parameter values to obtain more successful results.

4.2. Results of CNN Models before–after Parameter Optimization

Parameter optimization is a process used to improve the performance of a machine learning model. In this process, selecting or tuning specific parameter values is critical to ensure that the model achieves the best performance. In this section, the accuracy rates of CNN algorithms after optimization with the most appropriate parameter values are calculated. While making these calculations, many trials were performed for each CNN algorithm. During these experiments, the number of rounds, learning rate, number of layers and functions were tried many times. In addition, while performing parameter optimization, all data regarding model overfitting and misfitting, cross-validation and training times were meticulously tested and applied. Figure 15, Figure 16, Figure 17 and Figure 18 show the confusion matrices of the CNN algorithms before parameter optimization. These four models were trained with the Imagenet dataset by adding a classification layer to each model. Each model was trained for 30 rounds. For the selection of the model, the accuracy rates in the evaluation set during training were followed, and the model in the tour that gave the highest evaluation accuracy rate during the training was selected.

Section 3.9 provides some information on parameter optimization. Here, Table 4 and Table 5 shows the parameter values and some optimization functions common to all calculations. Our experiments show that the adaptive moment estimation (Adam) and stochastic gradient descent (SGD) functions give the best results. The Adam optimization algorithm is a widely used optimization algorithm, especially in deep learning models. This algorithm is among the gradient-based optimization methods and is designed to speed up the learning process and make it more efficient. The parameter update rule of the Adam function is as follows:

(1): First, the first and second moments of the gradient are calculated. These are the first moment: the moving average of the gradient (momentum) and moving average of the square of the gradient (RMSprop).
(2): The calculated moments are corrected with correction terms.
(3): Update the parameters.

Here, Table 4 shows the parameter values that are used.

Here, Table 5 shows which CNN algorithm uses which optimization function and their accuracy rates.

Another parameter function used in the experiment, SGD, is used to improve the performance of the models, just like the Adam function. The most important feature of grid search optimization is that it systematically tries all combinations within a given set of hyperparameters to obtain the best performance. Grid search creates a grid by specifying a range of parameters and their possible values. It then trains the model with each combination on this grid and evaluates the performance of each combination using a specific performance metric. It repeats this process until it finds the parameter combination with the best performance. The parameters to be used for each CNN architecture used during the experiment were searched by grid search. As a result of the experiments carried out here, the learning rate values of 0.0001, 0.001, 0.01 and 0.1 for the Adam and SGD functions seen in Table 5 were evaluated. In addition, five was used as a tolerance value for the evaluation loss values obtained in the training rounds. In this case, if the evaluation loss did not decrease in five consecutive rounds, the model was considered to have memorized the training data and training was stopped.

As a result of optimization, it was observed that the ResNet architecture reached an accuracy rate of 100%. While reaching this accuracy rate, many factors affecting the model, such as the number of learnings, number of rounds and parameter function, were found after many trials. Figure 19 shows the accuracy rates after parameter optimization.

Figure 20 shows the learning graph of the ResNet architecture in each round after optimization. At the end of the 30th round, the model shows 100% learning success. More trials here will not change the model accuracy. Figure 21 also shows the confusion matrix of the ResNet architecture since it gives the best results. Here, since the correct activation function and number of layers are chosen, the learning ability of the model is maximized. In addition, adjusting the learning rate and setting the hyperparameters (number of epochs, number of Mini-Natches) affect the model to learn faster and more effectively. So, when the performance metrics in Table 6 are analyzed, it is seen that all the results are accurate.

4.3. Ensemble Learning Results

Here, using the ensemble learning method, a single decision was taken from the decisions taken by the four models by voting method. The ensemble model exploits the strengths of individual features and results in a more robust and comprehensive representation of image models. The features extracted from the four different CNN networks used in our model were hybrid modeled, and the accuracy rates were calculated by creating a single model. The result is the confusion matrix shown in Figure 22 and the results in Table 7. When this hybrid method was tested on the test set, it achieved 99% accuracy. Due to the hybrid development of the model presented here and the combination of CNN algorithm features, it has brought a different perspective and innovation to the studies in this field. The result obtained seems to be quite successful.

5. Conclusions

The most important issue in medicine is the early diagnosis and treatment of diseases. Therefore, early diagnosis is vital, especially in cancer. It is very important for patients to start the treatment process early and manage the entire process correctly. This critical situation constitutes the basic principle of this study. The main purpose of this study is to assist healthcare professionals by utilizing artificial intelligence technologies applied in the field of healthcare. In conclusion, the research presents a comprehensive and innovative approach to the classification and diagnosis of brain tumors using artificial intelligence, specifically employing convolutional neural networks (CNNs) such as VGG, ResNet, DenseNet and SqueezeNet.

The study utilized a sizable dataset of 7022 brain MR images obtained from the Kaggle library, which was divided into 40% for testing and 60% for training to ensure unbiased evaluation. Then, training and testing were carried out using this dataset. As a result of classifying four different CNN architectures with machine learning methods, the highest accuracy of 85% was found with the SVM classification method of the DenseNet architecture. In addition, a success rate of 83% was achieved by classifying the hybrid algorithm created with features extracted from four different CNN architectures with LDA.

In the second part of the experiments, it is seen that the ResNet architecture reached 99% accuracy in the classification made with default parameter values before parameter optimization. Here, a 100% success rate was achieved as a result of optimizing the Resnet algorithm parameter functions and re-applying them to the test set. When we examine the studies conducted in the literature for this dataset and the ResNet model, we see that such a high success rate has been achieved for the first time. Finally, community learning was performed in the classification and a 99% success rate was achieved by applying the voting method. Furthermore, the utilization of ensemble learning methods added another layer of sophistication to the classification process, ultimately contributing to the identification of the most effective validation method. The entire research was conducted using the Python programming language, emphasizing the adaptability and versatility of AI applications in the biomedical field.

The findings of this study not only contribute to the growing body of knowledge in the domain of biomedical image processing but also highlight the potential of artificial intelligence, particularly ResNet architecture, in achieving highly accurate classifications in the context of brain tumor diagnosis. The 100% accuracy rate attained by ResNet underscores its robustness and effectiveness in handling complex medical imaging tasks.

In comparison with the existing literature, the research results were benchmarked, showcasing competitive or superior performance. The systematic evaluation and comparison of various architectures and machine learning methods contributes to a deeper understanding of their applicability in real-world scenarios. Overall, this research underscores the promising role of artificial intelligence in advancing diagnostic capabilities in the field of medical imaging, offering new possibilities for accurate and efficient brain tumor classification.

Author Contributions

Author Contributions: Data supply, M.G.; methodology, M.G. and E.N.; writing original draft preparation and review, M.G. and E.N. All authors have read and agreed to the published version of the manuscript.

Funding

This study is derived from Mustafa Güler’s PhD Thesis. This study was supported by Istanbul University-Cerrahpaşa Scientific Research Projects Coordination Unit. Project no: 35916.

Data Availability Statement

The data presented in this study are openly available in: https://www.kaggle.com/masoudnickparvar/brain-tumor-mri-dataset.

Conflicts of Interest

The authors declare no conflict of interest.

References

Mathew, A.; Amudha, P.; Sivakumari, S. Deep Learning Techniques: An Overview. In International Conference on Advanced Machine Learning Technologies and Applications; Springer: Singapore, 2020; pp. 599–608. [Google Scholar]
Apostolopoulos, I.D.; Aznaouridis, S.I.; Tzani, M.A. Extracting Possibly Representative COVID-19 Biomarkers from X-ray Images with Deep Learning Approach and Image Data Related to Pulmonary Diseases. J. Med. Biol. Eng. 2020, 40, 462–469. [Google Scholar] [CrossRef] [PubMed]
Akgul, A.; Kaya, V.; Unver, E.; Karavas, E.; Baran, A.; Tuncer, S. COVID-19 Detection on X-ray Images Using a Deep Learning Architecture. J. Eng. Res. 2022, 11, 15–26. [Google Scholar] [CrossRef]
Younis, A.; Qiang, L.; Nyatega, C.O.; Adamu, M.J.; Kawuwa, H.B. Brain Tumor Analysis Using Deep Learning and VGG-16 Ensembling Learning Approaches. Appl. Sci. 2022, 12, 7282. [Google Scholar] [CrossRef]
Gürkahraman, K.; Karakiş, R. Veri Büyütme Kullanarak Derin Öğrenme ile Beyin Tümörleri Sınıflandırması. Gazi Üniversitesi Mühendislik Mimar. Fakültesi Derg. 2021, 36, 997–1011. [Google Scholar] [CrossRef]
Noreen, N.; Palaniappan, S.; Qayyum, A.; Ahmad, I.; Imran, M.; Shoaib, M. A Deep Learning Model Based on Concatenation Approach for the Diagnosis of Brain Tumor. IEEE Access 2020, 8, 55135–55144. [Google Scholar] [CrossRef]
Kazdal, S.; Dogan, B.; Camurcu, A.Y. Computer-Aided Detection of Brain Tumors Using Image Processing Techniques. In Proceedings of the 2015 23rd Signal Processing and Communications Applications Conference (SIU), Malatya, Turkey, 16–19 May 2015; pp. 863–866. [Google Scholar]
Ramteke, R.; Monali, K.Y. Automatic Medical Image Classification and Abnormality Detection Using k-Nearest Neighborhood. Int. J Adv. Comput. 2012, 2, 190. [Google Scholar]
Aleid, A.; Alhussaini, K.; Alanazi, R.; Altwaimi, M.; Altwijri, O.; Saad, A.S. Artificial Intelligence Approach for Early Detection of Brain Tumors Using MRI Images. Appl. Sci. 2023, 13, 3808. [Google Scholar] [CrossRef]
Sultan, H.H.; Salem, N.M.; Al-Atabany, W. Multi-Classification of Brain Tumor Images Using Deep Neural Network. IEEE Access 2019, 7, 69215–69225. [Google Scholar] [CrossRef]
Demirhan, A.; Güler, İ. Automatic Segmentation of Tumors, Edema and Healthy Tissues in the Brain Using Neural Fuzzy Inference System. In Proceedings of the Signal Processing and Communications Applications Conference (SIU), Trabzon, Turkey, 23–25 April 2014; pp. 120–123. [Google Scholar]
Sajjad, M.; Khan, S.; Muhammad, K.; Wu, W.; Ullah, A.; Baik, S.W. Multi-Grade Brain Tumor Classification Using Deep CNN with Extensive Data Augmentation. J. Comput. Sci. 2019, 30, 74–82. [Google Scholar] [CrossRef]
Divya, S.; Padma Suresh, L.; John, A. A Deep Transfer Learning Framework for Multi Class Brain Tumor Classification Using MRI. In Proceedings of the 2020 2nd International Conference on Advances in Computing, Communication Control and Networking (ICACCCN), Greater Noida, India, 18–19 December 2020; pp. 283–290. [Google Scholar] [CrossRef]
Mercaldo, F.; Brunese, L.; Martinelli, F.; Santone, A.; Cesarelli, M. Object Detection for Brain Cancer Detection and Localization. Appl. Sci. 2023, 13, 9158. [Google Scholar] [CrossRef]
Paul, A.; Chauhan, P.; Sharma, H.; Khosla, K.; Srivastava, V.; Kumar, A. Classification of Brain Tumor Images Using Enhanced Deep Learning-based Methodologies; Springer: Singapore, 2022; pp. 519–532. [Google Scholar]
Rehman, A.; Naz, S.; Razzak, M.I.; Akram, F.; Imran, M. A Deep Learning- Based Framework for Automatic Brain Tumors Classification Using Transfer Learning. Circuits Syst. Signal Process. 2020, 39, 757–775. [Google Scholar] [CrossRef]
Eker, A.G.; Duru, N. Deep Learning Applications in Medical Image Processing. Acta Infologica 2021, 5, 459–474. [Google Scholar]
Arora, A.; Jayal, A.; Gupta, M.; Mittal, P.; Satapathy, S.C. Brain Tumor Segmentation of MRI Images Using Processed Image Driven U-Net Architecture. Computers 2021, 10, 139. [Google Scholar] [CrossRef]
Anitha, R.; Raja, D.S.S. Development of Computer-Aided Approach for Brain Tumor Detection Using Random Forest Classifier. Int. J. Imaging Syst. Technol. 2017, 28, 48–53. [Google Scholar] [CrossRef]
Akkus, Z.; Ali, I.; Sedlar, J.; Agrawal, J.P.; Parney, I.F.; Giannini, C.; Erickson, B.J. Predicting Deletion of Chromosomal Arms in Low-Grade Gliomas from MR Images Using Machine Intelligence. J. Digit. Imaging 2017, 30, 469–476. [Google Scholar] [CrossRef] [PubMed]
Hoffman, M. Brain Structure and Its Parts. 2021. Available online: https://www.webmd.com/brain/picture-of-the-brain (accessed on 8 January 2024).
Dandil, E. Machine Learning Based Brain Tumor Detection Method and Application with MR Images and MR Spectroscopy Data. Ph.D. Thesis, Institute of Science and Technology, Sakarya University, Sakarya, Turkey, 2015. [Google Scholar]
Brain Tumor MRI Dataset|Kaggle. Available online: https://www.kaggle.com/masoudnickparvar/brain-tumor-mri-dataset (accessed on 5 May 2022).
Tharwat, A.; Gaber, T.; Ibrahim, A.; Hassanien, A.E. Linear Discriminant Analysis: A Detailed Tutorial. AI Commun. 2017, 30, 169–190. [Google Scholar] [CrossRef]
Goodfellow, I.; Bengio, Y.; Courville, A. DeepLearning; MIT Press: Cambridge, MA, USA, 2016; pp. 164–341. [Google Scholar]
Guo, Y.; Liu, Y.; Oerlemans, A.; Lao, S.; Wu, S.; Lew, M.S. Deep learning for visual understanding: A review. Neurocomputing 2016, 187, 27–48. [Google Scholar] [CrossRef]
Gu, J.; Wang, Z.; Kuen, J.; Ma, L.; Shahroudy, A.; Shuai, B.; Chen, T. Recent advances in convolutional neural networks. Pattern Recognit. 2018, 77, 354–377. [Google Scholar] [CrossRef]
Szegedy, C.; Liu, W.; Jia, Y.; Sermanet, P.; Reed, S.; Anguelov, D.; Erhan, D.; Vanhoucke, V.; Rabinovich, A. Going Deeper with Convolutions. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Boston, MA, USA, 7–12 June 2015; pp. 1–9. [Google Scholar]
Howard, A.G.; Zhu, M.; Chen, B.; Kalenichenko, D.; Wang, W.; Weyand, T.; Andreetto, M.; Adam, H. MobileNets: Efficient Convolutional Neural Networks for Mobile Vision Applications. arXiv 2017, arXiv:1704.04861. [Google Scholar]
Kumar, R. Adding Binary Search Connections to Improve DenseNet Performance. In Proceedings of the 5th International Conference on Next Generation Computing Technologies (NGCT-2019), Dehradun, India, 4–5 September 2020. [Google Scholar]
Ozyurt, F.; Sert, E.; Avcı, D. An Expert System for Brain Tumor Detection: Fuzzy C- Means with Super Resolution and Convolutional Neural Network with Extreme Learning Machine. Med. Hypotheses 2020, 134, 109433. [Google Scholar] [CrossRef]
Hong, H.; Pradhan, B.; Bui, D.T.; Xu, C.; Youssef, A.M.; Chen, W. Comparison of Four Kernel Functions Used in Support Vector Machines for Landslide Susceptibility Mapping: A Case Study at Suichuan Area (China). Geomat. Nat. Hazards Risk 2017, 8, 544–569. [Google Scholar] [CrossRef]
Hellman, M.E. The Nearest Neighbor Classification Rule with a Reject Option. IEEE Trans. Syst. Sci. Cybern. 1970, 6, 179–185. [Google Scholar] [CrossRef]
Connelly, L. Logistic Regressio. Medsurg. Nurs. 2020, 29, 353–354. [Google Scholar]
Bayes, F.R.S. Essay Towards Solving a Problem in the Doctrine of Chances. Biometrica 1958, 45, 296–315. [Google Scholar] [CrossRef]
Yang, Y. Temporal Data Mining via Unsupervised Ensemble Learning; Elsevier: Amsterdam, The Netherlands, 2016. [Google Scholar]
Manikandan, G.; Karthikeyan, B.; Rajendiran, P.; Harish, R.; Prathyusha, T.; Sethu, V. Breast Cancer Prediction Using Ensemble Techniques. Scopus Ijphrd Cit. Score 2019, 10, 183. [Google Scholar] [CrossRef]
Buber, E.; Sahingoz, O.K. Image Processing with Machine Learning System and Setting Optimal Parameters. In Proceedings of the International Artificial Intelligence and Data Processing Symposium, Malatya, Turkey, 16–17 September 2017; pp. 1–5. [Google Scholar]
Tan, M.; Chen, B.; Pang, R.; Vasudevan, V.; Sandler, M.; Howard, A.; Le, Q.V. Mnasnet: Platform-aware neural architecture search for mobile. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Long Beach, CA, USA, 15–20 June 2019; pp. 2820–2828. [Google Scholar]
Duchi, J.; Hazan, E.; Singer, Y. Adaptive subgradient methods for online learning and stochastic optimization. J. Mach. Learn. Res. 2011, 12, 2121–2159. [Google Scholar]
Bengio, Y. Practical recommendations for gradient-based training of deep architectures. In Neural Networks: Tricks of the Trade; Springer: Berlin/Heidelberg, Germany, 2012; pp. 437–478. [Google Scholar]
Liu, Y.; Zhou, Y.; Wen, S.; Tang, C. A Strategy on Selecting Performance Metrics for Classifier Evaluation. Int. J. Mob. Comput. Multimed. Commun. 2014, 6, 20–35. [Google Scholar] [CrossRef]
Hossin, M.; Sulaiman, M.N. A Review on Evaluation Metrics for Data Classification Evaluations. Int. J. Data Min. Knowl. Manag. Process 2015, 5, 1–11. [Google Scholar]

Figure 1. Glioma and malignant brain tumor.

Figure 2. Pituitary and no-tumor (healthy) brain tumor.

Figure 3. Brain tumor dataset model flow chart.

Figure 4. Single-layer convolutional neural network.

Figure 5. VGG architecture [4].

Figure 6. Residual block or skip connection on ResNet.

Figure 7. DenseNet architecture layer connection.

Figure 8. SqueezeNet architecture layer connection.

Figure 9. DenseNet + SVM.

Figure 10. SqueezeNet + DenseNet + ResNet + VGG.

Figure 11. SqueezeNet and machine learning classifiers performance metrics.

Figure 12. ResNet and machine learning classifiers performance metrics.

Figure 13. DenseNet and machine learning classifiers performance metrics.

Figure 14. VGG and machine learning classifiers performance metrics.

Figure 15. Resnet architecture before optimization.

Figure 16. VGG architecture before optimization.

Figure 17. SqueezeNet architecture before optimization.

Figure 18. DenseNet architecture before optimization.

Figure 19. Accuracy rates of CNN algorithms after parameter optimization.

Figure 20. ResNet epoch numbers.

Figure 21. ResNet architecture after optimization.

Figure 22. Voting method confusion matrix.

Table 1. Comparison of accuracy rates of brain tumor studies.

References	Model	Dataset Quantity	Accuracy
[14]	YOLO	300	0.94
[9]	Hybrid Model	500	0.99
[4]	VGG-16	253	0.98
[15]	ResNet50	2100	0.991
[5]	DenseNet121	3064	0.986
[18]	U-Net	BraTS 2018	0.98
[17]	CNN	BraTS 2020	0.80
[6]	Inception-V3, DenseNet	3064	0.993
[13]	ResNet50	3064	0.986
[16]	VGG, GoogleNet, AlexNet	3064	0.986
[13]	VGG16-VGG19	233	0.948
[12]	VGG-19	121	0.873
[10]	CNN	3580	0.961
[20]	CNN	159	0.870
[19]	CNN	985	0.85
[7]	CNN	497	0.842
[11]	Neuro-fuzzy	20	-
[8]	KNN, SVM	51	0.80

Table 2. Confusion matrix.

		Predicted Value
		Positive	Negative
Actual Value	Positive	True Positive (TP)	False Negative (FN)
Actual Value	Negative	False Positive (FP)	True Negative (TN)

Table 3. Machine learning accuracy rates.

CNN Architecture	LDA	SVM	KNN	DT	NB
SqueezeNet	0.780	0.848	0.815	0.762	0.785
DenseNet	0.837	0.857	0.778	0.454	0.640
ResNet	0.679	0.840	0.843	0.719	0.823
VGG	0.661	0.808	0.795	0.694	0.765
SqueezeNet + DenseNet + ResNet +VGG	0.837	0.833	0.736	0.487	0.634

Table 4. Common parameter values.

Parameter	Value
Number of Rounds	30
Mini-Batch Size	16
Learning Rate Timer	Reduce LR on Plateau
Learning Rate Timer Mode	Min
Learning Rate Timer Factor	0.1
Learning Speed Timer Patience Coefficient	10

Table 5. Parameter optimization values.

CNN	Learning Speed	Optimization Function	Accuracy	Test Accuracy
SqueezeNet	0.0001	Adam	0.980	0.975
DenseNet	0.001	SGD	0.996	0.997
ResNet	0.01	SGD	0.996	1.0
VGG	0.1	SGD	0.990	0.996
Ensemble	-	-	-	0.998

Table 6. Resnet after-optimization performance metric values.

	Precision	Recall	F1-Score	Accuracy
Glioma	1.0	1.0	1.0	1.0
Meningioma	1.0	1.0	1.0	1.0
No-Tumor	1.0	1.0	1.0	1.0
Pituitary	1.0	1.0	1.0	1.0

Table 7. Voting method performance metric values.

	Precision	Recall	F1-Score	Accuracy
Glioma	1.0	0.99	1.0	0.9987
Meningioma	0.99	1.0	1.0	0.9987
No-Tumor	1.0	1.0	1.0	1.0
Pituitary	1.0	1.0	1.0	1.0

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Güler, M.; Namlı, E. Brain Tumor Detection with Deep Learning Methods’ Classifier Optimization Using Medical Images. Appl. Sci. 2024, 14, 642. https://doi.org/10.3390/app14020642

AMA Style

Güler M, Namlı E. Brain Tumor Detection with Deep Learning Methods’ Classifier Optimization Using Medical Images. Applied Sciences. 2024; 14(2):642. https://doi.org/10.3390/app14020642

Chicago/Turabian Style

Güler, Mustafa, and Ersin Namlı. 2024. "Brain Tumor Detection with Deep Learning Methods’ Classifier Optimization Using Medical Images" Applied Sciences 14, no. 2: 642. https://doi.org/10.3390/app14020642

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Brain Tumor Detection with Deep Learning Methods’ Classifier Optimization Using Medical Images

Abstract

1. Introduction

2. Related Studies

3. Material and Method

3.1. Datasets

3.2. Convolutional Neural Network (CNN) Model

3.3. VGG Architecture

3.4. ResNet Architecture

3.5. DenseNet Architecture

3.6. SqueezeNet Architecture

3.7. Machine Learning Classifiers

3.8. Ensemble Learning

3.9. Parameter Optimization

3.10. Evaluation Criteria

4. Experimental Results

4.1. Machine Learning Method and Results

4.2. Results of CNN Models before–after Parameter Optimization

4.3. Ensemble Learning Results

5. Conclusions

Author Contributions

Funding

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI