1. Introduction
Blood cells are one of the basic components of the human body and are divided into WBCs, red blood cells and platelets [
1]. Bone marrow produces white blood cells called immune cells [
2]. WBCs protect the body from viruses, infections and other infectious diseases. WBCs are an essential component of the immune system and play a crucial role in protecting the body against infections and diseases. There are several types of WBCs, each with distinct functions and characteristics. However, abnormalities in WBC counts or alterations in their morphology can indicate the presence of various diseases. Accurate identification and classification of WBC types are essential for diagnosing specific blood disorders and determining appropriate treatment strategies. WBCs are classified into granular and nongranular cells [
3]. Granulocytes (multinuclear) are neutrophils, basophils and eosinophils [
4]. Nongranular (mononuclear) cells are lymphocytes and monocytes. An increase or decrease in the number of WBC responses to different diseases depends on the type of WBC [
5]. WBC diseases are diagnosed by testing a sample of blood under a microscope. WBCs are broadly classified into several types, including neutrophils, lymphocytes, monocytes, eosinophils and basophils. Neutrophils are the most common type and are involved in fighting bacterial infections. Lymphocytes are responsible for immune responses and can be further categorized as B cells, T cells and natural killer cells. Monocytes play a role in phagocytosis and the immune response to pathogens. Eosinophils are associated with allergic reactions and parasitic infections, while basophils are involved in inflammatory responses and allergies. Abnormalities in WBC counts or alterations in their morphology can indicate various blood disorders such as leukemia, lymphoma, infectious diseases, autoimmune disorder, and allergic reactions. The causes of these diseases can vary, including genetic factors, environmental factors, infections, immune system dysregulation and exposure to toxins. Treatment options for WBC-related diseases depend on the specific diagnosis and may include chemotherapy, radiation therapy, immunotherapy or bone marrow transplantation. Neutrophil cells increase in the blood due to metabolic disturbances, and hormonal conditions. Also, bacteria, fungi, and endogenous and exogenous toxins cause an increase in neutrophil cells in the blood [
6]. Hepatitis and whooping cough increase lymphocytes in the blood [
7]. Also, HIV, polio, tuberculosis and rubeola decrease lymphocytes in the blood [
8]. Monocytes increase in the blood due to diseases such as malaria, listeriosis and viral and bacterial infections [
9]. The value of eosinophils in the blood increases due to atopic diseases, allergies and parasites [
10]. Conditions associated with basophils include hypothyroidism, malignant myeloproliferative disease, renal disease syndrome and hemolytic anaemia [
11].
While lymphocytes, monocytes and neutrophils are involved in various hematological disorders, including some myeloproliferative diseases, it is important to note that the classification of myeloproliferative disorders primarily focuses on abnormalities related to the clonal proliferation of myeloid cells. While lymphocytes, monocytes and neutrophils play essential roles in the immune system and can be affected in various hematological disorders, their involvement in myeloproliferative diseases is not as prominent or defining as in other conditions, such as lymphoproliferative disorders or certain types of leukemia. The specific cell types affected in myeloproliferative disorders are more directly related to the abnormal clonal proliferation of myeloid cells in the bone marrow. Myeloproliferative disorders are a group of hematological conditions characterized by the excessive production of mature myeloid cells in the bone marrow. There are four main types of myeloproliferative disorders recognized by the WHO. Chronic myeloid leukemia (CML) is associated with the presence of the Philadelphia chromosome, resulting in the formation of the BCR-ABL fusion gene. This gene drives the uncontrolled growth and accumulation of mature myeloid cells, particularly granulocytes. Polycythemia vera (PV) is characterized by the overproduction of red blood cells (erythrocytes) in the bone marrow. While neutrophils and other myeloid cells can also be increased, they are not the primary focus of this disorder. Essential thrombocythemia (ET) involves the overproduction of platelets (thrombocytes) in the bone marrow, resulting in increased platelet counts. Myeloid cell abnormalities, including elevated neutrophils and monocytes, can also be observed in some cases, but they are not a defining feature. Primary myelofibrosis (PMF) is characterized by the excessive production of fibrous tissue in the bone marrow, which disrupts normal blood cell production. Neutrophils, monocytes and lymphocytes can be increased, but they are not the primary focus of the disease.
Thus, the diagnosis of many diseases takes place by examining a blood sample for a blood cell count test (hemogram). The hemogram test is based on spreading blood on a glass slide, staining it and then placing it under a microscope to evaluate blood cells [
12]. A normal number of WBCs ranges from 4500 to 11,000 units per microliter [
13]. In order to identify the types of WBC cells, the hematologist must perform many procedures, such as locating the cells and classifying the type of the microscopic image, which is tedious and time consuming. Also, hematology diagnosis depends on the segmentation and classification of WBC, which is difficult for hematologists because of their irregular sizes and shapes. Thus, potential manual errors and the time it takes to analyze and reach effective results can be reduced thanks to artificial intelligence techniques [
14]. The main goal of AI systems is to analyze WBC cells to classify WBC cell types. Based on the analysis of blood samples, automated systems determine the elements of interest for each type of WBC cell and show the morphology of each cell, such as shape, size, colour, texture, and nucleus. Segmentation of WBC cells is a complex task, and the challenge is to select cell boundaries, separate them, and remove artifacts. Many researchers have used machine learning techniques to classify WBC cell types and developed more effective CNN models.
The use of artificial intelligence (AI) techniques in white blood cell (WBC) classification indeed has significant benefits in the field of pathology. Accurate identification and classification of WBC types are crucial for diagnosing and treating various diseases, including infections, autoimmune disorders, and certain types of cancer. Here are some ways AI enhances WBC classification: AI algorithms analyze large datasets of WBC images and learn to identify and classify different WBC types with high accuracy. This helps pathologists by reducing the chances of human error and providing more consistent and reliable results. AI-powered systems process and classify WBC images much faster than manual methods. This significantly reduces the time required for analysis, allowing pathologists to diagnose and treat patients more efficiently. AI algorithms provide standardized classification criteria, minimizing the subjectivity and variability that may arise when different pathologists interpret WBC images. This leads to more consistent diagnoses and treatment decisions. AI techniques enable the automated screening of large volumes of WBC images, facilitating the identification of rare or abnormal WBCs that may be missed during manual examination. This helps in early detection and diagnosis of diseases. AI systems are used as training tools for pathologists and medical students. By analyzing and comparing large datasets, AI algorithms assist in teaching the subtle differences between different WBC types and improve the skills of practitioners in WBC classification. AI techniques aid in the analysis of vast amounts of data from WBC images, enabling researchers to discover patterns, correlations and new insights into various diseases. This contributes to advancements in pathology and facilitates the development of more targeted treatment strategies. However, it is important to note that AI systems should be used as a supportive tool rather than a replacement for human pathologists. The expertise and clinical judgment of pathologists remains crucial for accurate diagnosis and treatment decisions. AI augments their capabilities and improves the overall efficiency and accuracy of WBC classifications in pathology.
The morphological features of WBC cell types are similar in the early stage, so this study aims to extract hidden features using hybrid techniques between machine and deep learning based on features combined from more than one model.
The most important contributions to this study are as follows:
Improving blood slide images using overlapping average filters and Contrast limited adaptive histogram equalization (CLAHE);
Classification of WBC types by SVM based on hybrid features of VGG19-ResNet101, ResNet101-MobileNet and VGG19-ResNet101-MobileNet;
Classification of WBC types by FFNN based on hybrid features of CNN (VGG19, ResNet101 and MobileNet) and handcrafted features.
The rest of the paper is organized as follows:
Section 2 discusses the techniques and findings of previous studies.
Section 3 presents the methods and tools for analyzing blood slide images to classify WBC types.
Section 4 summarizes the results of WBC type classification systems.
Section 5 discusses and compares all strategies for classifying WBC types.
Section 6 concludes the study.
2. Related Work
Patil et al. [
15] employed canonical correlation analysis to classify multiple blood cells. The technique involved the extraction of overlapping nuclei, followed by training and classification using recurrent neural network (RNN) models based on canonical correlation analysis. The authors reported their findings on the successful classification of blood cells using this approach. Yusuf et al. [
16] utilized capsule networks for the classification of different types of WBCs. To improve the network’s performance, they fine tuned it and addressed challenges related to overfitting and dataset balancing. The capsule network achieved an impressive accuracy of 96.86% in classifying WBCs. Hüseyin et al. [
17] proposed a regional CNN methodology to locate white blood cells and classify them into different types. This approach aimed to facilitate the identification of various white blood cell diseases using the same image. The authors reported the successful implementation of their methodology for WBC classification. Mesut et al. [
18] developed a hybrid methodology combining quadratic discriminant analysis and CNN models for WBC classification. Relevant features were extracted using the Ridge test and information coefficient in conjunction with CNN models. The proposed methodology achieved an impressive classification accuracy of 97.95% for WBC types. To address the challenge of limited data, Khaled et al. [
19] proposed the use of generative adversarial networks (GANs). This approach aimed to overcome the lack of data, which often hinders system generalization. The CNN models’ weights were initialized either using preselected weights from the CIFAR-100 dataset or randomly. Maxim et al. [
20] presented two approaches for classifying WBC species. The first approach involved the classification of hand-extracted features, while the second approach utilized deep learning techniques. The evaluation was performed on spot images using fivefold cross-validation. The reported accuracies for machine learning and deep learning methods were 77.8% and 70.3%, respectively. Ahmed et al. [
21] proposed the usage of VGGNet for diagnosing leukemia. The authors highlighted the production of high-level features by VGGNet, which were then filtered using the statistically enhanced Negative Swarm method. The selection of 1000 out of 25,000 features was carried out using the SESS method, resulting in performance improvement. Channabasava et al. [
22] introduced the BCNet architecture and fine tuned it using three different optimizers. The performance of BCNet was compared with pretrained CNN models. The results indicated that BCNet with the RMSP optimizer outperformed the other optimizers and pretrained models, demonstrating the superiority of the proposed architecture. Yan et al. [
23] developed the WBC-Net network based on the UNet++ architecture and CNN models for WBC classification. WBC-Net incorporated a feature encoder with multiple layers to extract and combine various features using different metrics. The training process involved WBC segmentation improvement through decoding, Tversky index determination, and cross-entropy loss. Partha et al. [
24] proposed a method for segmenting the WBC nucleus by employing a colour space transformation and k algorithm. This method facilitated the separation of the nucleus from the rest of the image. The characteristics of the first and last layers of CNN models were collected for WBC type classification. The segmentation method achieved an accuracy of 98.61%, while the CNN model attained 96% accuracy. Asim et al. [
25] proposed the utilization of a CNN for the classification of WBCs following image optimization using the CLAHE method. To identify the most discriminative characteristics, the authors employed an optimization technique based on an ant colony, merging the optimized features and subsequently inputting them into an SVM for classification. Cecilia et al. [
26] put forth a framework for differential counting of WBCs, aiming to reduce image analysis time and enhance diagnostic efficiency. They employed a colour-shifting technique to highlight WBC cells in the images. The authors performed WBC cell segmentation using the watershed method and extracted chromaticity and texture features. These features were then classified using a random forest classifier. Yusuf et al. [
27] introduced a pretrained adaptive model for the classification of WBC species, which leverages knowledge gained from preexisting models. Notably, this adaptive model has the ability to adapt to target domains while disregarding domain differences. Xin et al. [
28] proposed a supervised self-learning approach for WBC classification. Their method involved extracting the frontal area of cell images using the K-means clustering method. Subsequently, the WBC area was extracted through concavity analysis. Colour and border features were then extracted using a noncontrast edge optimization operator, and these features were fed into an SVM for classification.
Thus, the researchers devoted their time and effort to achieving superior results for classifying WBC types. This study is distinguished from previous studies by extracting features from several CNN models and integrating them, in addition to incorporating CNN and handcrafted features.
4. Results of Techniques Performance
4.1. Split of WBC Dataset
All systems were implemented on the WBC dataset. The dataset consists of 12,507 images distributed among four classes of WBC types in a balanced manner.
Table 1 shows the WBC type dataset distributed for all systems into 80% for training and validating and 20% for testing the systems.
4.2. System Performance Metrics
The predictive performance of the proposed techniques for classifying the WBC type dataset was measured by evaluative measures described by Equations (8)–(12). The equations show variables that refer to correctly classified samples are called TP and TN, and incorrectly classified samples are called FP and FN [
42]. The systems produce a confusion matrix containing all the equations’ variables.
4.3. Results of CNN-SVM Technique
The section discusses the results of CNN-SVM hybrid techniques for analysing blood slide images to classify the WBC typology dataset. This technique depends on two parts: first, the analysis of blood slide images to obtain the morphological characteristics of each type of WBC and extract the hidden features of each image through the convolutional layers of CNN models (VGG19, ResNet101 and MobileNet) and select the essential features using PCA; and second, the essential features of the SVM are then distributed through three phases, 80% for training and validation and 20% from data to test the performance of the SVM.
The hybrid technique obtained good results for analyzing blood slide images for classifying the WBC type dataset, as shown in
Table 2 and
Figure 7. SVM yielded the best performance with the features of VGG19. The VGG19-SVM technique achieved an AUC of 95.23%, accuracy of 96.20%, precision of 96.20%, specificity of 98.73% and sensitivity of 96.15%. While ResNet101-SVM achieved an AUC of 96.93%, accuracy of 96.10%, precision of 96.18%, specificity of 98.40% and sensitivity of 96.08%. In contrast, MobileNet-SVM achieved an AUC of 97.63%, accuracy of 97%, precision of 96.90%, specificity of 98.93% and sensitivity of 96.78%.
Figure 8 shows the confusion matrix generated from the VGG19-SVM, ResNet101-SVM and MobileNet-SVM technologies.
In the confusion matrix, the numbers in the last line represent the accuracy of each class, while the last column represents the precision of each class. The green cells located on the main diagonal represent correctly classified images called TP, whereas the red cells represent the misclassified images called FP and FN.
First, the VGG19-SVM technique achieved accuracy for each WBC type: accuracy for the Eosinophil type of 96.2%, Lymphocyte type of 95.7%, Monocyte type of 95.8% and Neutrophil type of 97%. Second, ResNet101-SVM achieved accuracy for each WBC type: accuracy for the Eosinophil type of 96.2%, Lymphocyte type of 95.8%, Monocyte type of 96.3% and Neutrophil type of 96.2%. Third, the MobileNet-SVM technique achieved accuracy for each WBC type: accuracy for the Eosinophil type of 97.4%, Lymphocyte type of 96.5%, Monocyte type of 96.8% and Neutrophil type of 97.2%.
The features of CNN models were combined and fed into SVM for classification to achieve a promising accuracy for classifying WBC species.
Table 3 and
Figure 9 summarize the results for technologies VGG19-ResNet101-SVM, ResNet101-MobileNet-SVM and VGG19-ResNet101-MobileNet-SVM. First, the VGG19-ResNet101-SVM technique obtained an AUC of 98.60%, accuracy of 97.60%, precision of 97.58%, specificity of 99.08% and sensitivity of 97.60%. Second, the ResNet101-MobileNet-SVM technique obtained an AUC of 98.73%, accuracy of 98.10%, precision of 98.15%, specificity of 99.25% and sensitivity of 98.08%. Third, the VGG19-ResNet101-MobileNet-SVM technique obtained an AUC of 98.88%, accuracy of 98.40%, precision of 98.33%, specificity of 99.53% and sensitivity of 98.58%.
Figure 10 shows the confusion matrix generated from the VGG19-ResNet101-SVM, ResNet101-MobileNet-SVM and VGG19-ResNet101-MobileNet-SVM technologies. First, the VGG19-ResNet101-SVM technique achieved accuracy for each WBC type: accuracy for the Eosinophil type of 97.4%, Lymphocyte type of 97.7%, Monocyte type of 97.7% and Neutrophil type of 97.3%. Second, ResNet101-MobileNet-SVM achieved accuracy for each WBC type: accuracy for the Eosinophil type of 97.8%, Lymphocyte type of 98.7%, Monocyte type of 98.1% and Neutrophil type of 98.9%. Third, the VGG19-ResNet101-MobileNet-SVM technique achieved accuracy for each WBC type: accuracy for the Eosinophil type of 97.8%, Lymphocyte type of 98.7%, Monocyte type of 98.1% and Neutrophil type of 98.9%.
4.4. Results of FFNN with Fused Features of CNN and Handcrafted
The section discusses the results obtained by the FFNN system with features of CNN and handcrafted features for analyzing blood slide images to classify the WBC classification dataset. This technique is a contribution of the study, which relies on first obtaining the features of the WBC types from the CNN models (VGG19, ResNet101 and MobileNet) and integrating them with the handcrafted features. Second, inputting the essential features of FFNN and then distributing them through three phases: 80% for training and validation and 20% from data for SVM performance testing.
FFNN, based on the handcrafted features, obtained good results for the analysis of blood slide images to classify the WBC type dataset, as shown in
Table 4. FFNN yielded an AUC of 93.25%, accuracy of 94.4%, precision of 94.45%, specificity of 98.05% and sensitivity of 94.65%.
Figure 11 shows the confusion matrix generated by FFNN with handcrafted features. The FFNN technique with handcrafted features achieved accuracy for each white blood cell type: accuracy for eosinophil type
89.8%, lymphocyte type
98.4%, monocyte type
99%, and neutrophil type
90.7%.
The hybrid technique of FFNN-CNN-handcrafted technique obtained good results for blood slide image analysis to classify the WBC type dataset, as shown in
Table 5 and
Figure 12. FFNN yielded the best performance with MobileNet and handcrafted features. The FFNN-VGG19-handcrafted technique achieved an AUC of 99.40%, accuracy of 99.40%, precision of 99.45%, specificity of 99.65% and sensitivity of 99.33%. The FFNN-ResNet101-handcrafted technique achieved an AUC of 99.48%, accuracy of 98.90%, precision of 98.95%, specificity of 99.50% and sensitivity of 99.05%. The FFNN-MobileNet-handcrafted technique achieved an AUC of 99.43%, accuracy of 99.80%, precision of 99.75%, specificity of 99.75% and sensitivity of 99.68%.
Figure 13 shows the confusion matrix generated by the FFNN-VGG19-handcrafted, FFNN-ResNet101-handcrafted and FFNN-MobileNet-handcrafted technologies. First, the FFNN-VGG19-handcrafted technique achieved accuracy for each WBC type: accuracy for the eosinophil type of 99.7%, lymphocyte type of 98.9%, monocyte type of 99.8% and neutrophil type of 99.4%. Second, the FFNN-ResNet101-handcrafted achieved accuracy for each WBC type: accuracy for eosinophil type of 99.7%, lymphocyte type of 98.9%, monocyte type of 98.4% and neutrophil type of 98.7%. Third, the FFNN-MobileNet-handcrafted technique achieved accuracy for each WBC type: accuracy for eosinophil type of 100%, lymphocyte type of 99.2%, monocyte type of 99.8% and neutrophil type of 100%.
There are also some tools that evaluate FFNN’s performance on the WBC dataset as follows.
4.4.1. Error Histogram
The error histogram is one measure that shows the performance of FFNN for analysis of blood slide images for the WBC type dataset. The network records the error between the target and output values in each epoch during the training, testing and validation phases. Each stage appears in a colour distinguished from the other stage, as in
Figure 14. The red colour represents the least error during data training, the green colour represents the least error during the data validation stage, and the blue colour represents the least error during data testing [
43]. With handcrafted features, the FFNN reached the best performance among 20 bins within the values −1.465 and 2.139. With VGG19 and handcrafted features, FFNN achieved the best performance among the 20 bins with values of −0.95 and 0.95. With ResNet101 and handcrafted features, FFNN achieved the best performance among the 20 bins with the values −0.9499 and 0.95. With MobileNet and handcrafted features, FFNN achieved the best performance among the 20 bins with values of −0.9495 and 0.95.
4.4.2. Cross-Entropy
Cross-entropy is a measure of the performance of FFNN for analysis of blood slide images for the WBC type dataset. The network records the error between the target and output values during each phase. Each stage has a special colour, as in
Figure 15. Red represents the least error during data training, green represents the least error during the validation stage, and blue represents the least error during the testing stage [
44]. With the handcrafted features, FFNN found the lowest error at epoch 32 with a value of 0.068149. With VGG19 and handcrafted features, FFNN achieved the lowest error at epoch 13 with a value of 0.076133. With the features of ResNet101 and handcrafted FFNN achieved the lowest error at epoch 113 with a value of 0.14846. With MobileNet and handcrafted features, FFNN achieved the lowest error at the epoch of 101 with a value of 0.12752.
4.4.3. Gradient and Validation Checks
The validation checks and gradient are metrics that demonstrate the performance of FFNN for analysis of blood slide images for the WBC dataset. In each epoch, the network records the gradient and failures of the FFNN, as in
Figure 16. It is noted with the handcrafted features that FFNN achieved a gradient of 0.042213 in epoch 121. With the VGG19 and handcrafted features, the FFNN reached a gradient of 0.04985 in epoch 95. With the ResNet101 and handcrafted features, the FFNN achieved a gradient of 0.02963 in epoch 119. With MobileNet and handcrafted features, the FFNN reached a gradient of 0.049243 in epoch 107.
5. Discussion of the Systems Performance for Classifying WBC Types
Recognizing the type of WBC is essential to help pathologists identify the type of disease through blood analysis. Manual analysis requires time, effort and expertise to identify the type of WBC cells. Therefore, AI techniques play a vital role in effective WBC type recognition. In this study, effective hybrid systems were developed to classify blood slide images for the WBC type dataset. The images of all systems were passed to two successive filters to improve the images.
The first technique for analyzing blood slide images is to distinguish between WBC types by CNN-SVM hybrid techniques. WBC images are fed into VGG19, ResNet101 and MobileNet models to extract all accurate features and fed into PCA to select the necessary features. SVM receives the features and classifies them with high accuracy into the four types of WBC. VGG19-SVM, ResNet101-SVM, and MobileNet-SVM techniques reached an accuracy of 96.20%, 96.10% and 97%, respctively, for the classification of the WBC type dataset.
The second technique analyses blood slide images to discriminate between WBC types by hybrid methods of CNN-SVM based on fused features. WBC images are fed into VGG19, ResNet101 and MobileNet models to extract all accurate features and fed into PCA to select the necessary features. Fused features were obtained by combining the features of VGG19-ResNet101, ResNet101-MobileNet and VGG19-ResNet101-MobileNet models. SVM receives fused features and classifies them with high accuracy into the four types of WBC. VGG19-ResNet101-SVM, Res-Net101-MobileNet-SVM and VGG19-ResNet101-MobileNet-SVM techniques reached an accuracy of 97.60%, 98.10% and 98.40%, respectively, for the classification of WBC type dataset.
The third technique is to analyze blood slide images to discriminate between WBC types by hybrid methods based on handcrafted features alone, as well as combining them with CNN models’ features. The WBC images are input into the GLCM, LBP, DWT and FCH methods of extracting the features, merging them, and then inputting them into FFNN for classification achieving an accuracy of 94.4%. The images were analyzed by VGG19, ResNet101 and MobileNet models to extract all the accurate features and input them into PCA to select the necessary features. The VGG19, ResNet101 and MobileNet features are separately integrated with the Handcrafted features. SVM receives melted features and classifies them with high accuracy into the four types of WBC. VGG19-handcrafted-FFNN, ResNet101-handcrafted-FFNN and MobileNet-handcrafted-FFNN techniques reached an accuracy of 99.40%, 98.90% and 99.80%, respectively, for the classification of the WBC type dataset.
Table 6 and
Figure 17 summarize the systems performance metrics for classifying the WBC dataset. The systems achieved impressive results in categorizing each type of WBC. FFNN, when fed with the hybrid MobileNet and handcrafted features, achieved the best performance compared to the rest of the systems, which reached an accuracy of 100%, 99.2%, 99.8%, and 100% for the classification of Eosinophil, Lymphocyte, Monocyte and Neutrophil, respectively.