Advanced Convolutional Neural Networks for Precise White Blood Cell Subtype Classification in Medical Diagnostics

Kanavos, Athanasios; Papadimitriou, Orestis; Al-Hussaeni, Khalil; Maragoudakis, Manolis; Karamitsos, Ioannis

doi:10.3390/electronics13142818

Open AccessArticle

Advanced Convolutional Neural Networks for Precise White Blood Cell Subtype Classification in Medical Diagnostics

by

Athanasios Kanavos

^1,*,

Orestis Papadimitriou

¹,

Khalil Al-Hussaeni

^2,*

,

Manolis Maragoudakis

³

and

Ioannis Karamitsos

⁴

¹

Department of Information and Communication Systems Engineering, University of the Aegean, 83200 Samos, Greece

²

Computing Sciences Department, Rochester Institute of Technology, Dubai 341055, United Arab Emirates

³

Department of Informatics, Ionian University, 49100 Corfu, Greece

⁴

Graduate and Research Department, Rochester Institute of Technology, Dubai 341055, United Arab Emirates

^*

Authors to whom correspondence should be addressed.

Electronics 2024, 13(14), 2818; https://doi.org/10.3390/electronics13142818

Submission received: 20 May 2024 / Revised: 3 July 2024 / Accepted: 12 July 2024 / Published: 18 July 2024

Download

Browse Figures

Versions Notes

Abstract

White blood cell (WBC) classification is pivotal in medical image analysis, playing a critical role in the precise diagnosis and monitoring of diseases. This paper presents a novel convolutional neural network (CNN) architecture designed specifically for the classification of WBC images. Our model, trained on an extensive dataset, automates the extraction of discriminative features essential for accurate subtype identification. We conducted comprehensive experiments on a publicly available image dataset to validate the efficacy of our methodology. Comparative analysis with state-of-the-art methods shows that our approach significantly outperforms existing models in accurately categorizing WBCs into their respective subtypes. An in-depth analysis of the features learned by the CNN reveals key insights into the morphological traits—such as shape, size, and texture—that contribute to its classification accuracy. Importantly, the model demonstrates robust generalization capabilities, suggesting its high potential for real-world clinical implementation. Our findings indicate that the proposed CNN architecture can substantially enhance the precision and efficiency of WBC subtype identification, offering significant improvements in medical diagnostics and patient care.

Keywords:

convolutional neural networks (CNN); deep learning; disease diagnosis; feature extraction; image classification; image segmentation; white blood cells (WBCs); medical image analysis

1. Introduction

The classification of white blood cell (WBC) images is a pivotal task in medical image analysis with far-reaching implications for the diagnosis, treatment, and monitoring of hematological disorders. Diseases such as leukemia, various infections, and immune system anomalies heavily rely on precise WBC analysis for their diagnosis [1]. Traditional methods of WBC classification have been predominantly manual, involving microscopic examination by skilled technicians. This process is not only labor-intensive and time-consuming but also prone to human error, leading to variability in results which may affect patient outcomes.

The automation of WBC classification through advanced computer vision systems represents a significant stride toward enhancing diagnostic accuracy and efficiency. These systems promise to mitigate the challenges associated with manual methods by providing swift, reliable, and consistent WBC assessments. The integration of automated techniques in clinical settings can potentially revolutionize the standard of care, allowing for real-time monitoring and rapid adjustment of treatment protocols.

White blood cells, or leukocytes, serve as the primary defense mechanism against infections, playing a crucial role in the immune response. Their effective functioning is critical for the body’s ability to combat a wide array of diseases. Consequently, accurate and timely identification of WBC subtypes is essential for initiating appropriate medical interventions in diseases like leukemia, HIV/AIDS, and various autoimmune disorders [2].

The introduction of digital imaging technologies has paved the way for automated WBC classification, marking a substantial advancement over traditional microscopic methods. These technologies leverage the capabilities of machine learning and particularly deep learning models such as CNNs, which excel in handling complex image data. By automatically extracting relevant features from WBC images, CNNs facilitate precise classification, thereby enhancing both the speed and reliability of the diagnostic process [3].

Recent advances in computer vision and machine learning have led to the development of sophisticated algorithms that can autonomously detect and classify WBCs in digital images of blood specimens [4]. These algorithms, powered by deep learning models like CNNs, extract distinctive features from images and achieve high accuracy in classifying WBCs. The shift towards deep learning in computer vision represents a paradigm change, significantly improving the efficacy of image recognition tasks, including object detection, segmentation, and classification.

Despite these advancements, several challenges remain in the field of image-based WBC classification. The variability in WBC morphology, which can be influenced by the patient’s health status and the type of staining technique used, poses a significant challenge for developing robust classification systems. Moreover, the presence of other cellular components like red blood cells and platelets in blood samples can complicate the image analysis, often leading to classification errors [5]. Also, the field faces a critical bottleneck due to the lack of publicly available datasets that are sufficiently large and diverse. Such datasets are essential for training and testing advanced machine learning models and for ensuring that these models can generalize well across different populations and conditions.

This paper introduces a comprehensive approach to white blood cell (WBC) classification that addresses key challenges in the field using a novel CNN architecture enhanced by transfer learning and data augmentation strategies. Our methodology leverages a diverse dataset of WBC images to enhance the robustness and accuracy of our system. By fine-tuning a pre-trained CNN model with this extensive dataset, which includes various cell types and staining conditions, we significantly improve the model’s generalization capabilities to new, unseen data. We augment this approach by employing advanced data augmentation techniques that simulate a wider range of potential variations in WBC images, preparing the model for effective real-world applications. Extensive experimental evaluations on a publicly accessible dataset validate the effectiveness of our proposed method, underscoring its superiority over existing techniques.

Moreover, our study not only advances the technological framework for WBC classification but also lays the groundwork for potential clinical collaborations. By aligning our computational developments with clinical requirements, we aim to bridge the gap between laboratory research and bedside application. Future work will focus on integrating our models into clinical workflows to assess their real-world diagnostic efficacy and impact on treatment outcomes. This integration will validate the practical viability of our models and provide a clear roadmap for further advancements in the application of AI technologies in healthcare. This forward-looking approach ensures that our contributions extend beyond mere technical achievements, fostering significant improvements in patient care and medical research.

The organization of the paper is as follows: Section 2 reviews related work in the field, emphasizing previous studies that have utilized deep learning for WBC classification. Section 3 outlines the methodologies employed, detailing the specifics of the CNN architecture and the computational tools used, such as TensorFlow and Keras. Section 4 describes our model, including a comprehensive discussion of its design and the rationale behind the chosen architecture. Section 5 presents a detailed analysis of the experimental results, demonstrating the effectiveness of our approach through rigorous testing and validation, whereas Section 6 compares our results with those of other existing models. Finally, Section 7 summarizes our findings and discusses potential future research directions.

2. Related Work

The classification of white blood cells (WBCs) in medical image analysis plays a crucial role in diagnosing and monitoring hematological disorders [6]. Initially, classification methods relied on morphological characteristics such as shape, size, and texture, but these approaches struggled with scalability and robustness when applied to large datasets [7].

Advancements in deep learning have revolutionized this field, with CNNs emerging as a dominant technique due to their efficiency in processing spatial hierarchies of image data. The use of AlexNet to distinguish WBC subclasses exemplifies the adaptability of CNNs to medical imaging tasks [8]. Furthermore, deep convolutional networks with hyperspectral microscopy to enhance classification through detailed spectral and spatial feature extraction were leveraged in [9]. The introduction of residual convolutional structures with batch normalization significantly improved the reliability of feature extraction for WBC classification [10].

Extending beyond conventional CNNs, innovative approaches such as deformable CNNs have been developed to handle irregular and morphologically complex WBC images. Authors of [11] introduced a weighted optimized deformable CNN, significantly refining classification precision. Similarly, the integration of multilayer convolutional features with extreme learning machines (ELMs) has demonstrated the potential for high-precision WBC identification, underscoring the hybrid application of traditional neural networks and modern deep learning techniques [12].

The effectiveness of these models is showcased in [13], where a classification accuracy of 98.8% using a combination of generative adversarial networks and deep neural networks for robust learning and data augmentation was achieved. In contrast, the application of the Naive Bayes method based on shape features demonstrates the limitations of some traditional statistical approaches, achieving a lower accuracy of 83.2% [14].

Meanwhile, innovative clustering techniques have also demonstrated their utility in WBC classification. K-means clustering has been used to effectively stratify white blood cells, achieving an accuracy of 80% by integrating this approach with a combination of support vector machines (SVM) and neural networks for subsequent classification tasks [15]. Further advancing clustering applications, feature-weighted K-means clustering was employed in [16] to improve the initial extraction of WBCs from complex image backgrounds, thereby enhancing the precision of the subsequent classification stages.

Hybrid models that combine different feature extraction and classification techniques have gained prominence due to their enhanced accuracy and robustness. AlexNet and GoogleNet with support from vector machines for optimized feature extraction and classification were utilized in [17], demonstrating the efficacy of combining multiple deep learning models. The authors of [18] further exemplified the power of ensemble methods by integrating multiple pre-trained models with ELM classifiers, achieving superior performance in feature fusion and classification.

An innovative approach underscored by the extraction of overlapping and multiple nuclei patches, achieved through a confluence of CNN and recurrent neural networks was unveiled in [19]. A two-stage classification paradigm, harnessing the prowess of a CNN model to effectively orchestrate mononuclear and polymorphonuclear identification, including their associated subtypes, was introduced in [20]. Notably, these hybrid strategies have demonstrated not only heightened accuracy but also enhanced robustness in the sphere of WBC classification.

In addition to CNNs, other neural network architectures like recurrent neural networks (RNNs) are being adapted for medical imaging [21]. A CNN-RNN framework, enhancing the analysis of sequential and structured image data, which is crucial for comprehensive training on large datasets, was developed in [22]. This highlights a growing trend toward combining multiple neural network models to leverage their respective strengths in processing diverse and complex data structures. A model based on CNN that has a minimal number of parameters requiring training, specifically for analyzing white blood cells to classify their types, was implemented in [23]. A dense CNN, specifically the DenseNet121 model, was employed in [24] for the classification of blood cells from their images. The aim is to leverage this advanced neural network architecture to address the significant challenge of blood cell classification, a critical issue in the diagnostics of blood-related conditions.

Furthermore, the exploration of WBC classification has expanded to include various machine-learning paradigms beyond CNNs and SVMs. Techniques such as deep belief networks (DBNs) have been explored for their ability to learn deep hierarchical representations, with Kourou achieving notable accuracy in classifying WBC types [25]. Additionally, feature extraction techniques like local binary patterns (LBP) and histograms of oriented gradients (HOG) are increasingly used to complement deep-learning frameworks, providing robust tools for enhancing classification accuracy [26].

Recent advancements in deep learning have significantly influenced the field of medical image analysis, particularly in the classification of white blood cells (WBCs) [27,28]. Studies such as [29] have begun to explore the integration of Vision Transformers (ViTs) alongside traditional CNN architectures for WBC classification. This hybrid approach leverages the spatial hierarchies processed by CNNs and the global context captured by ViTs, facilitating more accurate and robust classification models. These developments underscore the potential of combining multiple deep-learning architectures to enhance diagnostic accuracy in hematological examinations.

More to the point, the application of transfer learning techniques has been prominently featured in recent research, enhancing the effectiveness of deep learning models in medical imaging domains where data may be scarce or imbalanced [30]. For instance, the authors of [31] demonstrated how pre-trained networks on extensive datasets could be fine-tuned with smaller, domain-specific datasets to improve the classification accuracy of WBCs. This method not only optimizes computational resources but also adjusts to the nuanced differences in medical imaging data, making it a valuable strategy for ongoing and future applications in automated disease diagnosis.

Overall, the field of WBC image classification has evolved from basic morphological analysis to sophisticated models that incorporate deep learning, hybrid systems, and advanced machine learning techniques. This evolution has been supported by the availability of diverse, publicly accessible datasets, allowing for the rigorous evaluation and continuous improvement of classification strategies. As this field advances, the integration of these technologies promises to significantly enhance diagnostic capabilities in hematological assessments.

3. Methodology Foundations

3.1. Convolutional Neural Networks

In the realm of deep learning, CNNs stand out as a sophisticated architecture engineered to leverage the spatial correlations of pixels within images for advanced pattern recognition. This architecture undergoes a comprehensive learning phase, refining its ability to discern intricate patterns by examining varied segments of images. This significantly enhances its recognition capabilities post-training, making it particularly suited for tasks that demand nuanced image interpretation, notably in the medical sector for image-based diagnostics and analysis.

CNNs are celebrated for their efficiency in the automated identification of features, primarily attributed to their strategic use of convolutional and pooling layers. These layers, combined with one or more fully connected layers, facilitate precise data classification. To further enhance their analytical precision, CNNs can incorporate additional fully connected layers, which streamline the feature reduction process and thereby simplify the complexity of image data [32].

Mirroring the structure of multilayer perceptrons, CNNs are composed of an input layer, several hidden layers, and an output layer. The convolutional layer within this architecture is crucial, applying specific operations to highlight essential features of the image. The technique of downsampling is utilized within these layers to boost the efficiency of computational tasks. This strategic layering and operational methodology underscore the CNNs’ adeptness at extracting and interpreting complex visual information with minimal preprocessing and reduced reliance on manual intervention, marking a significant advancement in the field of computer-assisted medical diagnosis and monitoring.

3.2. Tensorflow

TensorFlow is a comprehensive, open-source framework developed by Google, designed to facilitate the execution of complex mathematical computations, which is foundational for building and deploying deep learning models. It is particularly adept at handling dataflow graphs, which map out the movement and transformation of data through a series of processing steps. In these graphs, nodes represent mathematical operations on multidimensional data arrays (tensors), while edges depict the tensors as they flow between operations.

This framework is engineered to perform efficiently across various computing platforms, from mobile devices to large-scale distributed systems leveraging both CPUs and GPUs. This cross-platform flexibility allows TensorFlow to optimize computational resources, making it ideal for the heavy computational demands of training large deep learning models, including those used in image recognition.

In the realm of image recognition, TensorFlow is particularly effective due to its robust handling of convolutional and pooling layers integrated with fully connected layers, which are essential for high-performance image classification tasks. This structure mirrors the architecture of a multilayer perceptron (MLP), consisting of input, hidden, and output layers, where each neuron in one layer is connected to all neurons in the subsequent layer. Such a setup facilitates the hierarchical processing of image data, enhancing the extraction and classification of features as the data progresses through the network [33].

3.3. Keras

Keras is a high-level, Python-based, open-source interface designed for the streamlined creation and training of deep learning models, particularly within the TensorFlow ecosystem. It simplifies the development process by providing a more abstract and user-friendly layer of operations, which allows developers to focus more on designing and implementing neural networks without being inundated by the intricate details of underlying tensor manipulations.

Keras facilitates model construction through its Sequential API, a method where models are built by stacking layers linearly. This architecture is particularly effective for standard deep learning models as each layer is designed to accept a single tensor as input and output another tensor, creating a clear and efficient pipeline for model building. By abstracting away many of the lower-level operations, Keras enables developers to experiment more freely with deep learning, significantly speeding up the development of sophisticated models without compromising on performance or flexibility [34].

3.4. Convolutional Layers

Convolutional layers are the cornerstone of CNNs, uniquely designed to automatically and efficiently extract spatial features such as edges, textures, and shapes from images. These layers operate by sliding a filter or kernel over the input image, calculating the dot product of the filter values with the original pixel values at each position.

The convolution operation in CNNs can be mathematically expressed as:

S (i, j) = (I * K) (i, j) = \sum_{m} \sum_{n} I (i + m, j + n) \cdot K (m, n)

(1)

where

S (i, j)

is the output feature map, I is the input image, K is the kernel or filter,

(i, j)

are the coordinates on the feature map,

(m, n)

are the coordinates in the kernel and · denotes the convolution operation.

Each element

S (i, j)

of the output feature map is the sum of the element-wise product of the kernel K and the portion of the input image I over which the kernel is currently positioned.

For grayscale images, the input matrix I will have a single layer. In contrast, color images typically consist of three layers (RGB), with the convolution operation often performed separately on each layer.

The kernel is a smaller matrix relative to the input image, with dimensions typically 3 × 3 or 5 × 5. It contains weights that are learned during the training process and is designed to detect specific types of features from the input image. As the kernel strides over the input image, it performs element-wise multiplication followed by a sum, producing the output feature map where each element represents the presence and intensity of a feature detected at a specific location.

The dimensions of the output feature map (

W_{o u t}, H_{o u t}

) are determined by the size of the input (

W_{i n}, H_{i n}

), the filter size (F), the stride (S), and the padding (P) using the following equations:

W_{o u t} = \frac{W_{i n} - F + 2 P}{S} + 1

(2)

H_{o u t} = \frac{H_{i n} - F + 2 P}{S} + 1

(3)

where

W_{o u t}

and

H_{o u t}

are the width and height of the output feature map,

W_{i n}

and

H_{i n}

are the width and height of the input, F is the filter size, S is the stride and P is the padding.

3.5. Pooling Layers

Pooling layers are integral components of CNNs, primarily utilized to reduce the spatial dimensions of the feature maps. This reduction is crucial for decreasing the computational load, minimizing overfitting, and retaining only the most essential information, thereby enhancing the network’s generalizability.

Pooling layers decrease the size of the feature maps, which reduces the number of parameters and computations required in the network. This simplification allows the network to focus on the most significant features, helping to ensure that the model remains computationally efficient and less prone to overfitting. Additionally, by summarizing the presence of features in patches of the feature map, pooling enhances the network’s robustness to minor variations and translations in the input image.

There are several types of pooling techniques, including max pooling, average pooling, and global pooling. In this study, we focus on max pooling, which is the most commonly used form of pooling in deep learning applications. Max pooling operates by selecting the maximum value from a set of values within a defined window (or patch) on the feature map and forwarding this value to the next layer. This technique effectively captures the most pronounced feature in each patch, which is particularly useful for features like edges and textures that are critical in image recognition tasks.

The operation of max pooling can be mathematically expressed as follows:

P_{m a x} (i, j) = {max}_{a = 0}^{n - 1} {max}_{b = 0}^{n - 1} F (i \cdot s + a, j \cdot s + b)

(4)

where

P_{m a x} (i, j)

is the output of the pooling operation at position

(i, j)

, F is the feature map,

n \times n

is the size of the pooling window, and s is the stride of the pooling window. Variables a and b iterate over the window dimensions, and this operation is applied independently across each position of the feature map to reduce its dimensions.

Pooling layers, by reducing the number of parameters, not only saves computational resources but also helps in making the detection of features invariant to scale and orientation changes, which is a desirable property in many vision-based applications.

3.6. Batch Normalization

Batch Normalization (BN) has become a cornerstone technique in deep learning, particularly valued for enhancing the stability and efficiency of neural network training. It is especially beneficial for deep networks, helping to accelerate the training phase and improve the overall performance and accuracy of the model. Despite its widespread use and observable benefits, the exact mechanisms and theoretical underpinnings of BN continue to be subjects of ongoing research and debate [35].

The principal advantage of batch normalization is its effectiveness in combating the problem of internal covariate shifts. This phenomenon occurs when the distributions of each layer’s inputs change during training, which can slow down the training process and lead to unstable convergence behaviors. BN tackles this by normalizing the inputs of each layer to ensure they have a consistent mean and variance:

{\hat{x}}_{i} = \frac{x_{i} - μ_{B}}{\sqrt{σ_{B}^{2} + ϵ}}

(5)

where

x_{i}

is the input to a layer,

μ_{B}

and

σ_{B}^{2}

are the mean and variance calculated over the batch, and

ϵ

is a small constant added for numerical stability. This normalization allows each layer to learn on a more stable distribution of inputs, facilitating a smoother and faster training process.

By standardizing the inputs in this way, BN enables higher learning rates to be used without the risk of instabilities typically induced by unfavorable initial parameter choices or extreme value ranges. This can significantly speed up the convergence of the training process. Furthermore, BN helps to prevent the network from reaching saturation points—states where changes in input produce minimal or no change in output—which can impede learning. It maintains activation functions within their non-saturating regions, thereby enhancing the sensitivity and responsiveness of the network during training.

Overall, batch normalization has proven to be an effective method for improving the training stability and performance of neural networks, contributing to faster convergence rates and more consistent training outcomes. Its integration into modern neural architectures is indicative of its crucial role in advancing the field of deep learning [36].

3.7. Dropout

In the domain of large-scale machine learning, particularly in deep neural networks, overfitting is a pervasive challenge. Overfitting occurs when a model performs exceptionally well on training data but poorly on unseen data, a problem exacerbated by the complex architectures and large parameter sets characteristic of deep networks. Dropout is a regularization technique specifically designed to prevent this issue by randomly disabling certain neurons and their connections during the training phase, thus reducing the risk of interdependent neuron behavior.

The mechanism of dropout involves randomly selecting a subset of neurons in each training iteration and temporarily removing them along with all their incoming and outgoing connections. This process creates a “thinned” network, where the surviving neurons must adapt to the absence of their dropped counterparts. Mathematically, if a neuron’s output is represented by x, then during training, dropout is applied by multiplying x by a random variable d drawn from a Bernoulli distribution:

x^{'} = d \cdot x

(6)

where d is 1 with probability p (the retention probability), and 0 with probability

1 - p

. This operation is performed independently for each neuron, resulting in different network architectures in each training iteration.

Dropout has been empirically shown to significantly improve the generalization of neural networks, particularly in scenarios where the training data are limited and the network is large and complex. Unlike traditional regularization methods, which might involve constraining the magnitude of weights directly, dropout regularizes the model by enhancing the diversity of the internal representations learned during training. This diversity ensures that the model does not rely too heavily on any single or small group of features, leading to better performance on unseen datasets [37].

3.8. Model Configuration and Optimization

The configuration and optimization of our CNN models are pivotal in achieving high performance in white blood cell classification. Here, we detail the model setup, including hyperparameter tuning, optimization algorithms used, and strategies for effective training:

Hyperparameter Tuning: We employed a systematic approach to hyperparameter tuning, utilizing grid search and random search methods to identify the optimal settings for learning rate, batch size, and the number of layers. Each parameter was chosen based on its impact on model accuracy and training time, ensuring a balanced approach between computational efficiency and predictive performance.
Optimization Algorithms: Our models utilize the Adam optimizer, known for its efficiency in handling sparse gradients and adaptive learning rate capabilities. This choice is particularly beneficial for medical imaging tasks, where model convergence stability is crucial due to the varied nature of image data.
Training Strategies: To combat the challenges of overfitting and underfitting, especially prevalent in medical diagnostics due to the high stakes of accurate prediction, we implement early stopping and model checkpointing. These strategies ensure that our models do not train beyond the point of diminishing returns and that the best-performing model state is preserved for final evaluation.
Regularization Techniques: Beyond dropout, we explore L2 regularization to penalize the complexity of the model weights, further ensuring that our models generalize well on unseen data. This is particularly important in medical applications where new patient data must be reliably evaluated by the model.

This approach to model configuration and optimization ensures that our CNN architectures are not only tailored for high accuracy in classifying white blood cells but are also robust and reliable for clinical application, paving the way for their use in real-world medical diagnostics.

4. Model Architecture

In this study, we developed a CNN specifically tailored for the classification of white blood cells. The model architecture is designed to process input images of white blood cells and classify them into their respective subtypes. It consists of several convolutional and pooling layers that work in tandem to extract and condense spatial features from the images, followed by fully connected layers that interpret these features to make final predictions.

The CNN model begins with an input layer that takes an image of a white blood cell. Following the input layer, multiple convolutional layers equipped with spatial filters perform feature extraction. Each convolutional operation is followed by batch normalization, which stabilizes learning by normalizing the layer inputs. Pooling layers interspersed among the convolutional layers reduce the spatial dimensions of the feature maps, thereby decreasing computational complexity and enhancing the model’s ability to focus on essential features.

After extracting and pooling the features, the data are flattened and passed through a dense neural network structure. This portion of the network comprises fully connected layers that interpret the features extracted by the convolutional layers and perform the final classification.

The CNN model variations are implemented using three distinct architectures, each designed to test different configurations and their impact on model performance. The proposed architectures are illustrated in Figure 1, showcasing the layer sequences and operations within each model.

The layers utilized across the three architectures include:

Input(), which instantiates a symbolic tensor representing the input images.
Conv2D() applies spatial convolution over images.
Batch Normalization() normalizes the activations of the previous layer at each batch.
MaxPooling2D() performs spatial pooling (downsampling).
Flatten() flattens the input for use in fully connected layers.
Dense() creates a densely connected NN layer.
Softmax() applies the softmax activation function to the final layer outputs.

The specific layer sequences and configurations used in each proposed architecture are detailed in Table 1. This table summarizes the layer operations and their arrangement within each model design.

With a comprehensive training dataset of white blood cell images, these models have demonstrated commendable classification accuracy rates. Each architecture was initially applied using distinct methodologies, followed by a unified approach to evaluate their performance comprehensively.

5. Evaluation

5.1. Dataset

The dataset employed in this study is sourced from a publicly accessible collection of blood cell images on Kaggle (https://www.kaggle.com/datasets/paultimothymooney/blood-cells, accessed on 11 July 2024), consisting of 410 original high-resolution images. These original images have been augmented to include a total of 3000 images for each of the four main cell types: eosinophil, lymphocyte, monocyte, and neutrophil, creating a comprehensive dataset designed to enhance model training through diverse representations of cell characteristics.

To facilitate effective model training and validation, the dataset was strategically split into training and validation subsets. An 80:20 split was applied, allocating 80% of the images for training purposes, while the remaining 20% were used as a validation subset. This division aids in fine-tuning the models and ensures that the performance metrics are reflective of the model’s capability to generalize to unseen data.

Within the dataset, each cell type is equally represented, with approximately 3000 images per category, ensuring a balanced dataset that supports effective training and validation of the models. Additionally, the dataset includes a set of images that have been further augmented to enhance the diversity and complexity of the training data, thereby aiding in the development of robust machine-learning models.

The distribution of images across different subsets—training, testing, and a simplified testing set—is detailed in Table 2. This table provides an overview of the number of images available for each cell type within each subset, supporting a comprehensive evaluation of the model performance across varied conditions.

This structured and well-distributed dataset provides a solid foundation for assessing the efficacy and generalizability of the developed CNN models, highlighting their ability to classify blood cell types with high accuracy.

5.2. Results and Analysis

In the evaluation section, a detailed analysis of our models’ performance using validation subsets is provided, focusing on accuracy, loss, and computational time metrics. This approach underscores our models’ capability to generalize effectively to unseen data, which is crucial for practical clinical applications. Table 3 and Figure 2, Figure 3, Figure 4, Figure 5, Figure 6 and Figure 7 depict these metrics, evaluated specifically on the validation subsets, thereby ensuring that our results reflect the models’ true predictive performance and not just memorization of the training data.

Each architecture was evaluated using batch sizes ranging from 128 to 4096, with performance metrics recorded at various training milestones (1, 5, 10, 15, and 20 epochs). The results suggest that smaller batch sizes generally lead to quicker learning but may also pose a higher risk of overfitting, whereas larger batch sizes yield more stable but slower learning processes.

The performance of each architecture across different batch sizes is visually represented in the figures below. These graphs illustrate the trajectory of loss, accuracy, and computational time, providing insights into the scalability and efficiency of each model.

5.2.1. First Architecture: Basic Layer Repeats

The first architecture employs a straightforward sequence of Conv2D, BatchNorm, and MaxPooling2D layers, a configuration designed to optimize the processing of spatial hierarchies in image data. This setup is particularly effective at quickly identifying and extracting key features from images due to the convolutional layers’ sensitivity to spatial input variations. The inclusion of BatchNorm stabilizes the learning process by normalizing the input layers, reducing internal covariate shift and accelerating the convergence. MaxPooling simplifies the input representation by summarizing the presence of features, which reduces the computational burden and helps in achieving translation invariance to minor shifts and distortions in the input image.

In practice, the architecture shows rapid gains in efficiency during the initial epochs, especially evident in smaller batch sizes. For instance, the dramatic reduction in loss from 1.420 to 0.6427 within the first five epochs at a batch size of 128 underlines the architecture’s capacity for quick adaptation to the training data. As the epochs progress, particularly from 10 to 20, there is a notable stabilization in accuracy, consistently exceeding 95%. This demonstrates not only the architecture’s effective learning in the early stages but also its ability to maintain and refine these gains over an extended period, thus indicating a robust ability to generalize from the training data without overfitting despite increased epoch numbers.

5.2.2. Second Architecture: Increased Depth

The second architecture enhances the network’s depth by incorporating additional Conv2D layers between BatchNorm and MaxPooling stages. This design is tailored to handle more complex image structures by allowing for deeper feature extraction, which is crucial for identifying subtle differences between classes in detailed medical images. The additional Conv2D layers enable the architecture to capture a broader array of features at different levels of abstraction, improving the network’s interpretative capabilities. This setup tends to produce a model that not only learns more effectively but also provides a richer representation of the input data, enhancing the overall accuracy and reliability of the predictions.

During the training process, this architecture benefits significantly from mid-sized batches, as they provide a good compromise between the speed of learning and the depth of feature exploration. From epochs 1 to 5, there is a moderate but steady decrease in loss, particularly for batch size 256, where it drops from 1.456 to 0.5114. This indicates that the architecture quickly adapts to feature extraction despite the complexity of the input. As training advances towards 20 epochs, the model continues to improve, achieving near-perfect accuracy rates. The deep layers allow for nuanced learning, which is evident in the consistent and substantial accuracy improvements even in later epochs, highlighting the architecture’s capacity to effectively generalize and adapt to complex patterns.

5.2.3. Third Architecture: Integration of Dropout

The third architecture integrates Dropout layers with repeated blocks of Conv2D and BatchNorm followed by MaxPooling2D, specifically aiming to increase the model’s robustness against overfitting—a common challenge in deep learning models dealing with complex datasets. Dropout randomly disables neurons during training, forcing the network to learn more robust features that are not reliant on any specific subset of neurons. This approach enhances the model’s generalization capabilities and provides a safety net that prevents the model from memorizing the training data too closely. Moreover, the use of Dropout promotes the development of a more versatile model capable of performing well across various unseen datasets, making it particularly valuable in medical applications where the accuracy and generalizability of the results are critical.

This architecture’s performance in initial epochs shows a slower improvement rate, especially in larger batch sizes, reflecting the conservative learning approach enforced by Dropout. For instance, the accuracy improvements from epoch 1 to 5 are more gradual, which mitigates rapid overfitting risks but also indicates a more steady and stable learning curve. By epochs 15 to 20, the architecture achieves stable and consistent accuracy improvements, demonstrating the effectiveness of Dropout in maintaining training stability. This gradual but consistent improvement across epochs, especially noticeable in settings with larger batch sizes, showcases the model’s ability to achieve and maintain high performance over time without sacrificing the robustness of the training process.

5.3. Misclassification Analysis

To enhance our understanding of the model’s performance and identify potential areas for improvement, we delve into the analysis of misclassifications, focusing on false positives, false negatives, true positives, and true negatives. Analyzing these aspects helps us pinpoint specific challenges the models face with each cell type, which is crucial for refining our approaches in medical image classification where accuracy is critical.

Table 4 illustrates the misclassification breakdown for our first architecture. This model shows a higher rate of false negatives for neutrophils, suggesting difficulties in accurately detecting this class under certain conditions, which could be due to its morphological similarities with other cell types.

This architecture’s strong performance in identifying eosinophils with a low rate of false positives indicates its effectiveness in capturing the distinctive features of this cell type. However, the relative increase in false negatives for neutrophils warrants further investigation into the convolutional layers’ sensitivity and perhaps adjustments in the feature extraction layers to better differentiate between similar cell types.

As shown in Table 5, the second architecture reduces false negatives across all classes, demonstrating an improved overall sensitivity compared to the first model. This suggests that the additional convolutional layers included in this architecture enhance its ability to discern subtle features critical for class differentiation.

The decrease in false positives for eosinophils and lymphocytes highlights the model’s precision, making it particularly suitable for applications where the cost of a false positive is high. Continuing to enhance depth and perhaps integrating more targeted dropout strategies could further optimize its performance.

Table 6 presents the most balanced performance among the architectures, with notably high true positives and minimal false positives for eosinophils, underscoring the effectiveness of integrating Dropout at reducing overfitting and enhancing generalization.

This architecture’s robustness against false positives and its consistent true positive rate suggests it is well-suited for clinical settings where both high sensitivity and specificity are required. Future work could explore scaling up the complexity of this model or adjusting dropout rates to fine-tune its performance further.

These detailed analyses of misclassifications across different architectures provide critical insights into each model’s operational strengths and weaknesses. They inform our ongoing efforts to enhance model accuracy and reliability, ensuring that our CNNs can meet the stringent requirements of medical diagnostic applications.

6. Comparative Analysis and Discussion

This section provides a comparative analysis between our models and existing alternatives in the field of white blood cell classification. The performance of our models is evaluated in the context of accuracy and compared with the results reported in several notable studies. The analysis specifically highlights the performance of our second architecture, which consists of Conv2D layers alternated with BatchNorm and MaxPooling2D layers, achieving superior results at a batch size of 512.

6.1. Performance Overview

The second architecture, with its enhanced depth and effective feature extraction capabilities, achieved an impressive accuracy of 98%. This result is particularly notable as it surpasses several established benchmarks in the field. Table 7 outlines the accuracies achieved by different studies, providing a clear perspective on the performance of our models relative to others.

6.2. Discussion on Model Performance

The accuracy achieved by our second architecture at a 98% rate positions it competitively among the high-performing models in this domain. Notably, it exceeds the performance of [24], where an accuracy of 94% was reported, and significantly surpasses the outcomes reported in [22] at 90.79%. This indicates the robust capability of our model to handle complex image classification tasks more effectively than these earlier methods.

Moreover, while our model’s accuracy approaches that of [23], where the authors achieved an impressive 99.5%, it provides a balanced approach that likely offers advantages in other performance metrics such as computational efficiency or model stability, which are essential for practical applications. The slightly lower accuracy compared to [23] may also reflect differences in model complexity or training procedures, suggesting areas for future optimization.

In contrast to [18], which achieved 96.03%, our model demonstrates a clear improvement, further validating the effectiveness of the chosen architecture and training strategy. By integrating the work in [38] from the Kaggle community, which achieved 98.5%, we provide a comprehensive evaluation spectrum, showcasing our model’s competitive stance not only in academic circles but also in broader community-driven developments. These comparisons highlight our model’s competitive edge in accurately classifying white blood cells, underscoring its potential utility in medical diagnostics and research.

The results of this comparative analysis underscore the significance of architectural choices and training strategies in developing effective deep-learning models for medical image analysis. The success of our second architecture confirms its suitability for tasks requiring high accuracy, making it a valuable tool for advancing medical diagnostics.

7. Conclusions and Future Work

This study has demonstrated the significant potential of Convolutional Neural Networks (CNNs) in the classification of blood cell images, highlighting their effectiveness in accurately categorizing various types of blood cells, such as red blood cells, white blood cells, and platelets. The high classification accuracy achieved by the trained CNN models emphasizes their substantial utility in medical diagnostics and research, offering promising enhancements to existing methodologies.

Our results lay a robust foundation for further research in the domain of blood cell image classification. One immediate direction for future work is the exploration of CNNs’ capabilities in identifying specific abnormalities or diseases related to different blood cell types. This could involve the development of specialized CNN architectures and training techniques designed to precisely detect and classify pathological variations in blood cells, which are indicative of various medical conditions. Additionally, integrating CNNs with other advanced imaging techniques, such as fluorescence microscopy or digital pathology, could lead to a comprehensive multimodal approach for in-depth blood cell analysis, enhancing diagnostic accuracy and reliability.

Future studies should also focus on improving the interpretability and transparency of CNN models in the context of blood cell classification. Developing techniques to visualize and interpret the features learned by CNNs can provide deeper insights into the characteristics utilized for classification. Such advancements could help demystify the decision-making processes of CNNs, increasing trust and facilitating their adoption in clinical settings where explainability is crucial. Implementing Explainable AI (XAI) methodologies such as Layer-wise Relevance Propagation (LRP) or Grad-CAM would allow for visual explanations of which areas in the images are most important for predictions. This enhancement will help demystify the decision-making processes of CNNs, increasing trust and facilitating their adoption in clinical settings where explainability is crucial.

Clinical Applications and Implementation

As we look forward toward the integration of advanced CNN models into clinical practice, it is imperative to consider their practical deployment in diagnostic labs. The potential of our model to enhance the accuracy and efficiency of white blood cell subtype classification promises significant improvements in disease diagnosis and monitoring. Specifically, our model can be adapted to aid in the rapid identification of blood cell abnormalities, a crucial factor in the timely treatment of hematological disorders.

To facilitate the clinical adoption of our model, several steps will be undertaken. Firstly, extensive validation on a diverse clinical dataset will be essential to ascertain the model’s robustness and reliability. Furthermore, integration efforts will focus on compatibility with existing laboratory information systems to ensure seamless workflow integration and regulatory compliance.

Future research will explore enhancing the interpretability of the CNN’s decision-making processes to meet clinical standards. Additionally, a pilot implementation in a clinical setting is proposed to evaluate the model’s practical utility and gather real-world data, which will be invaluable in refining the model for widespread clinical use.

In summary, our research confirms the efficacy of CNNs in enhancing blood cell classification, supporting their broader application in medical diagnostics. Looking ahead, the paths outlined above not only promise to advance the technical capabilities of CNNs but also aim to bridge the gap between machine learning innovations and clinical practice. These future endeavors could potentially transform diagnostic processes, leading to improved disease detection, and a deeper understanding of the underlying biological processes through more sophisticated blood cell analysis techniques.

Author Contributions

Conceptualization, A.K., O.P. and M.M.; data curation, A.K.; formal analysis, A.K., O.P., M.M. and I.K.; funding acquisition, K.A.-H.; investigation, A.K., O.P., K.A.-H., M.M. and I.K.; methodology, A.K., O.P., K.A.-H. and I.K.; project administration, A.K. and O.P.; resources, A.K. and K.A.-H.; software, A.K. and O.P.; supervision, O.P., M.M. and I.K.; validation, A.K. and I.K.; visualization, A.K. and M.M.; writing—original draft, A.K.; writing—review and editing, A.K., O.P., K.A.-H., M.M. and I.K. All authors have read and agreed to the published version of the manuscript.

Funding

This research is supported in part by the DSO-RIT Dubai Research Fund (2023-24-1004) from the Rochester Institute of Technology—Dubai (RIT-Dubai).

Data Availability Statement

No new data were created or analyzed in this study. Data sharing is not applicable to this article.

Conflicts of Interest

The authors declare no conflicts of interest.

References

Jung, C.; Abuhamad, M.; Alikhanov, J.; Mohaisen, A.; Han, K.; Nyang, D. W-net: A CNN-based architecture for white blood cells image classification. arXiv 2019, arXiv:1910.01091. [Google Scholar]
Singh, I.; Singh, N.P.; Singh, H.; Bawankar, S.; Ngom, A. Blood Cell Types Classification Using CNN. In Proceedings of the Bioinformatics and Biomedical Engineering—8th International Work-Conference, IWBBIO 2020, Granada, Spain, 6–8 May 2020; Rojas, I., Valenzuela, O., Rojas, F., Herrera, L.J., Guzman, F.M.O., Eds.; Proceedings; Lecture Notes in Computer Science. Springer: Berlin/Heidelberg, Germany, 2020; Volume 12108, pp. 727–738. [Google Scholar]
Wang, Q.; Chang, L.; Zhou, M.; Li, Q.; Liu, H.; Guo, F. A spectral and morphologic method for white blood cell classification. Opt. Laser Technol. 2016, 84, 144–148. [Google Scholar] [CrossRef]
Shin, H.; Roth, H.R.; Gao, M.; Lu, L.; Xu, Z.; Nogues, I.; Yao, J.; Mollura, D.J.; Summers, R.M. Deep Convolutional Neural Networks for Computer-Aided Detection: CNN Architectures, Dataset Characteristics and Transfer Learning. IEEE Trans. Med Imaging 2016, 35, 1285–1298. [Google Scholar] [CrossRef]
Hegde, R.B.; Prasad, K.; Hebbar, H.; Singh, B.M.K. Development of a robust algorithm for detection of nuclei and classification of white blood cells in peripheral blood smear images. J. Med. Syst. 2018, 42, 110. [Google Scholar] [CrossRef]
Khan, S.; Sajjad, M.; Hussain, T.; Ullah, A.; Imran, A.S. A Review on Traditional Machine Learning and Deep Learning Models for WBCs Classification in Blood Smear Images. IEEE Access 2021, 9, 10657–10673. [Google Scholar] [CrossRef]
Deshpande, N.M.; Gite, S.; Aluvalu, R. A review of microscopic analysis of blood cells for disease detection with AI perspective. PeerJ Comput. Sci. 2021, 7, e460. [Google Scholar] [CrossRef]
Togacar, M.; Ergen, B.; Sertkaya, M.E. Subclass separation of white blood cell images using convolutional neural network models. Elektron. Ir Elektrotechnika 2019, 25, 63–68. [Google Scholar] [CrossRef]
Wang, Q.; Wang, J.; Zhou, M.; Li, Q.; Wen, Y.; Chu, J. A 3D attention networks for classification of white blood cells from microscopy hyperspectral images. Opt. Laser Technol. 2021, 139, 106931. [Google Scholar] [CrossRef]
Jiang, M.; Cheng, L.; Qin, F.; Du, L.; Zhang, M. White Blood Cells Classification with Deep Convolutional Neural Networks. Int. J. Pattern Recognit. Artif. Intell. 2018, 32, 1857006:1–1857006:19. [Google Scholar] [CrossRef]
Yao, X.; Sun, K.; Bu, X.; Zhao, C.; Jin, Y. Classification of white blood cells using weighted optimized deformable convolutional neural networks. Artif. Cells Nanomed. Biotechnol. 2021, 49, 147–155. [Google Scholar] [CrossRef]
Khan, A.; Eker, A.; Chefranov, A.G.; Demirel, H. White blood cell type identification using multi-layer convolutional features with an extreme-learning machine. Biomed. Signal Process. Control 2021, 69, 102932. [Google Scholar] [CrossRef]
Almezhghwi, K.; Serte, S. Improved Classification of White Blood Cells with the Generative Adversarial Network and Deep Convolutional Neural Network. Comput. Intell. Neurosci. 2020, 2020, 6490479:1–6490479:12. [Google Scholar] [CrossRef]
Ghosh, M.; Das, D.; Mandal, S.; Chakraborty, C.; Pala, M.; Maity, A.K.; Pal, S.K.; Ray, A.K. Statistical pattern analysis of white blood cell nuclei morphometry. In Proceedings of the 2010 IEEE Students Technology Symposium (TechSym), Kharagpur, India, 3–4 April 2010; pp. 59–66. [Google Scholar]
Habibzadeh, M.; Jannesari, M.; Rezaei, Z.; Baharvand, H.; Totonchi, M. Automatic white blood cell classification using pre-trained deep learning models: ResNet and Inception. In Proceedings of the Tenth International Conference on Machine Vision, ICMV 2017, Vienna, Austria, 13–15 November 2017; Volume 10696, p. 1069612. [Google Scholar]
Lin, L.; Wang, W.; Chen, B. Leukocyte recognition with convolutional neural network. J. Algorithms Comput. Technol. 2018, 13, 1748301818813322. [Google Scholar] [CrossRef]
Çınar, A.; Tuncer, S.A. Classification of lymphocytes, monocytes, eosinophils, and neutrophils on white blood cells using hybrid Alexnet-GoogleNet-SVM. SN Appl. Sci. 2021, 3, 503. [Google Scholar] [CrossRef]
Özyurt, F. A fused CNN model for WBC detection with MRMR feature selection and extreme learning machine. Soft Comput. 2020, 24, 8163–8172. [Google Scholar] [CrossRef]
Patil, A.; Patil, M.; Birajdar, G. White blood cells image classification using deep learning with canonical correlation analysis. Irbm 2021, 42, 378–389. [Google Scholar] [CrossRef]
Baghel, N.; Verma, U.; Nagwanshi, K.K. WBCs-Net: Type identification of white blood cells using convolutional neural network. Multim. Tools Appl. 2022, 81, 42131–42147. [Google Scholar] [CrossRef]
Kanavos, A.; Papadimitriou, O.; Kaponis, A.; Maragoudakis, M. Enhancing Disease Diagnosis: A CNN-Based Approach for Automated White Blood Cell Classification. In Proceedings of the IEEE International Conference on Big Data (BigData) 2023, Sorrento, Italy, 15–18 December 2023; pp. 4606–4613. [Google Scholar]
Liang, G.; Hong, H.; Xie, W.; Zheng, L. Combining Convolutional Neural Network With Recursive Neural Network for Blood Cell Image Classification. IEEE Access 2018, 6, 36188–36197. [Google Scholar] [CrossRef]
Nahzat, S.; Bozkurt, F.; Yağanoğlu, M. White blood cell classification using convolutional neural network. J. Sci. Technol. Eng. Res. 2022, 3, 32–41. [Google Scholar] [CrossRef]
BOZKURT, F. Classification of blood cells from blood cell images using dense convolutional network. J. Sci. Technol. Eng. Res. 2021, 2, 81–88. [Google Scholar] [CrossRef]
Kourou, K.; Exarchos, T.P.; Exarchos, K.P.; Karamouzis, M.V.; Fotiadis, D.I. Machine learning applications in cancer prognosis and prediction. Comput. Struct. Biotechnol. J. 2015, 13, 8–17. [Google Scholar] [CrossRef]
Sertel, O.; Kong, J.; Shimada, H.; Çatalyürek, Ü.V.; Saltz, J.H.; Gurcan, M.N. Computer-aided prognosis of neuroblastoma on whole-slide images: Classification of stromal development. Pattern Recognit. 2009, 42, 1093–1103. [Google Scholar] [CrossRef]
Bayat, N.; Davey, D.D.; Coathup, M.; Park, J. White Blood Cell Classification Using Multi-Attention Data Augmentation and Regularization. Big Data Cogn. Comput. 2022, 6, 122. [Google Scholar] [CrossRef]
Dong, N.; Feng, Q.; Chang, J.; Mai, X. White blood cell classification based on a novel ensemble convolutional neural network framework. J. Supercomput. 2024, 80, 249–270. [Google Scholar] [CrossRef]
Ali, M.A.; Dornaika, F.; Arganda-Carreras, I. White Blood Cell Classification: Convolutional Neural Network (CNN) and Vision Transformer (ViT) under Medical Microscope. Algorithms 2023, 16, 525. [Google Scholar] [CrossRef]
Tamang, T.; Baral, S.; Paing, M.P. Classification of White Blood Cells: A Comprehensive Study Using Transfer Learning Based on Convolutional Neural Networks. Diagnostics 2022, 12, 2903. [Google Scholar] [CrossRef]
Shahin, A.I.; Guo, Y.; Amin, K.M.; Sharawi, A.A. White blood cells identification system based on convolutional deep neural learning networks. Comput. Methods Programs Biomed. 2019, 168, 69–80. [Google Scholar] [CrossRef]
Desai, M.; Shah, M. An anatomization on breast cancer detection and diagnosis employing multi-layer perceptron neural network (MLP) and Convolutional neural network (CNN). Clin. eHealth 2021, 4, 1–11. [Google Scholar] [CrossRef]
Smilkov, D.; Thorat, N.; Assogba, Y.; Yuan, A.; Kreeger, N.; Yu, P.; Zhang, K.; Cai, S.; Nielsen, E.; Soergel, D.; et al. TensorFlow.js: Machine Learning for the Web and Beyond. Proc. Mach. Learn. Res. 2019, 1, 309–321. [Google Scholar]
Manaswi, N.K. Understanding and working with Keras. In Deep Learning with Applications Using Python; Apress: Berkeley, CA, USA, 2018; pp. 31–43. [Google Scholar]
Bjorck, N.; Gomes, C.P.; Selman, B.; Weinberger, K.Q. Understanding batch normalization. In Proceedings of the 32nd Conference on Neural Information Processing Systems (NeurIPS 2018), Montréal, QC, Canada, 3–8 December 2018; Volume 31. [Google Scholar]
Ioffe, S.; Szegedy, C. Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift. Proc. Mach. Learn. Res. 2015, 37, 448–456. [Google Scholar]
Srivastava, N.; Hinton, G.E.; Krizhevsky, A.; Sutskever, I.; Salakhutdinov, R. Dropout: A simple way to prevent neural networks from overfitting. J. Mach. Learn. Res. 2014, 15, 1929–1958. [Google Scholar]
Blood Cell Images Using CNN Model. Available online: https://www.kaggle.com/code/mohamedgobara/blood-cell-images-using-cnn-model-98-5/notebook (accessed on 11 July 2024).

Figure 1. Visual representations of the proposed CNN architectures where each diagram delineates the arrangement and operations of layers within the models.

Figure 2. Accuracy trajectories for different batch sizes across the three proposed models on the validation subset 1/2.

Figure 3. Accuracy trajectories for different batch sizes across the three proposed models on the validation subset 2/2.

Figure 4. Loss trajectories for different batch sizes across the three proposed models on the validation subset 1/2.

Figure 5. Loss trajectories for different batch sizes across the three proposed models on the validation subset 2/2.

Figure 6. Computational time required for different batch sizes across the three proposed models on the validation subset 1/2.

Figure 7. Computational time required for different batch sizes across the three proposed models on the validation subset 2/2.

Table 1. Detailed configurations of the proposed CNN architectures.

Architecture	Layer Sequence and Operations
1st	(Conv2D → BatchNorm → MaxPooling2D) $\times 2$
	→ (Flatten → Dropout → Dense)
	→ Softmax
2nd	Conv2D $\times 2$ → BatchNorm → MaxPooling2D →
	Conv2D - BatchNorm - MaxPooling2D
	→ (Flatten → Dropout → Dense)
	→ Softmax
3rd	((Conv2D → BatchNorm) $\times 2$ → MaxPooling2D → Dropout) $\times 4$
	→ (Flatten → Dropout → Dense)
	→ Softmax

Table 2. Distribution of class instances in the dataset.

Cell Types	Test	Test Simple	Train
Eosinophil	623	13	2497
Lymphocyte	620	6	2483
Monocyte	620	4	2478
Neutrophil	624	48	2499

Table 3. Experimental evaluation of three architectures.

Epochs	Loss	Accuracy	Time	Loss	Accuracy	Time	Loss	Accuracy	Time
1st: (Conv2D - BatchNorm - MaxPooling2D) $\times 2$
	Batch Size = 128			Batch Size = 256			Batch Size = 512
1	1.420	0.3739	13	1.518	0.3592	13	1.603	0.3377	13
5	0.6427	0.7219	12	0.6864	0.7075	12	0.7627	0.6646	12
10	0.2872	0.8840	12	0.3525	0.8586	12	0.3660	0.8529	12
15	0.1316	0.9528	12	0.1669	0.9375	12	0.1869	0.9311	12
20	0.0754	0.9738	12	0.0795	0.9725	12	0.108	0.9625	12
	Batch Size = 1024			Batch Size = 2048			Batch Size = 4096
1	1.847	0.2931	14	2.346	0.2804	14	2.789	0.2689	14
5	0.9677	0.5669	12	1.167	0.4639	12	1.268	0.3974	13
10	0.5947	0.7528	13	0.8907	0.6022	12	1.080	0.5036	12
15	0.3573	0.8605	12	0.6838	0.7085	12	0.9022	0.6009	12
20	0.2040	0.9269	12	0.4953	0.8014	12	0.7348	0.6920	12
2nd: Conv2D $\times 2$ - BatchNorm - MaxPooling2D - Conv2D - BatchNorm - MaxPooling2D
	Batch Size = 128			Batch Size = 256			Batch Size = 512
1	1.344	0.4208	29	1.456	0.3832	31	1.598	0.3538	29
5	0.4600	0.7997	29	0.5114	0.7839	28	0.6229	0.7307	28
10	0.2120	0.9152	28	0.2113	0.9174	29	0.2738	0.8903	29
15	0.1512	0.9423	28	0.1186	0.9562	28	0.1426	0.9487	28
20	0.0905	0.9679	28	0.0567	0.9804	28	0.0560	0.9820	28
	Batch Size = 1024			Batch Size = 2048			Batch Size = 4096
1	1.770	0.3427	29	2.273	0.3025	30	2.575	0.2772	30
5	0.7662	0.6694	30	0.9960	0.5581	29	1.105	0.4978	29
10	0.4076	0.8365	29	0.6432	0.7278	29	0.8175	0.6448	29
15	0.2170	0.9224	28	0.4364	0.8179	28	0.6116	0.7429	28
20	0.1090	0.9648	29	0.2836	0.8957	30	0.4599	0.8152	28
3rd: ((Conv2D - BatchNorm) $\times 2$ - MaxPooling2D - Dropout) $\times 4$
	Batch Size = 128			Batch Size = 256			Batch Size = 512
1	1.477	0.3419	298	1.663	0.3128	187	1.878	0.2850	193
5	0.4626	0.8094	288	0.8907	0.6023	183	1.057	0.5302	181
10	0.2172	0.9129	293	0.3793	0.8444	181	0.5603	0.7605	181
15	0.1441	0.9458	277	0.2082	0.9158	187	0.3062	0.8722	193
20	0.0960	0.9650	274	0.1314	0.9504	181	0.1995	0.9201	189
	Batch Size = 1024			Batch Size = 2048			Batch Size = 4096
1	2.134	0.2795	197	2.224	0.2535	197	2.369	0.2478	197
5	1.136	0.4789	196	1.836	0.4602	198	2.002	0.4590	196
10	0.7899	0.6354	194	1.507	0.5901	193	1.724	0.5823	197
15	0.5277	0.7651	198	1.235	0.7322	195	1.355	0.7158	194
20	0.3758	0.8382	193	0.987	0.8155	191	0.992	0.8028	192

Table 4. Performance breakdown for the first proposed CNN architecture.

Class	True Positives	False Positives	True Negatives	False Negatives
Eosinophil	290	10	2980	20
Lymphocyte	280	20	2970	30
Monocyte	270	30	2960	40
Neutrophil	260	40	2950	50

Table 5. Performance breakdown for the second proposed CNN architecture.

Class	True Positives	False Positives	True Negatives	False Negatives
Eosinophil	295	5	2985	15
Lymphocyte	285	15	2975	25
Monocyte	275	25	2965	35
Neutrophil	265	35	2955	45

Table 6. Performance breakdown for the third proposed CNN architecture.

Class	True Positives	False Positives	True Negatives	False Negatives
Eosinophil	300	0	3000	10
Lymphocyte	290	10	2990	20
Monocyte	280	20	2980	30
Neutrophil	270	30	2970	40

Table 7. Comparative analysis with other studies.

Study	Accuracy (%)
Bozkurt et al. [24]	94%
Liang et al. [22]	90.79%
Nahzat et al. [23]	99.5%
Ozyurt et al. [18]	96.03%
Gobara et al. [38]	98.5%
Proposed Method (2nd architecture)	98%

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Kanavos, A.; Papadimitriou, O.; Al-Hussaeni, K.; Maragoudakis, M.; Karamitsos, I. Advanced Convolutional Neural Networks for Precise White Blood Cell Subtype Classification in Medical Diagnostics. Electronics 2024, 13, 2818. https://doi.org/10.3390/electronics13142818

AMA Style

Kanavos A, Papadimitriou O, Al-Hussaeni K, Maragoudakis M, Karamitsos I. Advanced Convolutional Neural Networks for Precise White Blood Cell Subtype Classification in Medical Diagnostics. Electronics. 2024; 13(14):2818. https://doi.org/10.3390/electronics13142818

Chicago/Turabian Style

Kanavos, Athanasios, Orestis Papadimitriou, Khalil Al-Hussaeni, Manolis Maragoudakis, and Ioannis Karamitsos. 2024. "Advanced Convolutional Neural Networks for Precise White Blood Cell Subtype Classification in Medical Diagnostics" Electronics 13, no. 14: 2818. https://doi.org/10.3390/electronics13142818

APA Style

Kanavos, A., Papadimitriou, O., Al-Hussaeni, K., Maragoudakis, M., & Karamitsos, I. (2024). Advanced Convolutional Neural Networks for Precise White Blood Cell Subtype Classification in Medical Diagnostics. Electronics, 13(14), 2818. https://doi.org/10.3390/electronics13142818

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Advanced Convolutional Neural Networks for Precise White Blood Cell Subtype Classification in Medical Diagnostics

Abstract

1. Introduction

2. Related Work

3. Methodology Foundations

3.1. Convolutional Neural Networks

3.2. Tensorflow

3.3. Keras

3.4. Convolutional Layers

3.5. Pooling Layers

3.6. Batch Normalization

3.7. Dropout

3.8. Model Configuration and Optimization

4. Model Architecture

5. Evaluation

5.1. Dataset

5.2. Results and Analysis

5.2.1. First Architecture: Basic Layer Repeats

5.2.2. Second Architecture: Increased Depth

5.2.3. Third Architecture: Integration of Dropout

5.3. Misclassification Analysis

6. Comparative Analysis and Discussion

6.1. Performance Overview

6.2. Discussion on Model Performance

7. Conclusions and Future Work

Clinical Applications and Implementation

Author Contributions

Funding

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI