1. Introduction
Blood analysis is commonly employed as a routine medical diagnostic method, playing a crucial role in the early detection and treatment of numerous diseases. Generally, variations in the content of blood cells and various components in the plasma, as well as morphological changes in blood cells, can provide valuable information when the internal environment of the body undergoes alterations. This aids healthcare professionals in assessing the patient’s condition [
1]. Traditional blood cell analysis involves the manual counting and recording of the number of red blood cells, white blood cells, and platelets in blood smears prepared through dilution under a microscope. However, the human eye cannot perform in-depth analyses or differentiate the chemical properties of cells. Prolonged microscopic observation is prone to observer fatigue, increasing the likelihood of misdiagnosis. The advent of blood analysis instruments has partially replaced manual work, allowing for accurate and rapid cell count statistics. Nevertheless, current instruments are not flawless in blood analysis, with errors occurring relatively frequently. Therefore, to ensure the accuracy and reliability of diagnostic results, re-examination is necessary. Moreover, certain pathological cells, such as abnormal lymphocytes, remain undetectable to blood analyzers. The detection of such pathological information is critical and indispensable for the early diagnosis of some diseases [
2].
The emergence of hyperspectral imaging technology conveniently addresses the technological gap in the medical field. Derived from multispectral imaging technology, hyperspectral imaging integrates imaging and spectral techniques to capture two-dimensional geometric spatial and one-dimensional spectral information of the target, characterized by a “unified spectrum” feature [
3]. Microscopic hyperspectral imaging applies hyperspectral imaging technology to the microscopic domain, combining the characteristics of optical imaging and spectral analysis [
4]. It continuously captures images of tissue or cell samples using narrow and contiguous spectral channels covering the visible, infrared, ultraviolet, and other spectral ranges. The rich spectrum contains diagnostic spectral features. The fine differences in the spectral curves can be utilized to discern subtle characteristics of tissue or cell physiology, morphology, and biochemical composition. This technique has advantages in the precise discrimination and early diagnosis of diseases. Furthermore, the computer processing of hyperspectral microscopic image information ensures rapid and high-precision analysis, eliminating the constraints of manual conditions. This guarantees that experimental results are not influenced by subjective factors, maintaining a uniform and reliable discrimination standard.
Microscopic hyperspectral imaging technology has been applied in various aspects of the medical field [
5,
6], including detecting tumors in the human body [
7,
8,
9], distinguishing normal tissues from cancerous tissues [
10,
11,
12], and facilitating the cytological diagnosis of cancer cells [
13,
14,
15], as well as assisting in the diagnosis and characterization of the central nervous system [
16,
17,
18]. These studies have demonstrated that tissues with similar biochemical compositions are more likely to exhibit similar spectral features. Specific spectral profiles inherent in different cells and tissues, akin to fingerprints, contain subtle differences. Therefore, they can be utilized for the characterization and differentiation of objects, particularly those objects that are challenging to distinguish using traditional microscopic imaging methods.
In the current research landscape of hyperspectral microscopic image classification, the predominant approach adopted by researchers involves the utilization of deep learning techniques for image classification. Long Jonathan and his team [
19], recognizing the distinctive features of pathological images, introduced the Fully Convolutional Network (FCN). This network imposes no restrictions on the input size of images, enabling the generation of results identical in size to the input images. Consequently, this innovation elevated image segmentation to the pixel level. Building upon the FCN, the team led by Ranneberger Olaf [
20] proposed U-Net, which incorporates up-sampling and down-sampling techniques. Joshua Samuel Raj and colleagues [
21] introduced the Opposition-based Crow Search (OSC) algorithm, adept at extracting optimal features from preprocessed images. This algorithm significantly enhances the accuracy, specificity, and sensitivity of medical image diagnostics. Niranjan Balachandar and team [
22] improved the detection accuracy of diabetic retinopathy and the classification of chest X-ray images by optimizing parameters such as local training iteration counts, cyclic learning rates, and cyclic weighted losses in the Cyclical Weight Transfer (CWT) framework. In the research applying deep learning and image processing techniques to the classification of human blood cells, scholars have also endeavored to explore and achieve promising results in classification studies based on sparse representation [
23] and convolutional neural networks [
24].
Although significant progress has been made in the classification of blood cell microscopic hyperspectral images, certain technical deficiencies and challenges still persist. Primarily, the samples within the blood cell dataset exhibit a highly uneven distribution across different categories, posing a challenge for existing CNN architectures to effectively address this issue. Additionally, hyperspectral images typically entail high-dimensional spectral information, leading to elevated data processing and computational costs. Consequently, lightweight network architectures have become crucial for mitigating these challenges. In response to these issues, this paper proposes a lightweight GhostMRNet tailored for the classification of blood cell microscopic hyperspectral images. The primary contributions of this study are delineated as follows:
- (1)
A lightweight GhostMRNet model is proposed in this study. The model was tested on a real dataset of blood cell microscopic hyperspectral images, achieving an overall classification accuracy, average classification accuracy, and Kappa coefficient of 99.965%, 99.565%, and 0.9925%, respectively. These results substantiate the effectiveness of our proposed approach, underscoring its research value and practical significance in assisting medical professionals in disease diagnosis.
- (2)
To simultaneously achieve a lightweight design and enhance network feature extraction capability, the GhoMR block was introduced. A Ghost Module replaces conventional convolutional layers, effectively reducing the network’s computational complexity. Multiscale feature extraction is realized through the cascading of convolutional kernels, enhancing network feature extraction capabilities. Additionally, to further augment the feature representation capability and reduce redundant information, the SE module was incorporated to allocate weights to features in each channel, facilitating the fusion of inter-channel features.
- (3)
To address the issue of imbalanced sample classes, focal loss is employed. By adjusting the weights of the loss function, focal loss focuses on challenging-to-classify samples, contributing to a balanced emphasis on different categories. This enhances the classification accuracy, particularly for rare categories, and mitigates the impact of class imbalance on model training.
3. Proposed Method
3.1. GhostMRNet
Due to the limited number of samples in the microscopic hyperspectral blood cell data used in this study and the categorization of annotated samples into only two classes, red blood cells and white blood cells, the adoption of excessively deep or complex network structures may have led to overfitting issues. Therefore, based on the principle of simplicity, this paper proposes a lightweight GhostMRNet model for the classification of microscopic hyperspectral blood cell images.
The overall architecture of the network is illustrated in
Figure 3. Initially, noise reduction and dimensionality reduction in the hyperspectral images are achieved through median filtering and principal component analysis (PCA). The preprocessed hyperspectral images are then partitioned into image blocks as inputs to the network. Feature extraction is performed initially using a standard 3 × 3 convolutional layer. Subsequently, two GhoMR blocks are employed to extract features at different scales. The GhoMR block employs the Ghost Module instead of conventional convolutions, reducing the number of parameters and computational complexity while maintaining network performance. Multiscale feature extraction is achieved through the cascading of convolutional kernels. Furthermore, an SE block is utilized to recalibrate the weights of features in each channel, learning the correlation between features in different channels and facilitating feature fusion. The introduction of residual connections helps prevent overfitting during deep feature extraction, thereby enhancing classification efficiency and accuracy. The final step involves outputting image prediction results through fully connected layers. To address the issue of imbalanced sample classes, focal loss is employed to balance the model’s focus on different categories and improve the classification accuracy.
3.2. GhoMR
The spectral information in hyperspectral images is rich, and effectively extracting image features while maintaining a lightweight structure poses a challenge. The GhoMR block proposed in this paper employs the Ghost Module instead of conventional convolutional layers, effectively reducing network computational complexity. To achieve multiscale feature extraction, a method involving the cascading of small convolutional kernels is employed instead of large ones, enhancing network nonlinearity while reducing computational load. The Ghost Module utilizes “inexpensive” linear operations to generate similar feature maps, yet the contributions of different feature maps vary. Hence, the SE block was introduced to allocate weights to the features in each channel, selectively emphasizing informative features and suppressing irrelevant channel features.
The overall architecture of the GhoMR block is illustrated in
Figure 4. In assuming that the model’s input is
, the feature extraction of the GhoMR block is divided into the following three steps:
- (1)
The extraction of feature
is performed using a Ghost Module with a 1 × 1 kernel:
- (2)
The Spit operation is employed to partition the N feature maps of
into four subsets, denoted by
, where
. Each subset, excluding
, is required to undergo processing through a 3 × 3 Ghost Module. The output,
, of the preceding Ghost Module undergoes hierarchical fusion through summation with the elements of the current subset,
, resulting in the generation of the feature set
:
where
represents summing by elements.
- (3)
Ultimately, the output mappings
,
,
, and
are concatenated along their depth, creating a unified feature block encompassing all information. This consolidated feature block undergoes feature recalibration through a 1 × 1 Ghost Module and an SE block. Subsequently, it is fused with the input
through a residual link to generate the final output
. This operation is denoted as follows:
where
represents concatenation, and
denotes element-wise summation.
3.3. Focal Loss
Table 1 presents the proportions of two types of blood cell samples. In considering the issue of imbalanced positive and negative classes of red and white blood cells in hyperspectral images, the loss function employed is an improvement upon the Cross-Entropy Loss, known as the focal loss [
27]. The formula for the focal loss is given by
Here, denotes the sample labels of the true samples, while represents the predicted sample labels; is used to address the imbalance between positive and negative samples, and is employed to handle the imbalance of easy and difficult samples. Based on debugging experience, was set to 0.6, and was set to 1.
4. Experiments and Discussion
4.1. Dataset Preprocessing
The human blood cell dataset utilized in this study was collected using a microscopic hyperspectral imaging system consisting of a microscope and a silicon charge-coupled device. Blood smears used for dataset collection were provided by the Department of Hematology, Ruijin Hospital, Shanghai Jiaotong University School of Medicine, and stained with Giemsa biological dye. Datasets 1-3 and 2-2 originated from distinct patients, and cellular categorization was performed under the supervision of expert medical professionals. As illustrated in
Figure 5, the original size of the 1-3 human blood cell dataset is 974 × 799 × 33, and that of the 1-2 human blood cell dataset is 462 × 451 × 33. On the left, there are single-band dataset images, and on the right, there are data labels. In the label data images, black represents the background, red denotes red blood cells, and white signifies white blood cells. Since background data contribute insignificantly to the classification of red and white blood cells, this study excluded the background during classification, focusing solely on the classification of red and white blood cells. Due to the blurriness of the original images, preprocessing involved the initial application of median filtering and normalization. Subsequently, PCA was employed to reduce the dimensionality of the hyperspectral images.
4.2. Experiment Setting
The GhostMRNet proposed in this paper was implemented using the Python language and the PyTorch deep learning framework. All experiments were conducted on a computer with a 64-bit Windows 10 operating system, 16 GB of RAM, and an NVIDIA GeForce GTX 1660 Ti 6GB GPU Nvidia Corporation, Santa Clara, CA, USA. To mitigate biases introduced by diverse training samples, this paper provides the average results and standard deviation of 10 experiments conducted under identical conditions in self-experimentation, ablation experiments, parameter configuration experiments, and comparative experiments with other models.
To accurately reflect the classification performance of the designed model, stochastic gradient descent was employed as the optimization algorithm with a learning rate set to 0.001, a batch size set to 256, and 50 epochs. In terms of training sample selection, 10% of the training samples were randomly chosen, and the remaining 90% were designated as the test set. To assess the classification accuracy comprehensively, metrics such as the overall accuracy (OA), average accuracy (AA), precision, recall, F1-score, and Kappa coefficient were utilized for the quantitative evaluation of the classification performance.
4.3. Classification Results of GhostMRNet
The GhostMRNet was applied for classification on two hyperspectral images, and the classification results are illustrated in
Figure 6 and presented in
Table 2. In
Figure 6, (a) represents the annotated image generated based on medical experts’ annotations, where black signifies excluded background, red denotes red blood cells, and white signifies white blood cells. Subsequently, (b) displays the classification predictions from GhostMRNet. It is evident that, overall, the predictions from this method exhibit a high degree of alignment with the ground truth labels.
From
Table 2, it can be observed that there is minimal difference in the comprehensive evaluation metrics across different images, with an average OA of 99.965% and an average AA of 99.565%. This indicates that the GhostMRNet model exhibits a consistent and excellent classification performance for both cell types, demonstrating its robustness in addressing sample imbalance issues. The average Kappa coefficient of 0.9925 further signifies a high level of consistency between the predicted and true categories, emphasizing the model’s accuracy in classification.
4.4. Experimental Parameters
In the classification of human blood cells based on microscopic hyperspectral imaging, the classification performance is primarily influenced by three factors: window size, the number of principal components, and the proportion of training samples. This subsection will evaluate the impact of different parameter settings on the blood cell classification results through a comparative analysis.
4.4.1. Effect of Window Size
To investigate the impact of window size on the classification results, while keeping other parameters constant, window sizes of 5 × 5, 7 × 7, 9 × 9, and 11 × 11 were employed. The classification results are depicted in
Figure 7.
From the experimental results, it is evident that an increase in window size has a positive impact on the classification of human blood cells in two distinct microscopic hyperspectral images. However, when the window size exceeds 9 × 9, the improvement in classification results becomes marginal, accompanied by a substantial increase in computational overhead. Therefore, in this study, the window size was determined to be 9 × 9.
4.4.2. Effect of the Number of Principal Components
To investigate the impact of the dimensionality of reduced data on the classification results, while keeping other parameters constant, the dimensionality of the reduced data was set to 5, 10, 15, and 20, employing PCA. The classification results are illustrated in
Figure 8.
From the experimental results, it is observed that the classification performance of the 2-2 blood cell dataset improves with an increase in the PCA dimensionality. For the 1-3 human blood cell dataset, the PCA dimensionality and classification performance exhibit a positive correlation from dimensions 5 to 10. However, when the PCA dimensionality exceeds 10, the classification performance begins to decline. Extremely low dimensions may lead to reductions in the spectral characteristics reflected in the images, while excessively high dimensions may introduce more redundant information, incurring higher computational costs. Moreover, beyond a data dimensionality of 10, there is a limited improvement in the accuracy and Kappa coefficient values. Therefore, reducing the dimensionality to around 10 is a more cost-effective choice.
4.4.3. Effect of Training Ratio
The small-sample problem is a prevalent issue in existing hyperspectral image (HSI) classification methods. To assess the classification performance of GhostMRNet under small training sets, we varied the training set proportions to 5%, 10%, 15%, and 20%. The classification results are depicted in
Figure 9.
From the experimental results, it is evident that the proposed classification method maintains good accuracy even in scenarios with small training sets. A substantial improvement in classification performance is observed when the training set proportion increases from 5% to 10%. However, beyond a training set proportion of 10%, the enhancement in classification performance becomes less noticeable. Therefore, considering considerations related to GPU computing power and experimental time, a training set proportion of around 10% was deemed a suitable choice.
4.5. Ablation Experiments
To validate the effectiveness of the proposed GhoMR module, as well as the Ghost Module and SE blocks within the GhoMR module, this study conducted ablation experiments. Experiment 1 replaces the Ghost Module in GhoMRNet with a regular convolutional layer, Experiment 2 removes the SE block from the GhoMR module, and Experiment 4 substitutes the multiscale feature extraction part of the GhoMR module with regular convolutional layers. The experiments utilized the same data set partitioning method and hyperparameter settings, with the network input being preprocessed microscopic hyperspectral image blocks. The results are presented in
Table 3 and
Table 4.
From
Table 3 and
Table 4, it can be observed that after incorporating the Ghost Module, the network parameters were reduced by 24%, yet the classification performance did not decrease significantly. This suggests that the Ghost Module can significantly reduce the network parameters without compromising classification performance. After the introduction of multiscale feature extraction in GhoMR, the AAs for 1-3 and 2-2 increased by 0.64% and 0.54%, respectively, and the Kappa coefficients increased by 0.007 and 0.004, indicating that the multiscale structure in the GhoMR effectively enhances feature extraction capabilities, thereby improving the classification performance for minority samples. However, upon the removal of the SE block from the GhoMR module, particularly evident in dataset 2-2, a significant decline in the model performance occurred. This observation indicates that the SE block effectively enhances crucial feature channels extracted through multiscale feature extraction, suppressing redundant features, and thereby enhancing the network’s feature extraction capability.
4.6. Comparison Results of Different Methods
To further evaluate the classification performance of the proposed model on microscopic hyperspectral images, this subsection compares it with those of GhostNet [
25], ResNet34 [
28], a CNN-based classification model [
24], and a SVM.
In the CNN-based classification model [
24], there were eight weighted layers, including the input layer, two convolutional layers, two max-pooling layers, two fully connected layers, and a final output layer. The parameters for each layer were set as follows: in the second layer, the number of convolutional filters was 16, and the filter size was 3 × 3; in the third layer, the pooling layer filter size was 3 × 3; in the fourth layer, the number of convolutional filters was 32, and the filter size was 3 × 3; in the fifth layer, the pooling layer filter size was 3 × 3; and the numbers of neurons in the sixth and seventh layers were 256 and 2, respectively. The same preprocessing and experimental parameter settings as in the baseline experiment were applied to each model. The experimental results are shown in
Table 5 and
Table 6, and
Figure 10 and
Figure 11 depict the classification results of different algorithms on microscopic hyperspectral images, where green in (f) represents the prediction of white blood cells.
From the figures, it can be observed that GhostMRNet and ResNet34 exhibit superior classification performances, while GhostNet and the CNN-based classification model [
24] struggled to identify white blood cells. The SVM model shows a poorer classification performance, especially in the case of dataset 1-3, where it failed to effectively classify red and white blood cells.
Table 5 and
Table 6 indicate that, under the same focal loss framework, when faced with highly imbalanced class distributions, GhostNet and the CNN-based classification model [
24] both struggle to perform well, particularly in recognizing white blood cells. On the other hand, the SVM suffered from severe overfitting and could not correctly distinguish between red and white blood cells. In contrast, GhostMRNet, which accurately classified red and white blood cells, outperformed ResNet34 in all evaluation metrics. This demonstrates that the proposed GhostMRNet, utilizing multiscale feature extraction, effectively captures the features of a small amount of data, achieving excellent performances in the classification of imbalanced human blood cell datasets.
5. Discussion
Blood detection is crucial for the early detection and treatment of many diseases. Although existing blood analyzers can accurately and rapidly perform cell counts, they still fall short of perfecting blood analysis. Microscopic hyperspectral imaging provides spatial and spectral information to assist in blood detection. In this study, we propose a lightweight end-to-end network that utilizes multiscale feature extraction and SE modules to extract the spatial and channel features of microscopic hyperspectral images, while employing Ghost Modules to reduce network parameters, effectively achieving blood cell classification. We conducted extensive experiments to evaluate the effectiveness of our model. Compared to other well-known networks, the proposed network achieves better performances. The experimental results demonstrate the helpfulness of GhoMR’s design and the introduction of SE modules.
Despite our model outperforming others, there are still limitations. During data preprocessing, we employed PCA for dimensionality reduction to reduce spectral redundancy and computational load. PCA is an effective dimensionality reduction method, but its linear transformation disrupts the original spatial geometry of medical hyperspectral images, which may have undesirable effects on the dimensionality reduction. Therefore, incorporating blood cell biochemical properties into band selection for medical hyperspectral images would be a better dimensionality reduction approach.
6. Conclusions
This study primarily addresses the human blood cell classification issue based on microscopic hyperspectral imaging. A lightweight GhostMRNet model was proposed for the classification of microscopic hyperspectral images of blood cells. The GhoMR block uses Ghost Modules instead of conventional convolutional layers and employs a cascading approach with small convolutional kernels to achieve multiscale feature extraction, aiming to enhance network feature extraction capabilities while reducing computational complexity. Additionally, the SE block was introduced to allocate weights to features in each channel, selectively emphasizing informative features while suppressing less crucial channel features. The GhoMR block effectively extracts spatial and spectral features from microscopic hyperspectral images, yielding an improved classification performance with lower computational costs.
Several comparative experiments were conducted in this study. Firstly, by comparing models with and without the GhoMR block, as well as GhoMR blocks without multiscale feature extraction, the effectiveness of the GhoMR block in reducing network parameters while enhancing the classification performance was demonstrated. Secondly, through comparisons with other deep learning and machine learning methods, the results indicate that GhostMRNet exhibits a superior classification performance in blood cell hyperspectral image tasks, with lower computational costs, demonstrating robustness.
It is worth noting that the experiments in this study were conducted solely on two microscopic hyperspectral images of blood cells from the same time period. Future research could further develop experiments using blood cell hyperspectral datasets from different time periods to comprehensively assess the model’s performance and enhance its practical application value in medical diagnosis.