1. Introduction
Lung diseases, such as fibrosis, opacity, tuberculosis, and pneumonia (viral and COVID), pose a significant global health burden, impacting the lives of countless individuals worldwide. These diseases are characterized by their detrimental effect on lung function, notably leading to a loss of lung elasticity. This decrease in elasticity results in a reduced total volume of air that the lungs can hold, consequently impairing respiratory function. The ability of some lung diseases to spread rapidly, especially in cases of infectious conditions like tuberculosis and pneumonia, underscores the critical need for prompt and accurate diagnosis. Early identification of these diseases is paramount, as it enables the timely initiation of appropriate treatment, essential in mitigating the spread of the disease and improving patient outcomes. The rapid and accurate diagnosis of lung diseases benefits individual patients by providing them with the necessary treatment. It plays a crucial role in public health by controlling the spread of infectious respiratory conditions [
1,
2,
3,
4].
In the dynamic and evolving landscape field of medical diagnostics, the integration of cutting-edge technologies has become essential, especially in the area of pulmonary health. Within this context, chest X-ray imaging emerges as a fundamental tool. It offers a non-invasive and efficient approach to detecting and analyzing lung abnormalities. This imaging technique is of paramount importance for healthcare professionals. It facilitates the rapid and accurate diagnosis of a wide range of lung diseases, making it a key component in managing and treating pulmonary conditions. The strength of chest X-ray imaging lies in its ability to deliver clear and comprehensive images of the chest cavity. These detailed visualizations are crucial for clinicians in accurately detecting, monitoring, and addressing different lung pathologies. As advancements in medical science continue, the role of chest X-ray imaging in diagnosing and managing lung diseases is increasingly significant, underscoring the need for ongoing research and development in this vital area of healthcare [
5,
6,
7,
8].
In the last few years, the domain of medical image analysis has revolutionized with the introduction of deep learning (DL) methodologies. These approaches are attributable to the inherent capacity of DL to automate and enhance complex analytical processes, thus introducing novel prospects in medical diagnostics. The impact of DL is particularly pronounced in the area of chest radiology, where these advanced computational techniques have shown exceptional proficiency. DL algorithms, characterized by their sophisticated pattern recognition capabilities, have substantially improved the way chest images are interpreted, offering a more nuanced and accurate approach to detecting and diagnosing lung-related diseases. These techniques leverage large datasets of medical images, learning intricate patterns and anomalies that might elude traditional methods, thereby providing a more comprehensive and detailed understanding of pulmonary conditions. As such, DL in chest radiology not only represents a technological advancement but also marks a significant leap in the ability of medical professionals to diagnose and treat lung diseases with greater precision and effectiveness. This integration of DL into chest radiology augments not only the diagnostic accuracy but also fosters the development of personalized treatment strategies. Such advancements are pivotal in improving patient outcomes, marking a significant stride in managing and treating pulmonary health juncture [
9,
10].
This paper presents an innovative DL framework engineered for the multi-class diagnosis of lung diseases by analyzing chest X-ray images. Our objective is to address the increasing demand for efficient diagnostic tools and to harness the advanced capabilities offered by DL technologies. The proposed framework is a state-of-the-art convolutional neural network (CNN) architecture. This architecture is designed to extract and assimilate discriminative features from chest X-ray images, a process crucial for accurate lung disease identification. The CNN’s ability to process and analyze complex visual data from X-rays enables it to identify subtle patterns and anomalies of various lung conditions, which might be challenging to discern through conventional diagnostic methods. The proposed framework provides more accurate, efficient, and reliable diagnostic solutions, and the reliability of the diagnostic outcomes is enhanced by the robustness of the model, which is trained and validated on an extensive dataset, ensuring consistent performance across a wide range of cases.
The most important contributions of this work are as follows:
We propose an innovative adaptation of the VGG19 to enhance the diagnostic capabilities of this established CNN model, and we introduce the integration of custom blocks into the architecture. These blocks augment the network’s capacity to encapsulate crucial image features, paramount in accurately classifying chest X-ray images. The custom blocks are designed to enhance the feature maps generated by the preceding layers through the CNN’s normalization, regularization, and spatial resolution enhancement. This process results in a more comprehensive and nuanced representation of the image data, enabling the model to detect and differentiate between subtle and complex patterns indicative of different lung diseases.
We manage the challenge of dataset imbalance, a common issue in medical imaging studies; our research focuses on a dataset comprising chest X-ray images categorized into six types: opacity, COVID-19, fibrosis, tuberculosis, viral pneumonia, and normal. The original dataset exhibited a significant imbalance among these categories, a factor that can adversely impact the performance of DL models. To mitigate this issue and enhance the robustness of our model, we employed the data augmentation strategy. This technique involves artificially expanding the dataset by generating new, modified versions of existing images through various transformations such as rotation, scaling, and flipping. So, by applying these augmentations, we were able to transform the imbalanced dataset into a balanced one, ensuring that each class was equally represented.
The structure of this work is organized as follows:
Section 2 is devoted to integrating generative AI with chest X-ray imaging.
Section 3 reviews relevant studies in the field, providing the context to this topic.
Section 4 details the proposed methodology, encompassing the collection used, the algorithm implemented, the data augmentation strategy, and the metrics for evaluating our approach.
Section 5 is dedicated to presenting the experimental analysis and the outcomes obtained. In
Section 6, the optimization strategies employed in our research are discussed. Finally,
Section 7 concludes the article by discussing future research directions.
2. Integrating Generative AI with Chest X-ray Imaging
Integrating Generative AI with chest X-ray imaging for multi-class diagnosis stands at the forefront of medical innovation, creating a critical nexus between state-of-the-art computational methods and medical expertise. At the core of this integration is the strategic use of data augmentation techniques, which encompass a suite of manipulations such as horizontal flipping, brightness adjustments, shifts, rotations, zooms, shears, and changes in fill mode. These classical strategies are not merely tools for image manipulation but are pivotal in crafting a versatile and comprehensive dataset that challenges and refines the learning processes of DL models. These data augmentation techniques enrich the training data with various variations, replicating the diverse array of scenarios that a DL model would encounter in the real world. By systematically altering images through these techniques, the model is exposed to a broad spectrum of variations akin to the range it would need to interpret in a clinical environment. This exposure is critical, as chest X-rays inherently exhibit a degree of variability owing to numerous factors, such as differences in patient anatomy, positioning during the scan, the calibration of imaging equipment, and the intricacies of exposure settings. Each X-ray is a unique confluence of these factors, and a robust DL model must be capable of this variability to provide accurate diagnoses. In a clinical context, the ability of a model to generalize across various conditions and imaging nuances directly translates to its diagnostic utility. The performance of DL models on chest X-rays can significantly affect patient outcomes, as these models assist in early detection, accurate diagnosis, and timely treatment of pulmonary conditions. The multi-class diagnosis capability that Generative AI integration brings is precious in settings where a swift differential diagnosis is critical. Moreover, the variability introduced through data augmentation techniques aids in mitigating overfitting. In overfitting, a model performs exceptionally well on training data but fails to generalize to new, unseen data. By learning from augmented images that reflect a wider range of clinical scenarios, DL models develop a more robust understanding of the features that truly indicate specific diseases rather than artefacts of the dataset they were trained on [
11,
12].
Generative AI has emerged as a groundbreaking solution in managing one of the most persistent challenges in medical imaging: category disparities in collection. In diagnostic modelling, especially with chest X-ray imaging, collections often exhibit a significant inequality, with a preponderance of common illnesses overshadowing rarer pathologies. This imbalance risks developing biased or underperforming diagnostic models that excel at recognizing frequently occurring conditions but falter with less common ones. Such a skew in data can lead to diagnostic inaccuracies, potentially impacting patient care, especially for those with less common diseases that are underrepresented in the training data. As a result, the diagnostic models trained on these augmented datasets gain a more comprehensive understanding of a wider array of pathologies, leading to improved identification and classification capabilities across the spectrum of disease [
13,
14].
AI-integrated Computer-Aided Diagnosis (AI-CAD) systems are designed to enhance the accuracy, speed, and efficiency of diagnosing diseases, thereby revolutionizing patient care and treatment outcomes. Integrating AI into CAD systems has opened up new frontiers in medical imaging analysis, offering powerful tools for detecting, characterizing, and monitoring various health conditions. At the core of AI-CAD systems is the application of advanced DL algorithms, which enable these systems to analyze complex medical images with a level of detail and accuracy previously unattainable. These AI models are trained on vast datasets of medical images, learning to recognize patterns and anomalies indicative of specific diseases. By doing so, AI-CAD systems assist radiologists and clinicians in making more informed diagnostic decisions, reducing the likelihood of human error and the variability that can occur in image interpretation. Furthermore, AI-CAD systems help to alleviate the workload on medical professionals. AI-CAD systems can rapidly process and analyze these images, highlighting areas of concern for further review by a radiologist. This speeds up the diagnostic process and allows radiologists to focus their expertise on more complex cases, improving overall healthcare delivery. Moreover, AI-CAD systems are continuously evolving. These systems learn and improve as they are exposed to more data, increasing their diagnostic accuracy. This continuous learning process ensures that AI-CAD systems remain at the forefront of medical technology, adapting to new challenges and advancements in healthcare [
15,
16].
3. Relative Work
The utilization of DL techniques in identifying abnormalities in chest X-ray images has attained significant traction in recent years. This surge in popularity is attributed to the remarkable capabilities of these algorithms in discerning intricate patterns and irregularities that might elude traditional analysis methods. In medical research, the application of artificial intelligence (AI) has become increasingly prominent, particularly in facilitating the diagnosis of various health conditions. Numerous studies leveraging AI in medical diagnostics have reported positive results, demonstrating both the accuracy and efficacy of these technologies [
17,
18,
19,
20,
21,
22,
23]. This section delves into the strategies employed by previous researchers in this domain.
Sarkar et al. [
24] proposed a multi-scale CNN model designed for a six-class categorization task, focusing on identifying tuberculosis, bacterial pneumonia, fibrosis, viral pneumonia, normal lung conditions, and COVID-19 using 5700 chest X-ray images. This study examines the efficacy of the VGG19 and the VGG16 models in their standard form and the VGG16 with multi-scale feature mapping forms. The standard VGG19 model achieved an accuracy of 95.61% and the VGG16 of 95.79%. However, when the VGG16 model was enhanced, the accuracy improved to 97.47%. In [
25], the authors proposed a 2D-CNN model designed for a six-class categorization assignment, focusing on determining fibrosis, viral pneumonia, tuberculosis, bacterial pneumonia, normal lung conditions, and COVID-19 employing chest X-ray images. This work analyses the effectiveness of the VGG19 and the VGG16 models in their standard format and the 2D-CNN model. The standard VGG19 model reached an accuracy of 89.51%, the VGG16 of 90.43% and the 2D-CNN model of 96.75%. Also, in [
26], the authors suggested a ResNet50 with deep features model designed for a five-class categorization assignment, focusing on determining viral pneumonia, tuberculosis, bacterial pneumonia, normal lung conditions, and COVID-19 using 2186 chest X-ray images. The model gained an accuracy score of 91.60%.
The study [
27] proposes a DL model for multi-class categorization aimed at identifying pneumonia, COVID-19, normal, and lung cancer utilizing CT and chest X-ray images. This study examines the efficacy of four distinct architectural combinations, which integrate VGG19 and ResNet152V2 with various neural network models like CNN, GRU (Gated Recurrent Unit), and Bi-GRU (Bidirectional Gated Recurrent Unit). The accuracy achieved by the VGG19, combined with a CNN model, is 98.05%. In work [
28], the authors suggested a DL multi-class categorization model to determine COVID-19, viral pneumonia, normal, and lung opacity, employing chest X-ray images. This study analyses the effectiveness of the MobileNetV2 model in its standard format and the modified MobileNetV2 model. The standard MobileNetV2 model reached an accuracy of 90.47%, and the modified MobileNetV2 model of 95.80%. In [
29], the authors propose a DL-based diagnostic system specifically designed to detect pneumonia utilizing X-ray images rapidly. This study compares the analysis of two prominent DL methods: VGG19 and ResNet50. These methods were evaluated for their efficacy in diagnosing three distinct conditions: pneumonia, COVID-19, and normal lung health. The findings of this study with the proposed diagnostic system have accuracy with the VGG19 method of 96.60%, while the ResNet50 method recorded an accuracy score of 95.80%. Furthermore, in [
30], the authors suggested an altered VGG16 model designed for a three-class categorization assignment, focusing on determining pneumonia, normal lung conditions, and COVID-19 utilizing chest X-ray images. The altered VGG16 model earned an accuracy score of 91.69%.
Sanida et al. [
31] proposed a DL model designed for a three-class categorization task, focusing on identifying pneumonia, normal lung conditions, and COVID-19 using chest X-ray images. This study examines the efficacy of the VGG19 model in its standard form and in modified forms that include the integration of inception blocks. The standard VGG19 model attained an accuracy of 98.17%. However, when the VGG19 model was enhanced with two inception blocks, the accuracy increased to 99.25%. Furthermore, incorporating a single inception block into the VGG19 model resulted in an accuracy of 98.59%. These results demonstrate the substantial impact that architectural modifications, such as the addition of inception blocks, can have on the performance of a DL model in medical image analysis. In [
32], the authors explore the efficacy of a novel deep CNN method called Decompose, Transfer, and Compose (DeTraC). This technique is specifically developed to address the challenges of identifying anomalies in image datasets pertaining to pneumonia, SARS, and COVID-19. The study uses various established CNN models, including VGG19, GoogleNet, ResNet, AlexNet, and SqueezeNet. Each model is assessed for accurately categorizing anomalies within the dataset. The DeTraC with the VGG19 model attained an accuracy score of 97.35%.
Hemdan et al. [
33] focused on binary categorization to differentiate between COVID-19 and healthy cases using chest X-ray scans. The study utilized a small dataset of 50 scans, divided into 25 scans representing COVID-19 cases and 25 from healthy individuals. Central to their research was the development of COVIDX-Net, a diagnostic system that leverages seven different pre-trained models. These models included VGG19, Xception, ResNetV2, InceptionV3, DenseNet201, InceptionResNetV2, and MobileNetV2. VGG-19 emerged as the most effective classifier among the seven models, reaching an accuracy of 90.00% and an F1-score of 0.91. Conversely, InceptionV3 was found to have the lowest accuracy in this study, with a rate of 50.00%. In [
34], the authors propose an imaging-based fusion technique to differentiate between COVID-19 and healthy cases employing chest X-ray images. This method combines features extracted from chest X-ray images using two distinct processes: the histogram-oriented gradient (HOG) and the VGG-19 model. The HOG with the VGG19 model achieved an accuracy score of 99.49%.
In the work [
35], the authors utilized four different DL models—ResNet50, DenseNet121, VGG16, and VGG19—applying the concept of transfer learning to diagnose X-ray images. The study aimed to differentiate between COVID-19 and normal lung conditions. Transfer learning, a method where a model developed for one assignment is reused as the starting point for a model on a second assignment, is particularly effective in systems where the available data is limited, as is often the case in medical imaging. The performance of the VGG16 and VGG19 models outperformed the other two DL strategies, ResNet50 and DenseNet121. The work reported an overall categorization accuracy of 97.00% for ResNet50, 96.66% for DenseNet121, and 99.33% for VGG16 and VGG19.
Numerous studies in the field of medical imaging have demonstrated impressive accuracy rates in scenarios involving binary or limited-class categorization. However, a recurrent issue observed in these studies is a notable decline in performance when the number of categories to be classified increases. This decline in accuracy is primarily attributed to the heightened complexity involved in distinguishing between multiple conditions, particularly when these conditions exhibit only subtle differences in their features. Such a challenge becomes increasingly pronounced in multi-class categorization contexts, where the distinction between various lung diseases can be nuanced and complex. This inherent limitation significantly impacts the practical utility of these models in real-world clinical settings, where patients often present with a range of diverse lung conditions. In such scenarios, the ability to accurately categorize multiple lung diseases becomes not just beneficial but essential. Consequently, there is a pressing need for a specially designed and robust DL framework capable of performing multi-class categorization of lung diseases with a high accuracy rate and reliance. Such a framework would be invaluable in real-life clinical applications, enabling healthcare professionals to provide more accurate diagnoses and, therefore, more effective treatments for patients with complex lung conditions. This need underscores the importance of ongoing research and development in the field of DL to create more advanced and capable diagnostic tools that can meet the demands of modern healthcare.
Table 1 summarises works for lung disease identification, the number of categories, the model employed, and the accuracy rate attained.
6. Discussion
Integrating Generative AI with DL in medical imaging, particularly in diagnosing lung diseases, marks a significant stride in healthcare technology. Despite the advancements achieved using DL techniques in accurately diagnosing lung conditions such as opacity, COVID-19, fibrosis, tuberculosis, viral pneumonia, and normal lung conditions, substantial research gaps and problems remain that present challenges and motivations for further study.
A primary research gap in this field is the lack of highly accurate and robust automated systems for diagnosing various lung diseases from chest X-ray images. While traditional models have shown promise, their effectiveness in distinguishing between different lung conditions, especially those with subtle radiographic differences, is limited. This gap becomes more pronounced with emerging diseases like COVID-19, where rapid and accurate diagnosis is crucial. There is also a need for models that can generalize well across diverse patient populations and varying image qualities, a challenge not fully addressed by existing models.
Current DL models face issues such as overfitting, limited interpretability, and the need for large annotated collections. Overfitting leads to models performing well on training data but failing to generalize to new data. The nature of DL models also poses a challenge, as medical professionals require models that provide interpretable diagnostic insights. Furthermore, the performance of these models is highly dependent on the quantity and quality of the training data, which can be a limitation in medical imaging due to privacy concerns, data availability, and the labour-intensive process of medical annotation.
The motivation for research in integrating Generative AI with medical imaging stems from the impact of advanced diagnostic tools on patient care and healthcare systems. The potential to enhance the accuracy and efficiency of diagnosing lung diseases is a development that carries far-reaching implications for patients and healthcare providers. So, by automating the diagnostic process, an AI-integrated CAD system can alleviate this burden, enabling quicker times for medical image analysis. The modified VGG19 model, with its deeper architecture, offers a promising avenue for improvement over traditional models.
Our approach utilizes the outputs from the last three max-pooling layers of the VGG19 model. We redirect these through custom layers, including batch normalization, dropout, and up-sampling. This not only refines the feature maps generated by the preceding layers but also enhances the model’s capability to identify subtle and complex patterns characteristic of different lung diseases. Moreover, improving the generalizability and interpretability of the modified model significantly increases their clinical applicability, leading to broader adoption in healthcare settings.
The comparison between the basic VGG19 model and the modified reveals significant enhancements in the latter’s diagnostic capabilities. The modified VGG19 model’s accuracy is markedly superior, registering at 98.88%, which is an increase of nearly 2.5 percentage points over the basic model. The confusion matrix of the modified VGG19 shows a commendable increase in true positive rates for most categories. The basic VGG19 model showcases AUC values exceeding 0.96 across all categories, affirming its robustness as a reliable diagnostic tool. Nevertheless, the modified VGG19 model achieved even higher AUC values that exceeded 0.98 for all categories. These AUC values indicate an exceptionally high true positive rate and a low false positive rate, essential for accurate medical diagnosis. These results suggest that the modifications made to the VGG19 model have substantial practical implications. They indicate a potential for significantly more accurate diagnoses and better patient management strategies. The modified VGG19 model stands out as an improved tool in the medical imaging field, with its enhanced performance likely to contribute to more effective healthcare.
7. Conclusions and Future Work
Our work introduced a novel DL framework, which harnesses the power of VGG19 architecture by integrating custom blocks into the model. We developed a system capable of multi-class diagnosis of a spectrum of lung conditions, including fibrosis, opacity, tuberculosis, normal lung states, viral pneumonia, and COVID-19 pneumonia. This DL framework system is particularly noteworthy given chest X-rays’ critical role in promptly and accurately identifying pulmonary abnormalities. Our evaluation process, conducted on an extensive dataset, highlights the framework’s remarkable capabilities, surpassing existing state-of-the-art methods in lung disease diagnosis. The achieved accuracy 98.88% is a testament to the framework’s precision and reliability. The framework demonstrated superior accuracy and exhibited exceptional performance across critical metrics like precision, recall, F1-score, and AUC, averaging 0.9870, 0.9904, 0.9887, and 0.9939, respectively. In future work, we intend to harness the power of GANs to create a more diverse and representative range of synthetic medical images and extend this study in several directions, including enhancing the model’s generalization to different patient demographics, integrating it with other diagnostic tools and electronic health records for more comprehensive patient assessments, and expanding its applicability to other thoracic and respiratory conditions.