Classification of High-Resolution Chest CT Scan Images Using Adaptive Fourier Neural Operators for COVID-19 Diagnosis

Gurrala, Anusha; Arora, Krishan; Sharma, Himanshu; Qamar, Shamimul; Roy, Ajay; Chakraborty, Somenath

doi:10.3390/covid4080088

Open AccessArticle

Classification of High-Resolution Chest CT Scan Images Using Adaptive Fourier Neural Operators for COVID-19 Diagnosis

by

Anusha Gurrala

¹,

Krishan Arora

¹,

Himanshu Sharma

¹,

Shamimul Qamar

²,

Ajay Roy

¹

and

Somenath Chakraborty

^3,*

¹

Department of Electronics and Communication Engineering, Lovely Professional University, Punjab 144411, India

²

College of Computer Science, King Khalid University, Abha 62521, Saudi Arabia

³

Department of Computer Science and Information Systems, Leonard C. Nelson College of Engineering and Sciences, West Virginia University Institute of Technology, Beckley, WV 25801, USA

^*

Author to whom correspondence should be addressed.

COVID 2024, 4(8), 1236-1244; https://doi.org/10.3390/covid4080088

Submission received: 15 April 2024 / Revised: 10 July 2024 / Accepted: 5 August 2024 / Published: 7 August 2024

(This article belongs to the Special Issue Artificial Intelligence and Machine Learning Applications for Developing the Diagnosis of COVID-19)

Download

Browse Figures

Review Reports Versions Notes

Abstract

:

In the pursuit of advancing COVID-19 diagnosis through imaging, this paper introduces a novel approach utilizing adaptive Fourier neural operators (AFNO) for the analysis of high-resolution computed tomography (HRCT) chest images. The study population comprised 395 patients with 181,106 labeled high-resolution COVID-19 CT images from the HRCTCov19 dataset, categorized into four classes: ground glass opacity (GGO), crazy paving, air space consolidation, and negative for COVID-19. The methods included image preprocessing, involving resizing and normalization, followed by the application of the AFNO model, which enables efficient token mixing in the Fourier domain independent of input resolution. The model was trained using the Adam optimizer with a learning rate of 1 × 10⁻⁴ and evaluated using metrics such as accuracy, precision, recall, and F1 score. The results demonstrate AFNO’s superior performance in few-shot segmentation tasks over traditional self-attention mechanisms, achieving an overall accuracy of 94%. Specifically, the model showed high precision and recall for the GGO and negative classes, indicating its robustness and effectiveness. This research has significant implications for the development of AI-powered diagnostic tools, particularly in environments with limited access to high-quality imaging data and those where computational efficiency is critical. Our findings suggest that AFNO could serve as a powerful model for analyzing HRCT images, potentially leading to improved diagnosis and understanding of COVID-19, representing a critical step in combating the pandemic.

Keywords:

COVID-19; CT scan; computed tomography; chest image; token mixers; transformer model; Fourier transform

1. Introduction

As the COVID-19 pandemic continues to impact global health, rapid and accurate diagnosis has become a cornerstone in controlling outbreaks. The disease’s rapid spread since its emergence in China has resulted in over 378 million confirmed cases and more than 5.67 million fatalities worldwide [1]. The diagnostic strategy for detecting SARS-CoV-2 primarily involves reverse-transcription polymerase chain reaction (RT-PCR) tests [2] and radiological imaging, particularly thin-section chest computed tomography (CT) [3,4]. While RT-PCR remains the gold standard, chest CT scans play a pivotal role in patient management and morbidity assessment, with HRCT boasting a sensitivity range of 56–98% for COVID-19 pneumonia diagnosis [5,6]. T imaging hallmarks, such as ground glass opacities (GGO) and crazy paving patterns, are integral to COVID-19 detection [7]. However, their non-specificity necessitates a more discerning diagnostic toolset [8]. The vast number of images generated during the pandemic has fueled research into deep learning techniques for swift COVID-19 identification from CT scans [9,10]. Despite positive results, these endeavors face limitations due to the small sample sizes and lack of comparative respiratory disease data within public datasets [11,12,13,14].

The implementation of artificial intelligence (AI) in medical diagnostics is hindered by the scarcity of large, accessible datasets required for algorithm training [15]. This gap is partially bridged by transfer learning, where AI is first trained on extensive databases before application to smaller, specialized datasets [16]. Recognizing the need for comprehensive image collections, our study aims to establish a sizable CT scan image dataset of COVID-19 patients for future AI utilization [17].

In parallel, vision transformers (ViTs) have emerged as powerful tools in image recognition, owing to their efficient token mixing capabilities [18]. Nonetheless, the quadratic complexity of token mixing with self-attention becomes prohibitive for high-resolution images and long sequences [19]. Innovative token mixers like global filter networks (GFN) have been developed, leveraging Fourier transforms for efficient token mixing [20]. Despite their strengths, these methods face challenges in adaptivity and expressiveness at higher resolutions [21]. To overcome these limitations, this paper introduces the adaptive Fourier neural operator (AFNO), which treats tokens as continuous elements and models token mixing as a continuous global convolution. This approach is rooted in operator learning, traditionally used to tackle partial differential equations (PDEs) [22]. Adapting Fourier neural operators (FNOs) for vision, we impose a block-diagonal structure on channel mixing weights and utilize soft-thresholding to sparsify frequencies, enhancing generalization and efficiency [23].

The literature reflects a growing body of work dedicated to improving transformer efficiency [24]. Our proposed AFNO model, with its efficient token mixing and expressive capability [25], is positioned to advance the field of medical image analysis, particularly for high-resolution COVID-19 CT scans. This work aims to set a new benchmark in AI-powered diagnostics, offering a potential leap forward in managing the COVID-19 health crisis.

The key contributions of this work are as follows:

Innovative technique implementation integrates AFNO into the image analysis of HRCT, enhancing the handling of high-resolution data, which traditional models struggle with due to computational limitations. Enhanced image processing efficiency is achieved with Fourier transforms for efficient token mixing, overcoming the scalability issues associated with high-resolution images.

Improved diagnostic accuracy is demonstrated through superior performance in identifying COVID-19 related abnormalities in chest CT scans compared to traditional methods, enhancing diagnostic accuracy and reliability.

The architecture of AFNO allows for adaptation to different resolutions and details, suggesting wider clinical application beyond COVID-19 to other medical imaging tasks. This is a significant step forward in the use of AI for medical diagnostics, particularly in environments where access to high-quality imaging data may be limited.

2. Materials and Methods

2.1. Dataset

The HRCTCov19 dataset [26] was employed for this study, consisting of 181,106 high-resolution chest CT images from 395 patients, categorized into classes representing ground glass opacity (GGO), crazy paving, air space consolidation, and negative for COVID-19. Images were preprocessed to a uniform size of 512 × 512 pixels, normalized to a [0, 1] scale, and randomly split into training (60%), validation (20%), and testing (20%) sets. In Table 1. it is shown as follows while image dataset is showing in Figure 1.

2.2. Image Preprocessing

Prior to training, all images underwent a series of preprocessing steps. Images were resized using bilinear interpolation to a consistent resolution of 512 × 5 × 12 pixels to ensure compatibility with the AFNO model. For normalization, pixel values were normalized to a [0, 1] scale via the following equation:

I_{n o r m} = \frac{I - m i n (I)}{\max (I) - m i n (I)}

where I is the original image intensity and min(I) and max(I) are the minimum and maximum intensities in the image, respectively.

To increase the robustness of the model, data augmentation techniques such as random rotation, flipping, and zoom were applied.

2.3. Adaptive Fourier Neural Operator (AFNO) Architecture

The adaptive Fourier neural operator method is a sophisticated approach designed for image analysis that divides each image into small and large patches, some of which overlap. This method is used to address the challenge of efficiently processing high-dimensional data by transforming the way the image information is mixed and processed. The core of AFNO lies in its innovative technique for mixing the image’s features. Unlike conventional methods that mix features at a fixed scale, AFNO adapts to different resolutions, which is crucial for handling images of various sizes and details. The AFNO method rethinks the self-attention mechanism often used in transformers, which usually requires a hefty computational effort, especially as image size grows. Instead, AFNO integrates kernels, a mathematical concept used to generalize functions, and operates on these kernels in a continuous fashion. This continuous operation allows the model to consider the global context of the image, which is essential for understanding the spatial relationships within the image.

One of the critical improvements AFNO brings to the table is its efficient handling of the Fourier transform, a mathematical tool used to decompose functions into frequencies. AFNO uses this technique to work more effectively with the spectral components of the image, which represent the image’s details and textures. The method ensures that the important high-frequency information is preserved during the image processing stages. Moreover, AFNO introduces a new structure for mixing channels, which are the components of the image data related to color and intensity. The model structures the data in such a way that it divides the image into blocks and processes these blocks separately, making the computation more manageable and highly parallelizable. This block structure enables AFNO to handle larger images more efficiently by simplifying the complex interdependencies of the image’s features.

An innovative aspect of AFNO is its weight sharing mechanism, where it reuses certain parameters across different parts of the image, which reduces the overall computational load without sacrificing performance. This weight sharing is coupled with an adaptive strategy, where the importance of different features is dynamically adjusted. By focusing more on the significant features and less on the less relevant ones, AFNO can operate more efficiently. It also addresses the sparsity inherent in images—where only a few pixels might hold the most critical information—by selectively focusing on those areas of the image that carry more information. This selective focus is achieved through a technique that penalizes less important features, thus promoting a more compact and relevant feature set for the model to work with.

Thus, the AFNO architecture was employed for its ability to handle high-resolution input images without the computational infeasibility associated with self-attention mechanisms. The token mixing in AFNO is achieved through a series of operations in the Fourier domain, described as follows:

Each image X is transformed into the frequency domain using the fast Fourier transform (FFT):

\hat{X} = F F T (X)

In the frequency domain, a gating mechanism is applied, parameterized by a learnable matrix

\hat{Y} = σ (\hat{X} . W)

where σ represents a non-linear activation function.

The processed frequency representation is then converted back to the spatial domain using the inverse fast Fourier transform (IFFT):

Y = I F F T (\hat{Y})

The model also performs channel mixing, which is adaptively parameterized by block-diagonal matrices to control the complexity and enhance parameter efficiency.

The model was trained using the Adam optimizer with a learning rate of 1 × 10⁻⁴, decayed by a factor of 0.1 every 20 epochs. Cross-entropy loss function was used to compute the difference between the predicted and true labels:

L = - \sum_{c = 1}^{M} y_{o, c} l o g (p_{o, c})

where y_o_,c is a binary indicator of whether class label c is the correct classification for observation o, and p_o_,c is the predicted probability that observation o is of class c.

Model performance was evaluated on the test set using metrics such as accuracy, precision, recall, and F1 score. These metrics were chosen to provide a comprehensive understanding of the model’s classification abilities, taking into account the balance between sensitivity and specificity, which is crucial for medical diagnostic tasks.

3. Results

The AFNO model’s application to the HRCTCov19 dataset, which consists of 181,106 high-resolution chest CT images from 395 patients processed at a resolution of 512 × 512 pixels, has yielded a comprehensive insight into its classification capabilities across four diagnostic categories: ground glass opacity (GGO), crazy paving, air space consolidation, and negative for COVID-19. The distribution of classifications as depicted in the confusion matrix in Figure 2, indicates a high degree of accuracy, especially notable in the identification of GGO and negative cases.

The model has shown remarkable aptitude for detecting GGO, which is evidenced by a large majority of cases being correctly classified, as indicated by the prominent darker shading in the confusion matrix. This suggests that the model’s features are well-tuned to identify the characteristics unique to GGO within the data. Similarly, the negative category has demonstrated a significant number of correct classifications, affirming the model’s ability to discern non-COVID-19 cases with a high degree of certainty.

From Table 2, it is observed that the highest precision points to the model’s strength in accurately identifying true positive cases within this class. Although the precision dips for the air space consolidation and negative classes, it remains well within the acceptable range, illustrating the model’s reliable performance even in classes with fewer samples. The recall metric, which measures the model’s sensitivity or its ability to identify all actual positives, remains impressively high across all classes. The model’s sensitivity is particularly noteworthy in the air space consolidation and negative classes, where despite a lower precision, it misses very few actual positive cases, as reflected by the lighter shading off the diagonal in the confusion matrix.

The F1 score, a harmonic mean of precision and recall, also attests to the model’s balanced performance. It acknowledges the trade-off between precision and recall, with both metrics converging to deliver a consistently high F1 score across the dataset. The overall accuracy of the model stands at 94%, emphasizing the AFNO model’s robustness and effectiveness. This high accuracy level, combined with the individual class metrics, reveals a model that is not only adept at learning from high-resolution data but also at generalizing across diverse manifestations of chest CT findings related to COVID-19.

The balance in the average metrics, with a macro average precision of 0.85 and recall of 0.94, indicates that the model performs consistently across different classes. It manages to maintain a high recall rate even with slightly varying precision rates, an important aspect in medical diagnostics where the cost of false negatives can be particularly high. Thus, the result analysis of the confusion matrix and calculated metrics demonstrate that the AFNO model is particularly suited for the nuanced task of medical image classification. It exhibits a powerful ability to differentiate between complex patterns inherent in high-resolution CT images related to COVID-19, which is crucial for supporting radiologists and potentially reducing time-to-diagnosis.

Figure 3 displays two graphs: one for the model’s accuracy and one for the model’s loss over 100 epochs of training. The left graph presents both training and validation accuracy. Both accuracies start from a high baseline and continue to improve slightly over the epochs. There is a notable variance in the validation accuracy, which might be attributed to the complexity of the validation data or a small degree of overfitting as the model learns. Despite the fluctuations, the validation accuracy remains close to the training accuracy, which indicates that the model generalizes well to new, unseen data. There is no significant divergence, which is positive as it implies the model is not overfitting the training data significantly.

The right graph shows the model’s loss during training. Loss is a measure of how well the model’s predictions match the actual labels. Lower loss values indicate better performance. Both training and validation losses decrease over time, which suggests that the model is learning and improving its predictive ability with each epoch. As with the accuracy, the training and validation loss lines are close together, which further suggests that the model is generalizing well. There is a consistent decline without any abrupt changes or increases in loss, indicating that the learning rate and model architecture are appropriate for this task. The smoothness of the curve suggests that the learning rate decay implemented every 20 epochs is working well to refine the model’s learning as it converges.

The model shows a high level of initial performance that continues to improve slightly and stabilize as training progresses, indicative of effective learning dynamics. The proximity of the training and validation lines in both graphs suggests that the model is not suffering from high variance (overfitting) or high bias (underfitting). The performance as seen in these plots indicates that the model training process is stable and that the model is learning effectively from the data. The absence of any significant gap between training and validation lines towards the later epochs suggests that the model’s architecture and hyperparameters are well-tuned to this particular dataset and task. These results are considered favorable in machine learning and are particularly beneficial in a medical context where high accuracy and reliability are crucial. All analyses were performed using Python 3.7.6, TensorFlow 2.7.0, and Keras 2.7.0. The computations were facilitated by a standard lab PC configured with a dedicated NVIDIA GeForce RTX 3070 GPU, renowned for its CUDA core architecture optimized for deep learning tasks. The system was powered by an Intel^® Core™ i7-10750H CPU @ 2.60 GHz (12 CPUs), complemented by 32 GB of DDR4 RAM to ensure smooth data processing and model training. For the operating system, we utilized Windows 10 Enterprise 64-bit, providing a stable and secure environment for our research needs.

4. Discussion

In the present study, the AFNO method was applied to the HRCTCov19 dataset, which is a compilation of 181,106 high-resolution chest CT images from 395 patients. The images were processed at a resolution of 512 × 512 pixels and categorized into four classes representing various COVID-19-related pathologies. This study highlights several key findings and implications for future research and clinical practice. The AFNO model’s ability to handle high-resolution images efficiently is rooted in its innovative use of Fourier transforms for token mixing. This approach circumvents the quadratic complexity of traditional self-attention mechanisms, allowing for scalable and effective analysis of large CT image datasets. The model’s superior performance in detecting COVID-19 related abnormalities, particularly GGO and negative cases, demonstrates its potential for clinical application in rapid diagnostic scenarios. The high precision and recall rates achieved by the AFNO model across various COVID-19 manifestations underscore its diagnostic reliability. This is crucial in medical diagnostics, where accurate and timely detection can significantly impact patient management and outcomes. The AFNO model’s adaptability to different image resolutions further suggests its potential utility beyond COVID-19, in various other medical imaging tasks. Despite the promising results, the current study is limited by the HRCTCov19 dataset’s lack of diversity and structure to support a double-blind study. The dataset’s demographic and geographic homogeneity may limit the generalizability of the findings. Additionally, the existing dataset does not facilitate the anonymization required for a double-blind evaluation, where both the evaluators and clinical practitioners would be unaware of the classification outcomes. Future work will involve acquiring additional datasets from diverse populations and multiple medical institutions. This will enable a double-blind evaluation to objectively assess the model’s performance across varied populations. Such efforts are essential to validate the AFNO model’s robustness and generalizability, ensuring its efficacy in different clinical settings.

5. Conclusions

The AFNO method proposed for the strategic processing of the HRCTCov19 dataset has demonstrated its ability to efficiently handle large-scale high-resolution medical imaging data and also demonstrates potential as a viable tool in clinical settings. Its notable performance in classifying CT images into distinct COVID-19-related conditions opens up avenues for its application in rapid diagnostic scenarios, where accurate and timely classification can significantly impact patient management and outcomes. The AFNO model has proven to be a formidable tool in the automated classification of COVID-19 from high-resolution CT images. Its ability to maintain high accuracy across diverse pathologies holds promise for supporting radiologists in diagnostic workflows, potentially reducing the time-to-diagnosis and helping to manage the large influx of imaging data in pandemic situations. Future work could involve validating the model’s performance across independent datasets and may explore the integration of the AFNO model into clinical decision support systems and its performance in multi-center studies to validate its efficacy across different populations and imaging protocols.

Author Contributions

A.G. and S.C. writing—original draft preparation, S.C., A.G. writing—review and editing, S.C., K.A. and H.S.; supervision, S.C., K.A., S.Q., A.R. and H.S. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Institutional Review Board Statement

For this research, ethical approval is not required, as we do not deal with any human subjects and the dataset used here is an open source dataset [26].

Informed Consent Statement

Not applicable.

Data Availability Statement

The dataset is available in the Zenodo repository (https://doi.org/10.5281/zenodo.10252424). The latest version of the data can also be accessible at https://databiox.com, accessed on 5 September 2023.

Conflicts of Interest

The authors declare no conflicts of interest.

References

World Health Organization. Coronavirus Disease 2019 (COVID-19) Situation Report-94, 2020. Available online: https://www.who.int/docs/default-source/coronaviruse/situation-reports/20200423-sitrep-94-covid-19.pdf?sfvrsn=b8304bf0_4 (accessed on 5 September 2023).
Zhao, D.; Yao, F.; Wang, L.; Zheng, L.; Gao, Y.; Ye, J.; Guo, F.; Zhao, H.; Gao, R. A comparative study on the clinical features of Coronavirus 2019 (COVID-19) Pneumonia With Other Pneumonias. Clin. Infect. Dis. 2020, 71, 756–761. [Google Scholar] [CrossRef] [PubMed]
Fang, Y.; Zhang, H.; Xie, J.; Lin, M.; Ying, L.; Pang, P.; Ji, W. Sensitivity of chest CT for COVID-19: Comparison to RT-PCR. Radiology 2020, 296, E115–E117. [Google Scholar] [CrossRef] [PubMed]
Huang, C.; Wang, Y.; Li, X.; Ren, L.; Zhao, J.; Hu, Y.; Zhang, L.; Fan, G.; Xu, J.; Gu, X.; et al. Clinical features of patients infected with 2019 novel coronavirus in Wuhan, China. Lancet 2020, 395, 497–506. [Google Scholar] [CrossRef] [PubMed]
Yang, W.; Sirajuddin, A.; Zhang, X.; Liu, G.; Teng, Z.; Zhao, S.; Lu, M. The role of imaging in 2019 novel coronavirus pneumonia (COVID-19). Eur. Radiol. 2020, 30, 4874–4882. [Google Scholar] [CrossRef]
Chung, M.; Bernheim, A.; Mei, X.; Zhang, N.; Huang, M.; Zeng, X.; Cui, J.; Xu, W.; Yang, Y.; Fayad, Z.A.; et al. CT imaging features of 2019 Novel Coronavirus (2019-nCoV). Radiology 2020, 295, 202–207. [Google Scholar] [CrossRef] [PubMed]
Zhou, S.; Wang, Y.; Zhu, T.; Xia, L. CT features of Coronavirus Disease 2019 (COVID-19) pneumonia in 62 patients in Wuhan, China. AJR Am. J. Roentgenol. 2020, 214, 1287–1294. [Google Scholar] [CrossRef] [PubMed]
Shirani, F.; Shayganfar, A.; Hajiahmadi, S. COVID-19 pneumonia: A pictorial review of CT findings and differential diagnosis. Egypt. J. Radiol. Nucl. Med. 2021, 52, 38. [Google Scholar] [CrossRef]
Simpson, S.; Kay, F.U.; Abbara, S.; Bhalla, S.; Chung, J.H.; Chung, M.; Henry, T.S.; Kanne, J.P.; Kligerman, S.; Ko, J.P.; et al. Radiological Society of North America expert consensus statement on reporting chest CT findings related to COVID-19: Endorsed by the Society of Thoracic Radiology, the American College of Radiology, and RSNA. Radiol. Cardiothorac. Imaging 2020, 2, c200152. [Google Scholar] [CrossRef]
Huang, L.; Han, R.; Ai, T.; Yu, P.; Kang, H.; Tao, Q.; Xia, L. Serial quantitative chest CT Assessment of COVID-19: A Deep Learning Approach. Radiol. Cardiothorac. Imaging 2020, 2, e200075. [Google Scholar] [CrossRef]
Li, L.; Qin, L.; Xu, Z.; Yin, Y.; Wang, X.; Kong, B.; Bai, J.; Lu, Y.; Fang, Z.; Song, Q.; et al. Artificial intelligence distinguishes COVID-19 from community acquired pneumonia on chest CT. Radiology 2020, 296, E65–E71. [Google Scholar] [CrossRef]
Afshar, P.; Heidarian, S.; Enshaei, N.; Naderkhani, F.; Rafiee, M.J.; Oikonomou, A.; Fard, F.B.; Samimi, K.; Plataniotis, K.N.; Mohammadi, A. COVID-CT-MD, COVID-19 computed tomography scan dataset applicable in machine learning and deep learning. Sci. Data 2021, 8, 121. [Google Scholar] [CrossRef] [PubMed]
Yang, X.; He, X.; Zhao, J.; Zhang, Y.; Zhang, S.; Xie, P. COVID-CT-dataset: A CT scan dataset about COVID-19. arXiv 2020, arXiv:2003.13865. [Google Scholar]
Silva, P.; Luz, E.; Silva, G.; Moreira, G.; Silva, R.; Lucio, D.; Menotti, D. COVID-19 detection in CT images with deep learning: A voting-based scheme and cross-datasets analysis. Inform. Med. Unlocked 2020, 20, 100427. [Google Scholar] [CrossRef] [PubMed]
Bolhasani, H.; Amjadi, E.; Tabatabaeian, M.; Jassbi, S.J. A histopathological image dataset for grading breast invasive ductal carcinomas. Inform. Med. Unlocked 2020, 19, 100341. [Google Scholar] [CrossRef]
Jacob, J.; Alexander, D.; Baillie, J.K.; Berka, R.; Bertolli, O.; Blackwood, J.; Buchan, I.; Bloomfield, C.; Cushnan, D.; Docherty, A.; et al. Using imaging to combat a pandemic: Rationale for developing the UK National COVID-19 chest imaging database. Eur. Respir. J. 2020, 56, 2001809. [Google Scholar] [CrossRef]
Morozov, S.P.; Andreychenko, A.; Pavlov, N.; Vladzymyrskyy, A.; Ledikhova, N.; Gombolevskiy, V.; Blokhin, I.A.; Gelezhe, P.B.; Gonchar, A.V.; Chernina, V.Y. MosMedData: Chest CT Scans with COVID-19 Related Findings Dataset. arXiv 2020, arXiv:2005.06465. [Google Scholar]
Tay, Y.; Dehghani, M.; Bahri, D.; Metzler, D. Efficient transformers: A survey. arXiv 2020, arXiv:2009.06732. [Google Scholar] [CrossRef]
Vaswani, A.; Shazeer, N.; Parmar, N.; Uszkoreit, J.; Jones, L.; Gomez, A.N.; Kaiser, Ł.; Polosukhin, I. Attention is all you need. In Advances in Neural Information Processing Systems; Curran Associates Inc.: Red Hook, NY, USA, 2017; pp. 5998–6008. [Google Scholar]
Dosovitskiy, A.; Beyer, L.; Kolesnikov, A.; Weissenborn, D.; Zhai, X.; Unterthiner, T.; Dehghani, M.; Minderer, M.; Heigold, G.; Gelly, S.; et al. An image is worth 16x16 words: Transformers for image recognition at scale. arXiv 2020, arXiv:2010.11929. [Google Scholar]
Zhu, C.; Ping, W.; Xiao, C.; Shoeybi, M.; Goldstein, T.; Anandkumar, A.; Catanzaro, B. Long-short transformer: Efficient transformers for language and vision. arXiv 2021, arXiv:2107.02192. [Google Scholar]
Rao, Y.; Zhao, W.; Zhu, Z.; Lu, J.; Zhou, J. Global filter networks for image classification. arXiv 2021, arXiv:2107.00645. [Google Scholar]
Lee-Thorp, J.; Ainslie, J.; Eckstein, I.; Ontanon, S. Fnet: Mixing tokens with fourier transforms. arXiv 2021, arXiv:2105.03824. [Google Scholar]
Li, Z.; Kovachki, N.; Azizzadenesheli, K.; Liu, B.; Bhattacharya, K.; Stuart, A.; Anandkumar, A. Fourier neural operator for parametric partial differential equations. arXiv 2020, arXiv:2010.08895. [Google Scholar]
Guibas, J.; Mardani, M.; Li, Z.; Tao, A.; Aanandkumar, A.; Catanzaro, B. Adaptive Fourier Neural Operators: Efficient Token Mixers For Transformers. arXiv 2022, arXiv:2111.13587. [Google Scholar]
Abedi, I.; Vali, M.; Otroshi, B.; Zamanian, M.; Bolhasani, H. HRCTCov19-a high-resolution chest CT scan image dataset for COVID-19 diagnosis and differentiation. BMC Res. Notes 2024, 17, 32. [Google Scholar] [CrossRef]

Figure 1. HRCTCov19 image dataset samples from each four labels: (a) GGO; (b) crazy paving; (c) air space consolidation; (d) negative.

Figure 2. Confusion matrix.

Figure 3. Accuracy and loss curves.

Table 1. Cases based on their age and infection type in the data.

Age	GGO	Crazy Paving	Air Space Consolidation	Negative
10 to 20	1
20 to 30	10	2	1	1
30 to 40	47	7	4	4
40 to 50	47	8	2	3
50 to 60	58	10	5	4
60 to 70	67	17	10	2
70 to 80	33	6	2	7
80 to 90	23	7	3	1
90 to 99	2	0	0	1
Total	288	57	27	23

Sum total: 395 patients—197 female, 193 male, 5 not specified.

Table 2. Class-wise performance evaluation.

	Precision	Recall	F1 Score	Support
GGO	0.99	0.94	0.96	26,208
Crazy Paving	0.89	0.93	0.91	5301
Air Space Consolidation	0.77	0.93	0.84	2484
Negative	0.74	0.94	0.82	2162
Accuracy			0.94	36,155
Macro Avg.	0.85	0.94	0.89	36,155
Weighted Avg.	0.95	0.94	0.94	36,155

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Gurrala, A.; Arora, K.; Sharma, H.; Qamar, S.; Roy, A.; Chakraborty, S. Classification of High-Resolution Chest CT Scan Images Using Adaptive Fourier Neural Operators for COVID-19 Diagnosis. COVID 2024, 4, 1236-1244. https://doi.org/10.3390/covid4080088

AMA Style

Gurrala A, Arora K, Sharma H, Qamar S, Roy A, Chakraborty S. Classification of High-Resolution Chest CT Scan Images Using Adaptive Fourier Neural Operators for COVID-19 Diagnosis. COVID. 2024; 4(8):1236-1244. https://doi.org/10.3390/covid4080088

Chicago/Turabian Style

Gurrala, Anusha, Krishan Arora, Himanshu Sharma, Shamimul Qamar, Ajay Roy, and Somenath Chakraborty. 2024. "Classification of High-Resolution Chest CT Scan Images Using Adaptive Fourier Neural Operators for COVID-19 Diagnosis" COVID 4, no. 8: 1236-1244. https://doi.org/10.3390/covid4080088

Article Menu

Classification of High-Resolution Chest CT Scan Images Using Adaptive Fourier Neural Operators for COVID-19 Diagnosis

Abstract

1. Introduction

2. Materials and Methods

2.1. Dataset

2.2. Image Preprocessing

2.3. Adaptive Fourier Neural Operator (AFNO) Architecture

3. Results

4. Discussion

5. Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI