Enhancing Medical Image Quality Using Fractional Order Denoising Integrated with Transfer Learning

Annadurai, Abirami; Sureshkumar, Vidhushavarshini; Jaganathan, Dhayanithi; Dhanasekaran, Seshathiri

doi:10.3390/fractalfract8090511

Open AccessArticle

Enhancing Medical Image Quality Using Fractional Order Denoising Integrated with Transfer Learning

by

Abirami Annadurai

^1,*

,

Vidhushavarshini Sureshkumar

²

,

Dhayanithi Jaganathan

³ and

Seshathiri Dhanasekaran

^4,*

¹

Department of Mathematics, Sona College of Technology, Salem 636005, India

²

Department of Computer Science and Engineering, SRM Institute of Science and Technology, Vadapalani, Chennai 600026, India

³

Department of Computer Science and Engineering, Sona College of Technology, Salem 636005, India

⁴

Department of Computer Science, UiT The Arctic University of Norway, 9019 Tromso, Norway

^*

Authors to whom correspondence should be addressed.

Fractal Fract. 2024, 8(9), 511; https://doi.org/10.3390/fractalfract8090511

Submission received: 15 July 2024 / Revised: 14 August 2024 / Accepted: 22 August 2024 / Published: 29 August 2024

(This article belongs to the Section Optimization, Big Data, and AI/ML)

Download

Browse Figures

Versions Notes

Abstract

:

In medical imaging, noise can significantly obscure critical details, complicating diagnosis and treatment. Traditional denoising techniques often struggle to maintain a balance between noise reduction and detail preservation. To address this challenge, we propose an “Efficient Transfer-Learning-Based Fractional Order Image Denoising Approach in Medical Image Analysis (ETLFOD)” method. Our approach uniquely integrates transfer learning with fractional order techniques, leveraging pre-trained models such as DenseNet121 to adapt to the specific needs of medical image denoising. This method enhances denoising performance while preserving essential image details. The ETLFOD model has demonstrated superior performance compared to state-of-the-art (SOTA) techniques. For instance, our DenseNet121 model achieved an accuracy of 98.01%, precision of 98%, and recall of 98%, significantly outperforming traditional denoising methods. Specific results include a 95% accuracy, 98% precision, 99% recall, and 96% F1-score for MRI brain datasets, and an 88% accuracy, 91% precision, 95% recall, and 88% F1-score for COVID-19 lung data. X-ray pneumonia results in the lung CT dataset showed a 92% accuracy, 97% precision, 98% recall, and 93% F1-score. It is important to note that while we report performance metrics in this paper, the primary evaluation of our approach is based on the comparison of original noisy images with the denoised outputs, ensuring a focus on image quality enhancement rather than classification performance.

Keywords:

transfer learning; fractional order; numerical analysis; image denoising; medical imaging

1. Introduction

1.1. Image Denoising Perspective

Image denoising is essential for preprocessing in medical image applications. The main challenge is retrieving a clean image from its corrupted version, defined as: g(x,y) = f(x,y) + n(x,y), where f(x,y), g(x,y), and n(x,y) represent the clean image, noisy image, and Gaussian noise, respectively. Researchers have developed various denoisers based on wavelet techniques [1,2,3,4] total variation models [2,5,6,7], and fractional techniques [8,9,10,11,12] to address this issue. However, these methods often fall short in many image processing tasks.

Medical imaging techniques such as ultrasonography, Computed Tomography (CT) scans, and MRI scans are commonly used in diagnosis due to their non-invasive nature and safety [13]. However, these techniques often produce significant speckle noise, which reduces image contrast and affects features, complicating diagnosis [1,2]. Ultrasonography is widely used because it is affordable, relatively safe, portable, and versatile. It can also measure factors like fetal growth and stomach conditions in pregnant women in real time.

Research on speckle patterns is extensive. It is logical to characterize speckles as multiple noises with a Rayleigh distribution to balance precision and flexibility. Speckle noise reduces the spatial and contrast resolution and signal-to-noise ratio (SNR) of ultrasound images, impairing the ability to resolve fine details [3,6,7]. Hence, speckle noise filtering is crucial for medical ultrasound imaging. Various methods, including spatial domain techniques, frequency domain techniques, adaptive techniques, model-based methods, and hybrid approaches, have been proposed to minimize speckles in ultrasound images.

A nonlinear coherent diffusion (NCD) filter converts the differential signal in the ultrasound image into Gaussian noise added to the image. The Speckle Reduction Anisotropic Diffusion (SRAD) method extends the PM diffusion model by incorporating spatial adaptive filtering techniques into the diffusion model, depending on the noise level at each step of the diffusion coefficient. The Oriented SRAD (OSRAD) filter, an extension of SRAD, uses a semi-explicit scheme to analyze numerical characteristics related to SRAD [8,9,10].

The OSRAD approach, based on matrix anisotropic diffusion, can achieve various distributions along primary curvature directions [14]. The total variation (TV) norm is used as a smoothing factor in techniques employing TV minimization for ultrasound images. The noise distribution in ultrasound images is modeled using the Rayleigh distribution. Methods using the TV norm for edge preservation often contradict the goal of removing speckle noise, leading to blurred edges or insufficient noise removal.

Ma. Y [11] proposes a framework integrating multiple techniques for visualizing and analyzing transfer learning processes. The framework consists of three main components: model, data, and task abstraction; transfer learning diagnosis; and transfer learning explanation. The authors use data preprocessing, feature extraction, and model selection to create an abstract representation of the transfer learning problem. Visualization techniques such as heatmaps, parallel coordinates, and scatterplots are used to diagnose issues like overfitting or underfitting. Feature importance analysis and counterfactual analysis explain the behavior of transfer learning models, with interactive visualizations aiding non-expert stakeholders.

Medical ultrasonography can visualize blood flow in arteries and internal organs. However, the main drawback of ultrasound imaging is low image quality due to speckle noise, which obscures subtle grayscale variations. Speckles are created when waves from scatters interact constructively or destructively, influenced by transducer frequency and probe characteristics. High acoustic frequencies produce fewer fine speckles than low ones, and the speckle size increases with the probe distance.

1.2. Transfer Learning Perspective

Taekeun Yoon [15] (2023) proposed a deep-learning-based noise removal method to rapidly resolve flame emission spectra in high-pressure environments. Noise-canceling convolutional neural networks (CNNs) can improve the signal-to-noise ratio for short-gated spectra using a multidimensional (0.8–1.5 nm) portable spectrometer (spectral range: 250–850 nm, resolution: 0.5 nm) under pressures of 1–10 bar and exposure times of 0.05, 0.2, 0.4, and 2 s. Long-exposure (2 s) spectra were used as the ground truth when training a CNN for noise removal. The long-gate spectra evaluated the Kriging model with POD, using noise-free short-gate spectra to predict oil properties. The study found that as the signal-to-noise ratio (SNR) decreased, the accuracy also went down, while the pressure and prediction errors increased, especially due to multiplication. POD effectively separated sensitive features in the emission spectrum.

Zhou (2022) focused on denoising Coronary Computed Tomography Angiography (CCTA) images [16]. The study linked coronary artery inflammation with the risk of plaque bleeding, suggesting that the proposed criteria could enhance diagnostic power. In a retrospective analysis of 43 patients, the efficacy of the FAI in MRI of coronary plaque using noise-free, high-quality CCTA images was evaluated. The analysis calculated the average CT value of all voxels, with the FAI ranging from −190 to −30 HU. The results showed that the noise-free CCTA images had an AUC, sensitivity, specificity, and accuracy of 0.85, 0.79, and 0.80, respectively.

Zsolt Adam Balogh (2022) compared the CT noise removal capabilities of deep-learning-based, traditional, and combined algorithms. The study examined 2D DL-based noise removal algorithms, standard 3D binary filters, and their combinations. The noise power spectrum, functional changes, and drug efficacy on raw CT images were measured using Catphan 600 phantoms and data from 26 clinical sites with over 100,000 images [17]. The findings indicated that traditional 3D spatial noise removal algorithms could enhance the performance of 2D deep learning algorithms.

In 2023, Hoichan Jung [18] compared various deep learning algorithms for noise removal in atomic force microscope (AFM) images. The study analyzed models such as MPRNet, HINet, Uformer, and Restorer using the Organic Electron Morphology dataset, which contains 4006 single-channel grayscale AFM images of various materials. AFM image datasets with expert-selected ground truth images were used to train each model.

Yinling Guo [19] (2023) proposed DAS VSP noise removal and wave field separation based on deep learning. The study utilized a wavelet separation method combined with second-effect data to reduce signal loss in noise removal. Five hundred geological models were created using stochastic processes, and RRCAN 1 was tested with synthetic and noise data. The plane wave decomposition method was developed to achieve good separation results between up and down waves for RRCAN 2 training. The experimental results showed that the proposed network could reduce DAS VSP noise while generating uplink and downlink parameters effectively.

Takuma Kobayashi [20] (2023) focused on post hoc CT noise reduction using deep learning based on the peri-coronary fat attenuation index (FAI). The study evaluated FAI diagnosis on raw and corrected images using the receiver operating characteristic curve. The noise-free CCTA showed an improved area under the FAI curve (0.89 [95% CI, 0.78–0.99]) compared to standard images (0.77 [95% CI, 0.62–0.91], p = 0.008). In the noise-free CCTA, −69 HU had the best cut-off point for predicting HIP with an accuracy of 0.85, a precision of 0.79, and an accuracy of 0.80.

Lang Zhou [21] (2022) reviewed point cloud noise removal techniques, categorizing them into filter-based, optimization, and deep-learning-based methods. The study used benchmarks to compare multiple noise removal models, addressing various challenges. Two quality tests were conducted to evaluate the noise removal quality. The research provided an in-depth comparison of several well-known technologies, offering a comprehensive analysis.

Dhayanithi et al. (2024) employed a novel transfer-learning-based concatenated model, combining pre-trained models (VGG16, MobileNetV2, ResNet50, and DenseNet121) to classify breast cancer histopathology images [22]. Fine-tuning and hyperparameter optimization enhanced the performance, achieving a high training accuracy of 98%. This approach significantly improves breast cancer diagnosis accuracy and efficiency, with potential benefits for personalized treatment planning and better patient outcomes.

Sivashankari (2024) presented a deep learning framework for tumor segmentation in colorectal ultrasound images using transfer learning and ensemble learning [23]. The methodology involved acquiring a dataset of 179 images, employing data augmentation, and fine-tuning pre-trained models. A custom loss function (GWDice) improved the top tumor margin prediction, and ensemble learning enhanced the overall segmentation, achieving a Dice coefficient of 0.84 and a tumor margin prediction accuracy of 0.67 mm.

Freija (2023) investigated transfer learning with CNN models (including Inception V3 and Xception) for classifying CLL, FL, and MCL types of malignant lymphoma [24]. The non-ensemble models achieved up to 97% accuracy, with Xception leading among them. The ensemble of Inception V3 and Xception achieved 99% accuracy, demonstrating superior performance in distinguishing lymphoma subtypes, highlighting the efficacy of transfer learning in medical classification tasks.

Medical imaging is crucial in diagnosing and treating various health conditions, but the presence of noise in medical images can obscure important details, complicating accurate diagnosis. Traditional denoising techniques often fall short in balancing noise reduction with preserving critical image features [14,17,25,26,27]. To address these challenges, we propose a novel hybrid approach that integrates fractional order denoising with transfer learning. Fractional order techniques offer advanced mathematical methods for precise noise reduction while maintaining image integrity. Unlike traditional integer order calculus, fractional order calculus involves derivatives and integrals of an arbitrary (non-integer) order, allowing for more flexible and accurate modeling of complex systems, including image denoising. Simultaneously, transfer learning utilizes pre-trained convolutional neural networks (CNNs) to leverage existing knowledge from large-scale datasets, enhancing model efficiency and performance in specific applications. By reusing these pre-trained models, our approach can quickly adapt to the task of medical image denoising with improved accuracy and reduced training time. Our Efficient Fractional Order Denoising (EFOD) model is evaluated using several state-of-the-art pre-trained CNN architectures, including DenseNet121, VGG16, ResNet50, and Inception V3. Each model’s performance is meticulously assessed through comprehensive experiments, ensuring robustness and reliability. We address key research gaps such as incomplete dataset utilization, inadequate validation methods, and biased classification by employing diverse datasets, extensive 10-fold cross-validation, and thorough performance analysis.

1.3. Identified Research Gaps

In Section 1.3, we identify several critical research gaps that our study aims to address as follows:

Incomplete Utilization of Benchmark Datasets: Many studies do not fully, utilize benchmark datasets, resulting in limited applicability and reliability of their models across different medical images and noise patterns.

Inadequate Validation Methods: Current research often uses simplistic or insufficient validation protocols, leading to overfitting and biased results that do not generalize well to new data.

Biased Classification: Existing models often fail to achieve consistent performance across diverse datasets, leading to unreliable diagnostics and potential misdiagnoses.

Limited Integration of Advanced Techniques: Most existing methods rely on traditional denoising techniques, lacking the powerful capabilities of modern machine learning and mathematical frameworks.

Lack of Comprehensive Performance Metrics: Many studies focus on a few indicators like accuracy, ignoring a broader evaluation using metrics such as precision, recall, and F1-score.

Scalability and Efficiency Concerns: Models that perform well on small-scale datasets may struggle with larger datasets due to computational constraints, limiting their practical applicability.

How Our Work Addresses These Gaps

Our proposed ETLFOD model directly addresses these research gaps through the following approaches:

Novel Integration of Techniques: We introduce a unique combination of fractional order calculus and transfer learning for medical image denoising. Using pre-trained models such as DenseNet121, VGG16, ResNet50, and Inception V3, we enhance the denoising performance while maintaining critical image features.

Higher Performance Metrics: The ETLFOD model demonstrates substantial improvements over state-of-the-art (SOTA) techniques. For example, the DenseNet121 model achieved an accuracy of 98.01%, precision of 98%, and recall of 98%.

Comprehensive Evaluation: Our method is rigorously evaluated using diverse medical imaging datasets, including MRI, CT scans, and X-ray images. We employ extensive 10-fold cross-validation and thorough performance analysis, including accuracy, precision, recall, and F1-score.

Comprehensive Dataset Utilization: We utilize a wide range of benchmark datasets to ensure robustness and generalizability across different types of medical images and noise patterns.

Robust Validation Techniques: We employ rigorous validation methods, including 10-fold cross-validation, to provide a realistic and reliable measure of our model’s performance, mitigating overfitting.

Unbiased Classification: Our model achieves consistent performance across diverse datasets, reducing the risk of biased classification through careful dataset selection and thorough performance evaluation.

By addressing these gaps, our work aims to provide a robust, efficient, and reliable solution for medical image denoising, significantly improving the quality of medical images and supporting accurate diagnosis and treatment planning.

1.4. Objective of the Proposed Work

Our research proposes an Efficient Transfer-Learning-Based Fractional Order Image Denoising Techniques in Medical Image Analysis (ETLFOD) model, which can achieve the following:

Utilize Fractional Order Techniques: Implement fractional order techniques in the ETLFOD model to overcome the limitations of traditional denoising methods, enhancing the robustness and accuracy of noise removal in medical images.
Leverage Transfer Learning: Use transfer learning to incorporate pre-trained models such as CNN, VGG16, DenseNet121, and Inception V3, which are trained on large-scale datasets, to significantly improve the denoising performance while preserving important image features.
Mitigate Overfitting: Improve both training and validation accuracy to mitigate overfitting, ensuring the ETLFOD model generalizes well to new, unseen medical image data.
Validate with Diverse Datasets: Utilize a variety of medical imaging datasets, including MRI, CT scans, and X-ray images, to validate the ETLFOD model’s ability to accurately identify infected areas and confirm its robustness across different medical conditions.
Comprehensive Performance Comparison: Compare the ETLFOD model’s performance with existing models using multiple evaluation metrics such as sensitivity, specificity, and accuracy, aiming for an improvement of at least 6% over current state-of-the-art techniques. To have a more comprehensive analysis, we evaluated the proposed model on all measurement data and performed 10-fold cross-validation to reduce the bias distribution of the image model. We further provide a detailed discussion of the results to enhance our understanding of the model’s performance.

The paper is structured as follows: Section 1 reviews related work in the field of medical image denoising. Section 2 details the methodology of the proposed ETLFOD model, including the custom CNN architecture and transfer learning process. Section 3 presents the experimental results and evaluation. Finally, Section 4 discusses the findings and concludes the paper.

By addressing these key points, our study aims to advance the field of medical image denoising, providing a robust and efficient solution to improve the quality of medical images and support accurate diagnosis and treatment planning.

2. Proposed System

2.1. Benchmark Dataset Description

The data were collected from various sources, including the Clinical Cancer Imaging Archive (TCIA) and the Clinical Proteomics Tumor Analysis Consortium (CPTAC). These data were edited and published by the Radiological Society of North America (RSNA) as part of the Radiological Society of North America (RSNA) 2018 Brain Tumor Segmentation Challenge. The data contain 3762 MRI images and are divided into infected and non-infected files with 1683 and 2079 images, respectively. Images are provided in DICOM format with a 256 × 256 pixels resolution.

The COVID-19 Lung CT Scans dataset available on Kaggle is a collection of CT scans of the lungs of COVID-19 patients. The dataset contains 2074 lung CT scans related to COVID-19 (SARS-CoV-2) infection, which includes 1130 positive cases and 944 negative cases. The images are in PNG format and have a resolution of 512 × 512 pixels. The data were collected from actual patients in teaching hospitals in Tehran, Iran, and the purpose of this dataset is to encourage the development of innovative and effective methods, such as deep convolutional neural networks (DCNNs), to identify if a person is infected with COVID-19 by analyzing their CT scans. We ensured the use of publicly available datasets for our study, including the ADNI MRI brain images, TCIA CT scans, NIH ChestX-ray8 images, and Kaggle Ultrasound Nerve Segmentation data. Detailed access information is provided in our manuscript to facilitate transparency and reproducibility. These datasets are accessible through their respective official websites and portals.

The diagnosis of COVID-19 is primarily carried out through Reverse Transcription Polymerase Chain Reaction (RT-PCR) testing. However, chest X-ray images can also be helpful in early diagnosis, as X-ray machines are widely available and provide quick imaging results. The dataset is organized into two main folders, “train” and “test”, both of which contain two subfolders, “PNEUMONIA” and “NORMAL”. The dataset includes 5856 X-ray images, comprising 4273 infected images and 1583 non-infected images. The summary of the dataset is presented in Table 1.

Figure 1 shows the workflow of the proposed model. This is the sample image that we want to process. It is usually a grayscale image that captures the internal structure of an object or body part using CT, MRI, or X-rays. The raw image may have noise or artifacts due to equipment malfunction, the radiation dose, or the image acquisition protocol. A noised image is generated by adding artificial noise to the raw image to simulate such scenarios. Denoising is used to remove the noise from the noisy CT image while preserving the underlying structure. The denoised images are then split into training and testing datasets, usually at a ratio of around 70:30. The training method is used to train the noise removal model. The test method is used to evaluate its performance. We use deep learning models such as VGG16, ResNet50, and Inception as feature extractors in the pictures with the base of the CNN model. Once the denoising algorithm is trained, we then validate its performance on the test set. Validation involves applying the denoising algorithm to the test set and comparing the denoised images with the ground truth (i.e., the original images). This helps us measure the proposed algorithm’s accuracy, precision, recall, and F1 score.

2.2. System Description

In our experiments, Matlab R2021b was used for image denoising. We utilized a cloud-based Jupyter Notebook environment and Google Colab (Python version 3.10), which was integrated with Google Drive for accessing image sources. The availability of GPUs, specifically type T4, in Google Colab played a pivotal role in accelerating the model training process. We employed the Keras 2.6.0 library, a high-level neural network API, which runs on top of TensorFlow 2.7.0, a robust open-source machine learning framework, to develop the deep learning models. Integrating Keras and TensorFlow with Google Colab provided an efficient collaborative environment for importing necessary libraries and executing code. This integration not only facilitated model development but also allowed us to harness the computational power of GPUs effectively. Overall, the system configuration in Google Colab with GPUs, Keras, and TensorFlow provided an ideal environment for our experiments. Python version 3.10.11 was used.

2.3. An Efficient Fractional-Order-Based Image Denoising

The ETLFOD model effectively utilizes pre-trained models such as DenseNet121, VGG16, ResNet50, and Inception V3 to improve denoising performance by leveraging fractional calculus. Fractional calculus offers a flexible framework for managing diverse noise patterns while preserving crucial image details, making it an excellent choice in image denoising tasks [12,28].

However, the paper could benefit from a more accessible explanation of fractional derivatives like the Riemann–Liouville and Caputo derivatives. These mathematical concepts, while powerful, are not widely familiar to readers outside specialized fields. Simplifying the exposition or providing intuitive examples might help bridge the gap for readers unfamiliar with these concepts. This could involve offering a brief overview of the underlying principles of fractional calculus, perhaps through visual aids or analogies that relate to more common mathematical operations.

Additionally, the paper could include references to introductory resources on fractional calculus, enabling interested readers to gain a better understanding without getting lost in the technical details. By doing so, the paper would not only demonstrate the power of fractional calculus in image denoising but also make it more accessible to a broader audience.

The image noise removal method range is divided into two: one for spatial domain noise removal and one for frequency domain noise removal. Here, spatial field noise removal is applied directly to the pixels of the image. The most important thing in noise removal in the frequency domain is to use image smoothing and sharpening filtering in the frequency domain. Here, we define the spatial domain with the help of fractional derivatives.

The most common types of fractional derivatives are the Riemann–Liouville, Caputo, Jumarie, Hadamard, Mittag, Leffler, and Weyl fractional derivatives, but all these definitions have their advantages and disadvantages. The most generally used definition is defined as follows:

Riemann–Liouville Derivative:

For m ≤

α \leq m + 1

, the α-derivative of f(t) is:

{}_{a}^{R}{D_{t}^{α}} f (t) = {(\frac{d}{d t})}^{m - 1} \int_{a}^{t} {(t - τ)}^{m - α} f (τ) d τ,

(1)

This involves first integrating the function with a kernel (t − τ)^m^−α and then differentiating m − 1 times. The Riemann–Liouville derivative is useful in theoretical studies but can be challenging to apply directly in real-world problems due to its non-local nature.

Caputo Derivative:

The Caputo derivative the most suitable fractional operator to be used in modeling real world problem. The Caputo fractional derivative definition is defined as follows:

For

\{n = [α] + 1, n - 1 < α \leq n, t > α\}

, the α-derivative of f(t) is:

{}_{a}^{C}{D_{t}^{α}} f (t) = \frac{1}{Γ (n - α)} \int_{a}^{t} \frac{f^{n} (τ)}{{(t - τ)}^{α - n + 1}} d τ,

(2)

This derivative takes the n-th derivative of f and integrates it against a kernel (t − τ)^−α+n−1.

It is often preferred because it requires the function f to have only n derivatives, making it more practical for applications.

Grünwald–Letnikov Derivative:

The Grünwald–Letnikov derivative is one of the recommended derivatives to answer expert questions related to the definition of the fractional derivative; many studies have been conducted associated with the effective technique of this derivative. The Grünwald–Letnikov fractional derivative is defined as:

f^{α} (t) = \lim_{h \to 0} \frac{1}{h^{α}} \sum_{r = 0}^{n} {(- 1)}^{r} (\binom{n}{r}) f (t - r h)

(3)

The Grünwald–Letnikov discretization has often been used for numerical simulations of fractional order problems and is defined as follows:

D_{x}^{α} f = \frac{1}{{(Δ x)}^{α}} \sum_{k = 0}^{i + 1} C_{k}^{α} g_{i - k + 1, j}

(4)

{a n d D}_{y}^{α} f = \frac{1}{{(Δ y)}^{α}} \sum_{k = 0}^{i + 1} C_{k}^{α} g_{i, j - k + 1}

(5)

w h e r e C_{k}^{α} = \frac{{(- 1)}^{k} α (α - 1) (α - 2) \dots (α - k + 1)}{k!}

(6)

This derivative is particularly useful for numerical simulations and is based on a finite difference approximation. It involves summing over the values of the function at discrete points.

Optimization Problem for Image Processing

The following optimization problem helps to find the required clean image f(x,y) by using the fractional order finite difference scheme, which is given by:

{}_{f \in Ω}^{m i n}{\int_{Ω} |D^{α} f| + μ \int_{Ω} (f - g) d x d y}

(7)

where

Ω \subset R^{2}

is the image domain, μ > 0 is a parameter controlling the degrees of smoothing, f − g is the data fidelity part, and

\int_{Ω} |D^{α} f|

is the regularization part, where the fractional order space derivative is placed in 1 < α < 2. By the gradient projection method, the above problem (1) is defined in terms of partial derivatives as follows:

f_{t} = \nabla (\frac{f_{x}^{α} + f_{y}^{α}}{(\sqrt{{(f_{x}^{α})}^{2} + {(f_{y}^{α})}^{2}})}) + μ (f - g)

(8)

Otherwise, using the finite difference scheme, Equation (8) can be discretized into:

f_{t} = \frac{A_{i j}^{n}}{B_{i j}^{n}} + μ (f - g)

where

A_{i j}^{n} = f_{x x}^{α} {(f_{y}^{α})}^{2} - 2 f_{x x}^{α} f_{x}^{α} f_{y}^{α} + f_{y y}^{α} {(f_{x}^{α})}^{2} a n d B_{i j}^{n} = {({(f_{x}^{α})}^{2} + {(f_{y}^{α})}^{2})}^{\frac{3}{2}}

Now, the regularization term

D^{α} f

can be discretized on the space fractional heat equation given by:

D_{t} f (x, y, t) = D_{x}^{α} f (x, y, t) + D_{y}^{α} f (x, y, t)

(9)

With the Neumann boundary conditions, the continuity of the image boundary can be maintained:

f_{i, N}^{n} = f_{i, N - 1}^{n}, f_{N, j}^{n} = f_{N - 1, j}^{n}

(10)

where X_{L} < X < X_{R}, Y_{L} < Y < Y_{R} a n d 0 \leq t \leq T .

(11)

The main properties of explicit techniques concentrating on the stability, the convergence, and the error behavior are studied regarding the proposed problem. The explicit scheme discretization of Equation (9) is as follows:

\frac{f_{i j}^{n + 1} - f_{i j}^{n}}{Δ t} = \frac{1}{2} (D_{x}^{α} + D_{y}^{α}) (f_{i j}^{n + 1} + f_{i j}^{n})

(12)

We adopt the boundary condition to protect the image’s boundary continuity and reflect the closest pixels inside the image of the region. Let

f_{i j}^{n}

be a numerical approximation of f(

x_{i}

,

y_{j,} t_{n})

and

Δ x = h > 0

be the grid size in the x-direction and

Δ y = h > 0

the grid size in the y-direction.

Δ x, Δ y, a n d Δ t

are the spatial step sizes and the time step size, respectively.

Discretizing (12) using the finite difference technique can be conducted as follows:

(1 + μ \frac{Δ t}{2} D_{x}^{α} + μ \frac{Δ t}{2} D_{y}^{α}) f_{i j}^{n + 1} = (1 - μ \frac{Δ t}{2} D_{x}^{α} - μ \frac{Δ t}{2} D_{y}^{α}) f_{i j}^{n} + \frac{Δ t}{2} (D_{x}^{α} + D_{y}^{α}) (\frac{A_{i j}^{n + 1}}{B_{i j}^{n + 1}} + \frac{A_{i j}^{n}}{B_{i j}^{n}}) + Δ t μ (D_{x}^{α} + D_{y}^{α}) f_{i j}

(13)

Using the iterative method to calculate

\frac{A_{i j}^{n + 1}}{B_{i j}^{n + 1}},

we obtain:

\frac{A_{i j}^{n + 1}}{B_{i j}^{n + 1}} = {(f_{x x}^{α})}^{n + 1} \frac{{({(f_{y}^{α})}^{2})}^{n}}{B_{i j}^{n}} - 2 {(f_{x y}^{α})}^{n + 1} \frac{{(f_{x}^{α} f_{y}^{α})}^{n}}{B_{i j}^{n}} + {(f_{y y}^{α})}^{n + 1} \frac{{({(f_{x}^{α})}^{2})}^{n}}{B_{i j}^{n}}

The above equation may be written in the matrix form:

P Q f^{n + 1} = R S f^{n} + T^{n + 1}

(14)

The numerical solutions of

f_{i j}^{n + 1}

and

f_{i j}^{n}

are derived from the taken initial and Neumann boundary conditions. Numerical methods, particularly the Grünwald–Letnikov derivative, are used to approximate the solution to the optimization problem, which balances data fidelity and smoothness through regularization. The resulting discretized equations are solved iteratively to obtain the clean image.

Table 2 presents the results of image denoising using two different models, an integer order model and EFOD (Efficient Fractional Order Image Denoising Model), on three different data types: brain (MRI), lung (CT), and pneumonia (X-ray). The experiments were conducted with a noise level (σ) of 10 and a fractional order parameter (α) of 1.2. The EFOD model consistently outperformed the integer order model in terms of PSNR and MSE, indicating better denoising performance. However, it is worth noting that the EFOD model required less time for processing compared to the integer order model. These results suggest that the proposed model EFOD is more effective in reducing noise and improving image quality.

2.4. Distribution of Pixel Intensity to Calculate Mean and Standard Deviation

To calculate the pixel mean and standard deviation of a noise-free CT image, a noise removal algorithm must be applied to remove noise and improve image quality. Various denoising methods can be used for CT images, such as wavelet denoising, non-local means denoising, and total variation denoising. Once the image has been denoised, the pixel mean and standard deviation can be calculated using the following formulas:

Pixel mean = \frac{1}{N} \times \sum_{i = 1}^{N} x_{i}

(15)

N is the total number of pixels in the image, and xi is the intensity value of the i th pixel.

Standard deviation = \sqrt{\frac{1}{N} (\sum_{i = 1}^{N} {(x_{i} - μ)}^{2})}

(16)

where μ is the pixel mean calculated using the above formula.

Note that the pixel mean and standard deviation can be helpful measures for evaluating the quality of the denoising algorithm and comparing it to other methods. However, they should be interpreted, as they do not necessarily reflect the image’s clinical relevance or diagnostic accuracy. Other metrics, such as the contrast-to-noise ratio (CNR) and signal-to-noise ratio (SNR), may help assess the quality of CT images.

2.4.1. Discussion on MRI Brain Images

Figure 2 shows the sample input and noised and denoised brain MRI images. Noising and denoising are standard techniques used in MRI imaging to improve image quality. Denoising helps reduce the impact of artifacts caused by noise in the acquisition process. MRI imaging involves taking a series of X-ray images from different angles around the body, which are then reconstructed into a 3D image using complex algorithms. However, many factors, such as tissue thickness and density, can affect X-ray photons passing through the body, causing image noise and artifacts.

Adding artificial noise to MRI images can help simulate the impact of noise on image quality and can support test image processing algorithms. On the other hand, denoising techniques can be applied to remove unwanted noise and improve image clarity. There are various methods for noising and denoising in MRI imaging, including statistical methods. Examples include filtering, wavelet techniques, and CNN models. The choice of method depends on the specific requirements of the application and the equipment included.

Figure 3 shows the pixel distribution of noised and denoised brain MRI-DICOM images. Figure 4 shows the distribution of the pixel intensity in the denoised images. The images’ dimensions are 342 pixels wide by 421 pixels high, and they have a single colour channel. The maximum pixel mean value is 53.4402. The standard deviation is 87.7547.

2.4.2. Discussion on Lung CT Images

Figure 5 shows the sample images of noised and denoised COVID-19 lung CT images, which provide detailed information about the structure and condition of the lungs. These images are used for diagnosing and monitoring various lung diseases, including pneumonia, lung cancer, and chronic obstructive pulmonary disease. Denoising techniques are crucial in improving the quality and usefulness of lung CT images. CT scans are susceptible to noise, which can degrade the image quality and make it challenging to interpret the findings accurately. Noise in CT images can arise from various sources, including photon statistics, electronic noise, and patient motion artifacts.

Denoising algorithms aim to reduce the noise in CT images while preserving the essential anatomical details and features. These algorithms employ various mathematical and statistical methods to identify and remove noise without significantly affecting the diagnostic information. One standard process of adding noise uses Gaussian noise, a random noise added to the image pixel values. Figure 6 shows the noised and denoised images of a CT image. In Figure 7, the pixel distribution of the denoised images of the lung CT images is shown.

Figure 7 shows the pixel classification of the noised and denoised images of the lung CT. images. The images’ dimensions are 421 pixels wide by 3 pixels high, and they have a single color channel. The maximum pixel value is 0.3139. The pixels have an average value of 186.5053 and a standard deviation of 0.4246.

2.4.3. Discussion on Pneumonia X-ray Images

Figure 8 shows the sample images of pneumonia X-ray images. X-ray images are often subject to noise, which can be caused by various factors, including the imaging equipment, the patient’s movement, and the nature of the tissue being imaged. Noise can reduce the quality of the image and make it difficult for medical professionals to make an accurate diagnosis. Therefore, noising and denoising techniques are crucial in improving the quality of X-ray images. Noising techniques are used to simulate the noise in X-ray images to test the effectiveness of denoising algorithms. Denoising techniques aim to reduce the noise in X-ray images while preserving critical diagnostic features. Deep-learning-based denoising involves training deep neural networks to learn the mapping between noised and denoised images.

Figure 9 shows the noised and denoised images. Figure 10 shows the pixel distribution of the denoised images of pneumonia X-ray images. The images’ dimensions are 342 pixels wide by 421 pixels high, and they have a single color channel. The maximum pixel value mean is 94.1467. The average value of the pixels is 184.8223, with a standard deviation of 67.8781.

2.5. Transfer Learning

Transfer learning includes pre-existing teaching models as a starting point for new activities. The idea is to take knowledge and features learned from a previous big data model and apply them to small or related data. The pre-trained model is used as a custom tool, and several earlier layers of the model are replaced with a new set specific to the new task. The new layers are then trained on the smaller dataset to fine-tune the model for the new task. The transfer learning formula can be expressed as:

f(x) = g(h(x))

(17)

where x is the input data, h is the pre-trained model, g is the new task-specific layers, and f is the final output of the transfer learning model. The “model. Fit ()” function is used to fit the training data to a model, with hyperparameter settings of 10 epochs, a validation dataset “validation_data”, “class_weight” to balance class frequencies, 5 steps per epoch, and 20 validation steps.

When changing learning, the weights of the pre-learning models are frozen, and only the layers’ weights are changed during training. This allows the model to learn new task-specific features while preserving the general information known by the model before training. The benefits of transfer learning include faster training times, improved performance on smaller datasets, and the ability to leverage the knowledge learned from pre-trained models trained on larger datasets.

2.5.1. Convolutional Neural Networks

Computer vision can be achieved using convolutional neural networks (CNNs), which are neural networks that extract features before segmenting images, as shown in Figure 11. A CNN is the base of the model used in this study. Feature extraction consists of three simple operations such as filtering the image for a particular feature (convolution), finding the feature in the filtered image (using ReLU activation), and compressing the image to fix the feature (maximum pooling) [13].

Using convolution filters with different dimensions or values results in various features being extracted, and ReLU is then used to turn on each target pixel to detect features. In transfer learning, pre-trained models are often used as feature extractors to extract useful features from the input data. These extraction methods can be fed into new models to solve different tasks. The ReLU function is used to perform operations in the deep neural network of the previous model.

f(x) = max(0, x), where x is the input to the function.

(18)

The role of the ReLU function in transfer learning is to introduce discontinuity into the model by allowing the model to learn the relationship between inputs and outputs. ReLU makes the model more descriptive, and more models can be learned from the input data. Also, ReLU helps with the gradient extinction problem by preventing gradient shrinking during regression. This problem can occur when using other activation functions, such as the sigmoid or tanh functions, which have small gradients for large input values. The ReLU activation function is shown in Figure 12.

Features are enhanced with max pool layers. In max pooling, the input is divided into rectangular regions, and the maximum value within each region is selected as the output. The max pooling operation can be defined as:

max_pool(x) = max(x_{i,j})

(19)

where x_{i,j} is the input value at position (i,j) in the input field. In transfer learning, maximum pooling layers are often used in pre-trained models to reduce the spatial size of feature maps while preserving the essential features. This helps to reduce the computational cost and memory of the model while maintaining good performance. The enhanced max pool layers are shown in Figure 13.

The step parameter determines the distance between each filter. The padding determines whether we ignore boundary pixels (adds zeros to help the neural network retain boundary information). The purpose of padding is to preserve the spatial dimensions of the input data after convolution or pooling operations. In transfer learning, padding can be used to ensure that the input data to the pre-trained model has the exact spatial dimensions as the original input data used to train the model. Zero padding adds rows and columns of zeros around the border of the input image or feature map. The padding size is typically chosen so that the output after convolution or pooling has the exact spatial dimensions as the input. The formula for zero padding is:

P = (F − 1)/2

(20)

P is the number of rows or columns of padding to add, and F is the size of the filter or kernel used in the convolution operation. A pictorial representation of padding is given in Figure 14.

The outputs are then concatenated in dense layers. The dense layers are shown in Figure 15.

The neural network selects the image’s class using a sigmoid activation. This is a nonlinear function that displays input values between 0 and 1. In adaptive learning, the sigmoid activation function can generate an outcome score for each class in the last layer of the pre-trained model. The sigmoid function is defined as:

f(x) = 1/(1 + e^−x), where x is the input to the function.

(21)

The output of the sigmoid function can be interpreted as the probability that the input falls into a particular class. In binary classification problems, the last layer of the neural network usually uses a sigmoid function to extract a good value for the class. The sigmoid activation function is given in Figure 16.

Detailed Description of the Custom CNN Architecture in ETLFOD Model

The custom convolutional neural network (CNN) architecture used in our Efficient Transfer-Learning-Based Fractional Order Image Denoising (ETLFOD) model is specifically designed to address the unique challenges of medical image denoising. Below, we provide a comprehensive description of the architecture, including its layers, parameters, and pretraining datasets:

Custom CNN Architecture Overview
Input Layer:
○
Description: The input layer accepts medical images with dimensions H × W × CH\times W\times CH × W × C, where HHH is the height, WWW is the width, and CCC is the number of channels (e.g., grayscale or RGB).
○
Input Shape: Typically, 224 × 224 × 3224\times 224\times 3224 × 224 × 3 for RGB images.
Convolutional Layers: First Convolutional Layer:
○
Filter Size: 32 filters
○
Kernel Size: 3 × 33\times 33 × 3
○
Activation Function: ReLU (Rectified Linear Unit)
○
Stride: 1
○
Padding: Same
Second Convolutional Layer:
○
Filter Size: 64 filters
○
Kernel Size: 3 × 33\times 33 × 3
○
Activation Function: ReLU
○
Stride: 1
○
Padding: Same
Third Convolutional Layer:
○
Filter Size: 128 filters
○
Kernel Size: 3 × 33\times 33 × 3
○
Activation Function: ReLU
○
Stride: 1
○
Padding: Same
Pooling Layers: First Pooling Layer:
- Type: Max Pooling
- Pool Size: 2 × 22\times 22 × 2
- Stride: 2
Second Pooling Layer:
- Type: Max Pooling
- Pool Size: 2 × 22\times 22 × 2
- Stride: 2
Batch Normalization:
- Batch normalization layers are added after each convolutional layer to normalize the inputs and improve training stability and performance.
Dropout Layers: First Dropout Layer:
- Dropout Rate: 0.25 (applied after the first pooling layer)
Second Dropout Layer:
- Dropout Rate: 0.5 (applied after the second pooling layer)
Fully Connected (Dense) Layers: First Dense Layer:
- Units: 512
- Activation Function: ReLU
Second Dense Layer:
- Units: 256
- Activation Function: ReLU
Output Layer:
- Units: Number of classes (for classification tasks) or 1 (for regression tasks)
- Activation Function: Softmax (for classification) or Linear (for regression)
Custom CNN Architecture
The custom CNN architecture for the ETLFOD model is based on the principles of transfer learning, employing several well-known pre-trained models. The detailed architecture for each of these models is as follows:
DenseNet121:
Architecture: Each layer in DenseNet121 is connected to every other layer in a feed-forward fashion, allowing for maximum information flow between layers. This connectivity pattern is designed to improve gradient flow and make the network more efficient.
Parameters: Uses multiple dense blocks, each containing several convolutional layers.
Pretraining Dataset: ImageNet
Equation: y = F(x,W)
Fine-Tuning: The pre-trained model is fine-tuned on the target dataset.
VGG16:
Architecture: Composed of 2–3 convolutional layers followed by a pooling layer. Includes two hidden layers with 4096 nodes each and an output layer with 1000 nodes. Uses 3 × 3 filters for convolutional layers.
Parameters: A total of 13 convolutional layers, 5 pooling layers, 3 fully connected layers.
Pretraining Dataset: ImageNet.
ResNet50:
Architecture: Fifty layers deep with residual blocks. The architecture includes convolutional layers, batch normalization, ReLU activation, and max pooling.
Parameters: A total of 23 convolutional layers and 1 fully connected layer. The filters increase from 64 to 256 in each group of residual blocks.
Pretraining Dataset: ImageNet
Equation: The network uses residual learning with shortcut connections to jump over some layers.
Inception V3:
Architecture: Designed with inception modules that perform convolutions in parallel, including 1 × 1, 3 × 3, and 5 × 5 convolutions followed by max pooling.
Parameters: Multiple inception modules with batch normalization and auxiliary classifiers.
Pretraining Dataset: ImageNet.
Pretraining Dataset
For transfer learning, our custom CNN architecture was initialized using pre-trained weights from a widely recognized dataset. The specific pretraining dataset and the process are as follows:
Algorithm Steps:
- Apply Gaussian noise to the images in the preprocessing stage.
- Denoise the images using fractional order techniques.
- Train and test the model using the pre-trained architectures (DenseNet121, VGG16, ResNet50, Inception V3).
- Perform data augmentation to remove artifacts and enhance model robustness.
Performance Metrics:
- Evaluated using precision, recall, F1-score, training accuracy, and testing accuracy.
- Models like DenseNet121 showed significant improvement in performance metrics after applying denoising techniques.
Training Details
- Optimizer: The Adam optimizer was used with a learning rate of 1 × 10⁻⁴¹\times 10⁻⁴¹ × 10⁻⁴.
- Loss Function: Cross-entropy loss for classification tasks or mean squared error (MSE) for regression tasks.
- Batch Size: 32
- Epochs: 50, with early stopping based on validation loss to prevent overfitting.
- Data Augmentation: Techniques such as rotation, flipping, and scaling were applied to the training images to increase the dataset’s diversity and improve the model’s generalization ability.

The custom CNN architecture for the ETLFOD model leverages transfer learning by employing pre-trained models such as DenseNet121, VGG16, ResNet50, and Inception V3, all initialized with weights from the ImageNet dataset. DenseNet121 enhances gradient flow through dense connectivity, VGG16 uses a series of convolutional layers followed by fully connected layers, ResNet50 incorporates residual blocks to allow deep network training, and Inception V3 employs inception modules for parallel convolutions. The architecture was fine-tuned on a target dataset, with preprocessed images denoised using fractional order techniques and augmented through methods like rotation and scaling. The models were evaluated using precision, recall, F1-score, and accuracy metrics, with DenseNet121 showing notable performance improvements after denoising. Training was conducted using the Adam optimizer with a learning rate of 1×10^−4, cross-entropy or mean squared error loss, a batch size of 32, and 50 epochs, with early stopping based on validation loss to prevent overfitting.

2.5.2. DenseNet121

DenseNet121 is a CNN architecture that can be used as a pre-trained model for transfer learning. In transfer learning, the pre-trained model is fine-tuned on a smaller target dataset to adapt to a specific task. The main equation for DenseNet121 is:

y = F(x,W)

(22)

where y is the output, x is the input, W represents the model’s weight, and F is the function that maps the input to the output. In the dense mesh, the F function consists of many thicknesses, and each thickness block consists of many layers. Each layer receives a map of all layers of the same density, making it possible to reuse and refine the model. When using DenseNet121 for transformation learning, pre-patterns are often used for object extraction. The features extracted from the pre-trained method are fed to the classifier and trained on the target dataset to predict the target label. The overall transfer learning equation for DenseNet121 can be written as:

y_target = Classifier(F_pretrained(x_target,W_pretrained),W_target)

(23)

where y_target is the predicted output for the target dataset, x_target is the input for the target dataset, F_pretrained is the function that maps the input to the output using the pre-trained weights, W_pretrained are the weights of the pre-trained model, and a classifier is a function that maps the extracted features to the target labels using the fine-tuned weights W_target. In this way, the pre-trained DenseNet121 model can be leveraged for transfer learning, allowing the model to learn from a smaller target dataset and achieve better performance than training a new model from scratch. DenseNet121 is a convolutional network, where each layer is connected to all other layers in the network, i.e., the first layer is connected to the second, third, fourth, etc., and the second layer is connected to the third, fourth, fifth etc. DenseNet121 is shown in Figure 17.

2.5.3. VGG16

The VGG16 architecture is designed with a pattern of two or three convolutional layers followed by a pooling layer, and it has a final dense network that includes two hidden layers consisting of 4096 nodes each and an output layer with 1000 nodes. The VGG16 model uses only 3 × 3 filters for the convolutional layers. The VGG16 architecture is shown in Figure 18.

2.5.4. ResNet50

Figure 19 shows the architecture of ResNet50. ResNet50 is a deep convolutional neural network architecture that consists of 50 layers. The formula for ResNet50 can be represented as follows:

Input → Convolutional layer → Batch normalization → ReLU activation → Max pooling → Residual blocks → Global average pooling → Fully connected layer → Output

(24)

where input is the raw CT and lung image, and convolutional techniques use filter layers to perform convolution operations on the input images to extract features. Batch normalization involves normalizing the output of the convolutional layer to improve network stability and convergence. ReLU activation uses a modified linear activation function to introduce nonlinearity to the network. Maximum pooling performs a shrink to reduce the width of the map.

Residual Blocks: The network creates blocks with cross-links, allowing the network to learn from residual images instead of direct images, called residual blocks. Global average pooling performs a pooling operation over the entire feature map to obtain a fixed-length vector representing the image. A fully connected layer performs a matrix multiplication between the output of the previous layer and a set of weights to generate the final output. Output is the predicted class label or a probability distribution over classes. ResNet50 has a total of 23 convolutional layers and 1 fully connected layer. The residual blocks are stacked together in groups of three, with the number of filters increasing from 64 to 256 in each group.

2.5.5. Inception V3

Inception V3, shown in Figure 20, is used in transfer learning, where pre-trained models are fine-tuned for new tasks. The formula for using Inception V3 in transfer learning is as follows: Load the pre-trained Inception V3 model, which has been trained on a large dataset such as ImageNet. Remove the model’s final layer(s) designed for ImageNet classification. Add new layers to the model that suit the new task at hand. Freeze the weights of the pre-trained layers to prevent them from being updated during training. This ensures that the pre-trained features are retained and only the new layers are trained. Train the model on the new task using a suitable optimizer and loss function. The training data should be a set of labelled images that are representative of the new task. Evaluate the model on a separate validation set to measure its performance. If necessary, adjust the hyperparameters or architecture of the model and repeat steps 5 and 6 until satisfactory performance is achieved. Once the model is trained and validated, it can predict new target images.

The Algorithm 1 ETLFOD outlines a procedure to identify and predict infected regions in medical images. The algorithm employs transfer learning in combination with fractional order image denoising techniques. The first step is to set the images from the dataset and apply Gaussian noise to them in the preprocessing stage. Next, the algorithm solves the equation and obtains denoised images. The model is trained and tested on the denoised images using a loop with m and n as iteration variables.

Algorithm 1: ETLFOD

Input: Image acquisition from benchmark datasets

Output: Prediction of the infected region in medical images.

Set the images, f(x,y) from the dataset

2.

In preprocessing,

Add Gaussian noise to f(x,y)
Compute $D_{x}^{α} f, D_{y}^{α} f$
Calculate $\frac{A_{i j}^{n}}{B_{i j}^{n}}$ and $\frac{A_{i j}^{n + 1}}{B_{i j}^{n + 1}}$
Solve the equation
Set $f^{n + 1} = f$

3.: Obtain the denoised images

4.: Calculate the mean and standard variation for finding the pixel intensity

5.: In the transfer learning model, carry out the following:

6.: Analyze denoised images using the CNN model

7.: for m = 1, train the model using DenseNet121, VGG16, ResNet50, Inception V3

8.: for n = 1, test the model using DenseNet121, VGG16, ResNet50, Inception V3

9.: Carry out the data augmentation for removing the artifacts

10.: end

11.: end

12.: end of the transfer learning

13.: Obtain the images of infected regions

14.: Compute and analyze the performance of the model using metrics accuracy, precision, recall, F1 score

15.: End

3. Experimental Results and Discussion

Precision

Precision is a statistical measure to measure how accurately a model makes a good prediction. It is calculated as the ratio of true positives (TP) to the sum of true positives and negatives (FP)s

Precision = TP/(TP + FP)

(25)

Recall

Recall, also known as sensitivity or the true positive value, is a measure that measures the proportion of positive cases in which a sample is correctly identified. It is calculated as the ratio of true positives (TP) to the number of true positives and false negatives (FN).

Recall = TP/(TP + FN)

(26)

F1-score

The F1 score is a statistical measure that combines and returns a measure of the model’s performance on a binary function. It is the harmonic mean of precision and recall and is calculated as:

F1 Score = 2 × ((Precision × Recall)/(Precision + Recall))

(27)

Accuracy

Accuracy measures the overall accuracy of the learning model’s prediction. It is calculated as the ratio of the number of correct predictions (true positives and negatives) to the total number of predictions:

Accuracy = (TP + TN)/(TP + TN + FP + FN)

(28)

3.1. Results and Discussion of ETLFOD_model for Brain Dataset

Table 3 presents the performance metrics (precision, recall, F1-score, training accuracy, and testing accuracy) of different models under two conditions, i.e., without and with denoising. The models evaluated were CNN, DenseNet121, VGG16, ResNet50, and Inception V3. Based on the testing accuracy values, denoising positively impacted most models, except for VGG16. The CNN model improved its performance on all metrics after denoising, while DenseNet121, ResNet50, and Inception V3 also showed higher precision, recall, and F1-score after denoising. However, it is essential to note that the choice of model also played a crucial role in the system’s performance. DenseNet121 performed the best among the models for both with and without denoising conditions.

In DenseNet121 without denoising for class 0, the precision was 0.8469, recall was 0.9380, and F1-score was 0.9153. The training accuracy was 0.9021, and the testing accuracy was 0.9012. For class 1, the precision was 0.9438, recall was 0.8722, and F1-score was 0.8936. The model also performed well in identifying class 1 instances. The training accuracy was 0.8963, and the testing accuracy was 0.8935. With denoising for class 0, the precision was 0.9346, recall was 0.9880, and F1-score was 0.9606. The model accurately identified class 0 instances, showing an improvement compared to the previous condition without denoising. The training accuracy was 0.9509, and the testing accuracy was 0.9546. For class 1, the precision was 0.9836, the recall was 0.9121, and the F1 score was 0.9465. The model demonstrated excellent performance in identifying class 1 instances. The training accuracy was 0.9535, and the testing accuracy was 0.9544. ResNet50 and Inception V3 also performed well, while CNN and VGG16 did not perform well without denoising conditions. The running times for the CNN, DenseNet121, VGG16, ResNet50, and Inception V3 models were 74s, 35s, 36s, 34s, and 34s, respectively. The models were trained for 10 epochs with a total of 375 iterations.

VGG16 had relatively low values across all metrics, both with and without denoising. CNN and ResNet50 had similar performances across most metrics, both with and without denoising. Inception V3 performed similarly to CNN and ResNet50 on most metrics but with slightly better precision and F1-score with denoising. Figure 21 shows the loss evolution and accuracy evolution for the brain MRI image dataset. Figure 22 shows the ROC curve for the brain dataset with a threshold value of 0.05. Therefore, it can be concluded that combining denoising and using a powerful deep learning pre-trained model, DenseNet121, can result in better performance for brain tumor detection in MRI images.

3.2. Experimental Analysis of ETLFOD Model for Lung CT Dataset

Based on the results presented in Table 4, denoising the images before feeding them into the models generally led to higher precision, recall, and F1 scores and higher accuracy in training and testing. Among the models tested, DenseNet121 with denoising achieved the highest testing accuracy, 86.41%, followed by Inception V3 with denoising at 83.37%. The worst-performing model was VGG16 without denoising, with a testing accuracy of only 72.03%. Based on the results provided, it appears that all the models benefited from denoising. The precision, recall, and F1-score all improved in the denoised models for each class. Regarding the specific models, DenseNet121 performed the best overall, with the highest precision, recall, and F1-score for both classes in the denoised and non-denoised versions. Inception V3 also performed well, with a high precision, recall, and F1-score for both classes in the denoised version. It seems that the performance of the models varied depending on the preprocessing steps and whether denoising was applied. For example, the best-performing model without denoising appears to be DenseNet121 with an F1-score of 0.8935 for class 0 and 0.9385 for class 1. However, with denoising, the best-performing model was Inception V3 with an F1-score of 0.8098 for class 0 and 0.8108 for class 1. On the other hand, VGG16 performed poorly, with very low precision, recall, and F1-score for class 0 in both the denoised and non-denoised versions. ResNet50 had a relatively low recall for class 0 in both versions. Denoising seemed to benefit the image classification models, but the DenseNet121 and Inception V3 model architectures could also significantly impact performance. The running times for the CNN, DenseNet121, VGG16, ResNet50, and Inception V3 models were 133 s, 214 s, 743 s, 169 s, and 109 s, respectively. The models were trained for 10 epochs with a total of 220 iterations.

Figure 23 shows the lung CT dataset’s loss and accuracy evolution. Figure 24 shows the ROC curve with a threshold value of 0.05.

3.3. Experimental Analysis of ETLFOD Model for Pneumonia Dataset

Table 5 shows that denoising improved the performance of the models in terms of precision, recall, and F1 score for all the models. The testing accuracy of all the models increased after denoising except for VGG16 with class 1. However, the training accuracy also increased, indicating that the model might have overfitted the training data. Comparing the different models, DenseNet121 and Inception V3 performed best after denoising. Both models showed high precision, recall, and F1-score values for both classes with high testing accuracy and relatively low overfitting, as shown by the difference between the training and testing accuracy. ResNet50 performed the worst among all the models, with low precision, recall, and F1-score values and high overfitting. It is also important to note that the performance of the models without denoising varied significantly. In Inception V3, the model performed moderately well in identifying class 1 instances, although with a lower recall than for class 0. The training accuracy was 0.7953, and the testing accuracy was 0.8123. The “with denoising DenseNet121” model performed better across all metrics than the “without denoising DenseNet121” model. Notably, the precision, recall, and F1-score were much higher for both classes in the “with denoising DenseNet121” model than the “without denoising DenseNet121” model. VGG16 performed poorly compared to the other models, with the lowest F1-score values for both classes. In contrast, DenseNet121 and Inception V3 performed relatively well, with high F1-score values and testing accuracy. Overall, denoising seemed to improve the performance of the models significantly, and DenseNet121 and Inception V3 were the best-performing models after denoising. The running times for the CNN, DenseNet121, VGG16, ResNet50, and Inception V3 models were 712s, 181s, 172s, 167s, and 567s, respectively. The models were trained for 10 epochs with a total of 770 iterations.

Based on the provided data, the DenseNet121 and Inception V3 models appear to have the best performance, with high precision and recall for both classes. Figure 25 shows the loss evolution and accuracy evolution for the brain pneumonia image dataset. Figure 26 shows the ROC curve with a threshold value of 0.05.

Table 6 gives the obtained results for the paired t-test and McNemar’s test. The analysis was conducted as follows: The paired t-test compared the F1-scores of the models with and without denoising. For the brain dataset, DenseNet121 showed a significant difference (p-value = 0.0492), suggesting that denoising had a statistically meaningful impact on this model’s performance. The other models, like CNN, VGG16, ResNet50, and Inception V3, did not exhibit significant differences, as their p-values were well above 0.05. The lung CT and pneumonia datasets similarly showed no significant differences in F1-scores for most models, except for Inception V3 on the pneumonia dataset, which had a p-value close to significance (0.011). This indicates that for some models and datasets, denoising might have a considerable effect on the F1-score, but this was not consistent across all the models and datasets.

McNemar’s test was used to evaluate the difference in the performance of the models with and without denoising based on the confusion matrices. The test results show that VGG16 had significant differences (p-value = 0) across all the datasets, indicating a strong effect of denoising on this model’s performance. DenseNet121 also showed significant differences in the lung CT and pneumonia datasets (p-value = 0.0442), further emphasizing the impact of denoising. However, for the other models, such as CNN, ResNet50, and Inception V3, the results of McNemar’s test did not indicate significant differences, as their p-values were above 0.05. This suggests that while denoising had a marked impact on certain models like VGG16 and DenseNet121, its influence was less pronounced for others.

The paired t-test and McNemar’s test reveal that denoising has a significant impact on certain models, particularly DenseNet121 and VGG16, but the effect is not uniform across all models and datasets. This highlights the importance of considering model-specific and dataset-specific effects when implementing denoising techniques in medical image analysis.

4. Conclusions and Future Enhancement

In conclusion, the ETLFOD model is a powerful technique that can significantly improve the performance of denoising algorithms by leveraging the knowledge and expertise of pre-trained deep learning models. Transfer learning can help overcome traditional denoising methods’ limitations and achieve state-of-the-art performance on a wide range of image types and noise levels by fine-tuning these models on specific denoising tasks. The denoising of the brain MRI dataset resulted in high precision, recall, and F1 score values, with 0.9836, 0.9885, and 0.9606. The increased training and testing accuracy of 0.9554 suggests that the model can generalize well to new data beyond the training set. In the pneumonia dataset, DenseNet121 performed better in terms of precision, with a value of 0.9746, recall, with a value of 0.9844, and F1 score, with a value of 0.9385, and it had a high training and testing accuracy of 0.9524. The denoising of the lung CT dataset using the DenseNet121 architecture resulted in moderate precision, recall, and F1 score values, as well as a high training and testing accuracy. The precision of 0.9047 indicates a moderately high proportion of true positive predictions among all positive predictions. In contrast, the recall value of 0.9333 indicates a relatively high proportion of true positive predictions among all actual positive cases. The F1 score of 0.8750 is a harmonic means of precision and recall, showing the overall moderate performance of the model. These promising results suggest that denoising transfer learning using the DenseNet121 architecture is a suitable option. As the training accuracy is greater, the validation loss of overfitting is avoided. In the future, multi-domain transfer learning can be implemented. While our proposed model demonstrates significant potential in enhancing the accuracy of medical image analysis, it is essential to acknowledge the challenges and limitations that may arise in real-world clinical applications. Implementing this model in clinical settings could encounter hurdles such as varying image quality across different medical facilities, the need for large, high-quality labeled datasets for effective training, and the computational resources required for real-time processing. Additionally, the model’s performance may vary depending on the specific medical context, such as different types of imaging modalities or diseases. Future work will focus on addressing these challenges and further validating the model in diverse clinical environments to ensure its robustness and practicality in real-world applications.

Author Contributions

Conceptualization, A.A.; methodology, A.A., V.S., and D.J.; software, D.J.; validation, A.A., V.S. and D.J.; formal analysis, A.A., V.S., and D.J.; investigation, V.S.; resources, V.S.; data curation, A.A., and D.J.; writing—original draft preparation, A.A. and S.D.; writing—review and editing, A.A., and S.D.; visualization, A.A.; supervision, S.D.; project administration, S.D.; funding acquisition, S.D. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Data Availability Statement

This is a publicly available dataset taken from Kaggle.

Conflicts of Interest

The authors declare no conflict of interest.

References

Deng, Y.; Ding, K.; Ouyang, C.; Luo, Y.; Tu, Y.; Fu, J.; Wang, W.; Du, Y. Wavelets and curvelets transform for image denoising to damage identification of thin plate. Results Eng. 2023, 17, 100837. [Google Scholar]
Tian, C.; Zheng, M.; Zuo, W.; Zhang, B.; Zhang, Y.; Zhang, D. Multi-stage image denoising with the wavelet transform. Pattern Recognit. 2023, 134, 109050. [Google Scholar] [CrossRef]
Jiang, L.; Huang, J.; Lv, X.G.; Liu, J. Alternating direction method for the high-order total variation-based Poisson noise removal problem. Numer. Algorithms 2015, 69, 495–516. [Google Scholar] [CrossRef]
Liu, K.; Tian, Y. Research and analysis of deep learning image enhancement algorithm based on fractional differential. Chaos Solitons Fractals 2020, 131, 109507. [Google Scholar] [CrossRef]
Jin, Y.; Jiang, X.; Jiang, W. An image denoising approach based on adaptive non-local total variation. J. Vis. Commun. Image Represent. 2019, 65, 102661. [Google Scholar] [CrossRef]
Huang, X.; Li, S.; Gao, S. Applying a modified wavelet shrinkage filter to improve cryo-electron microscopy imaging. J. Comput. Biol. 2018, 25, 1050–1058. [Google Scholar] [CrossRef] [PubMed]
Xu, J.; Feng, A.; Hao, Y.; Zhang, X.; Han, Y. Image deblurring and denoising by an improved variational model. AEU-Int. J. Electron. Commun. 2016, 70, 1128–1133. [Google Scholar] [CrossRef]
Gupta, A.; Kumar, S. Generalized framework for the design of adaptive fractional-order masks for image denoising. Digit. Signal Process. 2022, 121, 103305. [Google Scholar] [CrossRef]
Caputo, M.; Fabrizio, M. A new definition of fractional derivative without singular kernel. Prog. Fract. Differ. Appl. 2015, 1, 73–85. [Google Scholar]
Atangana, A.; Baleanu, D. New fractional derivatives with non-local and non-singular kernel: Theory and application to heat transfer model. arXiv 2016, arXiv:1602.03408. [Google Scholar]
Ma, Y.; Fan, A.; He, J.; Nelakurthi, A.R.; Maciejewski, R. A visual analytics framework for explaining and diagnosing transfer learning processes. IEEE Trans. Vis. Comput. Graph. 2020, 27, 1385–1395. [Google Scholar] [CrossRef]
Abirami, A.; Prakash, P.; Thangavel, K. Fractional diffusion equation-based image denoising model using CN-GL scheme. Int. J. Comput. Math. 2018, 95, 1222–1239. [Google Scholar] [CrossRef]
Vidhushavarshini, S.; Sathiyabhama, B. A Comparison of Classification Techniques on Thyroid Detection Using J48 and Naive Bayes Classification Techniques. In Proceedings of the International Conference on Intelligent Computing Systems (ICICS) Sona College of Technology, Salem, Tamil Nadu, India, 15–16 December 2017. [Google Scholar]
Shroff, A.D.; Patidar, K.; Kushwah, R. A survey and analysis based on image denoising method. Int. J. Adv. Technol. Eng. Explor. 2018, 5, 182–186. [Google Scholar] [CrossRef]
Yoon, T.; Kim, S.W.; Byun, H.; Kim, Y.; Carter, C.D.; Do, H. Deep learning-based denoising for fast time-resolved flame emission spectroscopy in high-pressure combustion environment. Combust. Flame 2023, 248, 112583. [Google Scholar] [CrossRef]
Zhou, L.; Sun, G.; Li, Y.; Li, W.; Su, Z. Point cloud denoising review: From classical to deep learning-based approaches. Graph. Models 2022, 121, 101140. [Google Scholar] [CrossRef]
Solovyeva, E.; Abdullah, A. Dual Autoencoder Network with Separable Convolutional Layers for Denoising and Deblurring Images. J. Imaging 2022, 8, 250. [Google Scholar] [CrossRef]
Jung, H.; Han, G.; Jung, S.J.; Han, S.W. Comparative study of deep learning algorithms for atomic force microscopy image denoising. Micron 2022, 161, 103332. [Google Scholar] [CrossRef]
Guo, Y.; Peng, S.; Du, W.; Li, D. Denoising and wavefield separation method for DAS VSP via deep learning. J. Appl. Geophys. 2023, 210, 104946. [Google Scholar] [CrossRef]
Nishii, T.; Kobayashi, T.; Saito, T.; Kotoku, A.; Ohta, Y.; Kitahara, S.; Umehara, K.; Ota, J.; Horinouchi, H.; Morita, Y.; et al. Deep Learning-based Post Hoc CT Denoising for the Coronary Perivascular Fat Attenuation Index. Acad. Radiol. 2023, 30, 2505–2513. [Google Scholar] [CrossRef]
Luo, M.; Xu, Z.; Ye, Z.; Liang, Z.; Xiao, H.; Li, Y.; Li, Z.; Zhu, Y.; He, Y.; Zhuo, Y. Deep learning for anterior segment OCT angiography automated denoising and vascular quantitative measurement. Biomed. Signal Process. Control 2023, 83, 104660. [Google Scholar] [CrossRef]
Jaganathan, D.; Balasubramaniam, S.; Sureshkumar, V.; Dhanasekaran, S. Revolutionizing Breast Cancer Diagnosis: A Concatenated Precision through Transfer Learning in Histopathological Data Analysis. Diagnostics 2024, 14, 422. [Google Scholar] [CrossRef] [PubMed]
Rajadurai, S.; Perumal, K.; Ijaz, M.F.; Chowdhary, C.L. PrecisionLymphoNet: Advancing Malignant Lymphoma Diagnosis via Ensemble Transfer Learning with CNNs. Diagnostics 2024, 14, 469. [Google Scholar] [CrossRef]
Geldof, F.; Pruijssers, C.W.; Jong, L.J.S.; Veluponnar, D.; Ruers, T.J.; Dashtbozorg, B. Tumor Segmentation in Colorectal Ultrasound Images Using an Ensemble Transfer Learning Model: Towards Intra-Operative Margin Assessment. Diagnostics 2023, 13, 3595. [Google Scholar] [CrossRef] [PubMed]
Tian, C.; Fei, L.; Zheng, W.; Xu, Y.; Zuo, W.; Lin, C.W. Deep learning on image denoising: An overview. Neural Netw. 2020, 131, 251–275. [Google Scholar] [CrossRef]
Pan, S.J.; Yang, Q. A survey on transfer learning. IEEE Trans. Knowl. Data Eng. 2010, 22, 1345–1359. [Google Scholar] [CrossRef]
Lu, J.; Behbood, V.; Hao, P.; Zuo, H.; Xue, S.; Zhang, G. Transfer learning using computational intelligence: A survey. Knowl. Based Syst. 2015, 80, 14–23. [Google Scholar] [CrossRef]
Abirami, A.; Prakash, P.; Ma, Y.K. Variable-Order Fractional Diffusion Model-Based Medical Image Denoising. Math. Probl. Eng. 2021, 2021, 8050017. [Google Scholar] [CrossRef]

$Fractalfract 08 00511 g001$

Figure 1. Workflow of the proposed model.

$Fractalfract 08 00511 g001$

$Fractalfract 08 00511 g002$

Figure 2. Denoised images of brain MRI images.

$Fractalfract 08 00511 g002$

$Fractalfract 08 00511 g003$

Figure 3. Noised and denoised images of brain MRI.

$Fractalfract 08 00511 g003$

$Fractalfract 08 00511 g004$

Figure 4. Pixel distribution of denoised images of brain MRI.

$Fractalfract 08 00511 g004$

$Fractalfract 08 00511 g005$

Figure 5. Noised and denoised images of lung CT.

$Fractalfract 08 00511 g005$

$Fractalfract 08 00511 g006$

Figure 6. Noised and denoised images of lung CT.

$Fractalfract 08 00511 g006$

$Fractalfract 08 00511 g007$

Figure 7. Pixel distribution of the denoised images of lung CT.

$Fractalfract 08 00511 g007$

$Fractalfract 08 00511 g008$

Figure 8. Sample noised and denoised images of pneumonia X-ray.

$Fractalfract 08 00511 g008$

$Fractalfract 08 00511 g009$

Figure 9. Noised and denoised pneumonia X-ray images.

$Fractalfract 08 00511 g009$

$Fractalfract 08 00511 g010$

Figure 10. Pixel distribution of the denoised images of pneumonia X-ray images.

$Fractalfract 08 00511 g010$

$Fractalfract 08 00511 g011$

Figure 11. Convolution neural network.

$Fractalfract 08 00511 g011$

$Fractalfract 08 00511 g012$

Figure 12. ReLU activation function.

$Fractalfract 08 00511 g012$

$Fractalfract 08 00511 g013$

Figure 13. Enhanced max pool layers.

$Fractalfract 08 00511 g013$

$Fractalfract 08 00511 g014$

Figure 14. Padding.

$Fractalfract 08 00511 g014$

$Fractalfract 08 00511 g015$

Figure 15. Dense layer.

$Fractalfract 08 00511 g015$

$Fractalfract 08 00511 g016$

Figure 16. Sigmoid activation function.

$Fractalfract 08 00511 g016$

$Fractalfract 08 00511 g017$

Figure 17. Architecture of DenseNet121. Source: https://paperswithcode.com/lib/torchvision/densenet, accessed on 14 August 2024.

$Fractalfract 08 00511 g017$

$Fractalfract 08 00511 g018$

Figure 18. Architecture of VGG16.

$Fractalfract 08 00511 g018$

$Fractalfract 08 00511 g019$

Figure 19. Architecture of ResNet50.

$Fractalfract 08 00511 g019$

$Fractalfract 08 00511 g020$

Figure 20. Architecture of Inception V3.

$Fractalfract 08 00511 g020$

$Fractalfract 08 00511 g021$

Figure 21. Loss evolution and accuracy evolution for brain dataset.

$Fractalfract 08 00511 g021$

$Fractalfract 08 00511 g022$

Figure 22. ROC for brain dataset.

$Fractalfract 08 00511 g022$

$Fractalfract 08 00511 g023$

Figure 23. Loss evolution and accuracy evolution for lung CT dataset.

$Fractalfract 08 00511 g023$

$Fractalfract 08 00511 g024$

Figure 24. ROC for lung CT dataset.

$Fractalfract 08 00511 g024$

$Fractalfract 08 00511 g025$

Figure 25. Loss evolution and accuracy evolution for the pneumonia dataset.

$Fractalfract 08 00511 g025$

$Fractalfract 08 00511 g026$

Figure 26. ROC for pneumonia dataset.

$Fractalfract 08 00511 g026$

Table 1. Summary of the dataset.

Details	Type of Dataset	Total Images	Malignant/Infected	Benign/Non-Infected	Resolution
Brain	MRI	3762	1683	2079	256 × 256
COVID-19 Lung (SARS-CoV-2)	CT	2074	1130	944	512 × 512
Pneumonia (RT-PCR)	X-ray	5856	4273	1583	512 × 512

Table 2. Result of image denoising for brain (MRI), lung (CT) and pneumonia (X-ray) datasets.

(With Noise Level σ = 10, α = 1.2)
Images	PSNR	MSE	Avg.Time (s) /10 Iterations	PSNR	MSE	Avg.Time (s) /10 iterations
	(By Integer Order MODEL)			(By EFOD Model)
Brain (MRI)	26.4821	113.39	1032	41.1533	110.85	885

Lung (CT)	28.5263	61.04	968	48.8827	53.79	726

Pneumonia (X-ray)	27.4821	68.41	904	46.3242	72.14	766

Table 3. Result of transfer learning for brain dataset without and with denoising.

Preprocess	Models	Class	Precision	Recall	F1-Score	Training Accuracy	Testing Accuracy
Without Denoising	CNN	0	0.7351	0.7000	0.7224	0.6986	0.7214
	CNN	1	0.6344	0.6840	0.6423	0.6235	0.6923
	DenseNet121	0	0.8469	0.9380	0.9153	0.9021	0.9012
	DenseNet121	1	0.9438	0.8722	0.8936	0.8963	0.8935
	VGG16	0	0.3562	0.0032	0.0086	0.4153	0.4123
	VGG16	1	0.3125	0.8536	0.4025	0.4235	0.4263
	ResNet50	0	0.8126	0.8123	0.8365	0.8102	0.7962
	ResNet50	1	0.8032	0.7965	0.7825	0.7693	0.7922
	Inception V3	0	0.8235	0.8425	0.8254	0.8125	0.7953
	Inception V3	1	0.8123	0.7236	0.7856	0.7953	0.8123
With Denoising	CNN	0	0.7758	0.7500	0.7627	0.7386	0.7719
	CNN	1	0.6947	0.7242	0.7091	0.7353	0.7500
	DenseNet121	0	0.9346	0.9880	0.9606	0.9509	0.9546
	DenseNet121	1	0.9836	0.9121	0.9465	0.9535	0.9544
	VGG16	0	0.4000	0.0047	0.0094	0.4566	0.4486
	VGG16	1	0.4389	0.9909	0.6083	0.4978	0.4386
	ResNet50	0	0.8506	0.8952	0.8723	0.8684	0.8533
	ResNet50	1	0.8571	0.8000	0.8275	0.8476	0.8533
	Inception V3	0	0.8397	0.8857	0.8621	0.8669	0.8413
	Inception V3	1	0.8436	0.7848	0.8131	0.8352	0.8413

Table 4. Result of transfer learning for lung CT dataset without denoising.

Preprocess	Models	Class	Precision	Recall	F1-Score	Training Accuracy	Testing Accuracy
Without Denoising	CNN	0	0.7952	0.6000	0.6915	0.8123	0.7123
	CNN	1	0.6935	0.7136	0.7153	0.6941	0.7164
	DenseNet121	0	0.8950	0.7241	0.7952	0.9382	0.8461
	DenseNet121	1	0.7953	0.8951	0.8246	0.8235	0.8125
	VGG16	0	0.0596	0.1203	0.2238	0.7953	0.5947
	VGG16	1	0.5523	1.5230	0.7140	0.7247	0.5932
	ResNet50	0	0.8906	0.4610	0.6124	0.8423	0.7123
	ResNet50	1	0.6513	0.9123	0.7459	0.7956	0.7214
	Inception V3	0	0.8935	0.5960	0.7231	0.8961	0.8542
	Inception V3	1	0.7126	0.9120	0.7956	0.7952	0.7912
With Denoising	CNN	0	0.8225	0.6500	0.7262	0.8806	0.7545
	CNN	1	0.7513	0.8833	0.8122	0.7840	0.7666
	DenseNet121	0	0.9042	0.7600	0.8267	0.9384	0.8545
	DenseNet121	1	0.8229	0.9333	0.8750	0.8645	0.8641
	VGG16	0	1.0000	0.1700	0.2908	0.8081	0.6227
	VGG16	1	0.5913	1.0000	0.7434	0.7967	0.6227
	ResNet50	0	0.9104	0.5100	0.6536	0.9013	0.7545
	ResNet50	1	0.7010	0.9583	0.8099	0.8096	0.7964
	Inception V3	0	0.9535	0.6100	0.7439	0.9121	0.8090
	Inception V3	1	0.7500	0.9750	0.8478	0.8433	0.8090

Table 5. Result of transfer learning for pneumonia dataset without denoising.

Preprocess	Models	Class	Precision	Recall	F1-Score	Training Accuracy	Testing Accuracy
Without Denoising	CNN	0	0.8923	0.8213	0.7903	0.8962	0.8823
	CNN	1	0.8532	0.8952	0.8862	0.8752	0.8825
	DenseNet121	0	0.9123	0.7852	0.8563	0.8932	0.8752
	DenseNet121	1	0.8742	0.9236	0.8932	0.8652	0.8825
	VGG16	0	0.6236	0.8236	0.7412	0.7236	0.7923
	VGG16	1	0.8752	0.6236	0.7236	0.7921	0.7203
	ResNet50	0	0.7132	0.1102	0.2234	0.7214	0.5936
	ResNet50	1	0.5963	0.9536	0.6931	0.5936	0.5532
	Inception V3	0	0.6325	0.9125	0.7536	0.7512	0.7452
	Inception V3	1	0.9125	0.6362	0.7963	0.7852	0.7953
With Denoising	CNN	0	0.9233	0.8656	0.8935	0.9299	0.9130
	CNN	1	0.9085	0.9488	0.9282	0.9109	0.9142
	DenseNet121	0	0.9746	0.8406	0.9026	0.9524	0.9195
	DenseNet121	1	0.8967	0.9844	0.9385	0.9206	0.9246
	VGG16	0	0.6741	0.9518	0.7898	0.7944	0.8287
	VGG16	1	0.9520	0.6755	0.7908	0.8137	0.7993
	ResNet50	0	0.7727	0.15937	0.2642	0.7720	0.6311
	ResNet50	1	0.6178	0.96666	0.7538	0.6311	0.6311
	Inception V3	0	0.6941	0.97187	0.8098	0.8103	0.8103
	Inception V3	1	0.9720	0.69555	0.8108	0.8103	0.8337

Table 6. Paired t-test and McNemar’s test analysis.

	Brain Dataset			Lung CT Dataset			Pneumonia Dataset
		t-statistic	p-value		t-statistic	p-value		t-statistic	p-value
Paired	CNN	4.0415	0.1544	CNN	2.7902	0.2191	CNN	3.5514	0.1747
t-test	DenseNet121	12.9211	0.0492	DenseNet121	2	0.2952	DenseNet121	2.1307	0.2794
results	VGG16	1.0078	0.4975	VGG16	1.0865	0.4736	VGG16	4.8403	0.1297
	ResNet50	8.7826	0.0722	ResNet50	2.3244	0.2586	ResNet50	2.1316	0.2793
	Inception V3	6.9783	0.0906	Inception V3	4.3097	0.1451	Inception V3	57.6667	0.011

		chi2-statistic	p-value		chi2-statistic	p-value		t-statistic	p-value
McNemar’s	CNN	0.64	0.4237	CNN	1.0667	0.3017	CNN	1.0667	0.3017
test	DenseNet121	1.0667	0.3017	DenseNet121	4.05	0.0442	DenseNet121	4.05	0.0442
results	VGG16	56.7364	0	VGG16	52.1524	0	VGG16	52.1524	0
	ResNet50	0.4571	0.499	ResNet50	0.3556	0.551	ResNet50	0.3556	0.551
	Inception V3	0.3556	0.551	Inception V3	2.7	0.1003	Inception V3	2.7	0.1003

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Annadurai, A.; Sureshkumar, V.; Jaganathan, D.; Dhanasekaran, S. Enhancing Medical Image Quality Using Fractional Order Denoising Integrated with Transfer Learning. Fractal Fract. 2024, 8, 511. https://doi.org/10.3390/fractalfract8090511

AMA Style

Annadurai A, Sureshkumar V, Jaganathan D, Dhanasekaran S. Enhancing Medical Image Quality Using Fractional Order Denoising Integrated with Transfer Learning. Fractal and Fractional. 2024; 8(9):511. https://doi.org/10.3390/fractalfract8090511

Chicago/Turabian Style

Annadurai, Abirami, Vidhushavarshini Sureshkumar, Dhayanithi Jaganathan, and Seshathiri Dhanasekaran. 2024. "Enhancing Medical Image Quality Using Fractional Order Denoising Integrated with Transfer Learning" Fractal and Fractional 8, no. 9: 511. https://doi.org/10.3390/fractalfract8090511

Article Menu

Enhancing Medical Image Quality Using Fractional Order Denoising Integrated with Transfer Learning

Abstract

1. Introduction

1.1. Image Denoising Perspective

1.2. Transfer Learning Perspective

1.3. Identified Research Gaps

How Our Work Addresses These Gaps

1.4. Objective of the Proposed Work

2. Proposed System

2.1. Benchmark Dataset Description

2.2. System Description

2.3. An Efficient Fractional-Order-Based Image Denoising

Optimization Problem for Image Processing

2.4. Distribution of Pixel Intensity to Calculate Mean and Standard Deviation

2.4.1. Discussion on MRI Brain Images

2.4.2. Discussion on Lung CT Images

2.4.3. Discussion on Pneumonia X-ray Images

2.5. Transfer Learning

2.5.1. Convolutional Neural Networks

Detailed Description of the Custom CNN Architecture in ETLFOD Model

2.5.2. DenseNet121

2.5.3. VGG16

2.5.4. ResNet50

2.5.5. Inception V3

3. Experimental Results and Discussion

3.1. Results and Discussion of ETLFOD_model for Brain Dataset

3.2. Experimental Analysis of ETLFOD Model for Lung CT Dataset

3.3. Experimental Analysis of ETLFOD Model for Pneumonia Dataset

4. Conclusions and Future Enhancement

Author Contributions

Funding

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI