Revolutionizing Prostate Whole-Slide Image Super-Resolution: A Comparative Journey from Regression to Generative Adversarial Networks

Gavade, Anil B.; Gadad, Kartik A.; Gavade, Priyanka A.; Nerli, Rajendra B.; Kanwal, Neel

doi:10.3390/uro4030007

Open AccessArticle

Revolutionizing Prostate Whole-Slide Image Super-Resolution: A Comparative Journey from Regression to Generative Adversarial Networks

by

Anil B. Gavade

^1,*

,

Kartik A. Gadad

¹,

Priyanka A. Gavade

²,

Rajendra B. Nerli

³ and

Neel Kanwal

^4,*

¹

Department of E&C, KLS Gogte Institute of Technology, Belagavi 590008, India

²

Department of Computer Science and Engineering, KLE Society’s Dr. M. S. Sheshgiri College of Engineering and Technology, Belagavi 590008, India

³

Department of Urology, JN Medical College, KLE Academy of Higher Education and Research, Belagavi 590010, India

⁴

Department of Electrical Engineering and Computer Science, University of Stavanger, 4021 Stavanger, Norway

^*

Authors to whom correspondence should be addressed.

Uro 2024, 4(3), 89-103; https://doi.org/10.3390/uro4030007

Submission received: 30 April 2024 / Revised: 18 June 2024 / Accepted: 25 June 2024 / Published: 27 June 2024

Download

Browse Figures

Versions Notes

Abstract

:

Microscopic and digital whole-slide images (WSIs) often suffer from limited spatial resolution, hindering accurate pathological analysis and cancer diagnosis. Improving the spatial resolution of these pathology images is crucial, as it can enhance the visualization of fine cellular and tissue structures, leading to more reliable and precise cancer detection and diagnosis. This paper presents a comprehensive comparative study on super-resolution (SR) reconstruction techniques for prostate WSI, exploring a range of machine learning, deep learning, and generative adversarial network (GAN) algorithms. The algorithms investigated include regression, sparse learning, principal component analysis, bicubic interpolation, multi-support vector neural networks, an SR convolutional neural network, and an autoencoder, along with advanced SRGAN-based methods. The performance of these algorithms was meticulously evaluated using a suite of metrics, such as the peak signal-to-noise ratio (PSNR), structural similarity index metrics (SSIMs), root-mean-squared error, mean absolute error and mean structural similarity index metrics (MSSIMs). The comprehensive study was conducted on the SICAPv2 prostate WSI dataset. The results demonstrated that the SRGAN algorithm outperformed other algorithms by achieving the highest PSNR value of 26.47, an SSIM of 0.85, and an MSSIM of 0.92, by 4× magnification of the input LR image, preserving the image quality and fine details. Therefore, the application of SRGAN offers a budget-friendly counter to the high-cost challenge of acquiring high-resolution pathology images, enhancing cancer diagnosis accuracy.

Keywords:

cancer diagnosis; generative adversarial networks; histopathology; super-resolution; prostate WSI

1. Introduction

The prostate gland, crucial for male reproductive health, is susceptible to cancer, necessitating meticulous diagnostic approaches [1]. Histopathology plays a pivotal role in prostate cancer diagnosis, examining tissue samples obtained through biopsies to identify cancerous cells [1,2]. This test, including transrectal ultrasound-guided biopsies and molecular pathology techniques, provides crucial insights into the cancer’s nature and guides personalized treatment strategies [3,4]. Moreover, the significance of resolution in imaging histopathology slides is well known, as higher resolution (HR) enables more precise visualization of tumor characteristics and facilitates accurate diagnosis [5]. Clinical tissue tests, complemented by HR imaging, are indispensable for monitoring treatment response and disease progression in prostate cancer management.

A whole-slide image (WSI) is a gigapixel image consisting of hundreds of thousands of digital values called pixels, which serves as a crucial component in highlighting cellular structures [6]. Pixels in WSIs may represent gray levels, colors, or intensities. Spatial resolution, determined by the pixel count in rows and columns, ensures sharpness and detail in these images. However, WSI acquisition sometimes results in lower resolution (LR) images due to hardware limitations, such as sensor size, scanning profile, or processing capabilities, making it hard to acquire high-resolution (HR) WSIs to facilitate a complete examination of cellular components and structures. One of the major reasons could be that older versions of hardware and constraints make it hard to acquire HR WSIs, posing challenges for a seamless digital pathology workflow. To address this, super-resolution (SR) and image fusion techniques are employed. SR algorithms are a category of algorithms that enhance resolution computationally, improving detail without hardware upgrades, while image fusion combines multiple images to yield composite outputs with superior detail [7]. The SR algorithms can intelligently infer and fill in missing details, producing an image with finer clarity and enhanced visual quality [7]. SR algorithms are usually preferred for their cost-effectiveness and feasibility [5]. They leverage computational power to refine existing images, offering a scalable solution for enhancing image quality across platforms. This accessible approach bypasses costly hardware upgrades, providing clearer and more accurate visual data for a seamless digital pathology (DP) workflow.

The histological details, like the cellular structure and glandular shapes, are sometimes required to be magnified to be visible; therefore, higher spatial resolution is important as the display screen becomes larger and viewing distances smaller. Enhancing the spatial resolution of microscopic and WSIs through SR methods can, therefore, be interesting as it can make large resources of LR data available and vital for developing artificial intelligence (AI)-based automated diagnosis systems. This paper investigates various SR techniques aimed at enhancing the resolution of prostate WSIs, with the goal of achieving better cellular visualization. The existing machine learning (ML) models and SR convolutional neural networks (CNNs) are shown to fail to give very detailed SR WSIs, as they lack the architecture for generating the images using the discriminator noise [7]. The effective spatial resolution will depend on how the compression system interacts with the displayed content. The primary objective of image SR is to enhance image quality by creating high-quality outputs from LR inputs. This process significantly improves visual perception by drawing attention to finer details, having softer textures, and having enhanced clarity by learning patterns and relationships from a dataset of paired LR and HR images. In particular, in DP, SR techniques can play an important role in facilitating histological image analysis and more precise diagnoses. Figure 1 demonstrates how SR is incorporated to achieve a magnified image and how it helps to fill the pixels in between, so that we perceive enhanced WSIs. This content emphasizes the technical aspects of SR algorithms in WSIs, using the dataset primarily to support our research objectives.

2. Preliminaries

SR can be performed by employing various functions including basic algorithms like bilinear interpolation, sparse coding, optimization, nearest neighbors, and frequency domain transformation, as well as advanced algorithms like the SR-CNN, autoencoder (AE), and SR-generative adversarial network (GAN) [7,8]. Some of these basic algorithms are discussed below. The GAN architecture will be introduced later:

\begin{matrix} Bilinear interpolation : I_{HR} (x, y) & = a \cdot I_{LR} (x_{1}, y_{1}) + b \cdot I_{LR} (x_{2}, y_{1}) \\ + c \cdot I_{LR} (x_{1}, y_{2}) + d \cdot I_{LR} (x_{2}, y_{2}) \end{matrix}

(1)

where

I_HR(x, y) is the pixel value at the HR image at coordinates (x, y),
I_LR(x_i, y_j) is the pixel value at the LR image at coordinates (x_i, y_j),
a, b, c, d are the interpolation coefficients, determined based on the relative positions of (x, y).

In the bilinear interpolation algorithm, four image points denoted as coefficients (a, b, c, d) are arranged and their interpolation value is determined to increase the resolution [9].

\begin{matrix} Sparse coding : min_{D, X} \frac{1}{2} {∥ Y - D X ∥}_{F}^{2} + λ {∥ X ∥}_{1} \end{matrix}

(2)

where

Y

is the input data,

D

is the dictionary,

X

is the sparse representation, and

λ

controls the sparsity [10].

\begin{matrix} Optimization : min_{x} f (x) subject to g_{i} (x) \leq 0, i = 1, \dots, m \end{matrix}

(3)

where

f (x)

is the objective function and

g_{i} (x)

are the constraint functions.

Here, the goal is to find the input value x so that we obtain the least f(x). Using inequality constraints, the algorithm iteratively adjusts x to minimize the objective function by adhering to the constraint function. The trade-off in this algorithm is that it balances between minimizing f(x) and satisfying the constraints.

\begin{matrix} Nearest neighbors : \arg \min_{i} ∥ q - x_{i} ∥ \end{matrix}

(4)

where

q

is the query point and

x_{i}

is the data point.

This algorithm strives to find the value of data point

x_{i}

to minimize the absolute difference between

q

and each data point

x_{i}

. The trade-off of nearest neighbors is that it can be sensitive to outliers and noisy data.

\begin{matrix} Frequency domain transformations : F {f (t)} = F (ω) = \int_{- \infty}^{\infty} f (t) e^{- j ω t} d t \end{matrix}

(5)

where

f (t)

is the signal in the time domain,

F (ω)

is its Fourier transform, and

ω

is the angular frequency.

General Super-Resolution Block Diagram

The below Figure 2 demonstrates the generalized block diagram of SRGAN. The image generated by the generator block is normally obtained by distributing noise and processing it using the function G(Z).

The input data are distributed, i.e., the WSI is fed into the discriminator block, D, as it considers the input as the benchmark ground truth image for comparison with the generated image. Generally, the generator and discriminator blocks are multilayered neural networks.

θ_{G}

and

θ_{D}

are the parameters for both blocks.

GANs [11] can be used for generating SR images using training dataset and noise, and from Figure 2, we can see that the GAN is trained with histopathological images, while the generator tries to generate fake images as its output. The discriminator tries to discriminate the generated images by comparing them with real ground truth images. Finally, there is the classifier, which helps in deciding whether the image, after passing through the discriminator, is super-resolved or not. In the case of SR, it can be a binary classifier, so if the image obtained after the discriminator stage is not satisfactory, it is sent back to the generator as noise, so that the generator can improve the quality of the image generation. The final output from the classifier is up-scaled 4× for the input image. The algorithm can be explained mathematically as follows:

The generator aims to minimize the probability that the discriminator correctly classifies the generated data as fake:

$min_{G} E_{z \sim p_{z} (z)} [log (1 - D (G (z)))];$

(6)
The generator’s update involves taking the gradient with respect to its parameters $θ_{G}$ :

$\nabla_{θ_{G}} E_{z \sim p_{z} (z)} [log (1 - D (G (z)))];$

(7)
The discriminator aims to maximize the probability of correctly classifying the real data and the probability of correctly classifying the generated data as fake:

$max_{D} E_{x \sim p_{data} (x)} [log D (x)] + E_{z \sim p_{z} (z)} [log (1 - D (G (z)))];$

(8)
The discriminator’s update involves taking the gradient with respect to its parameters $θ_{D}$ :

$\nabla_{θ_{D}} (E_{x \sim p_{data} (x)} [log D (x)] + E_{z \sim p_{z} (z)} [log (1 - D (G (z)))])$

(9)

These gradients are then used in an optimization algorithm to update the parameters of the discriminator and generator in an alternating fashion until convergence;
The overall objective of the GAN is a min–max game:

$min_{G} max_{D} E_{x \sim p_{data} (x)} [log D (x)] + E_{z \sim p_{z} (z)} [log (1 - D (G (z)))]$

(10)

where
G is the generator,
D is the discriminator,
$p_{data} (x)$ is the distribution of the real data, and
$p_{z} (z)$ is the distribution of the noise.

3. Related-Work

The evolution of SR techniques from basic methods like bicubic interpolation to advanced mathematical models such as the SR-CNN, AE, and GANs underscores the continuous research to improve image detail and quality. This is particularly relevant in fields that demand high precision, like histopathology. It is clear that the intersection of domain knowledge and mathematical innovation is pivotal in obtaining superior SR outcomes.

The journey to enhance image resolution began with the traditional methods such as bicubic interpolation. However, this technique resulted in larger, but overly interpolated and blurry images, lacking the sharpness needed for histopathological images. Akhtar et al. [12] addressed this issue by combining bicubic interpolation with a 2D interpolation filter. Bicubic interpolation creates an HR image, and the 2D filter refines it by considering local statistics and geometry, significantly improving high-frequency details compared to basic bicubic interpolation. To overcome bicubic interpolation’s limitations, Liu et al. [13] introduced a PCA-based approach, emphasizing pattern analysis and improving image quality for human-focused applications. Their work explored multiple PCA extensions, such as PPCA, KPCA, MDPCA, and RPCA, analyzing pattern analysis in SR [14,15].

To counter blurring effects from bicubic interpolation filters, Tai et al. [16] introduced spatial sharpening filters. These filters strategically enhance image sharpness and clarity by considering self-similarity. They employed linear regression techniques to adaptively reconstruct models for improving the visual quality of scaled-up images. Building on these advancements, Yang et al. [17] proposed an image SR technique rooted in sparse signal representation. Inspired by the idea of representing image patches as sparse linear combinations of elements, Zhang et al. [18] obtained a sparse representation for each patch within the LR input images. Their algorithm was based on training dictionaries over LR images, computing the mean pixel values and solving the optimization problem. Finally, the output obtained was an SR image. This process involves complex mathematical equations, such as the optimization problem, which seeks to minimize the difference between the generated HR image and the desired result. It is a testament to the evolution of image SR, where advanced mathematical approaches, like sparse signal representation, are harnessed to achieve increasingly HR images.

In the medical image SR domain, various innovative techniques have emerged. Gavade et al. [19] presented a hybrid model combining support vector regression and a multi-variate statistical neural network (MVSNN), optimized using the DolLion algorithm, which enhances convergence and solutions. Zhu et al. [20] proposed a model that intends to super-resolve the artifacts and features by considering the processing stage result with the use of sparse learning application. El-Shafai et al. [21] introduced a CNN framework tailored to WSIs [22], particularly tumor-related ones, using patch extraction, non-linear mapping, and reconstruction to upscale images, highlighting the transformative potential of deep learning (DL). Chen et al. [23] presented the 3D densely connected super-resolution network (DCSRN) for restoring high-resolution features in structural brain MRI images, surpassing traditional methods. Bychkov et al. [24] proposed a fusion of convolutional and recurrent architectures for predicting colorectal cancer from tumor tissue samples, showing the potential of DL in direct prognostic information extraction from medical images.

Gao et al. [25] introduced a novel deep network model tailored to medical image SR reconstruction, capitalizing on the repetitive structure and black borders prevalent in medical images. Departing from the SR-CNN architecture, the model incorporates a secondary convolution layer for enhanced feature representation and overlapping pooling layers to emphasize critical features and introduces a link layer connecting the second pooling layer to the reconstruction layer for combined local and global feature utilization. The experimental results demonstrated a notable average PSNR improvement of 2.1 dB, 0.6 dB, and 1 dB compared to the original SR-CNN, showcasing superior performance over other CNN algorithms.

Gu et al. [26] presented MedSRGAN, a specialized approach for medical images. Their generator network combines a CNN and a region-weighted multi-attention network (RWMAN) to emphasize meaningful regions in medical images. Multi-task loss during training enhances realistic patterns in super-resolved medical images. Mahapatra et al. [27] proposed progressive GANs for multi-stage SR, employing a triplet loss to generate high-resolution images. Oyelade et al. [28] introduced ROImammoGAN, tailored to ROI-based digital mammograms, addressing aspects like distortion and abnormalities. Iqbal et al. [29] advocated GANs over traditional DL algorithms to overcome limited medical data, showcasing the MI-GAN for retinal images and segmented masks, yielding promising Dice coefficients. The application of a multi-scale GAN-based model and a mixed-attention GAN in the restoration of image details for diagnosis showed promising results. PathSRGAN, a multi-supervised SR model utilizing GANs, has demonstrated success in enhancing the resolution of cytopathological images, further highlighting the significance of GANs in the field.

In summary, these advancements in SR techniques, from mathematical innovations to deep learning models like specialized GANs, exemplify the commitment in delivering high-quality super-resolved medical images. The journey from traditional methods to advanced machine learning reflects the increasing role of mathematics and AI in enhancing image quality for diagnosis and analysis. A brief literature review of GANs as shown in Table 1 gives a brief of the results achieved on the adopted models.

4. Methods and Materials

4.1. Super-Resolution Using Machine Learning and Deep Learning

ML algorithms contribute by analyzing data patterns to generate an HR WSI patch, while DL techniques like the SR-CNN excel in extracting intricate features for HR WSI patch generation [8]. Notably, SRGANs, a type of generative artificial intelligence model, are prominently utilized to create HR WSI patches through adversarial training. DL is integral in advancing pathological SR by excelling in pattern recognition by deciphering intricate details within LR WSIs and identifying essential features. The ML algorithm performance comparison demonstrated in this paper is for bicubic interpolation [33,34], and it is considered to be efficient when speed is not an issue. Linear regression, the four-orientation Laplace mask having second-order derivative, is used to make blurred images sharp by enhancing their quality and fine details. However, the use of masks in linear regression comes with the addition of noise in the image in gradient areas; therefore, removing noise in other areas is an additional problem. Hence, we moved towards principal component analysis (PCA) [16,35], sparse learning, and the MVSNN, optimizing the weights through associated algorithms to enhance image resolution over the training epochs. In the DL part, we addressed the performance of the SR-CNN [25] and GAN [26] for extracting hierarchical features. Enabling end-to-end learning simplifies development, ensuring adaptability to diverse WSI SR scenarios.

4.2. Architecture of Super-Resolution Generative Adversarial Networks

Our SRGAN model as shown in Figure 3 is similar to common GAN architectures, consisting of a generator and discriminator. The generator takes the LR histopathological image as the input and employs residual network [31] architectures, which include convolutional functions and skip connections to generate fake (noisy) images. Simultaneously, the discriminator is equipped with binary classifiers, which evaluate the generated images against the ground truth images. Images with high similarity undergo up-scaling (up to 4×), while LR images prompt feedback loops between the generator and discriminator. The ultimate output is an SR pathological image. This SR technique is extended to microscopic and WSIs in the pathology domain, emphasizing the role of evaluation metrics. Microscopic images are vital to pathologists in diagnosis and demand the preservation of fine details at the cellular level, including contrast and luminance. SR techniques are integral to DP diagnosis and quantitatively assess image quality, ensuring accurate representation of critical details at the cellular level and in textual content, addressing the specific demands of these specialized domains.

The SRGAN model is trained on LR images, which the form training dataset, to learn transformations such as interpolation. They aim to approximate missing high-frequency information in HR images by mapping LR to HR images, capturing the patterns required for HR images.

\begin{matrix} I_{HR} & = S R (I_{LR}) \\ where I_{LR} & : LR image \\ I_{HR} & : HR image \end{matrix}

(11)

SRGAN’s Architecture Mathematical Modeling

\begin{matrix} Convolution : & I_{out} & = I_{in} * K \end{matrix}

(12)

where K is the convolutional kernel filter.

\begin{matrix} Activation : & I_{out}^{'} = ReLU (I_{in} * K) \end{matrix}

(13)

\begin{matrix} I_{out}^{'} = ReLU (I_{out}) \end{matrix}

(14)

Residual Learning : I_{out}^{″} = I_{in} + F * (I_{in})

(15)

where

F * (I_{in})

represents the residual block.

Loss Function : L = \frac{1}{N} \sum_{i = 0}^{N} (I_{HR}^{i} - SR \cdot {(I_{HR}^{i})}^{2})

(16)

where N is the number of training examples,

I_{HR}^{i}

is the ground truth HR image, and

SR \cdot {(I_{HR}^{i})}^{2}

is the predicted HR image for the i^th example.

4.3. Dataset Details

In this paper, we are evaluating the SR methods on prostate cancer histopathological images in the dataset https://data.mendeley.com/datasets/9xxm58dvs3/1 (accessed on 30 April 2024) from Mendeley Data [36], comprising 18,783 annotated images across six classes and having 4417, 925, 2471, 3082, 2234, and 5654 patches in Gleason grade (GG) 0, GG 1, GG 2, GG 3, GG 4, and GG 5, respectively. The categorization of prostate cancer into the Gleason grade groups relies on the Gleason scores, which assess the patterns observed in prostate WSIs. Chen et al. [37] comprehensively demonstrated the calculation and classification of these groups. Patches extracted at 10x magnification with a 512-pixel size and 50% overlap underwent preprocessing for noise reduction, color correction, and contrast enhancement, ensuring readiness for analysis. Assessing image quality feasibility for training and testing relies on baseline models. Notably, the pathological images in the dataset adhere to a resolution of 512 × 512 pixels. Figure 4 shows examples of some patches.

4.4. Experimental Setup

The implementation was performed using the pytorch framework and setting the batch size to 24; the training and testing sizes were kept at 85% & 15%, respectively, to obtain the results. The algorithm was run on a Dell Precision Tower 5810 workstation with a Xeon CPU, 512 GB SSD, 32 GB of RAM, and an 8 GB Quadro P4000 Nvidia GPU.

5. Evaluation Metrics

Evaluating the quality of HR images is essential to have a suitable resolution required for the WSI classification. Zajiczek et al. [38] gave the techniques for assessing the outcomes of SR algorithms. Zhou et al. [39] also focused on SR image quality assessment, using deterministic and statistical fidelity to evaluate SR image quality. They found that SRGANs excel in achieving high spatial fidelity, but may struggle with deterministic fidelity. Therefore, they introduced the SR Image Fidelity index, based on content-dependent sharpness and texture assessment, as a novel parameter for image SR assessment. The peak signal-to-noise ratio (PSNR), structural similarity index (SSIM), root-mean-squared error, mean absolute error and multi-scale structural similarity index are some metrics for measuring the SR method’s outcomes. The following is a brief explanation of how these metrics estimated based on LR and HR images:

Peak signal-to-noise ratio (dB): The PSNR measures image quality by comparing the original image to the reconstructed image, with higher values indicating better quality and higher fidelity.

$PSNR (I, K) = 10 \cdot {log}_{10} (\frac{{MAX}^{2}}{MSE})$

(17)

where
I is the original image,
K is the reconstructed (or compressed) image, and
MAX is the maximum possible pixel value of the images.

Structural similarity index: The SSIM goes further by considering not just pixel-level differences, but also structural aspects like image similarity, considering luminance, contrast, and structure. A higher SSIM means better similarity to the original.
The SSIM is a product of three components, luminance (l), contrast (c), and structure (s), raised to the power of an exponent $α$ as shown below:

$SSIM (x, y) = {[l (x, y) \cdot c (x, y) \cdot s (x, y)]}^{α}$

(18)

Typically, $α$ is set to a smaller value, e.g., 1.

$\begin{matrix} l (x, y) & = \frac{2 μ_{x} μ_{y} + C_{1}}{μ_{x}^{2} + μ_{y}^{2} + C_{1}} \end{matrix}$

(19)

$\begin{matrix} c (x, y) & = \frac{2 σ_{x} σ_{y} + C_{2}}{σ_{x}^{2} + σ_{y}^{2} + C_{2}} \end{matrix}$

(20)

$\begin{matrix} s (x, y) & = \frac{σ_{x y} + C_{3}}{σ_{x} σ_{y} + C_{3}} \end{matrix}$

(21)

where

$\begin{matrix} x, y are the pixel values of the images, \\ μ_{x}, μ_{y} are the local means of x and y, respectively, \\ σ_{x}, σ_{y} are the local standard deviations of x and y, respectively, \\ σ_{x y} is the local covariance of x and y, \\ C_{1}, C_{2}, and C_{3} are small constants for stability . \end{matrix}$

Root-mean-squared error: The RMSE shows the average magnitude of the errors between images. The root-MSE is the square root of the MSE and is used to measure the average magnitude of the errors between the corresponding pixel values of the original (I) and reconstructed (K) images.

$RMSE (I, K) = \sqrt{MSE (I, K)} = \sqrt{\frac{1}{M N} \sum_{i = 1}^{M} \sum_{j = 1}^{N} {(I (i, j) - K (i, j))}^{2}}$

(22)

where

$\begin{matrix} I (i, j) is the pixel value at position (i, j) in the original image, \\ K (i, j) is the pixel value at position (i, j) in the reconstructed image, \\ M, N are the dimensions of the images . \end{matrix}$

Mean absolute error: The MAE evaluates the error by calculating the average absolute differences between the original and reconstructed images.

$MAE (I, K) = \frac{1}{M N} \sum_{i = 1}^{M} \sum_{j = 1}^{N} | I (i, j) - K (i, j) |$

(23)

where

$\begin{matrix} I (i, j) is the pixel value at position (i, j) in the original image, \\ K (i, j) is the pixel value at position (i, j) in the reconstructed image, \\ M, N are the dimensions of the images . \end{matrix}$

Multi-scale structural similarity index: An extension of the SSIM, it assesses the structural similarity at multiple scales.

$MSSIM (I, K) = \frac{1}{M N} \sum_{i = 1}^{M} \sum_{j = 1}^{N} SSIM (I (i, j), K (i, j))$

(24)

where

$\begin{matrix} SSIM (x, y) = \frac{(2 μ_{x} μ_{y} + C_{1}) (2 σ_{x y} + C_{2})}{(μ_{x}^{2} + μ_{y}^{2} + C_{1}) (σ_{x}^{2} + σ_{y}^{2} + C_{2})} \\ μ_{x}, μ_{y} are the average pixel values of x and y, \\ σ_{x}^{2}, σ_{y}^{2} are the variances of x and y, \\ σ_{x y} is the covariance between x and y, \\ C_{1} and C_{2} are constants to stabilize the division with a weak denominator . \end{matrix}$

The MSSIM measures image quality based on human perception, considering factors like color accuracy and sharpness. It is important for assessing visual quality in biomedical imagery.

6. Results and Discussion

Table 2 presents the results of traditional ML and the SRGAN. We have used several metrics to comprehensively compare the outcome of super-resolution. The results show that the SRGAN clearly outperforms and has better edge over the SR-CNN and traditional models. We have specifically focused on the implementation of pathological images for the various models and represent the SR metrics values for the PSNR, SSIM, MSE, RMSE, MAE, and MSSIM.

We can clearly see that the proposed method, SRGAN, outperformed the others by showcasing great performance on the PSNR, SSIM, and MSSIM. Since these metrics relate the quality and similarity indices, a higher value means a better model. There is a small performance difference between the AE and SRGAN, possibly due to the capturing of the spatial resolution by the convolution layers in the SRGAN. Subsequently, the SRGAN has least values for RMSE and MAE, and lower values indicate the improvement of the methods for SR application. By speculating about the WSI quality evaluation metrics, Table 2 helps us to conclude that these standardized evaluation metrics reaffirm the novelty of SRGANs. Figure 5 shows some super-resolved examples from the SRGAN and subjectively shows that the HR images carry significantly more information than the LR ones.

7. Conclusions

The results of this comprehensive comparative study on super-resolution reconstruction techniques for prostate whole-slide images demonstrate the superior performance of the SRGAN algorithm compared to other machine learning and deep learning methods. The SRGAN-based super-resolution algorithm achieved outstanding results, outperforming the other techniques across a range of evaluation metrics. Specifically, the SRGAN-based method achieved a PSNR of 26.47, an SSIM of 0.85, an RMSE of 0.035, an MAE of 0.026, and an MSSIM of 0.92. These results indicate that the SRGAN algorithm preserves high-frequency details and image quality while achieving up to 4× magnification of the input images. The application of the SRGAN provides a cost-effective solution to bridge the gap between the high cost of high-resolution imaging equipment and the need for high-quality pathology images. The proposed method can be used in the field of digital pathology, offering a powerful tool to enhance the quality and utility of prostate WSIs for improved cancer diagnosis and patient care.

8. Future Work

Future work should focus on advancing and optimizing algorithms to maximize the impact and adoption of SRGANs in the digital pathology workflow. The progression of image super-resolution (SR) towards black-box AI systems poses challenges, as the complex models can be difficult for clinicians to interpret, leading to trust issues in computational pathology. To address this, integrating explainable artificial intelligence (XAI), as highlighted in [40,41], could make the AI outputs more understandable and reliable for clinical decision-making.

AI, particularly ML and DL techniques, holds great promise in medical imaging. AI systems trained to identify abnormal areas in histopathology slides and enhancing slide resolution can significantly improve the diagnostic workflow. Through intensive initial data training, AI pre-screens slides, prioritizing them for detailed examination, which enhances diagnostic efficiency and accuracy while minimizing observer variability. A significant challenge in histopathology is the inter- and intra-observer variability, especially with the Gleason score, where differences in interpreting histological patterns can lead to inconsistent grading. AI addresses this issue by augmenting pathologists, standardizing evaluations, and reducing subjective workload. Objective evaluation metrics ensure consistent grading and better prognostic assessments. Furthermore, AI’s ability to continuously learn from new data enhances its diagnostic capabilities over time [42,43]. Thus, future work in SR should incorporate XAI with GANs to generate super-resolved images that are both highly accurate and interpretable. Additionally, by focusing on the deployment of smaller, more efficient models that can be accessed remotely, we can overcome the challenges associated with transmitting large gigapixel images. This approach will facilitate remote pathological evaluations by enabling the enhancement of low-resolution images after transmission.

Author Contributions

Conceptualization, A.B.G.; Methodology, A.B.G. and K.A.G.; Software, K.A.G. and P.A.G.; Investigation, A.B.G., N.K. and R.B.N.; Data Curation, K.A.G. and R.B.N.; Validation, P.A.G.; Formal Analysis, A.B.G. and N.K.; Writing—original draft, A.B.G. and K.A.G.; Writing—editing & review; N.K. and R.B.N.; Mentoring for present-ability, N.K.; Supervision; N.K. and R.B.N. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Informed Consent Statement

Informed consent is not applicable for this study.

Data Availability Statement

The dataset is publicly available.

Conflicts of Interest

The authors declare no conflict of interest.

References

Abdelrazek, A.; Mahmoud, A.M.; Joshi, V.B.; Habeeb, M.; Ahmed, M.E.; Ghoniem, K.; Delgado, A.; Khater, N.; Kwon, E.; Kendi, A.T. Recent Advances in Prostate Cancer (PCa) Diagnostics. Uro 2022, 2, 109–121. [Google Scholar] [CrossRef]
Sekhoacha, M.; Riet, K.; Motloung, P.; Gumenku, L.; Adegoke, A.; Mashele, S. Prostate cancer review: Genetics, diagnosis, treatment options, and alternative approaches. Molecules 2022, 27, 5730. [Google Scholar] [CrossRef] [PubMed]
Denmeade, S.R.; Isaacs, J.T. A history of prostate cancer treatment. Nat. Rev. Cancer 2002, 2, 389–396. [Google Scholar] [CrossRef] [PubMed]
Abdelrazek, A.S.; Ghoniem, K.; Ahmed, M.E.; Joshi, V.; Mahmoud, A.M.; Saeed, N.; Khater, N.; Elsharkawy, M.S.; Gamal, A.; Kwon, E.; et al. Prostate Cancer: Advances in Genetic Testing and Clinical Implications. Uro 2023, 3, 91–103. [Google Scholar] [CrossRef]
Lang, F.; Contreras-Gerenas, M.F.; Gelléri, M.; Neumann, J.; Kröger, O.; Sadlo, F.; Berniak, K.; Marx, A.; Cremer, C.; Wagenknecht, H.A.; et al. Tackling tumour cell heterogeneity at the super-resolution level in human colorectal cancer tissue. Cancers 2021, 13, 3692. [Google Scholar] [CrossRef] [PubMed]
Tabatabaei, Z.; Wang, Y.; Colomer, A.; Oliver Moll, J.; Zhao, Z.; Naranjo, V. Wwfedcbmir: World-wide federated content-based medical image retrieval. Bioengineering 2023, 10, 1144. [Google Scholar] [CrossRef] [PubMed]
Tian, C.; Zhang, X.; Lin, J.C.W.; Zuo, W.; Zhang, Y.; Lin, C.W. Generative adversarial networks for image super-resolution: A survey. arXiv 2022, arXiv:2204.13620. [Google Scholar]
Wang, Z.; Chen, J.; Hoi, S.C. Deep learning for image super-resolution: A survey. IEEE Trans. Pattern Anal. Mach. Intell. 2020, 43, 3365–3387. [Google Scholar] [CrossRef] [PubMed]
Cui, X.; Chang, J. Hyperspectral super-resolution via low rank tensor triple decomposition. arXiv 2023, arXiv:2306.10489. [Google Scholar] [CrossRef]
Li, Y.; Dong, W.; Xie, X.; Shi, G.; Wu, J.; Li, X. Image super-resolution with parametric sparse model learning. IEEE Trans. Image Process. 2018, 27, 4638–4650. [Google Scholar] [CrossRef]
Ahmad, W.; Ali, H.; Shah, Z.; Azmat, S. A new generative adversarial network for medical images super resolution. Sci. Rep. 2022, 12, 9533. [Google Scholar] [CrossRef] [PubMed]
Akhtar, P.; Azhar, F. A single image interpolation scheme for enhanced super resolution in bio-medical imaging. In Proceedings of the 2010 4th International Conference on Bioinformatics and Biomedical Engineering, Chengdu, China, 18–20 June 2010; IEEE: Piscataway, NJ, USA, 2010; pp. 1–5. [Google Scholar]
Liu, C.Z.; Kavakli, M. Extensions of principle component analysis with applications on vision based computing. Multimed. Tools Appl. 2016, 75, 10113–10151. [Google Scholar] [CrossRef]
Wang, S.; Wang, B. Super-resolution restoration of multispectral images based on principal component analysis. In Proceedings of the 2014 12th International Conference on Signal Processing (ICSP), Hangzhou, China, 19–23 October 2014; pp. 841–846. [Google Scholar]
Jiji, C.; Chaudhuri, S. PCA Based Generalized Interpolation for Image Super-Resolution. In Proceedings of the ICVGIP, Kolkata, India, 16–18 December 2004; pp. 139–144. [Google Scholar]
Tai, S.C.; Huang, J.J.; Chen, P.Y. A Super-Resolution Algorithm Using Linear Regression Based on Image Self-Similarity. In Proceedings of the 2016 International Symposium on Computer, Consumer and Control (IS3C), Xi’an, China, 4–6 July 2016; IEEE: Piscataway, NJ, USA, 2016; pp. 275–278. [Google Scholar]
Yang, J.; Wright, J.; Huang, T.S.; Ma, Y. Image super-resolution via sparse representation. IEEE Trans. Image Process. 2010, 19, 2861–2873. [Google Scholar] [CrossRef] [PubMed]
Zhang, J.; Shao, M.; Yu, L.; Li, Y. Image super-resolution reconstruction based on sparse representation and deep learning. Signal Process. Image Commun. 2020, 87, 115925. [Google Scholar] [CrossRef]
Gavade, A.B.; Rajpurohit, V.S. S-DolLion-MSVNN: A Hybrid Model for Developing the Super-Resolution Image From the Multispectral Satellite Image. Comput. J. 2022, 65, 757–772. [Google Scholar] [CrossRef]
Zhu, Z.; Guo, F.; Yu, H.; Chen, C. Fast single image super-resolution via self-example learning and sparse representation. IEEE Trans. Multimed. 2014, 16, 2178–2190. [Google Scholar] [CrossRef]
El-Shafai, W.; Aly, R.; Taha, T.E.; Abd El-Samie, F.E. CNN framework for optical image super-resolution and fusion. J. Opt. 2023, 1–20. [Google Scholar]
Wang, Z.; Liu, D.; Yang, J.; Han, W.; Huang, T. Deep networks for image super-resolution with sparse prior. In Proceedings of the IEEE International Conference on Computer Vision, Santiago, Chile, 7–13 December 2015; pp. 370–378. [Google Scholar]
Chen, Y.; Xie, Y.; Zhou, Z.; Shi, F.; Christodoulou, A.G.; Li, D. Brain MRI super resolution using 3D deep densely connected neural networks. In Proceedings of the 2018 IEEE 15th International Symposium on Biomedical Imaging (ISBI 2018), Washington, DC, USA, 4–7 April 2018; IEEE: Piscataway, NJ, USA, 2018; pp. 739–742. [Google Scholar]
Bychkov, D.; Turkki, R.; Haglund, C.; Linder, N.; Lundin, J. Deep learning for tissue microarray image-based outcome prediction in patients with colorectal cancer. In Proceedings of the Medical Imaging 2016: Digital Pathology, San Diego, CA, USA, 27 February–3 March 2016; SPIE: Bellingham, WA, USA, 2016; Volume 9791, pp. 298–303. [Google Scholar]
Gao, Y.; Li, H.; Dong, J.; Feng, G. A deep convolutional network for medical image super-resolution. In Proceedings of the 2017 Chinese Automation Congress, Jinan, China, 20–22 October 2017; pp. 5310–5315. [Google Scholar] [CrossRef]
Gu, Y.; Zeng, Z.; Chen, H.; Wei, J.; Zhang, Y.; Chen, B.; Li, Y.; Qin, Y.; Xie, Q.; Jiang, Z.; et al. MedSRGAN: Medical images super-resolution using generative adversarial networks. Multimed. Tools Appl. 2020, 79, 21815–21840. [Google Scholar] [CrossRef]
Mahapatra, D.; Bozorgtabar, B.; Garnavi, R. Image super-resolution using progressive generative adversarial networks for medical image analysis. Comput. Med Imaging Graph. 2019, 71, 30–39. [Google Scholar] [CrossRef]
Oyelade, O.N.; Ezugwu, A.E.; Almutairi, M.S.; Saha, A.K.; Abualigah, L.; Chiroma, H. A generative adversarial network for synthetization of regions of interest based on digital mammograms. Sci. Rep. 2022, 12, 6166. [Google Scholar] [CrossRef]
Iqbal, T.; Ali, H. Generative adversarial network for medical images (MI-GAN). J. Med Syst. 2018, 42, 1–11. [Google Scholar] [CrossRef] [PubMed]
Gavade, A.; Sane, P. Super resolution image reconstruction by using bicubic interpolation. In Proceedings of the National Conference on Advanced Technologies in Electrical and Electronic Systems, London, UK, 2–4 July 2014; Volume 10. [Google Scholar]
Mukherjee, L.; Bui, H.D.; Keikhosravi, A.; Loeffler, A.; Eliceiri, K.W. Super-resolution recurrent convolutional neural networks for learning with multi-resolution whole slide images. J. Biomed. Opt. 2019, 24, 126003. [Google Scholar] [CrossRef] [PubMed]
Kim, J.; Lee, J.K.; Lee, K.M. Deeply-recursive convolutional network for image super-resolution. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA, 27–30 June 2016; pp. 1637–1645. [Google Scholar]
Khaledyan, D.; Amirany, A.; Jafari, K.; Moaiyeri, M.H.; Khuzani, A.Z.; Mashhadi, N. Low-cost implementation of bilinear and bicubic image interpolation for real-time image super-resolution. In Proceedings of the 2020 IEEE Global Humanitarian Technology Conference (GHTC), Seattle, WA, USA, 29 October–1 November 2020; IEEE: Piscataway, NJ, USA, 2020; pp. 1–5. [Google Scholar]
Bustomi, M.A. Testing of Image Resolution Enhancement Techniques Using Bi-cubic Spatial Domain Interpolation. In Proceedings of the Journal of Physics: Conference Series; IOP Publishing: Bristol, UK, 2019; Volume 1417, p. 012028. [Google Scholar]
Cui, J.; Wang, Y.; Huang, J.; Tan, T.; Sun, Z. An iris image synthesis method based on PCA and super-resolution. In Proceedings of the 17th International Conference on Pattern Recognition, 2004, ICPR, Cambridge, UK, 23–26 August 2004; IEEE: Piscataway, NJ, USA, 2004; Volume 4, pp. 471–474. [Google Scholar]
Silva-Rodríguez, J. SICAPv2—Prostate Whole Slide Images with Gleason Grades Annotations. Mendeley Data 2020. [Google Scholar] [CrossRef]
Chen, N.; Zhou, Q. The evolving Gleason grading system. Chin. J. Cancer Res. Chung-Kuo Yen Cheng Yen Chiu 2016, 28, 58–64. [Google Scholar]
Neary-Zajiczek, L.; Beresna, L.; Razavi, B.; Pawar, V.; Shaw, M.; Stoyanov, D. Minimum resolution requirements of digital pathology images for accurate classification. Med. Image Anal. 2023, 89, 102891. [Google Scholar] [CrossRef]
Zhou, W.; Wang, Z. Quality assessment of image super-resolution: Balancing deterministic and statistical fidelity. In Proceedings of the 30th ACM International Conference on Multimedia, Lisbon, Portugal, 10–14 October 2022; pp. 934–942. [Google Scholar]
Van der Velden, B.H.; Kuijf, H.J.; Gilhuijs, K.G.; Viergever, M.A. Explainable artificial intelligence (XAI) in deep learning-based medical image analysis. Med. Image Anal. 2022, 79, 102470. [Google Scholar] [CrossRef]
Wang, L.; Yoon, K.J. Semi-supervised student-teacher learning for single image super-resolution. Pattern Recognit. 2022, 121, 108206. [Google Scholar] [CrossRef]
Arvaniti, E.; Fricker, K.S.; Moret, M.; Rupp, N.; Hermanns, T.; Fankhauser, C.; Wey, N.; Wild, P.J.; Rueschoff, J.H.; Claassen, M. Automated Gleason grading of prostate cancer tissue microarrays via deep learning. Sci. Rep. 2018, 8, 12054. [Google Scholar] [CrossRef]
Bulten, W.; Pinckaers, H.; van Boven, H.; Vink, R.; de Bel, T.; van Ginneken, B.; van der Laak, J.; Hulsbergen-van de Kaa, C.; Litjens, G. Automated deep-learning system for Gleason grading of prostate cancer using biopsies: A diagnostic study. Lancet Oncol. 2020, 21, 233–241. [Google Scholar] [CrossRef]

Figure 1. A demonstration of how a region of a WSI is super-resolved. The input low-resolution image (left) is magnified four times (4×) and resolved using the super-resolution technique to obtain a high-resolution image (right).

Figure 2. Generalized block diagram of super-resolution generative adversarial network (SRGAN). G represents the generator, and D represents the discriminator.

Figure 3. The architecture of the generator (G) and discriminator (D) modules in the SRGAN.

Figure 4. Prostate histopathological images of all 5 classes based on Gleason grade grouping belonging, respectively, to the classes (a) GG0 (b) GG1 (c) GG2 (d) GG3 (e) GG4 and (f) GG5.

Figure 5. Mapping of input LR pathological WSIs to the output super-resolution pathological images. (a) Example of WSI; (b) super-resolution images.

Table 1. Comparative performance study of super-resolution using SR models.

Models & Authors	PSNR	RMSE	SSIM	MSSIM	SNR	Dataset or Images
Bicubic Interpolation [30]	27.32	-	-	-	-	Lenna image
	43.3	-	-	-	-	Esophagus CT image
Deep Convolutional Network [25]	42.1	-	-	-	-	Nasal CT image
	38.9	-	-	-	-	Pelvic CT image
Linear Regression [16]	27.67	-	-	-	-	Multi-spectral image dataset
Residual CNN [14]	42.76	-	0.9953	-	-	T2w MRI brain
	26.59	15.64	0.98	0.95	24.36	Breast WSI
SR-RCNN ¹ [31]	19.75	11.60	0.98	0.97	28.31	Kidney WSI
	24.79	20.32	0.96	0.93	22.07	Pancreas WSI
	32.17	-	0.9350	-	-	Urban100 “img082”
Deep Recursive Convolutional Neural Network [32]	24.36	-	0.7399	-	-	B100 “134035”
	27.66	-	0.9608	-	-	Set14 “ppt3”

¹ SR-RCNN: super-resolution recurrent convolutional neural network.

Table 2. Performance comparison of the models on various evaluation metrics.

SR Models	PSNR (↑)	SSIM (↑)	RMSE (↓)	MAE (↓)	MSSIM (↑)
Regression	23.56	0.78	0.048	0.034	0.86
Sparse Learning	24.81	0.80	0.044	0.32	0.88
PCA	22.73	0.75	0.054	0.038	0.82
Bicubic Interpolation	21.92	0.70	0.060	0.045	0.78
MVSNN	25.36	0.82	0.039	0.029	0.90
SR-CNN	26.03	0.83	0.036	0.027	0.89
AE	26.18	0.83	0.034	0.027	0.90
SRGAN	26.47	0.85	0.035	0.026	0.92

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Gavade, A.B.; Gadad, K.A.; Gavade, P.A.; Nerli, R.B.; Kanwal, N. Revolutionizing Prostate Whole-Slide Image Super-Resolution: A Comparative Journey from Regression to Generative Adversarial Networks. Uro 2024, 4, 89-103. https://doi.org/10.3390/uro4030007

AMA Style

Gavade AB, Gadad KA, Gavade PA, Nerli RB, Kanwal N. Revolutionizing Prostate Whole-Slide Image Super-Resolution: A Comparative Journey from Regression to Generative Adversarial Networks. Uro. 2024; 4(3):89-103. https://doi.org/10.3390/uro4030007

Chicago/Turabian Style

Gavade, Anil B., Kartik A. Gadad, Priyanka A. Gavade, Rajendra B. Nerli, and Neel Kanwal. 2024. "Revolutionizing Prostate Whole-Slide Image Super-Resolution: A Comparative Journey from Regression to Generative Adversarial Networks" Uro 4, no. 3: 89-103. https://doi.org/10.3390/uro4030007

Article Menu

Revolutionizing Prostate Whole-Slide Image Super-Resolution: A Comparative Journey from Regression to Generative Adversarial Networks

Abstract

1. Introduction

2. Preliminaries

General Super-Resolution Block Diagram

3. Related-Work

4. Methods and Materials

4.1. Super-Resolution Using Machine Learning and Deep Learning

4.2. Architecture of Super-Resolution Generative Adversarial Networks

SRGAN’s Architecture Mathematical Modeling

4.3. Dataset Details

4.4. Experimental Setup

5. Evaluation Metrics

6. Results and Discussion

7. Conclusions

8. Future Work

Author Contributions

Funding

Informed Consent Statement

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI