Deep Learning-Based Super-Resolution Reconstruction and Segmentation of Photoacoustic Images

Jiang, Yufei; He, Ruonan; Chen, Yi; Zhang, Jing; Lei, Yuyang; Yan, Shengxian; Cao, Hui

doi:10.3390/app14125331

Open AccessArticle

Deep Learning-Based Super-Resolution Reconstruction and Segmentation of Photoacoustic Images

by

Yufei Jiang

,

Ruonan He

,

Yi Chen

,

Jing Zhang

,

Yuyang Lei

,

Shengxian Yan

and

Hui Cao

^*

School of Physics and Information Technology, Shaanxi Normal University, Xi’an 710119, China

^*

Author to whom correspondence should be addressed.

Appl. Sci. 2024, 14(12), 5331; https://doi.org/10.3390/app14125331

Submission received: 17 May 2024 / Revised: 13 June 2024 / Accepted: 17 June 2024 / Published: 20 June 2024

(This article belongs to the Special Issue Application of Machine Vision and Deep Learning Technology)

Download

Browse Figures

Versions Notes

Abstract

:

Photoacoustic imaging (PAI) is an emerging imaging technique that offers real-time, non-invasive, and radiation-free measurements of optical tissue properties. However, image quality degradation due to factors such as non-ideal signal detection hampers its clinical applicability. To address this challenge, this paper proposes an algorithm for super-resolution reconstruction and segmentation based on deep learning. The proposed enhanced deep super-resolution minimalistic network (EDSR-M) not only mitigates the shortcomings of the original algorithm regarding computational complexity and parameter count but also employs residual learning and attention mechanisms to extract image features and enhance image details, thereby achieving high-quality reconstruction of PAI. DeepLabV3+ is used to segment the images before and after reconstruction to verify the network reconstruction performance. The experimental results demonstrate average improvements of 19.76% in peak-signal-to-noise ratio (PSNR) and 4.80% in structural similarity index (SSIM) for the reconstructed images compared to those of their pre-reconstructed counterparts. Additionally, mean accuracy, mean intersection and union ratio (IoU), and mean boundary F1 score (BFScore) for segmentation showed enhancements of 8.27%, 6.20%, and 6.28%, respectively. The proposed algorithm enhances the effect and texture features of PAI and makes the overall structure of the image restoration more complete.

Keywords:

photoacoustic imaging; deep learning; super-resolution reconstruction; medical image segmentation

1. Introduction

With the continuous development of medical imaging technology, photoacoustic imaging (PAI) has attracted more and more attention as a fast-developing hybrid biomedical imaging technology [1,2,3]. PAI combines the advantages of optical imaging and ultrasound imaging, and has unique advantages in imaging depth, spatial resolution, and tissue imaging. It is able to provide abundant tissue functional and structural information, widely used in biomedical fields [4,5,6]. In a variety of clinical applications, PAI has shown significant potential and effectiveness [7]. For example, breast cancer screening and diagnosis [8,9]: early lesions and malignant tumors have been identified by detecting blood oxygen saturation and angiogenesis in breast tissue; diagnosis of skin lesions [10,11]: skin cancer and melanoma have been identified by detecting the concentration of melanin and vascular structure in the skin; cardiovascular disease monitoring [12,13]: the severity of atherosclerosis has been assessed by real-time imaging of intravascular lipid cores and calcified plaques; tumor detection [14,15]: by detecting blood flow changes and blood oxygen levels in tissues, important information is provided for early detection and surgical navigation. However, PAI still has limitations in practical applications. Non-ideal signal detection can significantly reduce the image quality of PAI [16]. In addition, the generation of PAI images relies on acoustic waves generated from biological tissues, which can only be sampled in the spatial dimension. Each discrete spatial measurement requires its own detector, and it may be infeasible to build an imaging system with a sufficiently large number of detectors due to practical and physical limitations [17]. Reconstruction of sampled data using standard methods results in low-quality images with serious detail loss. This has become a major obstacle to the clinical promotion of PAI.

Therefore, despite the great progress in PAI, there are still great challenges in image quality improvement and accurate segmentation of tissue structures [18,19]. Traditional photoacoustic image reconstruction and segmentation methods often rely on hand-designed feature extractors and mathematical models, which have many limitations in dealing with complex backgrounds, noise interference, and blurred organizational boundaries [20,21]. For example, filter- or edge detector-based methods have limited effectiveness in dealing with low-contrast structures and subtle features in PAI, while threshold-based segmentation methods are sensitive to noise and artifacts and are prone to producing erroneous segmentation results. This restricts their advancement in real-world uses.

In order to overcome the limitations of traditional methods, deep learning-based image reconstruction and segmentation methods have gradually attracted the attention of researchers in recent years [22,23,24,25]. Currently, deep-learning techniques have been widely used in the field of optoacoustic imaging to acquire high-quality optoacoustic images through large-scale learning and to use the reconstructed optoacoustic images for disease segmentation, classification, and detection. With their strong feature learning capabilities and end-to-end data-driven advantages, deep-learning models can effectively and accurately perform reconstruction and segmentation tasks using optoacoustic image data, as well as develop complicated representations. Deep learning-based techniques have superior generalization and robustness over traditional techniques for handling noise, artifacts, and structural complexity in PAI [26,27,28]. However, despite the significant progress made by deep-learning methods in the field of PAI processing, there are still some challenges and limitations. Because PAIs have complex structures and noise interference, traditional convolutional neural networks often perform poorly in processing PAIs, in which it is difficult to accurately reconstruct and segment the fine structures, and may suffer from overfitting or underfitting problems, resulting in insufficient model generalization ability [29]. Second, the application breadth and performance of deep-learning methods are limited because deep-learning models need a significant amount of labelled data for training, and the labelled data of PAI are hard to collect.

In order to solve the above problems, this paper proposes a deep-learning PAI reconstruction and segmentation method based on the improved network enhanced deep super-resolution minimalistic network (EDSR-M) and DeepLabV3+ [30]. As an advanced model for super-resolution reconstruction tasks, the EDSR [31] network can effectively improve the spatial resolution and detail information of the image. The improved network EDSR-M replaces the original upsampling layer with a convolution layer, which can effectively reduce the computational burden, simplify the network structure, reduce the risk of overfitting, and help learn image features better. The DeepLabV3 + network is an advanced model for semantic segmentation tasks that can segment different objects and organizations in the image into semantic regions. These two network structures are able to make full use of the spatial and semantic information of the image in the PAI reconstruction and segmentation task, thus achieving more accurate and robust reconstruction and segmentation results. Experimental results show that the proposed method achieves significant improvements in the reconstruction and segmentation tasks of PAI, demonstrating the potential and effectiveness of deep learning in the field of PAI. This research result provides an important reference and support for further promoting the development of PAI technology in clinical applications.

2. Materials and Methods

2.1. Overview of the Framework

Figure 1 shows the overall structure proposed in this study. The framework includes four steps of photoacoustic image (PAI) generation, deep-learning image reconstruction, deep-learning image segmentation, and model validation and evaluation, which are comprehensively applied to photoacoustic image processing. Firstly, low-resolution photoacoustic images (LR-PAIs) were generated by k-Wave [32] simulation. Secondly, the generated photoacoustic images were further reconstructed using the improved EDSR-M network to obtain high-resolution photoacoustic images (HR-PAIs). Subsequently, a deep-learning model (DeepLabV3+ [30]) was used to segment the images before and after reconstruction to assist in verifying the reconstruction effect and assessing the accuracy of the reconstructed images. Finally, the combination of deep-learning models was evaluated and validated.

2.2. Photoacoustic Signal Generation and Reconstruction

Photoacoustic signals are generated by illuminating tissue with a nanosecond laser pulse

δ (t)

. The light-absorbing molecules in the tissue undergo thermoelastic expansion and generate a photoacoustic pressure wave [33]. Assuming that thermal diffusion and volume expansion during illumination are negligible, the initial photoacoustic pressure

x

can be defined as

x (r) = Γ (r) A (r)

(1)

where

A (r)

is the spatial absorption function and

Γ (r)

is the Grüneisen coefficient describing the conversion efficiency from heat to pressure [34]. The photoacoustic pressure wave

p (r, t)

at position r and time

t

can be modelled as an initial-value problem for the fluctuation equation, where

c

is the speed of sound [35].

(\nabla^{2} - \frac{1}{c^{2}} \frac{\partial^{2}}{\partial t^{2}}) p (r, t) = - x (r) \frac{d δ (t)}{d t}

(2)

The transient signal is measured by a sensor located on the measurement surface

S_{0} .

The linear operator Μ acts on

p (r, t)

confined to the boundary of the computational domain

Ω

for a finite time

T

and provides linear mapping from the initial pressure

x

to the measured transient signal

y

.

y = Μ_{p| \partial Ω \times (0, T)} = A x

(3)

Time-reversal reconstruction [36] is a robust photoacoustic image reconstruction method for homogeneous and heterogeneous media and any arbitrary detection geometry. The method utilizes a time-reversal algorithm to achieve image reconstruction by running the numerical model of the forward problem in reverse, i.e., inverting the signal propagation process in the time domain while keeping the spatial coordinates constant. Compared to other imaging algorithms, the time-reversal reconstruction method is less susceptible to image artefacts and can achieve more desirable reconstruction results [37].

2.3. Deep-Learning Algorithms for PAI Reconstruction

The convolutional neural network EDSR [31] for super-resolution reconstruction was subjected to some modifications to enhance the efficiency and scalability of PAI reconstruction. Deep learning-based super-resolution reconstruction algorithms have been used to obtain high-resolution (HR) images from their low-resolution (LR) counterparts in various fields [38,39]. The EDSR network, as a deep-learning model for image super-resolution reconstruction, employs techniques such as residual learning and dense connectivity by increasing the depth and the number of parameters of the network, which can efficiently improve the quality of image super-resolution reconstruction and accuracy. However, since the PAI reconstruction itself does not need to change the size of the output image, the existence of the upsampling layer increases the computational complexity and the number of parameters of the network, which may lead to the problem of model overfitting or unstable training. In addition, the upsampling layer may lead to blurring or distortion of the network in reconstructing the image, especially at the edge, and detail parts of the reconstruction may be poor, affecting the reconstruction of the image.

To address this, an improved network EDSR-M was proposed by replacing the original upsampling layer with a convolutional layer, whose architecture is shown in Figure 2. The improved network still consists of a series of residual blocks, each of which contains multiple convolutional layers and activation functions inside, and sums up the input features with the output features through residual connections in order to efficiently learn the residual information of the image and increase the resolution of the image by employing appropriate convolutional operations in the final part of the network. Overall, the improved network structure improves the computational efficiency while maintaining the performance of the original network and avoids the problems that may be introduced by the traditional upsampling layer, such as artefacts and distortion. It also reduces the overfitting risk of the model, making the model more generalizable, robust, and better able to adapt to changes in different datasets and scenarios. Therefore, the improved network has significant advantages in improving model efficiency, reducing overfitting risk, and improving model interpretability.

2.4. Deep-Learning Algorithm for PAI Segmentation

Deep learning plays an important role in medical image segmentation by combining the powerful feature learning capability of deep neural networks and the rich information of medical images to achieve automatic and accurate segmentation of lesions, tissues, and organs, which provides important assistance and support for medical diagnosis and treatment [40,41]. DeepLabV3+ [30], as a deep-learning algorithm that focuses on semantic segmentation tasks, is able to accurately identify blood vessels, tissues, and other structures in PAI to provide accurate segmentation results for medical diagnosis, and its architecture is shown in Figure 3. Its core feature lies in the combination of deep separable convolution, null convolution, and multi-scale feature fusion, which can effectively capture the subtle structures and complex information in medical images to achieve accurate segmentation of organs and lesions. The model employs advanced backbone networks (e.g., ResNet, Xception, or MobileNetV2) to extract image features, and effectively captures contextual information at different scales through the cavity convolution module and feature pyramid pooling operations. Meanwhile, DeepLabV3+ also introduces a decoder module that fuses low-level and high-level features by transposing convolutional layers and hopping connections, which improves the accuracy and detail of the segmentation results, and provides powerful support for the development of medical imaging and clinical practice.

2.5. Photoacoustic Data for Training and Testing

Photoacoustic computed tomography is a novel medical imaging technique that utilizes the light absorption properties of tissues to generate images. The method works by identifying photoacoustic-induced initial pressure distributions in tissues. In these situations, light absorption raises local temperatures, which produces ultrasound waves and an initial pressure distribution [42]. Different tissues absorb light to different extents, and thus the initial pressure distributions vary, which affects the contrast of the PAI [43]. PAI of biological tissues mostly employs near-infrared (NIR) light, which can produce deeper penetration depth, a higher signal-to-noise ratio, and more contrast compared to those of typical visible-light biological imaging [44]. This helps to increase spatial resolution. The absorption coefficients of different tissues such as fat, muscle, and blood at different wavelengths have been extensively studied [45]. Therefore, the tissue structure and density information obtained using CT and MR scans can be used to define the initial pressure distribution in the PAI and then obtain the contrast of the photoacoustic image.

Synthetic training and test data were created using k-Wave, a MATLAB toolbox for simulating photoacoustic wavefields [32]. The photoacoustic simulation in the k-Wave toolbox was implemented using a pseudo-spectral approach. The pseudo-spectral method is a numerical method for solving partial differential equations that improves efficiency in the spatial domain by fitting Fourier series to all the data in a global way, and is suitable for time-domain modeling of broadband or high-frequency waves in the field of acoustics [46]. For each image in the dataset, an initial photoacoustic source with a grid size of 256 × 256 pixels is defined. The medium is assumed to be homogeneous, with a sound velocity of 1500 m/s and an attenuation coefficient of0.75 dB/(MHz∙cm), similar to that of soft tissue in vivo [47,48,49]. The transducer arrays have 32, 64, and 128 equidistant sensors on a circle with a radius of 100 pixels for the reception of the photoacoustic waves. k-Wave’s inbuilt functions are used to simulate photoacoustic pressure sampling. The image is then reconstructed from the simulated photoacoustic time-series data using the time-reversal method.

The CHAOS dataset [50] is a comprehensive dataset for multimodal medical imaging of the liver. The dataset contains CT and MR scans from multiple healthy volunteers covering segmented annotations of abdominal organs such as liver, kidney, and spleen. The abdominal MRI image dataset with T2-Spectral Pre-Saturation Inversion Recovery (SPIR) sequence was selected for defining the initial photoacoustic stressor in k-Wave to create simulated PAI. It contains 632 T2-weighted abdominal images based on fat-suppressed pulse sequences acquisitions. Post-processing of the reconstructed sound field images, including filtering, denoising, interpolation, and other operations, was performed to create a simulated image dataset for deep-learning reconstruction of PAI.

3. Results

3.1. Experimental Setup

Photoacoustic images (PAIs) were acquired from the CHAOS dataset by k-Wave simulation. During the experiments, the dataset was divided into a training set (75%), a validation set (5%), and a test set (20%), and data enhancement was performed on the training set by random rotations of 90 degrees and flipping on the x-axis, followed by super-resolution reconstruction using the EDSR-M network with 32 residual blocks. Finally, the pre- and post-reconstruction images were segmented using the DeepLabV3+ network with ResNet-50 backbone to assist in testing the reconstruction performance. The values of the main parameters of the network are shown in Table 1, and the implementation of these models was done with the MATLAB Deep Learning Toolbox (R2023b).

3.2. Model Evaluation Measures

In this study, a comprehensive assessment of deep-learning image reconstruction and segmentation models was conducted. The assessment measures mainly include quantitative assessment metrics and statistical difference analysis.

3.2.1. Quantitative Assessment Metrics for Image Reconstruction Models

In image reconstruction experiments, peak-signal-to-noise ratio (PSNR) and structural similarity index (SSIM) are used as image quality metrics to compare the reconstructed images with the ground truth. Among them, PSNR is a comprehensive measure of image quality, while SSIM is a local measure of contrast, brightness, and structural similarity, which can objectively measure the degree of difference between the model reconstructed image and the original image. These model evaluation measures are calculated according to Equations (4)–(6).

P S N R = 10 \times \log_{10} (\frac{{M A X}^{2}}{M S E})

(4)

M S E = \frac{1}{N} \sum_{i = 0}^{N} {(I_{i} - {\hat{I}}_{i})}^{2}

(5)

where

M A X

is the maximum possible value of the pixel value,

N

is the number of pixels,

I_{i}

is the ith pixel value of the original image, and

{\hat{I}}_{i}

is the ith pixel value of the reconstructed image.

S S I M (x, y) = \frac{(2 μ_{x} μ_{y} + c_{1}) (2 σ_{x y} + c_{2})}{({μ_{x}}^{2} + {μ_{y}}^{2} + c_{1}) ({σ_{x}}^{2} + {σ_{y}}^{2} + c_{2})}

(6)

where

μ_{x}

and

μ_{y}

are the mean values of the original image

x

and reconstructed image

y

, respectively,

{σ_{x}}^{2}

and

{σ_{y}}^{2}

are their variances,

σ_{x y}

is their covariance, and

c_{1}

and

c_{2}

are the two constants used for stabilization calculation.

3.2.2. Quantitative Assessment Metrics for Image Segmentation Models

In image segmentation experiments, the image segmentation results are compared with the ground truth using metrics such as accuracy, intersection and union ratio (IoU), boundary F1 score (BFScore), and Dice coefficient. For different categories in the segmentation results, accuracy denotes the ratio of correctly classified pixels to total pixels in that category. IoU denotes the ratio of correctly classified pixels in that category to the total number of ground-truth pixels and predicted pixels in that category. B-score denotes the extent to which the predicted boundaries match the true boundaries of each category. The Dice coefficient is used to measure the overlapping of the algorithm’s predicted results with the real labels degree. It takes values between 0 and 1, where 1 indicates perfect overlap and 0 indicates no overlap. For each image, mean accuracy is the average of the accuracies of all categories in that image. Global accuracy is the ratio of correctly categorized pixels (without categories) to the total number of pixels. Mean IoU is the average IoU score of all the classes in it. Weighted IoU is the average IoU for each class, weighted by the number of pixels in that class. Mean BFScore is the average BFScore score for all classes in it. These model evaluation measures are calculated according to Equations (7)–(17).

A c c u r a c y = \frac{T P}{T P + F N}

(7)

M e a n A c c u r a c y = \frac{1}{N} \sum_{i = 1}^{N} {A c c u r a c y}_{i}

(8)

G l o b a l A c c u r a c y = \frac{T P + T N}{T P + T N + F P + F N}

(9)

I o U = \frac{T P}{T P + F P + F N}

(10)

M e a n I o U = \frac{1}{N} \sum_{i = 1}^{N} {I o U}_{i}

(11)

W e i g h t e d I o U = \sum_{i = 1}^{N} w_{i} \times {I o U}_{i}

(12)

B F S c o r e = \frac{2 \times p r e c i s i o n \times r e c a l l}{r e c a l l + p r e c i s i o n}

(13)

p r e c i s i o n = \frac{T P}{T P + F P}

(14)

r e c a l l = \frac{T P}{T P + F N}

(15)

M e a n B f s c o r e = \frac{1}{N} \sum_{i = 1}^{N} B f s c o r e

(16)

D i c e = \frac{2 \times T P}{2 \times T P + F P + F N}

(17)

where

T P

denotes the number of pixels correctly categorized as positive categories,

T N

denotes the number of pixels correctly categorized as negative categories,

F P

denotes the number of pixels correctly categorized as positive categories,

F N

denotes the number of pixels incorrectly categorized as negative categories,

N

denotes the number of categories, and

w

denotes the pixel weight of each category.

3.2.3. Methods of Statistical Difference Analysis

In the statistical difference analysis, t-test and analysis of variance were used to evaluate the significant differences between the proposed method and other comparison methods. The t-test is a statistical method used to compare whether there is a statistically significant difference in the means between two groups of data. It is usually used to analyze whether there is a statistically significant difference between the means of two groups of samples. Analysis of variance is a statistical method used to test whether there is a significant difference between the means of samples from two or more groups. It is classified into one-way analysis of variance (ANOVA) and multivariate analysis of variance (MANOVA) according to the number of indicators analyzed. It determines whether the effects of different factors on the overall mean are statistically significant by decomposing the variance of the data into the variance due to each factor and the variance due to random errors. These statistical analysis methods enable a systematic assessment of the significant difference in performance between the proposed method and the existing methods to ensure the reliability and scientific validity of the experimental results.

3.3. Comparison of Experimental Results for Reconstruction

In the super-resolution reconstruction experiments, conventional photoacoustic image reconstruction techniques (e.g., time-reversal (TR) method) and EDSR-M were compared under different numbers of sensors. Meanwhile, the reconstruction effects of the TR method, bicubic interpolation, and deep learning-based reconstruction methods (including SRCNN, FSRCNN, VDSR, and EDSR-M) were evaluated. The reconstructed images were compared with ground truth images using PSNR and SSIM as quantitative indicators of image reconstruction quality. Meanwhile, the performance of the proposed method and other methods on PSNR and SSIM was analyzed for significant difference using various statistical analysis methods, including t-test, ANOVA, and MANOVA.

Table 2 and Figure 4 show the results of the reconstruction of photoacoustic images at different number of sensors. The observed results show that as the number of sensors increased, the artefacts in the reconstructed images were significantly reduced and the image quality was significantly improved. The images directly reconstructed by the TR method generally had problems such as blurring, lack of details, and artifacts, and their average PSNR and SSIM metrics also showed the lowest levels, indicating that their reconstruction results were relatively poor. In contrast, the reconstructed images using the EDSR-M method showed better reconstruction results at all sensor counts, and their average PSNR and SSIM were always at the highest level. The EDSR-M method was able to reduce the artifacts and improve the overall quality of the images, and it could recover the detail information in the images, which provided a solid foundation for the best reconstruction results.

Figure 5 shows the ground truth image and the PAI reconstructed by each method, and it can be observed that the PAI directly reconstructed by the TR method had obvious blurring and information loss problems, and the overall perception was poor. In addition, the reconstruction effect of double cubic interpolation, as a basic interpolation method, became worse instead, which indicates that it was not suitable for the super-resolution reconstruction of PAIs. Among the various deep-learning methods used, from the perspective of image effect, EDSR-M was able to better recover the fine structure and texture of the image, making the image look clearer and more natural. This demonstrates its superiority in improving image quality and preserving details. From the numerical evaluation point of view, the EDSR-M method also showed superior performance, and it can be observed that the evaluation metrics such as PSNR and SSIM of the EDSR-M reconstructed images achieved optimal values, which are higher than those of other reconstruction methods. This demonstrates the superiority of EDSR-M in maintaining image quality and improving detail sharpness.

Table 3 demonstrates the average numerical results of the reconstruction of each method on the test set. The TR method, as the most primitive direct reconstruction, had a PSNR of only 30.76 dB and an SSIM of 0.916. The bicubic interpolation method performed poorly, which may be due to the fact that its simple interpolation failed to improve the quality of the image efficiently. The image reconstructed by the SRCNN method had a lower SSIM, which was probably due to the fact that the SRCNN failed to recover image details effectively. In contrast, among FSRCNN, VDSR, and EDSR-M, EDSR-M achieved the best results with a PSNR of 36.84 dB and an SSIM of 0.960, indicating that it was an effective super-resolution reconstruction method that could improve the quality of the reconstruction while preserving the image details.

In Figure 6, a comparison of the results of multiple methods of super-resolution reconstruction on the test set is shown, and Figure 6a demonstrates the distribution of PSNR values of different reconstruction methods on the test set images. It can be seen that the box of the EDSR-M method is located at the top of the overall distribution, indicating a higher median PSNR value and higher quality reconstruction result. Figure 6b shows the distribution of SSIM values for different reconstruction methods on the test set images. Similar to PSNR, the box of the EDSR-M method is also located at the top of the overall distribution, indicating that the median of its SSIM values was higher and the distribution was more concentrated, which suggests that its reconstruction results were better in terms of structural similarity. Combining the two images, it can be seen that the EDSR-M method performed well in the task of PAI super-resolution reconstruction, and its reconstruction results were better than other methods in terms of PSNR and SSIM, which provides a valuable reference and development direction for the field of photoacoustic image super-resolution reconstruction.

Table 4 demonstrates the results of statistical difference analysis between the proposed method EDSR-M and other reconstruction methods in PSNR and SSIM. The p-values calculated by all the difference analysis methods were much smaller than the significance level

α = 0.05

, indicating significant differences between the EDSR-M method and other reconstruction methods in PSNR and SSIM. Combined with the mean values in Table 3, the superior performance of the EDSR-M method relative to other methods is further verified, which is of great significance for further research and practical applications in the field of photoacoustic image reconstruction.

3.4. Comparison of Results of Split Experiments

In the image segmentation experiments, the segmentation results of HR-PAI were compared with those of LR-PAI using DeepLabv3+ under the same training set (128 sensors). It was also compared with existing image deep-learning segmentation methods (FCN, SegNet, and U-Net). The predicted segmentation labels were compared with ground truth images using accuracy, IoU, BFScore, and the Dice coefficient as quantitative metrics of image segmentation quality. Meanwhile, the Dice coefficients of the segmentation results of each organ were analyzed for significant differences using several statistical analysis methods, including the t-test, ANOVA, and MANOVA.

Figure 7 shows the segmentation results of HR-PAI and LR-PAI on a single image, and evaluating the segmentation results, it can be observed that the segmentation effect of HR images was significantly better than that of LR images. The segmented labels of the HR image were highly consistent with the ground truth labels with clear morphological edges, presenting excellent segmentation precision and accuracy. On the contrary, the segmented labels of the LR image showed obvious confusion, with pixel confusion between labels and blurred edges, which could not correctly capture the subtle features in the image.

Table 5 shows the segmentation effect of HR-PAI and LR-PAI in the test set. The overall segmentation effect of the HR image was significantly better than that of the LR image. The overall pixel accuracy of the HR image reached 0.992, the average pixel accuracy was 0.903, the mean IoU was 0.856, the weighted IoU was 0.984, and the BFScore was 0.880, whereas the corresponding metrics for the LR image were 0.986, 0.834, 0.806, 0.954, and 0.828, respectively. As seen in Table 6, the HR images had better pixel accuracy, IoU, and BFScore on all organ types than the LR images, and the average Dice coefficient was correspondingly higher. Table 7 shows the significant difference analysis results of the Dice coefficients of each organ segmentation of HR-PAI and LR-PAI. The p-values calculated by all difference analysis methods were smaller than the significance level

α = 0.05

, which indicates that there were significant differences between HR images and LR images in the segmentation of different organs. In Figure 8, the Dice scores for each organ category on the test set are shown, where the Dice scores for all HR images were overall higher than those for LR images. This indicates that the segmentation results of the HR images were more accurate and finer at the organ level, and more closely matched the ground truth labels, while the segmentation results of the LR images were lower and had obvious deficiencies. In summary, HR images showed obvious advantages in the abdominal organ segmentation task with more accurate and clearer segmentation results, which provides important support and guidance for medical diagnosis and research based on PAI.

Figure 9 shows the segmentation results of HR-PAI on multiple networks. By evaluating the images, it can be concluded that the segmentation effect of different networks varied significantly, FCN had obvious deficiencies in segmenting edges and morphology, while SegNet could not distinguish multiple types of organs well. U-Net had a better segmentation effect, but there was pixel confusion, and some regions were not segmented accurately. DeepLabv3+, on the other hand, showed excellent segmentation results, highly similar to the ground truth labels with clear morphological edges, showing the best segmentation results.

Table 8 demonstrates the segmentation evaluation metrics of HR-PAI on the test set. DeepLabv3+ showed the best results in five metrics, namely overall pixel accuracy, average pixel accuracy, mean IoU, weighted IoU, and mean BFScore, which reached 0.992, 0.903, 0.856, 0.984, and 0.880, respectively. Table 9 and Figure 10 show the Dice coefficients of the segmentation results for each network. The box diagram of the SegNet part intersects with 0 because SegNet cannot distinguish multiple types of organs well in the segmentation experiment, and the Dice coefficient of other types of organs was 0. The Dice coefficients of DeepLabv3+ were significantly higher than those of other networks, indicating that it achieved the best results in the segmentation tasks of different organs. Table 10 shows the significant difference analysis results of DeepLabv3+ and other segmentation methods in the Dice coefficients of various organs in HR-PAI. The t-test results for DeepLabv3+ and U-Net on the liver and right kidney showed larger p-values. This could be because U-Net had a relatively better segmentation effect on these particular organs, leading to fewer significant differences. However, from the overall data, the p-values calculated by the other methods of difference analysis were much smaller than the significance level

α = 0.05

, which indicates that there were significant differences between DeepLabv3 + and other methods. In summary, DeepLabv3 + not only had high accuracy and reliability in HR-PAI segmentation, but also could better capture and preserve image details, providing a better segmentation tool for clinical applications.

4. Discussion

In this paper, a new network architecture, EDSR-M, was proposed to solve the problem of unclear imaging of PAI due to the imaging principle, to achieve high-quality super-resolution reconstruction of PAI, and to combine with DeepLabv3+ for image segmentation. The effectiveness of the method was verified by experiments and compared with that of other reconstruction and segmentation methods. The results of the reconstruction experiments show that EDSR-M outperformed other reconstruction methods both numerically and in terms of image effect, with average improvements of 11.02% in PSNR and 4.13% in SSIM, as well as superior performance in fine structure and texture reconstruction. The segmentation impact of HR-PAI after reconstruction by this model was much superior to that of LR-PAI according to the segmentation experiment findings. The segmentation results were enhanced by 8.27%, 6.20%, and 6.28% in terms of accuracy, mean IoU, and mean BFScore, respectively. The LR-PAI segmentation results exhibited fuzzy edges and confusing pixels, whereas the HR-PAI segmentation results had distinct edges and excellent accuracy. Furthermore, DeepLabv3+ performed well in PAI segmentation, with significantly higher accuracy, IoU, and BFScore when compared to those of other approaches, demonstrating the relevance of increasing PAI resolution for better segmentation outcomes.

In summary, deep learning-based reconstruction methods can effectively improve the reconstruction quality of PAI, which helps to enhance the accuracy and reliability of medical diagnosis. HR images provide more accurate labels, which can better help doctors understand the PAI and provide more valuable information for clinical diagnosis. In future research, the performance of the method with real PAI data and other medical images will be further validated to improve the generalization ability and robustness of the algorithm and to promote its wide application in the field of medical imaging.

Author Contributions

Conceptualization, Y.J.; methodology, Y.J.; software, Y.J.; validation, Y.C., R.H., Y.L., S.Y. and J.Z.; data curation, Y.J.; writing—original draft preparation, Y.J.; writing—review and editing, Y.J., Y.C., R.H. and J.Z.; supervision, H.C.; funding acquisition, H.C. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by the National Natural Science Foundation of China (grant number 12374440).

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

The original contributions presented in the study are included in the article, further inquiries can be directed to the corresponding author.

Acknowledgments

Our heartfelt thanks go to all those who have provided sincere and selfless support in the writing of this paper, especially our cohort Yi Chen and Ruonan He.

Conflicts of Interest

The authors declare no conflicts of interest.

References

Zou, Y.; Lin, Y.X.; Zhu, Q. PA-NeRF, a neural radiance field model for 3D photoacoustic tomography reconstruction from limited Bscan data. Biomed. Opt. Express 2024, 15, 1651–1667. [Google Scholar] [CrossRef]
Liang, Z.Y.; Zhang, S.Y.; Liang, Z.C.; Mo, Z.X.; Zhang, X.M.; Zhong, Y.T.; Chen, W.F.; Qi, L. Deep learning acceleration of iterative model-based light fluence correction for photoacoustic tomography. Photoacoustics 2024, 37, 100601. [Google Scholar] [CrossRef]
Zhang, Y.; Glorieux, C.; Yang, S.F.; Gu, K.; Xia, Z.Y.; Hou, R.J.; Hou, L.P.; Liu, X.F.; Xiong, J.C. Adaptive polarization photoacoustic computed tomography for biological anisotropic tissue imaging. Photoacoustics 2023, 32, 100543. [Google Scholar] [CrossRef] [PubMed]
Paltauf, G. Photoacoustic tomography reveals structural and functional cardiac images of animal models. Light-Sci. Appl. 2023, 12, 42. [Google Scholar] [CrossRef]
Wang, R.F.; Zhu, J.; Xia, J.; Yao, J.J.; Shi, J.H.; Li, C.Y. Photoacoustic imaging with limited sampling: A review of machine learning approaches. Biomed. Opt. Express 2023, 14, 1777–1799. [Google Scholar] [CrossRef]
Sun, Z.; Du, J.J.; Yao, Y.; Meng, Q.; Sun, H.F. A Deep Learning Method for Motion Artifact Correction in Intravascular Photoacoustic Image Sequence. IEEE Trans. Med. Imaging 2023, 42, 66–78. [Google Scholar] [CrossRef]
Yu, Y.; Feng, T.; Qiu, H.; Gu, Y.; Chen, Q.; Zuo, C.; Ma, H. Simultaneous photoacoustic and ultrasound imaging: A review. Ultrasonics 2024, 139, 107277. [Google Scholar] [CrossRef] [PubMed]
Zhang, J.; Chen, B.; Zhou, M.; Lan, H.; Gao, F. Photoacoustic Image Classification and Segmentation of Breast Cancer: A Feasibility Study. IEEE Access 2019, 7, 5457–5466. [Google Scholar] [CrossRef]
Nyayapathi, N.; Lim, R.; Zhang, H.; Zheng, W.; Wang, Y.; Tiao, M.; Oh, K.W.; Fan, X.C.; Bonaccio, E.; Takabe, K.; et al. Dual Scan Mammoscope (DSM)—A New Portable Photoacoustic Breast Imaging System with Scanning in Craniocaudal Plane. IEEE Trans. Biomed. Eng. 2020, 67, 1321–1327. [Google Scholar] [CrossRef]
Li, D.; Humayun, L.; Vienneau, E.; Vu, T.; Yao, J. Seeing through the Skin: Photoacoustic Tomography of Skin Vasculature and Beyond. JID Innov. Ski. Sci. Mol. Popul. Health 2021, 1, 100039. [Google Scholar] [CrossRef]
Von Knorring, T.; Israelsen, N.M.; Ung, V.; Formann, J.L.; Jensen, M.; Haedersdal, M.; Bang, O.; Fredman, G.; Mogensen, M. Differentiation Between Benign and Malignant Pigmented Skin Tumours Using Bedside Diagnostic Imaging Technologies: A Pilot Study. Acta Derm.-Venereol. 2022, 102, adv00634. [Google Scholar] [CrossRef] [PubMed]
Schoenhagen, P.; Vince, D.G. Intravascular Photoacoustic Tomography of Coronary Atherosclerosis Riding the Waves of Light and Sound. J. Am. Coll. Cardiol. 2014, 64, 391–393. [Google Scholar] [CrossRef]
Iskander-Rizk, S.; Wu, M.; Springeling, G.; van Beusekom, H.M.M.; Mastik, F.; Hekkert, M.T.L.; Beurskens, R.H.S.H.; Hoogendoorn, A.; Hartman, E.M.J.; van der Steen, A.F.W.; et al. In vivo intravascular photoacoustic imaging of plaque lipid in coronary atherosclerosis. Eurointervention 2019, 15, 452–456. [Google Scholar] [CrossRef] [PubMed]
Zheng, Y.; Liu, M.; Jiang, L. Progress of photoacoustic imaging combined with targeted photoacoustic contrast agents in tumor molecular imaging. Front. Chem. 2022, 10, 1077937. [Google Scholar] [CrossRef] [PubMed]
Nasri, D.; Manwar, R.; Kaushik, A.; Er, E.E.; Avanaki, K. Photoacoustic imaging for investigating tumor hypoxia: A assessment. Theranostics 2023, 13, 3346–3367. [Google Scholar] [CrossRef]
Deng, H.D.; Qiao, H.; Dai, Q.H.; Ma, C. Deep learning in photoacoustic imaging: A review. J. Biomed. Opt. 2021, 26, 040901. [Google Scholar] [CrossRef]
Arridge, S.; Beard, P.; Betcke, M.; Cox, B.; Huynh, N.; Lucka, F.; Ogunlade, O.; Zhang, E. Accelerated high-resolution photoacoustic tomography via compressed sensing. Phys. Med. Biol. 2016, 61, 8908–8940. [Google Scholar] [CrossRef]
Kim, M.; Pelivanov, I.; O’Donnell, M. Review of Deep Learning Approaches for Interleaved Photoacoustic and Ultrasound (PAUS) Imaging. IEEE Trans. Ultrason. Ferroelectr. Freq. Control 2023, 70, 1591–1606. [Google Scholar] [CrossRef]
Gao, Y.; Xu, W.Y.; Chen, Y.M.; Xie, W.Y.; Cheng, Q. Deep Learning-Based Photoacoustic Imaging of Vascular Network Through Thick Porous Media. IEEE Trans. Med. Imaging 2022, 41, 2191–2204. [Google Scholar] [CrossRef]
Schellenberg, M.; Gröhl, J.; Dreher, K.K.; Nölke, J.H.; Holzwarth, N.; Tizabi, M.D.; Seitel, A.; Maier-Hein, L. Photoacoustic image synthesis with generative adversarial networks. Photoacoustics 2022, 28, 100402. [Google Scholar] [CrossRef]
Choi, S.; Yang, J.; Lee, S.Y.; Kim, J.; Lee, J.; Kim, W.J.; Lee, S.; Kim, C. Deep Learning Enhances Multiparametric Dynamic Volumetric Photoacoustic Computed Tomography In Vivo (DL-PACT). Adv. Sci. 2023, 10, 2202089. [Google Scholar] [CrossRef]
Wang, J.Y.; Awad, M.; Zhou, R.X.; Wang, Z.X.; Wang, X.T.; Feng, X.; Yang, Y.; Meyer, C.; Kramer, C.M.; Salerno, M. High-resolution spiral real-time cardiac cine imaging with deep learning-based rapid image reconstruction and quantification. NMR Biomed. 2024, 37, e5051. [Google Scholar] [CrossRef] [PubMed]
Apivanichkul, K.; Phasukkit, P.; Dankulchai, P.; Sittiwong, W.; Jitwatcharakomol, T. Enhanced Deep-Learning-Based Automatic Left-Femur Segmentation Scheme with Attribute Augmentation. Sensors 2023, 23, 5720. [Google Scholar] [CrossRef] [PubMed]
Lee, S.B.; Hong, Y.; Cho, Y.J.; Jeong, D.; Lee, J.; Yoon, S.H.; Lee, S.; Choi, Y.H.; Cheon, J.E. Deep Learning-Based Computed Tomography Image Standardization to Improve Generalizability of Deep Learning-Based Hepatic Segmentation. Korean J. Radiol. 2023, 24, 294–304. [Google Scholar] [CrossRef] [PubMed]
Chen, R.Z.; Liu, M.; Chen, W.X.; Wang, Y.A.; Meijering, E. Deep learning in mesoscale brain image analysis: A review. Comput. Biol. Med. 2023, 167, 107617. [Google Scholar] [CrossRef] [PubMed]
Yang, C.C.; Lan, H.R.; Gao, F.; Gao, F. Review of deep learning for photoacoustic imaging. Photoacoustics 2021, 21, 100215. [Google Scholar] [CrossRef]
Huo, H.M.; Deng, H.D.; Gao, J.P.; Duan, H.Q.; Ma, C. Mitigating Under-Sampling Artifacts in 3D Photoacoustic Imaging Using Res-UNet Based on Digital Breast Phantom. Sensors 2023, 23, 5970. [Google Scholar] [CrossRef] [PubMed]
Zheng, W.H.; Zhang, H.J.; Huang, C.Q.; Shijo, V.; Xu, C.H.; Xu, W.Y.; Xia, J. Deep Learning Enhanced Volumetric Photoacoustic Imaging of Vasculature in Human. Adv. Sci. 2023, 10, e2301277. [Google Scholar] [CrossRef] [PubMed]
Rix, T.; Dreher, K.K.; Noelke, J.H.; Schellenberg, M.; Tizabi, M.D.; Seitel, A.; Maier-Hein, L. Efficient Photoacoustic Image Synthesis with Deep Learning. Sensors 2023, 23, 7085. [Google Scholar] [CrossRef]
Chen, L.-C.; Zhu, Y.; Papandreou, G.; Schroff, F.; Adam, H. Encoder-decoder with atrous separable convolution for semantic image segmentation. In Proceedings of the European Conference on Computer Vision (ECCV), Munich, Germany, 8–14 September 2018; pp. 801–818. [Google Scholar]
Lim, B.; Son, S.; Kim, H.; Nah, S.; Mu Lee, K. Enhanced deep residual networks for single image super-resolution. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops, Honolulu, HI, USA, 21–26 July 2017; pp. 136–144. [Google Scholar]
Treeby, B.E.; Cox, B.T. k-Wave: MATLAB toolbox for the simulation and reconstruction of photoacoustic wave fields. J. Biomed. Opt. 2010, 15, 021314-12. [Google Scholar] [CrossRef]
Xia, J.; Yao, J.; Wang, L.V. Photoacoustic Tomography: Principles and Advances. Prog. Electromagn. Res.-Pier 2014, 147, 1–22. [Google Scholar] [CrossRef] [PubMed]
Beard, P. Biomedical photoacoustic imaging. Interface Focus 2011, 1, 602–631. [Google Scholar] [CrossRef] [PubMed]
Xu, M.H.; Wang, L.H.V. Universal back-projection algorithm for photoacoustic computed tomography. Phys. Rev. E 2005, 71, 016706. [Google Scholar] [CrossRef] [PubMed]
Hristova, Y.; Kuchment, P.; Nguyen, L. Reconstruction and time reversal in thermoacoustic tomography in acoustically homogeneous and inhomogeneous media. Inverse Probl. 2008, 24, 055006. [Google Scholar] [CrossRef]
Han, D.; Sun, Z.; Yuan, Y. Time-reversal based reconstruction of intravascular photoacoustic images. J. Image Graph. 2016, 21, 442–450. [Google Scholar]
Zhong, X.Y.; Liang, N.N.; Cai, A.L.; Yu, X.H.; Li, L.; Yan, B. Super-resolution image reconstruction from sparsity regularization and deep residual-learned priors. J. X-Ray Sci. Technol. 2023, 31, 319–336. [Google Scholar] [CrossRef] [PubMed]
Huan, H.; Zou, N.; Zhang, Y.; Xie, Y.Q.; Wang, C. Remote sensing image reconstruction using an asymmetric multi-scale super-resolution network. J. Supercomput. 2022, 78, 18524–18550. [Google Scholar] [CrossRef]
Schellenberg, M.; Dreher, K.K.; Holzwarth, N.; Isensee, F.; Reinke, A.; Schreck, N.; Seitel, A.; Tizabi, M.D.; Maier-Hein, L.; Groehl, J. Semantic segmentation of multispectral photoacoustic images using deep learning. Photoacoustics 2022, 26, 100341. [Google Scholar] [CrossRef] [PubMed]
Liang, Z.C.; Zhang, S.Y.; Wu, J.; Li, X.P.; Zhuang, Z.J.; Feng, Q.J.; Chen, W.F.; Qi, L. Automatic 3-D segmentation and volumetric light fluence correction for photoacoustic tomography based on optimal 3-D graph search. Med. Image Anal. 2022, 75, 102275. [Google Scholar] [CrossRef]
Sahlstrom, T.; Pulkkinen, A.; Tick, J.; Leskinen, J.; Tarvainen, T. Modeling of Errors Due to Uncertainties in Ultrasound Sensor Locations in Photoacoustic Tomography. IEEE Trans. Med. Imaging 2020, 39, 2140–2150. [Google Scholar] [CrossRef]
Wang, L.V. Prospects of photoacoustic tomography. Med. Phys. 2008, 35, 5758–5767. [Google Scholar] [CrossRef] [PubMed]
Du, J.; Yang, S.; Qiao, Y.; Lu, H.; Dong, H. Recent progress in near-infrared photoacoustic imaging. Biosens. Bioelectron. 2021, 191, 113478. [Google Scholar] [CrossRef] [PubMed]
Singh, M.K.A.; Xia, W. Portable and Affordable Light Source-Based Photoacoustic Tomography. Sensors 2020, 20, 6173. [Google Scholar] [CrossRef] [PubMed]
Sheu, Y.-L.; Li, P.-C. Simulations of photoacoustic wave propagation using a finite-difference time-domain method with Berenger’s perfectly matched layers. J. Acoust. Soc. Am. 2008, 124, 3471–3480. [Google Scholar] [CrossRef]
Thanh Dat, L.; Kwon, S.-Y.; Lee, C. Segmentation and Quantitative Analysis of Photoacoustic Imaging: A Review. Photonics 2022, 9, 176. [Google Scholar] [CrossRef]
Gulenko, O.; Yang, H.; Kim, K.; Youm, J.Y.; Kim, M.; Kim, Y.; Jung, W.; Yang, J.-M. Deep-Learning-Based Algorithm for the Removal of Electromagnetic Interference Noise in Photoacoustic Endoscopic Image Processing. Sensors 2022, 22, 3961. [Google Scholar] [CrossRef] [PubMed]
Ul Haq, I.; Ali, H.; Wang, H.Y.; Cui, L.; Feng, J. BTS-GAN: Computer-aided segmentation system for breast tumor using MRI and conditional adversarial networks. Eng. Sci. Technol.-Int. J.-Jestech 2022, 36, 101154. [Google Scholar] [CrossRef]
Kavur, A.E.; Gezer, N.S.; Barış, M.; Aslan, S.; Conze, P.-H.; Groza, V.; Pham, D.D.; Chatterjee, S.; Ernst, P.; Özkan, S. CHAOS challenge-combined (CT-MR) healthy abdominal organ segmentation. Med. Image Anal. 2021, 69, 101950. [Google Scholar] [CrossRef]

Figure 1. Schematic diagram of the deep-learning structure for PAI.

Figure 2. Proposed EDSR-M network architecture.

Figure 3. DeepLabV3+ network architecture.

Figure 4. Photoacoustic image reconstruction results for different number of sensors using (a) 32 sensors for PAI reconstruction, (b) 64 sensors for PAI reconstruction, and (c) 128 sensors for PAI reconstruction.

Figure 5. Results of PAI reconstruction using different methods (128 sensors).

Figure 6. Comparison of super-resolution reconstruction performance of PAI by different methods (128 sensors). (a) PSNR of different methods’ reconstruction results; (b) SSIM of different methods’ reconstruction results.

Figure 7. Segmentation results of HR-PAIs and LR-PAIs (Orange—Liver, Blue—Right kidney, Cyan—Left kidney, Yellow—Spleen).

Figure 8. Segmentation Dice coefficients for each organ in HR-PAI and LR-PAI.

Figure 9. The segmentation results of different segmentation methods in HR-PAI (Orange—Liver, Blue—Right kidney, Cyan—Left kidney, Yellow—Spleen).

Figure 10. The Dice coefficients of each organ in HR-PAI by different segmentation methods.

Table 1. The main hyperparameters of the network.

Network	Main Hyperparameters	Specific Values
EDSR-M	Residual blocks	32
	Feature maps	256
	Kernel size	3 × 3
	Activation function	ReLU
	Optimizer	SGDM
	Initial learning rate	0.1
	Loss function	MSE
DeepLabV3+	Backbone network	ResNet-50
	Dilation rate in ASPP	[6,12,18,24]
	Upsampling factor	4
	Activation function	ReLU
	Optimizer	SGDM
	Initial learning rate	0.001
	Loss function	Cross-entropy

Table 2. Average PSNR and SSIM of reconstruction results with different number of sensors.

Methods		32 Sensors	64 Sensors	128 Sensors
TR	PSNR (dB)	29.32	29.52	30.76
TR	SSIM	0.889	0.899	0.916
EDSR-M	PSNR (dB)	31.46	33.98	36.84
EDSR-M	SSIM	0.903	0.930	0.960

Table 3. Mean PSNR and SSIM for different reconstruction methods (128 sensors).

	TR	Bicubic	SRCNN	FSRCNN	VDSR	EDSR-M
PSNR (dB)	30.76	30.66	34.82	35.01	35.37	36.84
SSIM	0.916	0.915	0.899	0.936	0.945	0.960

Table 4. Analysis of significant differences between EDSR-M and other reconstruction methods in PSNR and SSIM (128 sensors).

		TR	Bicubic	SRCNN	FSRCNN	VDSR
PSNR	t-test	6.91 × 10⁻⁴³	5.73 × 10⁻⁴⁴	3.34 × 10⁻⁴⁰	5.00 × 10⁻²⁰	4.14 × 10⁻²¹
PSNR	ANOVA	3.50 × 10⁻⁴²	1.23 × 10⁻⁴³	1.27 × 10⁻¹⁸	3.10 × 10⁻¹⁰	8.12 × 10⁻⁰⁹
SSIM	t-test	7.01 × 10⁻⁶³	5.15 × 10⁻⁶³	6.07 × 10⁻⁷⁶	1.24 × 10⁻⁶²	4.04 × 10⁻⁴⁶
SSIM	ANOVA	1.83 × 10⁻²⁹	9.01 × 10⁻³¹	1.14 × 10⁻⁵⁷	6.31 × 10⁻¹³	1.70 × 10⁻⁰⁸
MANOVA *		1.90 × 10⁻²⁶⁷	1.62 × 10⁻²⁶⁷	1.17 × 10⁻²⁶⁶	5.82 × 10⁻²⁶⁷	2.56 × 10⁻²⁶⁸

* MANOVA analyzed PSNR and SSIM as a whole, significance level

α = 0.05

.

Table 5. Overall segmentation evaluation indexes for HR-PAI and LR-PAI.

	Global Accuracy	Mean Accuracy	Mean IoU	Weighted IoU	Mean BFScore
HR-PAI	0.992	0.903	0.856	0.984	0.880
LR-PAI	0.986	0.834	0.806	0.954	0.828

Table 6. Evaluation indexes for segmentation of each organ in HR-PAI and LR-PAI.

		Accuracy	IoU	BFScore	Dicesorce
HR-PAI	Liver	0.890	0.829	0.703	0.801
	Right kidney	0.908	0.847	0.901	0.811
	Left kidney	0.887	0.831	0.893	0.773
	Spleen	0.846	0.800	0.845	0.772
LR-PAI	Liver	0.820	0.772	0.611	0.736
	Right kidney	0.821	0.785	0.819	0.757
	Left kidney	0.776	0.726	0.773	0.724
	Spleen	0.729	0.738	0.742	0.688

Table 7. Analysis of significant differences in Dice coefficients of organ segmentation between HR-PAI and LR-PAI.

	Liver	Right Kidney	Left Kidney	Spleen
t-test	3.46 × 10⁻³	1.97 × 10⁻²	2.67 × 10⁻³	3.16 × 10⁻³
ANOVA	2.01 × 10⁻²	3.26 × 10⁻²	1.86 × 10⁻²	1.94 × 10⁻²
MANOVA *	1.33 × 10⁻⁴

* MANOVA analyzed the Dice coefficients of each organ as a whole, significance level

α = 0.05

.

Table 8. Segmentation evaluation indexes of different segmentation methods in HR-PAI.

Methods	Global Accuracy	Mean Accuracy	Mean IoU	Weighted IoU	Mean BF Score
FCN	0.981	0.646	0.610	0.963	0.653
SegNet	0.975	0.345	0.307	0.955	0.708
U-Net	0.990	0.864	0.785	0.982	0.861
DeepLabv3+	0.992	0.903	0.856	0.984	0.880

Table 9. The average Dice coefficients of each organ in HR-PAI by different segmentation methods.

Methods	Liver	Right Kidney	Left Kidney	Spleen
FCN	0.526	0.499	0.361	0.359
SegNet	0.451	0	0	0
U-Net	0.633	0.626	0.496	0.371
DeepLabv3+	0.792	0.811	0.774	0.772

Table 10. Analysis of significant differences in Dice coefficients of various organs in HR-PAI between DeepLabv3+ and other segmentation methods.

		Liver	Right Kidney	Left Kidney	Spleen
FCN	t-test	2.30 × 10⁻¹²	1.93 × 10⁻⁹	1.73 × 10⁻⁹	6.87 × 10⁻¹⁰
	ANOVA	1.56 × 10⁻⁷	8.26 × 10⁻⁷	2.09 × 10⁻¹⁰	1.40 × 10⁻⁸
	MANOVA *	3.15 × 10⁻⁹
SegNet	t-test	9.51 × 10⁻¹⁵	3.10 × 10⁻²⁹	7.22 × 10⁻²⁹	4.60 × 10⁻²²
	ANOVA	5.13 × 10⁻¹¹	1.23 × 10⁻³⁶	3.81 × 10⁻³²	9.74 × 10⁻²⁷
	MANOVA *	7.35 × 10⁻²⁹
U-Net	t-test	0.572	0.115	8.20 × 10⁻³	1.157 × 10⁻³
	ANOVA	2.63 × 10⁻³	8.20 × 10⁻³	4.81 × 10⁻⁵	3.28 × 10⁻⁷
	MANOVA *	1.43 × 10⁻⁷

* MANOVA analyzed the Dice coefficients of each organ as a whole, significance level

α = 0.05

.

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Jiang, Y.; He, R.; Chen, Y.; Zhang, J.; Lei, Y.; Yan, S.; Cao, H. Deep Learning-Based Super-Resolution Reconstruction and Segmentation of Photoacoustic Images. Appl. Sci. 2024, 14, 5331. https://doi.org/10.3390/app14125331

AMA Style

Jiang Y, He R, Chen Y, Zhang J, Lei Y, Yan S, Cao H. Deep Learning-Based Super-Resolution Reconstruction and Segmentation of Photoacoustic Images. Applied Sciences. 2024; 14(12):5331. https://doi.org/10.3390/app14125331

Chicago/Turabian Style

Jiang, Yufei, Ruonan He, Yi Chen, Jing Zhang, Yuyang Lei, Shengxian Yan, and Hui Cao. 2024. "Deep Learning-Based Super-Resolution Reconstruction and Segmentation of Photoacoustic Images" Applied Sciences 14, no. 12: 5331. https://doi.org/10.3390/app14125331

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Deep Learning-Based Super-Resolution Reconstruction and Segmentation of Photoacoustic Images

Abstract

1. Introduction

2. Materials and Methods

2.1. Overview of the Framework

2.2. Photoacoustic Signal Generation and Reconstruction

2.3. Deep-Learning Algorithms for PAI Reconstruction

2.4. Deep-Learning Algorithm for PAI Segmentation

2.5. Photoacoustic Data for Training and Testing

3. Results

3.1. Experimental Setup

3.2. Model Evaluation Measures

3.2.1. Quantitative Assessment Metrics for Image Reconstruction Models

3.2.2. Quantitative Assessment Metrics for Image Segmentation Models

3.2.3. Methods of Statistical Difference Analysis

3.3. Comparison of Experimental Results for Reconstruction

3.4. Comparison of Results of Split Experiments

4. Discussion

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Acknowledgments

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI