Next Article in Journal
A Semi-Analytical Model for Separating Diffuse and Direct Solar Radiation Components
Previous Article in Journal
Path Loss Models for Cellular Mobile Networks Using Artificial Intelligence Technologies in Different Environments
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Hybrid Multimodal Medical Image Fusion Method Based on LatLRR and ED-D2GAN

1
School of Computer Science and Engineering, North Minzu University, Yinchuan 750021, China
2
Key Laboratory of Image and Graphics Intelligent Processing of State Ethnic Affairs Commission, North Minzu University, Yinchuan 750021, China
3
School of Science, Ningxia Medical University, Yinchuan 750004, China
*
Author to whom correspondence should be addressed.
Appl. Sci. 2022, 12(24), 12758; https://doi.org/10.3390/app122412758
Submission received: 9 November 2022 / Revised: 9 December 2022 / Accepted: 9 December 2022 / Published: 12 December 2022
(This article belongs to the Special Issue Advanced Technologies in Medical Image Processing and Analysis)

Abstract

:
In order to better preserve the anatomical structure information of Computed Tomography (CT) source images and highlight the metabolic information of lesion regions in Positron Emission Tomography (PET) source images, a hybrid multimodal medical image fusion method (LatLRR-GAN) based on Latent low-rank representation (LatLRR) and the dual discriminators Generative Adversarial Network (ED-D2GAN) is proposed. Firstly, considering the denoising capability of LatLRR, source images were decomposed by LatLRR. Secondly, the ED-D2GAN model was put forward as the low-rank region fusion method, which can fully extract the information contained by the low-rank region images. Among them, encoder and decoder networks were used in the generator; convolutional neural networks were also used in dual discriminators. Thirdly, a threshold adaptive weighting algorithm based on the region energy ratio is proposed as the salient region fusion rule, which can improve the overall sharpness of the fused image. The experimental results show that compared with the best methods of the other six methods, this paper is effective in multiple objective evaluation metrics, including the average gradient, edge intensity, information entropy, spatial frequency and standard deviation. The results of the two experiments are improved by 35.03%, 42.42%, 4.66%, 8.59% and 11.49% on average.

1. Introduction

Common medical images, such as Computed Tomography (CT) images, have high spatial resolution and can provide accurate anatomical information of lesions for the clinical diagnosis of patients [1]. However, due to the low resolution of soft tissue, CT images have certain limitations in qualitative diagnosis [2]. Positron Emission Tomography (PET) images are highly sensitive for the early diagnosis of tumors, but due to the low spatial resolution of the images, accurate anatomical structure information of the lesion cannot be provided by these images [3]. In the face of complex diseases, medical images with a single mode cannot provide sufficient auxiliary information for clinicians to refer to, while the fused images can simultaneously present effective information of images with different modes, which can improve the identification ability of the lesion area. It has an important clinical application value in early diagnosis, clinical staging, localization of lesion areas, formulation of diagnosis and treatment plans, and evaluations of the curative effect of tumors.
So far, a large number of image fusion methods has been proposed by relevant researchers, mainly including the methods based on multi-scale decomposition, and the methods based on sparse representation, deep learning, and hybrid models. Among them, the image fusion methods based on multi-scale decomposition [4] firstly decomposed the source images into different low-frequency sub-bands and high-frequency sub-bands. Then, specific fusion rules are used to synthesize each sub-band of the fused image. Finally, the fusion result is reconstructed by the corresponding inverse transform [5]. Diwakar et al. [6] proposed a multi-modal medical image fusion method in non-subsampled shearlet transform (NSST) domains for the Internet of Medical Things. The source images were decomposed into low-frequency and high-frequency component by the decomposition method based on NSST. In the low-frequency component, weighted fusion based on significance features is performed by using Multi local extrema (MLE) and co-occurrence filter. The fusion method based on fuzzy is used in the high-frequency component. Such methods can effectively obtain the detail information of the source images, but these methods need to manually design complex fusion rules. The number of the decomposition layers and the design of fusion rules will directly affect the quality of the final fused images. Compared with the methods based on multi-scale decomposition, sparse representation methods [7] under the condition of sharing the same set of sparse coefficients of the high-frequency and low-frequency images, through the use of sliding windows from the source images into multiple overlapping blocks, which can reduce the visual artifacts of the fused images and improve the robustness of registration error. Li et al. [8] proposed a method for multi-modal medical image denoising and fusion tasks based on sparse representation, in which group sparse representation can provide satisfactory fusion results with fewer artifacts through strong robustness. However, these methods are very time-consuming and dictionary learning is complex.
In recent years, deep learning models have been widely used in the field of image segmentation [9], image analysis [10], image detection [11], image fusion [12], and image classification [13] through good feature extraction and representation capabilities. In the field of image fusion, Liu et al. [14] firstly introduced the Convolutional Neural Network (CNN) into the multi-focus image fusion field. By learning the CNN model, it can effectively avoid the problem where traditional methods need to design complex fusion rules, but this method only uses the results of the last layer of the network model. FusionGAN [15] proposed by Ma et al. introduces the Generative Adversarial Network (GAN) [16] into the image fusion field, establishing an adversarial game between the generator and discriminator, which can obtain fused images with outstanding target information. Fu et al. [17] realized end-to-end anatomical and functional medical image fusion based on GAN and obtained fused images with clear edges and rich details. However, these methods only use one discriminator, which makes it easy to cause losses of effective information in the fused images, and fusion methods based on deep learning often pay little attention to noise-processing of the source images. In addition, it is still a challenge to achieve effective fusion by designing network architectures and loss functions compared to the methods based on multi-scale decomposition.
Considering the advantages and limitations of a single fusion method, researchers have proposed some methods based on hybrid models to apply deep learning to the framework of traditional image fusion tasks to further improve the quality of fused images. Latent low-rank representation (LatLRR) [18] is usually used in clustering analysis tasks, which can remove the noise region contained in the source images and extract the global and local structure of data. Gao et al. [19] combined LatLRR with CNN, used LatLRR and Rolling Guided Image Filtering (RGIF) to decompose the source images at two levels, and used the fusion rules based on CNN to fuse the detailed layers, thus improving the contrast and sharpness of the fused images. Xia et al. [20] proposed a multi-modal medical image fusion method based on multi-scale transformation and the deep stacked convolutional neural network (DSCNN). In this method, the trained DSCNN model was used to decompose the source images into low-frequency and high-frequency images, and were then fused respectively. The proposed method can adaptively decompose and reconstruct the image in the fusion process. However, the image fusion methods based on CNN have a strong dependence on the quality of the source images. Due to the particularity of medical images, there is a small number of registered medical images in the current public datasets. Therefore, the application of the image fusion method based on CNN in the field of multimodal medical image fusion has certain limitations. Wang et al. [21] proposed a medical image fusion method based on GAN and shift-invariant shearlet transform (SIST). In this method, SIST was used to decompose the source images, and the trained GAN model was used as the fusion rule of the high-pass sub-band, and the low-pass sub-band fusion was achieved by local energy-weighted summation and bilateral filter. It can effectively suppress the phenomena of artifacts and distortions in the fused images, but it is easy to lose some details in the source images because only one discriminator is being used.
Based on the above problems, in order to better preserve the anatomical structure and contour intensity information of the CT source images, the functional and metabolic information of the lesion area in the PET source images are highlighted, and to enhance the visibility of the source images, a hybrid multimodal medical image fusion method (LatLRR-GAN) based on LatLRR and dual discriminators GAN (ED-D2GAN) is proposed by combining LatLRR and GAN in this paper. Considering the denoising ability of LatLRR, in order to reduce the impact of noise in the fusion process, this paper firstly uses LatLRR to decompose the source images. Secondly, we show how GAN has the following advantages: (1) Strong feature extraction ability; (2) an end-to-end image fusion process; and (3) how the quality of fused images can constantly adjust through the adversarial game between generator and discriminator. This paper introduces GAN into the framework of multimodal medical image fusion tasks based on decomposition transformation to alleviate the shortage of manual design of complex fusion rules and improve the quality of the fused images. Additionally, considering the shortcomings of using a single discriminator, this paper uses dual discriminators to fully obtain the valid information contained in the source images and highlight the lesion information in the fused images. Finally, due to how the fusion rule based on region energy can obtain local region feature information and improve the sharpness of the fused images, this paper uses the threshold adaptive weighting algorithm based on a region energy ratio as the salient region fusion rule.
The main contributions can be summarized as follows:
  • A hybrid multimodal medical image fusion method based on LatLRR and ED-D2GAN is proposed, which can effectively realize the fusion of CT and PET images.
  • An image fusion strategy based on a dual discriminator GAN is proposed. Encoder and decoder networks are used in the generator; CNNs are used in dual discriminators, which can effectively preserve the anatomical structure information in CT source images and the functional information of the lesion region in PET source images.
  • A threshold adaptive weighting algorithm based on a region energy ratio is used as the fusion rule of salient region images, which improves the quality of fused images.
The remainder of the paper is presented in four sections. The fusion methods of low-rank region images and salient region images are described in detail in Section 2. Section 3 provides and analyses the experimental details and results. The experimental results and future work directions are discussed in Section 4. Finally, this paper is concluded in Section 5.

2. Proposed Method

The overall network architecture of this paper is shown in Figure 1. Among them, the overall framework process of the proposed method is described in Figure 1a, the overall network architecture of ED-D2GAN is shown in Figure 1b, the network structure of generator is shown in Figure 1c, and the network structure of the CT discriminator and PET discriminator of ED-D2GAN is shown in Figure 1d and Figure 1e, respectively.
This section describes the proposed fusion method in detail. Due to medical images often involving various tissues and organs of the human body, there are the characteristics of a huge amount of data, a complex structure, and significant noise. Therefore, in order to discard the noise in the source images and improve the quality of the fused images, LatLRR is first used to decompose the source images into a low-rank region, salient region, and noise region in this paper. The decomposition process of LatLRR is shown in Figure 2. Secondly, according to the characteristics of the low-rank region images, due to the small area of the lesions in the whole image, in order to properly process the background of the medical images to highlight the lesion information, a fusion model based on ED-D2GAN is proposed to extract the deeper features of the low-rank region images. The fusion process is shown in Figure 1b. Thirdly, the salient region images mainly reflect the detailed characteristics and edge information of the source images. The selection of the salient region images’ fusion rule has a great influence on the sharpness and edge distortion of the fused images. Due to the fusion rule based on pixels not being able to accurately reflect the strong correlation among multiple pixels in a certain local region, a threshold adaptive weighting algorithm based on a region energy ratio is proposed in this paper as the salient region images fusion rule. The specific fusion process is shown in Equations (8) to (13). Finally, the noise region images are discarded, and the final fused image is reconstructed by linear addition of the low-rank region fused image and the salient region fused image.

2.1. Low-Rank Region Images Fusion Method

Due to the difference between PET and CT in the imaging mechanism, the gray values of the two kinds of images are different. Malignant lesions with high metabolism appear as dark areas in PET images, while CT images can clearly show the distribution of bones and organs. In addition, the low-rank region images of the medical images obtained by LatLRR decomposition concentrates most of the effective information of the source images. Considering that low-rank region images mainly present background information, an image fusion method based on ED-D2GAN is proposed in this paper to better highlight the lesions in the fused images. The low-rank region image of CT after LatLRR decomposition is firstly enhanced to improve the details and contrast information in the fused images of low-rank region. In addition, considering that a discriminator used in the basic GAN model cannot retain all the valid information contained in the two source images at the same time, this paper uses two discriminators to discriminate the input source images and the fused image, respectively, to improve the quality of the fused image. Generators and discriminators have an adversarial role in the whole architecture. The overall network architecture of ED-D2GAN is shown in Figure 1b.

2.1.1. Generator

In order to obtain more detailed information from the source images, a generator network architecture based on an encoder and decoder is designed in this paper, which is used to fuse the enhanced image C T _ L 1 _ E of the low-rank region of CT decomposed by LatLRR and the low-rank region image P E T _ L 2 of PET decomposed by LatLRR. The generator network architecture is shown in Figure 1c. The feature extraction and fusion processes are performed in the encoder. Firstly, C T _ L 1 _ E and P E T _ L 2 were connected in series in the channel dimension as the input of the encoder network, and then the fused feature maps were used as the output. Finally, the fused feature maps were reconstructed in the decoder to obtain the low-rank region fused image with the same resolution as the source images. The encoder contains five layers of CNNs and the step size of each convolutional layer was set to 1. The decoder contains five layers of CNNs and its network architecture is shown in Figure 1c. In order to better preserve the contrast information of source images, a Batch Normalization (BN) layer [22] was introduced to overcome the sensitivity of data initialization, avoid the problem of gradient explosion or gradient disappearance, and accelerate the speed of the network training. In addition, a L e a k y   R e L u activation function was used to improve the network effect.

2.1.2. Discriminator

The discriminator is designed to act against the generator, where D C T aims at correctly discriminating the generated image I and C T _ L 1 _ E and D P E T aims at correctly discriminating the generated image I and P E T _ L 2 . In the proposed method, D C T and D P E T are two independent discriminators with the same architecture. Compared with the generator, the design of the discriminator architecture is relatively simple, as shown in Figure 1d and e, respectively. The discriminator is made up of three convolutional layers. In order to avoid introducing noise, convolution layers with a step size of 2 were used instead of pooling layers to make the discriminator have a better classification effect. The L e a k y   R e L u activation function was used in the first three layers, and the t a n h activation function was used in the last layer to generate a scalar that estimates the probability that the input image comes from low-rank region source images rather than the generated image.

2.1.3. Loss Function

The model architecture of ED-D2GAN consists of three parts: generator G , discriminator D C T and discriminator D P E T . Therefore, the loss function also includes three parts, that is, generator loss function L G , discriminator D C T loss function L D C T and discriminator D P E T loss function L D P E T .
  • Loss function of generator
Since the training process of the basic GAN model is unstable, this paper uses content loss to impose additional constraints on the generator. The loss function of a generator consists of the content loss between the generated image and the low-rank region images of the source images and the adversarial loss between the generator and the discriminator, which is defined as follows:
L G = L G a d v + α L c ,
where L G a d v is the adversarial loss, L c is the content loss, and α is the parameter that controls the trade-off.
L G a d v directs the generator to generate a real fused image through an adversarial game between the generator and discriminator in order to fool both two discriminators, defined as follows:
L G a d v = E log 1 D C T G C T _ L 1 _ E , P E T _ L 2 + E log 1 D P E T G C T _ L 1 _ E , P E T _ L 2 ,
where D C T and D P E T respectively represent two discriminators, G C T _ L 1 _ E ,   P E T _ L 2 represents the fused image, and E represents the mathematical expected value, that is, the generator expects the generated fused image to deceive the discriminators.
L c constrains the content similarity between the generated image and the low-rank region images of the source images, so that the fused image can retain more effective information of the source images. Since the texture details of CT images are mainly characterized by gradient changes, the functional information of PET images can be characterized by pixel intensity. Therefore, L c mainly includes gradient loss and intensity loss. The specific definition is as follows:
L c = L g r a d + β L i n ,
where L g r a d represents gradient loss, L i n represents intensity loss, and β is the parameter controlling the trade-off.
L g r a d is used to measure the retention degree of texture details in the fused image. In order to retain finer texture information in the fused image, this paper constructs gradient loss according to the principle of maximum selection, which can not only enhance the preservation of texture detail information, but also effectively prevent the diffusion of edge detail information in high-contrast areas. L g r a d is defined as follows:
L g r a d = max 2 C T _ L 1 _ E , 2 P E T _ L 2 2 G C T _ L 1 _ E , P E T _ L 2 | | 1 ,
where · represents the absolute value function, · 1 is the L1 norm, max · is a maximum function, 2 is the Laplacian gradient operator.
L i n is used to constrain the fused image to maintain a similar intensity information distribution with the source images, thus maintaining significant contrast information. L i n is defined as follows:
L i n = γ G C T _ L 1 _ E , P E T _ L 2 P E T _ L 2 F 2 + 1 γ G C T _ L 1 _ E , P E T _ L 2 C T _ L 1 _ E F 2 ,
where · F 2 is the F r o n b e n i u s norm, and γ is the parameter that controls the trade-off.
2.
Loss function of discriminator
In this paper, two independent discriminators D C T and D P E T are used to constrain the generator to capture more contrast information and texture information, respectively. The corresponding loss functions are L D C T and L D P E T , defined as follows:
L D C T = E l o g D C T C T _ L 1 _ E + E log 1 D C T G C T _ L 1 _ E , P E T _ L 2 ,
L D P E T = E l o g D P E T P E T _ L 2 + E log 1 D P E T G C T _ L 1 _ E , P E T _ L 2 ,  
where L D C T and L D P E T are both cross-entropy loss functions. The CT discriminator L D C T is used to accurately distinguish the fused image from the CT image C T _ L 1 _ E , and the PET discriminator L D P E T is used to accurately distinguish the fused image from the PET image P E T _ L 2 .

2.1.4. Training Details

In this paper, 400 images (200 CT images and 200 PET images) were selected as the training dataset images of ED-D2GAN from the training dataset of lung tumor patients provided by a third-class A hospital in Ningxia. This dataset is described in detail in Section 3.1.1. In order to meet the input size of the model, the image size of the training dataset was adjusted to 256 pixels × 256 pixels, and the original RGB three-channel images were converted to grayscale images. In the training process of ED-D2GAN, the network was trained for 12 epochs, the batch size was 16, the learning rate of the whole network was 2 × 10 4 , and the exponential decay was 0.9 of the original value after each epoch. The generator and discriminator used the RMS Prop Optimizer and Adam Optimizer, respectively, where α = 0.4 , β = 5 , γ = 10 .

2.2. Salient Region Images Fusion Method

Salient region images can reflect important information, such as the edge contour and texture details of the source images. Therefore, the selection of the salient region fusion rule is directly related to the sharpness and edge intensity of the fused image. Considering that the local features of an image are not expressed by a single pixel, but by multiple pixels in a local region, and how each pixel has a strong correlation with each other, the simple weighted fusion rule based on a single pixel cannot reflect the feature information of the region well. However, the fusion rule based on regional energy can overcome the one-sidedness of the above methods and obtain more local feature information of the images. Therefore, the fusion method based on regional energy [23] was selected as the basis of the salient region fusion rule in this paper. We considered that when the energy of two local regions is relatively similar, the relative loss of information is easy to be caused by directly selecting the pixel value with more regional energy. Therefore, a threshold adaptive weighted fusion algorithm based on regional energy ratio is proposed in this paper. According to the constant changes of regional center pixel and its corresponding regional energy, the weight matrix was set to adjust the weight coefficient adaptively, so as to achieve the purpose of fully preserving the details of the fused image. The fusion processes are as follows.
Firstly, the salient region images C T _ S 1 and P E T _ S 2 decomposed by LatLRR were segmented by the sliding window, respectively, and the regional energy E C T m , n and E P E T m , n of the pixel centered on m , n were obtained. The calculation formulas are as follows:
E C T m , n = i S j T W m , n × C T _ S 1 i + m , j + n 2 ,  
E P E T m , n = i S j T W m , n × P E T _ S 2 i + m , j + n 2 ,
where i , j is the relative offset of pixels in the region window from the center pixel. S and T are the maximum row and maximum column coordinates of the regional window. W is a weight matrix of size 3 × 3 . The normalized W is defined as:
W = 1 16 1 2 1 2 4 2 1 2 1 ,
Then, the regional energy ratio E r a t i o m , n is obtained according to the regional energy, and the calculation formula is as follows:
E r a t i o m , n = E P E T m , n E C T m , n ,
Finally, the salient region fused image I s f is calculated by weighting method, and the corresponding formula is as follows:
I s f m , n = w 1 × P E T _ S 2 m , n + w 2 × C T _ S 1 m , n ,
where w 1 and w 2 are the weighting coefficients. The specific calculation formula is as follows:
w 1 w 2 = 0 1 T                                                       E r a t i o < t h 1 E P E T / E P E T + E C T E C T / E P E T + E C T t h 1 < E r a t i o < t h 2 , 1 0 T                                                   E r a t i o > t h 2
where [·]T is the matrix transpose operator, and t h 1 and t h 2 are threshold coefficients determined according to the overall energy distribution of the images. When the energy ratio of the region is too small or too large, the weight of the region with higher energy is set to 1 and the weight of the region with lower energy is set to 0. When the energy ratio of the region is within the threshold range, the adaptive weight is calculated based on the energy ratio. That is, the larger the region energy, the larger the corresponding weighting coefficient, and the higher the proportion in the fused image. On the contrary, the smaller the region energy, the lower the proportion in the fused image.
After obtaining the low-rank region fused image and the salient region fused image, the final fused image was reconstructed by using the linear addition method, and the formula is as follows:
I l f + I s f = I f ,
where I l f and I s f represent the low-rank region fused image and the salient region fused image, respectively, and I f represents the final fused image. The proposed fusion algorithm is summarized in Algorithm 1.
Algorithm 1. The proposed multimodal medical image fusion algorithm.
Input: C T image and P E T image.
Output: Fused image I f .
Stage 1: Image decomposition
C T C T _ L 1 ,   C T _ S 1 ,   C T _ N 1 ;
P E T P E T _ L 2 ,   P E T _ S 2 ,   P E T _ N 2 ;
Stage 2: Image fusion
  • Fusion   of   the   low - rank   region   images according   to   Equations   ( 1 )   to   ( 7 ) ,   the   low - rank   region   fused   image   I l f   can   be   obtained ;
2.
Fusion   of   the   salient   region   images according   to   Equations   ( 8 )   to   ( 13 ) ,   the   salient   region   fused   image   I s f   can   be   obtained ;
Stage 3: Image reconstruction according to Equation (14);
Output   the   fused   image   I f .

3. Experimental Section

In this section, the dataset employed in this work and the experimental environments are introduced in Section 3.1. Then, in Section 3.2, the comparison methods and the objective evaluation metrics are presented. To demonstrate the effectiveness of the proposed LatLRR-GAN, qualitative and quantitative comparisons with representative and state-of-the-art methods and the corresponding analysis are given in Section 3.3. In Section 3.4, ablation experiments are described.

3.1. Dataset and Experimental Environments

3.1.1. Dataset

The dataset used in this paper was collected from 95 clinical patients with lung tumors who underwent PET and CT general examination in a third-class A hospital in Ningxia from January 2018 to June 2020. There were 46 female patients and 49 male patients. They ranged in age from 39 to 76, with an average age of 50.635. Among the 95 patients, 40 were smokers, whose smoking time ranged from 2 to 25 years, with an average year of 12.112. Before imaging, all patients fasted for 6 h, had a controlled blood glucose below 10, urinated, and removed any metal jewelry. The patients were examined using intravenous fluoride deoxy glucose injection. One hour after the imaging agent was injected, PET and CT images of the lungs and cadre were taken while the patients lay supine in a quiet and dark room for 45 to 60 min. After scanning, cross-plane, sagittal plane and coronal plane images were selected. To ensure the correct labeling of the lesion area, the data collected were evaluated by two imaging clinicians. When there was a dispute between the two clinicians, three clinicians with more than 10 years of experience in tumor imaging diagnosis were invited to make a joint diagnosis, and the result was subject to the majority opinion. Patients with special conditions were diagnosed in combination with clinical practice. After the processing of rotation, mirror, data enhancement and data augmentation, the image datasets of PET/CT, PET and CT modes were constituted. The final sample number of the image datasets of the three modes was 2430, respectively, among which the datasets of each mode include 2025 training set images and 405 test set images, respectively. The image labels were manually drawn by the clinicians.

3.1.2. Experimental Environments

Hardware environment: The computer had 256 GB of RAM, an NVIDIA TITAN V graphics card and an Intel (R) Xeon (R) Gold 6154 CPU @ 3.00 GHz processor. Software environment: Windows Server 2019 Datacenter 64-bit OS, Matlab 2020b, TensorFlow 2.0 Deep Learning Framework, CUDA 11.3.58.

3.2. Comparison Methods and Evaluation Metrics

3.2.1. Comparison Methods

In order to qualitatively prove the effectiveness of LatLRR-GAN, four fusion methods based on decomposition transformation and two deep learning fusion methods based on GAN were used to compare the fused results of CT images and PET images. Method 1: Nonsubsampled Contourlet Transform (NSCT) was used in the decomposition method, and the average gradient adaptive weighted fusion rule was used for the low-rank region images, and the fusion rule based on the region energy maximum was used for the salient region images. Method 2: LatLRR was used in the decomposition method, the fusion rule of average value was used in the low-rank region images, and the direct additive fusion rule was used in the salient region images. Method 3: Wavelet transform (WT) was used in the decomposition method, the fusion rule of average value was used for low-rank region images, and the fusion rule of maximum value was used for salient region images. Method 4: The decomposition method uses nested decomposition of LatLRR and NSCT. The low-frequency images use the fusion rule of average gradient adaptive weighting, and the high-frequency images use the fusion rule based on the regional energy maximum. Method 5: GANMcC [24]. Method 6: FusionGAN [15]. The parameters of both methods were set to the default values specified by their authors.

3.2.2. Evaluation Metrics

Six common evaluation metrics in the field of image fusion were used in the experiments to quantitatively evaluate the performance of LatLRR-GAN and other comparison methods. There was the average gradient (AG) [25], edge intensity (EI) [26], information entropy (IE) [27], spatial frequency (SF) [28], QAB/F [29] and standard deviation (SD) [30]. Among them, AG was used to measure the sharpness and texture details of the fused images. EI is a computational measure of image edge intensity, which is essentially the magnitude of the gradient of edge points. IE reflects the information content of the images and measures the expected value of the appearance of pixels at each position in the images. SF reflects the overall image sharpness and the change rate of the gray image. SF was obtained by row frequency and column frequency, and source images were not used as a reference image in the calculation. QAB/F was used to evaluate the amount of edge information transferred from input images to the fused image. SD measured the information richness of the fused images. Each evaluation metric was positively correlated with the quality of the fused images. The higher the evaluation metric value, the more details of the fused images could be obtained, and the higher the image clarity grade.

3.3. Comparison Experiments

In order to verify the effectiveness of LatLRR-GAN, two experiments were conducted. In Section 3.3.1, five cases of CT lung window images and PET images are compared on seven methods and six evaluation metrics. In Section 3.3.2, five cases of CT mediastinal window images and PET images are compared on seven methods and six evaluation metrics. Among them, the CT lung window images contain clear, detailed information of the trachea in the lung, and CT mediastinal window images contain clear mediastinal information.

3.3.1. CT Lung Window Images and PET Images

The fusion results of the CT lung window images and PET images are shown in Figure 3 and the evaluation metric results of the fused images are shown in Table 1. The best evaluation metric values are represented in red; the second-best evaluation metric values are represented in blue. Bar charts of the evaluation metric values of the fused images are shown in Figure 4.
It can be seen from Figure 3 that both Methods 4 and 7 clearly display detailed information of the lung bronchus, but detailed information on the lesion area and edge of Method 7 is more obvious. Methods 1, 2, 3, 5 and 6 could not better preserve the detailed information of the lung bronchus. Although Methods 2, 5 and 6 could highlight the lesion area information, the contrast of the fused images is lower and the edge information is blurred. Therefore, the Method 7 proposed in this paper can better fuse the information of lung bronchus in CT images and the lesion area information in PET images.
As can be seen from the evaluation metrics of IE and SD in Table 1 and Figure 4, there is little difference between the evaluation metric values of the Methods 4 and 7, which reflects how the fused images’ sharpness of these two methods is better. From the evaluation metrics of AG, EI and SF in Table 1 and Figure 4, it can be seen that Method 7 had a great improvement compared with other methods, which reflects the strong ability of the proposed method in preserving detailed information, such as that of the lung bronchus and edge intensity. Therefore, fused images with high definition and rich details could be obtained by the LatLRR-GAN method proposed in this paper.

3.3.2. CT Mediastinal Window Images and PET Images

The fusion results of CT mediastinal window images and PET images are shown in Figure 5 and the evaluation metric results of the fused images are shown in Table 2. The best evaluation metric values are represented in red; the second-best evaluation metric values are represented in blue. Bar charts of the evaluation metric values of the fused images are shown in Figure 6.
It can be seen from Figure 5 that Methods 1, 2, 4 and 7 can better retain the contrast information of CT mediastinal window images, such as tissue and bone. However, the performance abilities of Methods 1 and 2 on the lesion area are not as good as that of Method 7, and the edge intensity information of the fused images in Method 4 is not as good as that of Method 7. Methods 3, 5 and 6 are not as clear as Method 7 in their ability to express details such as tissue contour and lesion area. Although Method 3 can retain more lesion information, the contrast of detailed parts, such as organs and bones, is low, and detailed information on the edge is blurred. Therefore, Method 7 proposed in this paper can better fuse soft tissue information, such as the mediastinum in CT images and the lesion area information in PET images.
It can be seen from Table 2 and Figure 6 that compared with other methods, Method 7 proposed in this paper provides a great improvement in the evaluation metric values of AG and EI. In the evaluation metric values of SF, Method 7 shows little difference with Methods 1 and 4, but has obvious advantages compared with Methods 5 and 6. In the two evaluation metric values of IE and SD, Method 7 also has obvious advantages. In terms of the evaluation metric value of QAB/F, the value of Method 4 is improved compared with Method 7, because the background contrast of CT images is adjusted in Method 4 in order to highlight the lesion area of PET images. On the whole, the fused images obtained by Method 7 proposed in this paper can retain rich detailed information, such as on soft tissue and clearly contrasting information.

3.4. Ablation Experiments

Three cases of CT lung window images and PET images and three cases of CT mediastinal window images and PET images were selected, respectively, for ablation experiments to verify the effectiveness of the proposed method. Method 1 (LatLRR_AE): LatLRR was used to decompose the source images, the average value fusion rule was used for the low-rank region images, and the threshold adaptive weighted fusion rule based on the region energy ratio was used for the salient region images, which verified the effectiveness of the ED-D2GAN low-rank region images’ fusion rule proposed in this paper. Method 2 (ED-D2GAN): The ED-D2GAN proposed in this paper was directly used to fuse the source images, and the effectiveness of the LatLRR decomposition of the source images was verified. In addition, the advantages of ED-D2GAN using two discriminators were verified by changing the number of discriminators. Method 3 (Single_DCT): only the DCT discriminator was used and the remaining parts were consistent with the proposed method. Method 4 (Single_DPET): only the DPET discriminator is used and the remaining parts were consistent with the proposed method. Method 5 (LatLRR-GAN): LatLRR was used to decompose the source images, ED-D2GAN based fusion rule was used for the low-rank region images after decomposition, and the threshold adaptive weighted fusion rule based on a region energy ratio was used for the salient region images after decomposition.

3.4.1. CT Lung Window Images and PET Images

The fusion results of ablation experiments about CT lung window images and PET images are shown in Figure 7, and the evaluation metric results of the fused images are shown in Table 3. The best metrics are represented by red; the second-best metrics are represented by blue. In Figure 8, the evaluation metric values of the ablation experiment are visualized.
Figure 7 shows the fusion results about ablation studies of five methods on CT lung window images and PET images. It can be seen from the Figure that the visual effect of fused images of Method 1 is poor, and the contrast information of the lung bronchus is weak, which can reflect the obvious advantages of the low-rank region fusion rule based on ED-D2GAN in the LatLRR-GAN proposed in this paper. Although the fusion results of Method 2 improved compared with Method 1, it can be seen from Table 3 that the AG and EI of Method 2 are relatively low, which reflects the advantages of using LatLRR to decompose the source images. Compared with Methods 1 and 2, Methods 3 and 4 can better highlight the lesion area and better retain detailed information of the bronchial lung, but it can be seen from Table 3 and Figure 8 that using dual discriminators to distinguish the fused image from the two source images can make the fused images retain more detail in the CT source images and PET source images simultaneously. Moreover, Method 5 has advantages over the single discriminator in terms of AG, EI, SF and SD. Therefore, the proposed method can effectively improve the quality of the fused images.

3.4.2. CT Mediastinal Window Images and PET Images

The fusion results of the ablation experiment about CT mediastinal window images and PET images are shown in Figure 9. The evaluation metric results of the fused images are shown in Table 4. The best metrics are represented in red and the second-best metrics are represented in blue. In addition, the evaluation metric values of the ablation experiment are visualized in Figure 10.
Figure 9 shows the fusion results of the five methods of CT mediastinal window images and PET images about the ablation experiment. It can be seen from Figure 9 that the fused images of Method 1 are weak in contrast information and edge detail information, which reflects the effectiveness of the low-rank region images’ fusion rule based on ED-D2GAN proposed in this paper. Method 2 shows better performance in QAB/F, but it can be seen from Table 4 that the performance of this method is weak in the evaluation metrics of AG, EI and SD, which reflects the advantages of using LatLRR to decompose the source images in this paper. Methods 3 and 4 show little difference from Method 5 proposed in this paper in terms of visual effects, but it can be seen from Table 4 and Figure 10 that the dual discriminators can retain more detailed information in the two source images at the same time, and the overall effect of the fused images is better. Therefore, the LatLRR-GAN proposed in this paper has certain advantages in retaining information, such as detailed information and edge intensity.

4. Discussion

The method based on the hybrid model is an important research direction in the field of multimodal medical image fusion. The main work of this paper is an attempt in this research direction. By combining the multi-scale decomposition method with the method based on GAN, a significant lesion area of the fused images can be obtained. It has an important clinical application value in the early diagnosis and evaluation of the curative effect of tumors.
The fusion method based on a hybrid model can make up for the shortcomings of the single fusion method and improve the quality of the fused images effectively. This paper proposed a hybrid multimodal medical image fusion method based on LatLRR and ED-D2GAN, as shown in Figure 1. Due to LatLRR being able to obtain the noise component contained in the source images, this paper adopted LatLRR to decompose the source images and discard the noise component in the whole fusion process. Then, in order to improve the details of the lesion areas in the fused images, the low-rank CT images decomposed by LatLRR were enhanced in this paper, and the dual-discriminator GAN model was proposed as the fusion rule of low-rank region images. It is appropriate to use the GAN model as the fusion rule of low-rank region images, because it has a strong feature extraction ability and can constantly adjust the quality of fused images through the adversarial game process between the generator and discriminator. In addition, the use of two discriminators enables the final fused images to retain more details from the two source images at the same time, thus obtaining a better fused image. Finally, a threshold adaptive weighted algorithm based on a regional energy ratio was proposed as a fusion rule for salient region images. On the one hand, the correlation between multiple pixels in a local area was fully considered in this rule. On the other hand, when the energy of two local regions is very similar, this method can avoid the information loss caused by directly selecting the pixel value with the largest amount of energy of the region to a certain extent.
In summary, it can be seen from Figure 3, Figure 4, Figure 5, Figure 6, Figure 7, Figure 8, Figure 9 and Figure 10 that the visual effect of the fused images obtained by the method LatLRR-GAN proposed in this paper is superior to other fusion methods, which proves the superiority of LatLRR-GAN. Similarly, it can be seen from the objective evaluation metric values in Table 1, Table 2, Table 3 and Table 4, which can be quantitatively compared, that the LatLRR-GAN has certain advantages in AG, EI, IE, SF and SD.
Although LatLRR-GAN has some advantages in both qualitative and quantitative comparisons, this work still has limitations. As can be seen from Table 1, Table 2, Table 3 and Table 4, due to the method in this paper lacking further research on edge information retention, the performance of LatLRR-GAN on the QAB/F value is a little poor. At present, researchers have proposed a variety of methods for edge information extraction, and in future work, the authors will pay more attention to this problem.

5. Conclusions

This paper proposed a hybrid multimodal medical image fusion method based on LatLRR and ED-D2GAN. Firstly, the CT and PET source images were decomposed into low-rank region images, salient region images, and noise region images by LatLRR, respectively. Secondly, the decomposed CT low-rank region image was enhanced, and a dual discriminators GAN (ED-D2GAN) was used to fuse the low-rank region images. Thirdly, the threshold adaptive weighting algorithm based on a region energy ratio was used as the salient region images’ fusion rule. Finally, the final fused image was obtained by linear addition. Subjective and objective experiments demonstrate the effectiveness of the proposed fusion rules. The proposed method can not only highlight the lesion information in PET source images, but also obtain fused images with obvious contrast and edge intensity. It is of great significance to effectively alleviate the shortcomings of the single fusion method.

Author Contributions

Writing—review and editing, T.Z.; project administration, T.Z.; funding acquisition, T.Z.; Conceptualization, Q.L.; writing—original draft preparation, Q.L.; methodology, Q.L.; software, Q.L.; validation, Q.L., X.Z. and Q.C.; investigation, H.L.; data curation, H.L.; supervision, H.L.; visualization, X.Z.; formal analysis, Q.C. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by the National Natural Science Foundation of China, grant number 62062003; Natural Science Foundation of Ningxia, grant number 2022AAC03149; and North Minzu University Research Project of Talent Introduction, grant number 2020KYQD08.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

The data presented in this study are available on request from the corresponding author. The data are not publicly available due to privacy or ethical restrictions.

Conflicts of Interest

The authors declare no conflict of interest.

References

  1. Li, Y.; Zhao, J.; Lv, Z.; Li, J. Medical Image Fusion Method by Deep Learning. Int. J. Cogn. Comput. Eng. 2021, 2, 21–29. [Google Scholar] [CrossRef]
  2. Zhang, Y.D.; Dong, Z.; Wang, S.H.; Yu, X.; Yao, X.; Zhou, Q.; Hu, H.; Li, M.; Jiménez-Mesa, C.; Ramirez, J.; et al. Advances in multimodal data fusion in neuroimaging: Overview, challenges, and novel orientation. Inf. Fusion 2020, 64, 149–187. [Google Scholar] [CrossRef] [PubMed]
  3. Polinati, S.; Dhuli, R. Multimodal medical image fusion using empirical wavelet decomposition and local energy maxima. Optik 2020, 205, 163947. [Google Scholar] [CrossRef]
  4. Al-Marzouqi, H.; AlRegib, G. Curvelet transform with learning-based tiling. Signal Process. Image Commun. 2017, 53, 24–39. [Google Scholar] [CrossRef]
  5. Liu, Z.; Song, Y.; Sheng, V.S.; Xu, C.; Maere, C.; Xue, K.; Yang, K. MRI and PET image fusion using the nonparametric density model and the theory of variable-weight. Comput. Methods Programs Biomed. 2019, 175, 73–82. [Google Scholar] [CrossRef] [PubMed]
  6. Diwakar, M.; Shankar, A.; Chakraborty, C.; Singh, P.; Arunkumar, G. Multi-modal medical image fusion in NSST domain for internet of medical things. Multimed. Tools Appl. 2022, 81, 37477–37497. [Google Scholar] [CrossRef]
  7. Zong, J.; Qiu, T. Medical image fusion based on sparse representation of classified image patches. Biomed. Signal Process. Control. 2017, 34, 195–205. [Google Scholar] [CrossRef]
  8. Li, S.; Yin, H.; Fang, L. Group-Sparse Representation With Dictionary Learning for Medical Image Denoising and Fusion. IEEE Trans. Biomed. Eng. 2012, 59, 3450–3459. [Google Scholar] [CrossRef]
  9. Zhang, J.; Li, C.; Kosov, S.; Grzegorzek, M.; Shirahama, K.; Jiang, T.; Sun, C.; Li, Z.; Li, H. LCU-Net: A novel low-cost U-Net for environmental microorganism image segmentation. Pattern Recognit. 2021, 115, 107885. [Google Scholar] [CrossRef]
  10. Zhou, T.; Ye, X.; Lu, H.; Zheng, X.; Qiu, S.; Liu, Y. Dense convolutional network and its application in medical image analysis. Biomed Res. Int. 2022, 2022, 1–22. [Google Scholar] [CrossRef]
  11. Chen, H.; Li, C.; Wang, G.; Li, X.; Rahaman, M.; Sun, H.; Hu, W.; Li, Y.; Liu, W.; Sun, C.; et al. GasHis-Transformer: A multi-scale visual transformer approach for gastric histopathological image detection. Pattern Recognit. 2022, 130, 108827. [Google Scholar] [CrossRef]
  12. Zhou, T.; Li, Q.; Lu, H.; Cheng, Q.; Zhang, X. GAN review: Models and medical image fusion applications. Inf. Fusion 2023, 91, 134–148. [Google Scholar] [CrossRef]
  13. Chen, H.; Li, C.; Li, X.; Rahaman, M.; Hu, W.; Li, Y.; Liu, W.; Sun, C.; Sun, H.; Huang, X.; et al. IL-MCAM: An interactive learning and multi-channel attention mechanism-based weakly supervised colorectal histopathology image classification approach. Comput. Biol. Med. 2022, 143, 105265. [Google Scholar] [CrossRef] [PubMed]
  14. Liu, Y.; Chen, X.; Peng, H.; Wang, Z. Multi-focus image fusion with a deep convolutional neural network. Inf. Fusion 2017, 36, 191–207. [Google Scholar] [CrossRef]
  15. Ma, J.; Wei, Y.; Liang, P.; Chang, L.; Jiang, J. FusionGAN: A generative adversarial network for infrared and visible image fusion. Inf. Fusion 2019, 48, 11–26. [Google Scholar] [CrossRef]
  16. Goodfellow, I.; Pouget-Abadie, J.; Mirza, M.; Xu, B.; Warde-Farley, D.; Ozair, S.; Courville, A.; Bengio, Y. Generative Adversarial Networks. In Proceedings of the 27th International Conference on Neural Information Processing Systems (NIPS’14), Montreal, QC, Canada, 8–13 December 2014. [Google Scholar]
  17. Fu, J.; Li, W.; Du, J.; Xu, L. DSAGAN: A generative adversarial network based on dual-stream attention mechanism for anatomical and functional image fusion. Inf. Sci. 2021, 576, 484–506. [Google Scholar] [CrossRef]
  18. Liu, G.; Yan, S. Latent low-rank representation for subspace segmentation and feature extraction. In Proceedings of the 2011 International Conference on Computer Vision, Barcelona, Spain, 6–13 November 2011. [Google Scholar]
  19. Gao, C.; Song, C.; Zhang, Y.; Qi, D.; Yu, Y. Improving the Performance of Infrared and Visible Image Fusion Based on Latent Low-Rank Representation Nested With Rolling Guided Image Filtering. IEEE Access 2021, 9, 91462–91475. [Google Scholar] [CrossRef]
  20. Xia, K.; Yin, H.; Wang, J. A novel improved deep convolutional neural network model for medical image fusion. Clust. Comput. 2019, 22, 1515–1527. [Google Scholar] [CrossRef]
  21. Wang, L.; Chang, C.; Hao, B.; Liu, C. Multi-modal Medical Image Fusion Based on GAN and the Shift-Invariant Shearlet Transform. In Proceedings of the 2020 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), Online Event, 16–19 December 2020. [Google Scholar]
  22. Radford, A.; Metz, L.; Chintala, S. Unsupervised Representation Learning with Deep Convolutional Generative Adversarial Networks. In Proceedings of the International Conference on Learning Representations 2016, Caribe Hilton, San Juan, Puerto Rico, 12–19 November 2015. [Google Scholar]
  23. Srivastava, R.; Khare, A.; Prakash, O. Local energy-based multimodal medical image fusion in curvelet domain. IET Comput. Vis. 2016, 10, 513–527. [Google Scholar] [CrossRef]
  24. Ma, J.; Zhang, H.; Shao, Z.; Liang, P.; Xu, H. GANMcC: A Generative Adversarial Network With Multiclassification Constraints for Infrared and Visible Image Fusion. IEEE Trans. Instrum. Meas. 2021, 70, 1–14. [Google Scholar] [CrossRef]
  25. Shen, Y.; Wu, Z.; Wang, X.; Dong, Y.; Jiang, N. Tetrolet transform images fusion algorithm based on fuzzy operator. J. Front. Comput. Sci. Technol. 2015, 9, 1132–1138. [Google Scholar]
  26. Petrovic, V.; Cootes, T. Information representation for image fusion evaluation. In Proceedings of the Fusion 2006, Florence, Italy, 10–13 July 2006. [Google Scholar] [CrossRef]
  27. Roberts, J.W.; Van Aardt, J.; Ahmed, F. Assessment of image fusion procedures using entropy, image quality, and multispectral classification. J. Appl. Remote Sens. 2008, 2, 023522. [Google Scholar] [CrossRef]
  28. Eskicioglu, A.M.; Fisher, P.S. Image quality measures and their performance. IEEE Trans. Commun. 1995, 43, 2959–2965. [Google Scholar] [CrossRef] [Green Version]
  29. Xydeas, C.S.; Petrovic, V. Objective image fusion performance measure. Electron. Lett. 2000, 36, 308–309. [Google Scholar] [CrossRef] [Green Version]
  30. Liu, Y.; Liu, S.; Wang, Z. A general framework for image fusion based on multi-scale transform and sparse representation. Inf. Fusion 2015, 24, 147–164. [Google Scholar] [CrossRef]
Figure 1. The overall network architecture. (a) The overall framework process; (b) the overall network architecture of ED-D2GAN; (c) the network structure of the generator; (d) the network structure of the CT discriminator of ED-D2GAN; and (e) the network structure of the PET discriminator of ED-D2GAN.
Figure 1. The overall network architecture. (a) The overall framework process; (b) the overall network architecture of ED-D2GAN; (c) the network structure of the generator; (d) the network structure of the CT discriminator of ED-D2GAN; and (e) the network structure of the PET discriminator of ED-D2GAN.
Applsci 12 12758 g001
Figure 2. LatLRR decomposition process. C T _ L 1 and P E T _ L 2 represent the low-rank region images of the decomposed source images; C T _ S 1 and P E T _ S 2 represent the salient region images of the decomposed source images; and C T _ N 1 and P E T _ N 2 represent the noisy region images of the decomposed source images.
Figure 2. LatLRR decomposition process. C T _ L 1 and P E T _ L 2 represent the low-rank region images of the decomposed source images; C T _ S 1 and P E T _ S 2 represent the salient region images of the decomposed source images; and C T _ N 1 and P E T _ N 2 represent the noisy region images of the decomposed source images.
Applsci 12 12758 g002
Figure 3. The fusion results of CT lung window images and PET images. Method 1: NSCT; Method 2: LatLRR; Method 3: WT; Method 4: LatLRR+NSCT; Method 5: GANMcC; Method 6: FusionGAN; Method 7: LatLRR-GAN.
Figure 3. The fusion results of CT lung window images and PET images. Method 1: NSCT; Method 2: LatLRR; Method 3: WT; Method 4: LatLRR+NSCT; Method 5: GANMcC; Method 6: FusionGAN; Method 7: LatLRR-GAN.
Applsci 12 12758 g003
Figure 4. Bar charts of fused images’ evaluation metric values of CT lung window images and PET images. Method 1: NSCT; Method 2: LatLRR; Method 3: WT; Method 4: LatLRR+NSCT; Method 5: GANMcC; Method 6: FusionGAN; Method 7: LatLRR-GAN.
Figure 4. Bar charts of fused images’ evaluation metric values of CT lung window images and PET images. Method 1: NSCT; Method 2: LatLRR; Method 3: WT; Method 4: LatLRR+NSCT; Method 5: GANMcC; Method 6: FusionGAN; Method 7: LatLRR-GAN.
Applsci 12 12758 g004
Figure 5. The fusion results of CT mediastinal window images and PET images. Method 1: NSCT; Method 2: LatLRR; Method 3: WT; Method 4: LatLRR+NSCT; Method 5: GANMcC; Method 6: FusionGAN; Method 7: LatLRR-GAN.
Figure 5. The fusion results of CT mediastinal window images and PET images. Method 1: NSCT; Method 2: LatLRR; Method 3: WT; Method 4: LatLRR+NSCT; Method 5: GANMcC; Method 6: FusionGAN; Method 7: LatLRR-GAN.
Applsci 12 12758 g005
Figure 6. Bar charts of fused images’ evaluation metric values of CT mediastinal window images and PET images. Method 1: NSCT; Method 2: LatLRR; Method 3: WT; Method 4: LatLRR+NSCT; Method 5: GANMcC; Method 6: FusionGAN; Method 7: LatLRR-GAN.
Figure 6. Bar charts of fused images’ evaluation metric values of CT mediastinal window images and PET images. Method 1: NSCT; Method 2: LatLRR; Method 3: WT; Method 4: LatLRR+NSCT; Method 5: GANMcC; Method 6: FusionGAN; Method 7: LatLRR-GAN.
Applsci 12 12758 g006
Figure 7. The fusion results of CT lung window images and PET images about the ablation experiments. Method 1: LatLRR_AE; Method 2: ED-D2GAN; Method 3: Single_DCT; Method 4: Single_DPET; Method 5: LatLRR-GAN.
Figure 7. The fusion results of CT lung window images and PET images about the ablation experiments. Method 1: LatLRR_AE; Method 2: ED-D2GAN; Method 3: Single_DCT; Method 4: Single_DPET; Method 5: LatLRR-GAN.
Applsci 12 12758 g007
Figure 8. The evaluation metrics coefficient radar maps of CT lung window images and PET images about the ablation experiments. Method 1: LatLRR_AE; Method 2: ED-D2GAN; Method 3: Single_DCT; Method 4: Single_DPET; Method 5: LatLRR-GAN.
Figure 8. The evaluation metrics coefficient radar maps of CT lung window images and PET images about the ablation experiments. Method 1: LatLRR_AE; Method 2: ED-D2GAN; Method 3: Single_DCT; Method 4: Single_DPET; Method 5: LatLRR-GAN.
Applsci 12 12758 g008
Figure 9. The fusion results of CT mediastinal window images and PET images about the ablation experiments. Method 1: LatLRR_AE; Method 2: ED-D2GAN; Method 3: Single_DCT; Method 4: Single_DPET; Method 5: LatLRR-GAN.
Figure 9. The fusion results of CT mediastinal window images and PET images about the ablation experiments. Method 1: LatLRR_AE; Method 2: ED-D2GAN; Method 3: Single_DCT; Method 4: Single_DPET; Method 5: LatLRR-GAN.
Applsci 12 12758 g009
Figure 10. The evaluation metrics’ coefficient radar maps of CT mediastinal window images and PET images about the ablation experiment. Method 1: LatLRR_AE; Method 2: ED-D2GAN; Method 3: Single_DCT; Method 4: Single_DPET; Method 5: LatLRR-GAN.
Figure 10. The evaluation metrics’ coefficient radar maps of CT mediastinal window images and PET images about the ablation experiment. Method 1: LatLRR_AE; Method 2: ED-D2GAN; Method 3: Single_DCT; Method 4: Single_DPET; Method 5: LatLRR-GAN.
Applsci 12 12758 g010
Table 1. The results of fused images’ evaluation metric values of CT lung window images and PET images. Method 1: NSCT; Method 2: LatLRR; Method 3: WT; Method 4: LatLRR+NSCT; Method 5: GANMcC; Method 6: FusionGAN; Method 7: LatLRR-GAN. The best evaluation metric values are represented in red; the second-best evaluation metric values are represented in blue.
Table 1. The results of fused images’ evaluation metric values of CT lung window images and PET images. Method 1: NSCT; Method 2: LatLRR; Method 3: WT; Method 4: LatLRR+NSCT; Method 5: GANMcC; Method 6: FusionGAN; Method 7: LatLRR-GAN. The best evaluation metric values are represented in red; the second-best evaluation metric values are represented in blue.
ImagesMethodsAGEIIESFQAB/FSD
1NSCT6.790261.80196.623227.65880.48766.6319
LatLRR6.166858.55746.884521.40490.47256.2861
WT5.423148.77426.339124.28940.35325.7617
LatLRR+NSCT7.169966.80986.967927.02130.50896.6074
GANMcC5.262442.13915.537520.32150.30705.6115
FusionGAN5.339944.94036.291320.98590.33685.0670
LatLRR-GAN9.950798.05987.166432.60150.49367.1942
2NSCT7.190766.54616.450828.33340.54536.5942
LatLRR6.410461.28826.676422.00960.50716.1107
WT5.381550.41656.227324.17270.31585.5468
LatLRR+NSCT7.485170.47186.776627.56460.54526.5234
GANMcC5.278246.30285.793821.57640.31265.6716
FusionGAN5.732448.57186.130221.68550.33395.7612
LatLRR-GAN10.3892102.59927.028433.27050.50777.3602
3NSCT6.335458.04456.666327.09110.48836.6298
LatLRR5.811055.47986.692920.84220.49806.3504
WT4.925946.02316.339624.28320.31505.8398
LatLRR+NSCT6.731362.91496.764426.87610.51616.6197
GANMcC4.492140.75145.533619.52170.30805.5639
FusionGAN4.477541.55345.937420.29990.30725.4516
LatLRR-GAN9.593994.66427.104730.58880.49587.3836
4NSCT5.897252.35105.910027.47610.46296.4959
LatLRR5.298349.66616.762120.99300.47076.3137
WT4.500841.38445.732323.76950.30145.7827
LatLRR+NSCT6.195856.57286.820526.80180.50786.4695
GANMcC4.407539.03275.571120.15640.30105.3314
FusionGAN4.677240.88855.556721.58810.39565.5948
LatLRR-GAN8.327480.73297.051630.07960.48697.2110
5NSCT6.664160.07586.402427.66000.50706.6517
LatLRR5.870754.92096.635421.25080.47866.2156
WT4.988346.04046.107924.10760.30695.7196
LatLRR+NSCT6.894963.42676.700927.02350.52066.5769
GANMcC4.955440.17035.060719.32150.30045.0556
FusionGAN4.792442.76475.944620.77870.33855.0781
LatLRR-GAN9.642493.74016.944132.65490.50937.3473
Table 2. The results of fused images’ evaluation metric values of CT mediastinal window images and PET images. Method 1: NSCT; Method 2: LatLRR; Method 3: WT; Method 4: LatLRR+NSCT; Method 5: GANMcC; Method 6: FusionGAN; Method 7: LatLRR-GAN. The best evaluation metric values are represented in red; the second-best evaluation metric values are represented in blue.
Table 2. The results of fused images’ evaluation metric values of CT mediastinal window images and PET images. Method 1: NSCT; Method 2: LatLRR; Method 3: WT; Method 4: LatLRR+NSCT; Method 5: GANMcC; Method 6: FusionGAN; Method 7: LatLRR-GAN. The best evaluation metric values are represented in red; the second-best evaluation metric values are represented in blue.
ImagesMethodsAGEIIESFQAB/FSD
1NSCT6.877961.73575.036032.09760.51165.7207
LatLRR6.041956.21185.977624.98820.49565.5529
WT5.180547.45964.825428.02060.31485.0303
LatLRR+NSCT6.926163.75586.068130.56530.54685.7738
GANMcC4.510830.45814.701022.62900.31144.6901
FusionGAN5.271348.20934.532626.23130.30694.5911
LatLRR-GAN11.3958104.67246.708539.31400.52206.7189
2NSCT7.913672.04945.926431.46820.51905.8663
LatLRR7.137967.37536.721624.53940.48575.9897
WT5.751653.02975.766226.66680.38935.3399
LatLRR+NSCT8.256777.00776.690830.39580.53556.1098
GANMcC5.216240.89385.892521.47600.38355.4296
FusionGAN5.655346.48515.913122.40060.40275.7214
LatLRR-GAN9.026787.37716.303234.91380.52016.5196
3NSCT7.357966.98765.888929.24740.49065.7461
LatLRR6.738263.85256.660823.03310.46515.9029
WT5.517351.23365.716125.92840.39335.2259
LatLRR+NSCT7.772472.57526.646828.75660.50776.0418
GANMcC5.284040.61935.801522.48830.31545.5587
FusionGAN5.429242.84945.977220.33880.33905.5425
LatLRR-GAN10.6800104.21556.909534.33890.48216.6913
4NSCT9.115181.75076.280031.82760.52195.9192
LatLRR8.322377.13866.764825.37680.48115.8338
WT6.650160.73766.090028.07190.39755.1960
LatLRR+NSCT9.564388.03016.826330.95930.53206.0858
GANMcC6.542746.00026.066421.31190.39025.2568
FusionGAN6.668746.07356.014622.47280.39345.2173
LatLRR-GAN10.5485103.35786.918632.78300.49266.7281
5NSCT9.023074.81485.416336.70650.53335.7432
LatLRR7.950167.94336.512629.89360.49535.6478
WT6.502955.14895.264831.42500.30385.0835
LatLRR+NSCT9.196778.36936.557235.65720.54775.8475
GANMcC6.223942.73455.611822.05940.35065.6745
FusionGAN6.361244.93115.911524.41200.30175.0167
LatLRR-GAN12.3573119.33077.001536.19690.50786.8767
Table 3. The fused images’ evaluation metric results of CT lung window images and PET images about the ablation experiments. Method 1: LatLRR_AE; Method 2: ED-D2GAN; Method 3: Single_DCT; Method 4: Single_DPET; Method 5: LatLRR-GAN. The best evaluation metric values are represented in red; the second-best evaluation metric values are represented in blue.
Table 3. The fused images’ evaluation metric results of CT lung window images and PET images about the ablation experiments. Method 1: LatLRR_AE; Method 2: ED-D2GAN; Method 3: Single_DCT; Method 4: Single_DPET; Method 5: LatLRR-GAN. The best evaluation metric values are represented in red; the second-best evaluation metric values are represented in blue.
ImagesMethodsAGEIIESFQAB/FSD
1LatLRR_AE7.193670.42496.723522.60810.50446.6990
ED-D2GAN8.558682.81616.790930.98040.52937.1277
Single_DCT10.5909105.56187.202932.23290.45687.8626
Single_DPET10.5980105.12687.179932.53370.45807.7477
LatLRR-GAN11.0359109.25257.261133.23060.46497.9309
2LatLRR_AE7.451672.88776.752023.24570.50106.6615
ED-D2GAN8.731484.27976.904531.60640.51607.1360
Single_DCT11.1608110.85457.227133.54000.45677.9171
Single_DPET10.9083108.01147.244733.11740.46687.7543
LatLRR-GAN11.4452113.06917.267833.85910.45837.9639
3LatLRR_AE6.773066.65806.749022.24630.49946.3988
ED-D2GAN8.258280.06526.702432.27030.49767.1346
Single_DCT10.3707103.24227.209232.61150.45027.7318
Single_DPET10.1196100.49107.157931.95200.45337.5195
LatLRR-GAN10.6438105.34377.243933.26220.45837.7910
Table 4. The results of fused images’ evaluation metric values of CT mediastinal window images and PET images about the ablation experiment. Method 1: LatLRR_AE; Method 2: ED-D2GAN; Method 3: Single_DCT; Method 4: Single_DPET; Method 5: LatLRR-GAN. The best evaluation metric values are represented in red; the second-best evaluation metric values are represented in blue.
Table 4. The results of fused images’ evaluation metric values of CT mediastinal window images and PET images about the ablation experiment. Method 1: LatLRR_AE; Method 2: ED-D2GAN; Method 3: Single_DCT; Method 4: Single_DPET; Method 5: LatLRR-GAN. The best evaluation metric values are represented in red; the second-best evaluation metric values are represented in blue.
ImagesMethodsAGEIIESFQAB/FSD
1LatLRR_AE7.117967.77876.195526.20790.49886.0567
ED-D2GAN8.347977.02805.922533.83040.54236.2247
Single_DCT8.778183.63906.408332.78730.47866.8221
Single_DPET8.544982.46426.332431.53620.48436.7714
LatLRR-GAN9.317288.91286.433234.10390.49836.8266
2LatLRR_AE6.774664.82466.256325.42040.48886.0667
ED-D2GAN7.848172.54845.876832.54100.53596.1651
Single_DCT8.326579.49666.448932.04010.48406.7476
Single_DPET8.052177.69146.340430.82320.47646.7126
LatLRR-GAN8.667583.08926.470133.47060.49566.8007
3LatLRR_AE6.742864.51956.154625.29500.48935.7954
ED-D2GAN7.940673.74535.899533.71260.53136.1157
Single_DCT8.398080.61206.420931.72150.47246.5390
Single_DPET8.237680.01956.321630.69890.47476.5238
LatLRR-GAN8.771884.31286.437933.47660.49046.5717
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Share and Cite

MDPI and ACS Style

Zhou, T.; Li, Q.; Lu, H.; Zhang, X.; Cheng, Q. Hybrid Multimodal Medical Image Fusion Method Based on LatLRR and ED-D2GAN. Appl. Sci. 2022, 12, 12758. https://doi.org/10.3390/app122412758

AMA Style

Zhou T, Li Q, Lu H, Zhang X, Cheng Q. Hybrid Multimodal Medical Image Fusion Method Based on LatLRR and ED-D2GAN. Applied Sciences. 2022; 12(24):12758. https://doi.org/10.3390/app122412758

Chicago/Turabian Style

Zhou, Tao, Qi Li, Huiling Lu, Xiangxiang Zhang, and Qianru Cheng. 2022. "Hybrid Multimodal Medical Image Fusion Method Based on LatLRR and ED-D2GAN" Applied Sciences 12, no. 24: 12758. https://doi.org/10.3390/app122412758

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop