Multi-Source Image Fusion Based on BEMD and Region Sharpness Guidance Region Overlapping Algorithm

Guo, Xiao-Ting; Duan, Xu-Jie; Kong, Hui-Hua

doi:10.3390/app14177764

Open AccessArticle

Multi-Source Image Fusion Based on BEMD and Region Sharpness Guidance Region Overlapping Algorithm

by

Xiao-Ting Guo

^1,2,*,

Xu-Jie Duan

² and

Hui-Hua Kong

^1,3

¹

Shanxi Key Laboratory of Signal Capturing & Processing, North University of China, Taiyuan 030051, China

²

School of Instrument and Electronics, North University of China, Taiyuan 030051, China

³

School of Mathematics, North University of China, Taiyuan 030051, China

^*

Author to whom correspondence should be addressed.

Appl. Sci. 2024, 14(17), 7764; https://doi.org/10.3390/app14177764

Submission received: 24 July 2024 / Revised: 20 August 2024 / Accepted: 27 August 2024 / Published: 3 September 2024

Download

Browse Figures

Versions Notes

Abstract

Multi-focal image and multi-modal image fusion technology can fully take advantage of different sensors or different times, retaining the image feature information and improving the image quality. A multi-source image fusion algorithm based on bidimensional empirical mode decomposition (BEMD) and a region sharpness-guided region overlapping algorithm are studied in this article. Firstly, source images are decomposed into multi-layer bidimensional intrinsic mode functions (BIMFs) and residuals from high-frequency layer to low-frequency layer by BEMD. Gaussian bidimensional intrinsic mode functions (GBIMFs) are obtained by applying Gaussian filtering operated on BIMF and calculating the sharpness value of segmented regions using an improved weighted operator based on the Tenengrad function, which is the key to comparison selection and fusion. Then, the GBIMFs and residuals selected by sharpness comparison strategy are fused by the region overlapping method, and the stacked layers are weighted to construct the final fusion image. Finally, based on qualitative evaluation and quantitative evaluation indicators, the proposed algorithm is compared with six typical image fusion algorithms. The comparison results show that the proposed algorithm can effectively capture the feature information of images in different states and reduce the redundant information.

Keywords:

image fusion; bidimensional empirical mode decomposition; Gaussian filter; region sharpness; region overlapping

1. Introduction

General image fusion mainly includes multi-focus, multi-modal, and multi-exposure image fusion, among which multi-focus images contain color and gray multi-focus images, and multi-modal images contain medical multi-modal and infrared and visible images. With the updating and iteration of sensors, the sources of image acquisition are becoming increasingly diverse. In many scenarios, it is often necessary to use multiple sensors to jointly image the target scene to meet different practical needs. In order to make full use of multi-focal and multi-modal images obtained by multi-sensors, image fusion technology comes into being and plays an important role in photography, machine vision, medicine, military, and other fields.

2. The Related Work

In recent years, with the development of artificial intelligence algorithms, there have been some new attempts to fuse multi-source images. Nunes et al. [1] extend the empirical state decomposition to bidimensional empirical mode decomposition. From a new perspective, the application of bidimensional empirical mode decomposition (BEMD) fully demonstrates the advantages of non-stationary time frequency and nonlinear and adaptive capabilities in multi-source image fusion processing. It divides the input signal into different scales, which represent different frequency components of the input. These different frequency components are called different frequency layers. It is different from other time frequency signal processing such as discrete Fourier transform [2] and discrete wavelet transform (DWT) [3]. The BEMD method does not rely on any prior basic function, but carries out the adaptive decomposition process through its own data features, and can better represent the signal features in time frequency positioning behavior.

In the process of multi-source image fusion based on BEMD by Xie et al. [4], most of the detailed features in the image are extracted into the first bidimensional intrinsic mode function (BIMF), and the remaining detailed features are embedded into subsequent BIMFs and residuals. In the process of image decomposition, the BEMD algorithm is closely related to the correct interpolation operation. Due to the lack of local extremum constraints, incorrect interpolation will appear at the edge of the interpolated image, which is the edge effect in BEMD decomposition. However, the edge effect in BEMD decomposition has not been well solved. After the image is decomposed into multiple BIMFs and residuals, the ideal part is selected for fusion. A pixel-based fusion strategy is used to combine each component of BEMD. Pixel-level fusion rules cannot capture the significant information of the source image well, and will produce fuzzy fusion results. When fusing multi-modal images based on energy maximum selection strategy, the details of fusion result will be lost due to the difference of multi-modal image frequency regarding the selection of image sharpness values for fusion aspects.

Li et al. [5] constructed a novel convolutional neural network based on the Dempster–Shafer theory to deal with the problems of overlapping focus regions and depth of field. The Dempster–Shafer theory is introduced to fuse the results of different branches, which improves the reliability of the results. The gradient residuals are designed to improve the utilization of edge information and reduce the dimension of branch layer feature mapping, thus improving the performance of the network and reducing the number of training parameters. Compared with other advanced fusion methods, the fusion map obtained by this method is more accurate.

Considering image sharpness evaluation, a new image sharpness evaluation function, is proposed by Huang [6]. It can enhance the high frequency information of the image by histogram equalization, reduce the influence of jitter artifacts and stray light by DWT, and quantitatively evaluate the sharpness value of the image by calculating the edge gradient with Tenengrad function to obtain the sharpness value of the entire image.

Ojdanić et al. [7] designed an algorithm combining the Tenengrad operator and the hill climbing search function for the high-speed autofocus module of UAVs to solve the problem of low noise resistance and real-time performance of the traditional gray gradient algorithm. The UAV focusing system consists of two telescopes and deep learning-based target detection, complemented by suitable linear phase and passive focusing algorithms to achieve fast autofocus.

The purpose of the image fusion method mentioned above is to extract all useful features from a given multi-source image, and finally obtain a composite image containing all features of the multi-source image while reducing noise. Although many methods can fuse the image features information from two source images into one fused image, the quality of the final fusion image fluctuates greatly due to the change of imaging focus, illumination, and inappropriate fusion strategy; considering this, an improved image fusion scheme is proposed to make up for the technical defects of the recent research and overcome the above shortcomings. The experimental results indicate that this method is superior to the traditional fusion method in both qualitative evaluation and quantitative evaluation.

The contributions of the article can be summarized as follows:

A region overlapping multi-source image fusion method based on BEMD block sharpness guidance is proposed for image fusion. First, the input image pairs are decomposed into intrinsic mode functions (BIMFs) and residuals, i.e., multiple two-dimensional image layers, respectively, using BEMD. Gaussian filtering is applied to the obtained image layer, the filtered image features are more obvious, and the noise caused in the interpolation process is also smoothed.
A new fusion method is designed to improve the quality of the final fusion image, the filtered image is divided into different regions, and the improved Sobel weighted operator is used to calculate the image sharpness value of each region, which is an important reference for region selection.
For the final region selection, we adopt the method of region stitching and region overlapping, which can reduce the influence of artifacts on the final fusion image. First, the area with a higher sharpness value is selected. Based on the articulation comparison strategy, gbimf and residual are fused by subregion splicing and region overlapping. The overlay layer is weighted to build the final fusion image.

The remainder of this article is scheduled as follows. In Section 3, the basic knowledge about BEMD is given. Section 4 discusses the proposed method in detail. Section 5 conducts qualitative and quantitative comparisons of the proposed method and other state-of-the-art methods. Concluding remarks are given in Section 6.

3. Bidimensional Empirical Mode Decomposition

Multi-source image fusion is to synthesize the same target scene information obtained by different sensors or the same sensor at different times. As a single image suffers from some defects, such as scene information incomplete, the gray scale distribution range and brightness differences are large, as illustrated in Figure 1. Source image A is a picture with a clear foreground and a fuzzy background, and source image B is a picture with a fuzzy foreground and a clear background. Fused image C exhibits both a clear foreground and a clear background. When image fusion is carried out, the feature information of multiple images is optimized and combined to obtain a final fused image (fused image C), which contains rich and complete scene information, is more in line with human visual experience, is more convenient for machine recognition, and improves computing efficiency.

How to extract the feature information from images pairs quickly and accurately is the key technology in image fusion processing. BEMD usually decomposes the input image pairs into multiple BIMFs from a high-frequency to a low-frequency layer and a residual component. The low-frequency layers express the color information and the high-frequency layers express the texture and contour information.

In addition, the BEMD method has a boundary effect when processing the four edges of a 2D image. Symmetric padding is used to expand the image size, ensuring that the processed image is consistent with the original image size.

The decomposition process of the image signal according to the BEMD algorithm [8] can be represented by the flow chart shown in Figure 2.

The specific decomposition process is as follows:

Externally initialized, input image function f(x,y).
If it is a multi-channel image such as a three-channel color image, the image is automatically layered and cyclically processed; if it is a single channel, it is directly input.
Select the local maximum point and minimum point by sliding the local window.
Process the local maximum and minimum points of the image separately through interpolation, obtaining the upper and lower envelope surfaces E_max(x,y) and E_min(x,y), and then calculating their mean value E(x,y).

$E (x, y) = \frac{E_{\max} (x, y) + E_{\min} (x, y)}{2}$

(1)
Input image f(x,y) minus the envelope mean E(x,y) to achieve a new intermediate variable L(x,y).

$L (x, y) = f (x, y) - E (x, y)$

(2)
Verify that L(x,y) meets BIMF’s filter termination criteria.

$S D = \frac{\sum_{x = 1}^{M} \sum_{y = 1}^{N} {|L_{m - 1} (x, y) - L_{m} (x, y)|}^{2}}{\sum_{x = 1}^{M} \sum_{y = 1}^{N} {|L_{m - 1} (x, y)|}^{2}}$

(3)
L(x,y) = BIMFi is satisfied where (5) is performed to the residual image; otherwise, the following process is repeated until SD = 0.03 is satisfied:

$L_{n} (x, y) = L_{n - 1} (x, y) - E (x, y)$

(4)
The decomposed BIMF components are summed and subtracted from the original plot to achieve the residual Rm(x,y) between the two.

$R_{m} (x, y) = f (x, y) - B I M F_{m}$

(5)
Input image F(x,y) can be represented by BEMD decomposition as

$f (x, y) \sum_{i = 1}^{m} B I M F_{I} (x, y) + R_{m} (x, y)$

(6)

The image is taken as the original input image and, after m cycles to meet the termination conditions, m BIMF components and residual images satisfying the conditions are obtained.

4. Multi-Source Image Fusion Algorithm

4.1. Selection Strategy by Sharpness Comparison

In order to reduce the noise and suppress the halo artifacts in the interpolation process, the salient map is calculated by Gaussian filter. GBIMF is obtained to facilitate subsequent clarity contrast and fusion. The Gaussian filter [9] model is as follows:

G (x, y) = \frac{1}{2 π σ^{2}} е^{- \frac{x^{2} + y^{2}}{2 σ^{2}}}

(7)

where

σ

represents the standard deviation of a bidimensional image matrix and is set to 1. Gaussian filtering is convolved with the image by A × A gaussian kernel. The sum of all elements of the kernel is 1, and the kernel window size is set to 5 × 5.

The filtered BIMFs and residual are denoted as GBIMFs and GRes, respectively, and can be collectively referred to as G-layers. Precisely speaking, each image layer includes a pair of image layers from source image A or B. Each G-layer is segmented into multiple small regions. And the sharpness value of image layer pairs, namely the corresponding regions of the same G-layer from image A or B, are calculated by an improved Sobel operator of the Tenengrad function. The regions with higher sharpness are selected. The whole procedure is shown in Figure 3.

Sharpness value calculation is usually based on the gray gradient, and the values of feature edge pixels are usually higher than that of non-edge pixels. Existing evaluation functions are vulnerable to noise and, generally, the gray gradient is evaluated for the whole image, resulting in large evaluation errors. As for small image regions, the traditional Tenengrad gradient function selects the region with a larger gradient value rather than the region with higher sharpness. Misjudgment occurs and results in unsmooth or blur in the border region for the fused image, especially in the adjacent area of the foreground and background. Therefore, it is necessary to improve the sharpness evaluation function to improve its applicability to ensure that the regions with higher sharpness can be correctly selected.

The Tenengard function extracts the gradient values of pixels in each direction by convolution of the Sobel operator [10], which can reduce the influence of noise interference. The Tenengard function can be expressed as:

F = \sum_{x} \sum_{y} G_{x}^{2} (x, y) + G_{y}^{2} (x, y)

(8)

where Gx(x,y) and Gy (x,y) are the gradient values at pixel (x,y) in the horizontal and vertical direction. Gx(x,y) and Gy(x,y) are formulated as:

G_{x} (x, y) = f (x, y) \otimes g_{x}

(9)

G_{y} (x, y) = f (x, y) \otimes g_{y}

(10)

where f (x,y) is the image pixel;

\otimes

is the convolution symbol; gx and gy are the horizontal and vertical templates of Sobel operators and are defined as:

g_{x} = (\begin{matrix} - 1 & - 2 & - 1 \\ 0 & 0 & 0 \\ 1 & 2 & 1 \end{matrix})

(11)

g_{y} = (\begin{matrix} 1 & 0 & - 1 \\ 2 & 0 & - 2 \\ 1 & 0 & - 1 \end{matrix})

(12)

Due to the limitation of the traditional Sobel operator, the gradient changes in each direction of the image tend to appear poorly non-uniform. In the process of image sharpness calculation, the gradient calculation of each direction of the image cannot achieve a more accurate sharpness value, which leads to mistakes in the selection of the segmented image. The commonly used DFT method, the Tenengrad method, and the Laplacian function method all take into account the change of gray value in both horizontal and vertical directions when calculating sharpness, so it is easy to be affected by fringe noise, and the calculation results fluctuate greatly and the accuracy is poor. Aiming at this problem and considering the configuration of calculating image sharpness, an operator with eight directions, as shown in Figure 4, is designed for calculation.

In Figure 4, the expressions of the eight directional operators are, respectively, as follows:

\{\begin{cases} G_{1} (x, y) = f (x - 1, y + 1) + f (x, y + 1) + f (x + 1, + y + 1) + f (x - 1, y - 1) + f (x, y - 1) + f (x + 1, y - 1) \\ G_{2} (x, y) = f (x - 1, y + 1) + f (x - 1, y) + f (x - 1, y - 1) + f (x + 1, y + 1) + f (x + 1, y) + f (x + 1, y - 1) \\ G_{3} (x, y) = f (x - 1, y + 1) + f (x, y + 1) + f (x - 1, y) + f (x + 1, y) + f (x, y - 1) + f (x + 1, y - 1) \\ G_{4} (x, y) = f (x, y + 1) + f (x + 1, y + 1) + f (x + 1, y) + f (x - 1, y) + f (x - 1, y - 1) + f (x, y - 1) \\ G_{5} (x, y) = - f (x, y + 1) + 2 \times f (x, y) - f (x - 1, y) \\ G_{6} (x, y) = - f (x + 1, y) + 2 \times f (x, y) - f (x, y - 1) \\ G_{7} (x, y) = - f (x, y + 1) + 2 \times f (x, y) - f (x + 1, y) \\ G_{8} (x, y) = - f (x - 1, y) + 2 \times f (x, y) - f (x, y - 1) \end{cases}

(13)

Through experimental comparison, it is found that the operator takes into account the change trend of sharpness in eight directions, and uses the above eight operators to convolve GBIMF images, respectively. The maximum value of the absolute value of the calculated result of each pixel is selected as the result, and the corresponding direction operator of the result is selected as the final sharpness value as the key for evaluation. The resulting fusion image quality is greatly improved. Figure 5 is the final fusion effect diagram of the DFT operator, the Laplacian operator [11], and the improved Sobel operator. It can be seen that the improved Sobel operator has better clarity in the final fusion image and more obvious details in the fusion image. As shown in the red box in the figure, the resulting image has uniform brightness and less noise. The quality of the final fusion image is improved.

4.2. Improved Sobel Weighting Operator

BEMD decomposes the input images into multiple BIMFs and a residual component from a high-frequency layer to a low-frequency layer. And the high-frequency layers contain contour information and rich energy information. Thus, the fusion of high-frequency layer decides whether the feature information and the background information of the image can be retained. The Tenengard function is selected to distinguish contour sharpness.

When performing image fusion, the image is segmented into small regions, and the salient parts of the image are extracted for image fusion to minimize the influence of background information. However, many regions with low gradient information need to be saved. For instance, the hat body regions in image A and image B both show low gradient information. The difference of the gradient information of the hat body in image A or B may be very small. But the hat body of image A needs to be saved and that of image B needs to be removed. It is easy to make mistakes in judging the regions to be preserved and produce noise in the final image.

To address this issue, an improved Sobel operator of the Tenengrad function is designed. As depicted in Figure 4, the operator template is different from that of Equations (9) and (10). There are eight operator templates in eight directions. To obtain the gradient values G(x,y) according to the operator templates shown in Figure 4 by

G_{i} (x, y) = f (x, y) \otimes g_{i}, i = 1, \dots, 8 .

(14)

it is found that the operators take into account the gray change trend in eight directions and are used to convolve with the G-layers, respectively. Then, the pixel gradient value Gi(x,y) is obtained.

The eight operators act on the same regions and we can obtain eight pixel gradient values Gi(x,y). It is necessary to integrate them together to obtain the only one. To obtain appropriate normalized gradient weighting coefficient, the maximum absolute value scheme is generally used. This strategy is sensitive to noise and some feature information may also be lost. To obtain the neighborhood average value, the correlation analysis between pixel points and its neighborhood is carried out. Based on this idea, an algorithm named optimal mean of neighborhood grayscale is proposed. By calculating the correlation between the optimal mean grayscale values of the pixel and its neighborhood, the gradient weighting coefficient Q is obtained, as follows:

Q = 1 - \frac{2 \times f (x, y) \times f_{a v g}}{f^{2} (x, y) + {f^{2}}_{a v g}}

(15)

By this method, it can effectively enhance the sensitivity and noise resistance of the gradient weighting algorithm.

To obtain the optimal mean value f_avg, the pixel values in the neighborhood are filtered to ensure the reliability of the neighborhood grayscale mean, as shown in Equation (16).

f_{a v g} = \frac{1}{8} \sum_{(i, j) \in S} f (x + i, y + j) \times ψ_{i, j}

(16)

where f(x,y) is the gray value of the pixel, S is the set of eight neighborhoods of the pixel, and i and j are the horizontal and vertical coordinates of the neighborhood of the pixel, respectively.

The maximum gray value f_max and the minimum gray value f_min are found by traversing the gray value of the eight neighborhoods, and using them as the standard to judge the effective grayscale values of the neighborhood, finally to eliminate the interference of noise points, and enhance the quality of image fusion. This process can be expressed as:

ψ_{i, j} = \{\begin{cases} 1, & f_{\min} < f (x + i, y + j) < f_{\max} \\ 0, & f (x + i, y + j) = f_{\min}, f (x + i, y + j) = f_{\max} \end{cases}

(17)

The sum of weighted gradients is just the sharpness value F_Tenengrad of the region and the expression is:

F_{T e n e n g r a d} = \sum_{x} \sum_{y} \{[G_{1}^{2} (x, y) + G_{2}^{2} (x, y) + \dots + G_{8}^{2} (x, y)] \times Q\}

(18)

where Q is the gradient weighting coefficient. The image processed by Gaussian filter is selected based on the maximum region sharpness value, and the specific implementation method is as follows:

H_{i}^{j} = \{\begin{cases} G I M F_{i 1}^{j}, F_{T enengrad} (G I M F_{i 1}^{j}) \geq F_{T enengrad} (G I M F_{i 2}^{j}) \\ G I M F_{i 2}^{j}, F_{T enengrad} (G I M F_{i 1}^{j}) < F_{T enengrad} (G I M F_{i 2}^{j}) \end{cases}

(19)

where

{G I M F}_{i}^{j}

represents the i-th GBIMFs or the j-th region of the residual. By comparing the sharpness value F_Tenengrad of the corresponding regions in the same layer GBIMF or residuals of the input source images I_A and I_B, the region

H_{i}^{j}

with the highest sharpness value of the j-th region and layer i is obtained. All j will reorganize the i-th layer into the highest sharpness layer (relative to the i-th layer in image A or B), as shown in Figure 1.

H_{i}

and

H_{R}

are obtained by selection strategy of sharpness value based on the improved region operator weighting algorithm.

H_{i}

is the i-th GBIMF layer with maximum sharpness value and

H_{R}

is the residual layer with maximum sharpness value.

H_{i}

and

H_{R}

form a sequence of layers with maximum sharpness values and are used to guide subsequent fusion.

4.3. The Maximum Sharpness Value Guidance of the Region Overlapping Fusion Algorithm

To merge two original images I_A and I_B into one image I = (I_A, I_B), based on the above sharpness value maximum layer, the formula can be expressed as:

I_{f} = \sum_{i = 1}^{K} H_{i} + H_{R}

(20)

When implementing sharpness comparison selection, the region method can finely capture prominent features in the corresponding regions of the same layer in two input images. Namely, the regions with rich and sharp information should be taken as the fused image. The region fusion method may lead to gray and feature discontinuity at the junction of different regions, resulting in an uneven transition of the concatenated boundaries. For example, the region in row 1 and column 1 are all taken from image I₁, including the IMF layer 1 to K and residual layer. The regions in row 1 and column 2 are all taken from image I₂, including the IMF layer 1 to K and residual layer. The fused image is prone to forming obvious splicing marks at the boundaries between the first column and second column of the first row.

To solve the problem that the boundary transition of different regions is not smooth, the region overlapping method [12] can be used. Slide over the layer

H_{i}

to H_R obtained by the sharpness value selection strategy using the M × M size window. In the traversal of the two-dimensional layer matrix, if the step size of each step is S, which is less than M, then the overlapping area is formed between adjacent regions and the number of overlapping rows or columns N is M-S. The choice of overlapping value N may directly affect the degree of the splicing mark; therefore, appropriate value is crucial.

The region overlapping method is illustrated in Figure 6. If window size is 6 × 6 and sliding step size is 2, when the window slides on the image, it can ensure that there are overlapping areas between adjacent regions, and there is even region stacking. The overlapping number of different regions in the image is shown in Table 1. The gray value of each region divided by its overlapping number is the final gray value of corresponding region.

The region overlapping fusion can reduce the splicing marks compared with the method without overlapping. The comparison of pixel-based fusion strategy, region energy comparison strategy, and region sharpness comparison strategy are given in Figure 7. Pixel-based fusion strategy is based on the selection strategy of maximum pixel values. Therefore, for the image in the first row of Figure 7, when the gray value of the clock digit in image A is smaller than that in image B, the fusion method selects the clock digit with a blurred appearance of image B, resulting in blurred numbers in the fused image. The extracted numbers are unclear and the fused image is not so satisfactory. The fusion performance of the region energy comparison method is superior to the above method. However, in terms of sharpness and smoothness, it is still inferior to the region sharpness comparison method.

The pseudo code of the Algorithm 1 flow is as follows:

Algorithm 1 Multi-source image fusion based on BEMD and region sharpness guidance region overlapping algorithm

Input: two source image I = (IA, IB)
1: a multi-channel image I
2: repeat
3: find all local maxima (minima) of each channel image of Ri−1 by comparing the values of each pixel and its neighbours in the 3 × 3 window centered on it
4: obtaining the upper and lower envelope surfaces Emax(x,y) and Emin(x,y) by Equations (1) and (2)
5: get a new intermediate variable L(x,y) by Equation (3)
6: obtain the i-th bimf and residue R
7: i = i + 1
8: until the residue Ri is a constant or a monotonic function, or thenumber of IMFs is more than a given threshold
9: Filter BIMFs and Residual through Equation (7) to get GBIMF and GRes
10: Use Sobel operator to calculate the clarity of GBIMF and GRES
11: Use Equation (19) to select a map with high definition
12: Fusion through overlapping segmentation
Output: fused image If

5. Experimental Results

To further evaluate the effectiveness and feasibility of the proposed method, a comparative evaluation is conducted in relation to other notable methods, which mainly include the Laplace pyramid algorithm (LP) [11], low-pass pyramid fusion algorithm (RP) [12], DWT [13], double-tree complex wavelet transform algorithm (DTCWT) [14], curve wave transform algorithm (CVT) [15], and nonsubsampled contourlet transform algorithm (NSCT) [16].

A widely used dataset for image fusion is selected in the experiments, which consists of 20 pairs of color and 10 pairs of gray multi-focus images, 16 pairs of medical, and 8 pairs of infrared and visible multi-mode images, namely 2 categories and 4 sub-categories. All images are 520 × 520 in size. For images of the same type, all the quantitative indicators are averaged to reduce random errors and improve credibility. All experiments are simulated on MATLAB R2020a, the computer processor is Intel (R) Core (TM) i5-8250U CPU@1.8GHz, and the installed memory (RAM) is 8.00 GB.

5.1. Qualitative Evaluation

Four sub-categories, namely color multi-focus, gray multi-focus, medical, and infrared and visible images, are conducted for image fusion performance evaluation. We list one image for each category in Figure 8.

There are four kinds of source image pairs for image fusion evaluation. (a) The first and second row images are image with clear foreground and fuzzy background, and image with fuzzy foreground and clear background. (b) The first and second row images are image with clear foreground and fuzzy background, and image with fuzzy foreground and clear background. (c) Medical images captured from different instruments. (d) The first and second row images are visible and infrared images.

Image fusion results of four sub-categories by LP, RP, DWT, DTCWT, CVT, NSCT, and the proposed method are displayed in Figure 9 to Figure 12, respectively.

It can be seen from Figure 9 that the junction between the foreground and the background, which is shown in detail 1 of the detail figure, is that the algorithms RP, DWT, NXCT, and CVT will have uneven edges and artifacts on the edge of the club, while the Sobel operator in our algorithm can calculate gradients in eight directions. The final sharpness value obtained by combining sharpness weight can solve the unsmooth phenomenon caused by image fusion. In the final graph fusion strategy, the segmentation and overlapping strategy will reduce the appearance of spatial artifacts. As shown in detail 2 in Figure 9, other algorithms at the collar of the golf ball perform less well than the improved algorithm.

Figure 10 is the result of gray multi-focus image fusion. It can be seen that obvious speckle noise appears in detail 1 after the fusion of the RP algorithm, which has a great impact on the quality of the fusion image. In the DWT algorithm, fringe noise appears, and other algorithms will more or less appear as junction blurring and non-smoothing phenomena, while our algorithm smooths the noise that may occur in the fusion process through Gaussian filtering. After smoothing the image, the improved Sobel operator can effectively calculate the image sharpness, making the final fusion image detail features more obvious. As can be seen from detail 2, there is an obvious artifact at the border of the clock. The reason for this situation is that the block with higher graphic definition cannot be effectively selected for fusion during image fusion.

Figure 11 is the detail diagram of multi-modal medical image fusion. The algorithms RP, LP, DTCWT CVT, and NSCT show blocky segmentation and unsmooth edges in detail 1 and 2. We use Gaussian filtering to reduce noise and unsmooth problems caused by interpolation. In image fusion, the method of overlapping segmentation is used to make the edge of the image smoother.

Figure 12 shows that the RP algorithm of infrared image and visual image has serious detail loss, the DWT algorithm has a breakpoint at the body, and the human brightness of the LP, DTCWT, CVT, and NSCT algorithms is not obvious. Our algorithm makes up for these shortcomings by weighting the sharpness value to improve the brightness of the fusion image, and salient features become significantly brighter, such as the car and the pedestrian, and the brightness information is smooth. There is less noise. The contours and details of the object have fewer splicing marks, and the texture is fully preserved.

To sum up, for multi-focus image sets, the proposed algorithm can reduce the splicing marks at the junction of two focal images, and the fusion effect of detail features is satisfactory. For multi-modal image sets, the proposed algorithm can smooth the overall brightness and make the image clearer.

5.2. Quantitative Evaluation

Seven objective indicators, which are commonly used to evaluate the effect of image fusion, are selected as quantitative comparative analysis among different fusion methods, including mutual information (MI) [17], feature mutual information (FMI) [18,19], peak signal-to-noise ratio [20] (PSNR), standard deviation (SD), Q^Y [20] based on structural similarity, Q^VIF [21], and time. The larger the value is, the better the fusion effect will be.

The numerical evaluation results of the six indicators in the four sub-category data sets are given in Figure 13. There are 54 image pairs, composed of 20 color and 10 gray multi-focus, 16 medical, and 8 infrared and visible multi-mode image pairs.

The average value of the four sub-categories for the six evaluation indicators are listed in Table 2, Table 3, Table 4 and Table 5. The optimal value is marked in bold font.

For the color multi-focus image set (Table 2), the six indicators of the proposed method are optimal. This is consistent with the intuitive effect. For the gray multi-focus set (Table 3), the proposed method is superior to other algorithms in texture detail and sharpness. Thus, MI, PSNR, SD, and Q^Y are better. Compared with color images, edge features of grayscale images are weak. The proposed algorithm extracts more feature information from GBIMFs and less background information, resulting in unsatisfactory FMI and Q^VIF indicators.

For the medical image set (Table 4), the proposed algorithm is optimal in all six indicators, which reflects the effectiveness of the significant feature extraction effect, and it is also clearer and more comfortable from the visual perspective, as displayed in Figure 11.

For the infrared and visual images set (Table 5), the six indicators of the proposed algorithm except FMI are optimal. The FMI indicator is deficient in mutual feature information after fusion due to the small difference of background environment. This is because, in the process of using BEMD to guide significant feature extraction, some background information is lost due to interpolation and iteration. However, our method still highlights the significance target well, only the richness of feature information needs to be further improved.

As for the time index, the time performance of all data sets is not optimal, which is related to the weighted overlap fusion strategy used in the algorithm, and the final fusion image can only be obtained by repeated calculation of the comparative sharpness value. The advantage of the algorithm in time is reduced.

To sum up, the proposed algorithm still outperforms the other algorithms in several aspects for all image sets. Therefore, from the perspective of objective evaluation indicators, the proposed method has a good processing effect on images with high resolution and rich image information.

In addition to the six transform domain methods of LP, RP, DWT, DTCWT, CVT, and NSCT, we provide a deep convolutional neural network [22] method to simulate and compare with our method, as shown in Table 6. For the data comparison between the deep convolutional neural network and the proposed algorithm for four datasets (Table 6), the proposed method is superior to the deep convolutional neural network method in five indicators in the color multi-focus data set. For indicators in gray multi-focus sets, medical multi-modal data sets, and infrared and visible data sets, except for MI indicators in infrared and visible images, the algorithm proposed in this paper is generally superior to the deep convolutional neural network algorithm, especially in terms of time indicators, and the time fluctuation of the deep convolutional neural network algorithm on different data sets is large. The proposed algorithm consumes less time and has less fluctuation in different data sets.

6. Discussion and Conclusions

Image fusion is a hot research topic in the fields of medicine, machine vision, photography nowadays. The work of this article is an attempt of multi-focal and multi-modal image fusion. Our method can preserve important image features and rich detailed information.

All kinds of fusion methods play a role in the algorithm and improve the quality of image fusion. In our approach, image quality is improved at each stage. Our algorithm decomposes the image into IMFs and residuals via BEMD. At this time, the image obtained by IMFs and residual after interpolation process is noisy. We remove the noise and smooth the image by Gaussian filtering. In the fusion process, the improved eight-direction Sobel operator is combined with the weighting function to calculate the sharpness value. The sharpness obtained is the key to the next step of fusion, and the fusion strategy of segmentation and overlap is selected in the final fusion strategy, which can make the image edges smoother and the obtained image details more obvious.

All in all, from a visual point of view, it can be seen from Figure 9, Figure 10, Figure 11 and Figure 12 and Table 2, Table 3, Table 4 and Table 5 that the fusion image quality produced by our proposed method is superior to other fusion methods, which proves the superiority of our method. While the proposed work shows advantages over recent fusion approaches, the work still has limitations. The proposed work has the best performance in multi-focal image fusion, but the performance is slightly worse in multi-modal image fusion. And the calculation time of this method is not ideal. In future studies, we will emphasize reducing time consumption while maintaining improved fusion image quality. At the same time, our approach is still likely to continue to improve performance in the following aspects. Our method can reduce artifacts in multi-focus image boundary region fusion, as shown in Figure 9. In multi-focal image fusion, this is an open question, and a recent study [23] has given us a lot of inspiration in this regard. Such problems can be reduced by means of image enhancement. In the aspect of multi-modal image fusion, contrast, illumination, noise, and other factors have a great impact on the accurate judgment of the image, and a preprocessing method and image enhancement [24,25] can improve the imaging effect. The pre-processing of these two data sets will be the main direction of our subsequent improvement.

Author Contributions

Conceptualization, X.-T.G. and X.-J.D.; methodology, X.-T.G. and X.-J.D.; software, X.-T.G. and X.-J.D.; validation, X.-T.G., X.-J.D. and H.-H.K.; formal analysis, X.-T.G. and X.-J.D.; investigation, X.-J.D.; resources, X.-T.G.; data curation, X.-J.D.; writing—original draft preparation, X.-J.D.; writing—review and editing, X.-J.D.; visualization, X.-J.D.; supervision, X.-T.G. and H.-H.K.; project administration, X.-T.G. and H.-H.K.; funding acquisition, X.-T.G. and H.-H.K. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by the Fundamental Research Program of Shanxi [202103021223194], and Shanxi Key Laboratory of Signal Capturing and Processing.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

The data used to support the findings of this study are available upon the request.

Conflicts of Interest

The authors declare no conflict of interest.

References

Nunes, J.C.; Bouaoune, Y.; Delechelle, E.; Niang, O.; Bunel, P. Image analysis by bidimensional empirical mode decomposition. Image Vis. Comput. 2003, 21, 1019–1026. [Google Scholar] [CrossRef]
Cheng, H.; Shen, H.; Meng, L.; Ben, C.; Jia, P. A Phase Correction Model for Fourier Transform Spectroscopy. Appl. Sci. 2024, 14, 1838. [Google Scholar] [CrossRef]
Régal, X.; Cumunel, G.; Bornert, M.; Quiertant, M. Assessment of 2D Digital Image Correlation for Experimental Modal Analysis of Transient Response of Beams Using a Continuous Wavelet Transform Method. Appl. Sci. 2023, 13, 4792. [Google Scholar] [CrossRef]
Xie, Q.; Hu, J.; Wang, X.; Zhang, D.; Qin, H. Novel and fast EMD-based image fusion via morphological filter. Vis. Comput. 2022, 39, 4249–4265. [Google Scholar] [CrossRef]
Li, L.; Li, C.; Lu, X.; Wang, H.; Zhou, D. Multi-focus image fusion with convolutional neural network based on Dempster-Shafer theory. Optik 2023, 272, 170223. [Google Scholar] [CrossRef]
Huang, W.; Jing, Z. Evaluation of focus measures in multi-focus image fusion. Pattern Recognit. Lett. 2007, 28, 493–500. [Google Scholar] [CrossRef]
Ojdanić, D.; Zelinskyi, D.; Naverschnigg, C.; Sinn, A.; Schitter, G. High-speed telescope autofocus for UAV detection and tracking. Opt. Express 2024, 32, 7147–7157. [Google Scholar] [CrossRef]
Pan, J.; Tang, Y.Y. A mean approximation based bidimensional empirical mode decomposition with application to image fusion. Digit. Signal Process. 2016, 50, 61–71. [Google Scholar]
Garg, B.; Sharma, G. A quality-aware Energy-scalable Gaussian Smoothing Filter for image processing applications. Microsystems 2016, 45, 1–9. [Google Scholar] [CrossRef]
Vairalkar, M.K.; Nimbhorkar, S.U. Edge detection of images using Sobel operator. Int. J. Emerg. Technol. Adv. Eng. 2012, 2, 291–293. [Google Scholar]
Zhang, H.; Shen, H.; Yuan, Q.; Guan, X. Multispectral and SAR Image Fusion Based on Laplacian Pyramid and Sparse Representation. Remote Sens. 2022, 14, 870. [Google Scholar] [CrossRef]
Pampanoni, V.; Fascetti, F.; Cenci, L.; Laneve, G.; Santella, C.; Boccia, V. Analysing the Relationship between Spatial Resolution, Sharpness and Signal-to-Noise Ratio of Very High Resolution Satellite Imagery Using an Automatic Edge Method. Remote. Sens. 2024, 16, 1041. [Google Scholar] [CrossRef]
Liu, Y.; Liu, S.; Wang, Z. A general framework for image fusion based on multi-scale transform and sparse representation. Inf. Fusion 2015, 24, 147–164. [Google Scholar] [CrossRef]
Ioannidou, S.; Karathanassi, V. Investigation of the Dual-Tree Complex and Shift-Invariant Discrete Wavelet Transforms on Quickbird Image Fusion. IEEE Geosci. Remote Sens. Lett. 2007, 4, 166–170. [Google Scholar] [CrossRef]
Zhan, L.; Zhuang, Y.; Huang, L. Infrared and visible images fusion method based on discrete wavelet transform. J. Comput. 2017, 28, 57–71. [Google Scholar] [CrossRef]
Zhao, X.; Jin, S.; Bian, G.; Cui, Y.; Wang, J.; Zhou, B. A Curvelet-Transform-Based Image Fusion Method Incorporating Side-Scan Sonar Image Features. J. Mar. Sci. Eng. 2023, 11, 1291. [Google Scholar] [CrossRef]
Anandhi, D.; Valli, S. An algorithm for multi-sensor image fusion using maximum a posteriori and nonsubsampled contourlet transform. Comput. Electr. Eng. 2018, 65, 139–152. [Google Scholar] [CrossRef]
Liu, Y.; Wang, L.; Cheng, J.; Li, C.; Chen, X. Multi-focus image fusion: A Survey of the state of the art. Inf. Fusion 2020, 64, 71–91. [Google Scholar] [CrossRef]
Haghighat, M.B.A.; Aghagolzadeh, A.; Seyedarabi, H. A non-reference image fusion metric based on mutual information of image features. Comput. Electr. Eng. 2011, 37, 744–756. [Google Scholar] [CrossRef]
Guo, W.; Xiong, N.; Chao, H.-C.; Hussain, S.; Chen, G. Design and Analysis of Self-Adapted Task Scheduling Strategies in Wireless Sensor Networks. Sensors 2011, 11, 6533–6554. [Google Scholar] [CrossRef]
Wunsch, L.; Tenorio, C.G.; Anding, K.; Golomoz, A.; Notni, G. Data Fusion of RGB and Depth Data with Image Enhancement. J. Imaging 2024, 10, 73. [Google Scholar] [CrossRef] [PubMed]
Liu, Y.; Chen, X.; Peng, H.; Wang, Z. Multi-focus image fusion with a deep convolutional neural network. Inf. Fusion 2017, 36, 191–207. [Google Scholar] [CrossRef]
Bhutto, J.A.; Lianfang, T.; Du, Q.; Soomro, T.A.; Lubin, Y.; Tahir, M.F. An enhanced image fusion algorithm by combined histogram equalization and fast gray level grouping using multi-scale decomposition and gray-PCA. IEEE Access 2020, 8, 157005–157021. [Google Scholar] [CrossRef]
Bhutto, J.A.; Tian, L.; Du, Q.; Sun, Z.; Yu, L.; Soomro, T.A. An improved infrared and visible image fusion using an adaptive contrast enhancement method and deep learning network with transfer learning. Remote Sens. 2022, 14, 939. [Google Scholar] [CrossRef]
Bhutto, J.A.; Tian, L.; Du, Q.; Sun, Z.; Yu, L.; Tahir, M.F. CT and MRI medical image fusion using noise-removal and contrast enhancement scheme with convolutional neural network. Entropy 2022, 24, 393. [Google Scholar] [CrossRef]

Figure 1. Multi-focus color image fusion. (a) Source image A, (b) source image B, (c) fused image. Image A is a picture with a clear foreground and a fuzzy background, and image B is a picture with a fuzzy foreground and a clear background. Fused image C exhibits both a clear foreground and a clear background.

Figure 2. Flow diagram of BEMD algorithm. Decompose the input image signal f(x,y) into multiple 2D image layers including BIMFs and residual.

Figure 3. Multi-source image fusion based on BEMD and region sharpness guidance region overlapping algorithm.

Figure 4. Improved Sobel operator of Tenengrad function. There are 8 operator templates in 8 directions.

Figure 5. (a) DFT operator, (b) Laplacian operator, and (c) improved Sobel operator.

Figure 6. Region overlapping for 7 × 12 image with window size of 6 × 6 and sliding step size of 2.

Figure 7. (a) Source image A, (b) source image B, (c) pixel-based fusion strategy, (d) image fusion based on region energy comparison, (e) image fusion based on region sharpness comparison.

Figure 8. (a) Color multi-focus, (b) gray multi-focus, (c) medical images, (d) infrared and visible images.

Figure 9. Comparison of different algorithms for color multi-focus image fusion. (a) LP algorithm, (b) RP algorithm, (c) DWT algorithm, (d) DTCWT algorithm, (e) CVT algorithm, (f) NSCT algorithm, (g) our algorithm.

Figure 10. Comparison of different algorithms for gray multi-focus image fusion. (a) LP algorithm, (b) RP algorithm, (c) DWT algorithm, (d) DTCWT algorithm, (e) CVT algorithm, (f) NSCT algorithm, (g) our algorithm.

Figure 11. Comparison of different algorithms for medical multi-model image fusion. (a) LP algorithm, (b) RP algorithm, (c) DWT algorithm, (d) DTCWT algorithm, (e) CVT algorithm, (f) NSCT algorithm, (g) our algorithm.

Figure 12. Comparison of different algorithms for Infrared and visible image fusion. (a) LP algorithm, (b) RP algorithm, (c) DWT algorithm, (d) DTCWT algorithm, (e) CVT algorithm, (f) NSCT algorithm, (g) our algorithm.

Figure 13. Quantitative results of our method with six representative methods and six evaluation indicators on the widely used dataset. Notably, the red dotted lines describe the results of our method. The 54 image pairs are sequentially color and gray multi-focus, medical multi-mode, and infrared and visible multi-mode images.

Table 1. Number of overlapping layers in different regions.

Region	Stack Layer
a,h,p,v	1
b,f,I,o,q,u	2
c,d,e,r,s,t	3
j,n	4
k,l,m	6

Table 2. Comparison results of different algorithms in color multi-focus image set.

Methods	MI	FMI	PSNR	SD	Q^Y	Q^VIF	TIME
LP	0.9811	0.9009	35.5011	10.2568	0.9657	1.5879	0.0170
RP	0.9678	0.8972	34.9039	10.2615	0.9547	1.5732	0.3491
DWT	0.9075	0.8989	34.9948	10.2527	0.9461	1.5197	0.7552
DTCWT	0.9401	0.9006	35.1847	10.2470	0.9647	1.5482	1.4975
CVT	0.9041	0.9001	35.1226	10.2472	0.9447	1.5347	6.2568
NSCT	0.9526	0.9006	35.2878	10.2500	0.9616	1.5667	17.1029
Ours	1.1235	0.9010	36.4724	10.2797	0.9791	1.6023	1.0652

Table 3. Comparison results of different algorithms in gray multi-focus image set.

Methods	MI	FMI	PSNR	SD	Q^Y	Q^VIF	TIME
LP	0.9519	0.9024	35.7084	9.7644	0.9264	1.5326	0.0315
RP	0.9405	0.8964	34.8109	9.7586	0.9148	1.5020	0.0916
DWT	0.8481	0.8970	34.9988	9.7636	0.8883	1.4222	0.1350
DTCWT	0.8969	0.9015	35.4490	9.7546	0.9260	1.4885	0.3092
CVT	0.9476	0.9004	35.3754	9.7594	0.9025	1.4694	1.7779
NSCT	0.9138	0.9009	35.5876	9.7618	0.9275	1.5062	4.8504
Ours	1.0506	0.8897	35.7860	10.1688	0.9616	1.5071	0.3329

Table 4. Comparison results of different algorithms in medical multi-modal image set.

Methods	MI	FMI	PSNR	SD	Q^Y	Q^VIF	TIME
LP	0.7116	0.8965	31.9464	9.4535	0.8205	0.9016	0.0570
RP	0.7171	0.8689	30.6170	9.4336	0.7635	0.8500	0.0259
DWT	0.6736	0.8828	31.8465	9.3930	0.7918	0.7842	0.0755
DTCWT	0.6272	0.8928	31.7666	9.4326	0.6999	0.7896	0.1352
CVT	0.5818	0.8873	31.4522	9.4282	0.5682	0.7174	0.7658
NSCT	0.6685	0.8929	32.0096	9.4237	0.7483	0.8430	1.6713
Ours	0.7819	0.8984	32.1865	9.5780	0.9616	0.9579	1.1853

Table 5. Comparison results of different algorithms in infrared and visual image sets.

Methods	MI	FMI	PSNR	SD	Q^Y	Q^VIF	TIME
LP	0.4047	0.8052	28.4983	8.4960	0.8295	0.9444	0.0061
RP	0.3667	0.8562	27.6723	8.3892	0.6932	0.8668	0.0370
DWT	0.3844	0.8923	28.4818	8.4584	0.7630	0.7499	0.0633
DTCWT	0.3756	0.9013	28.4693	8.3869	0.8079	0.7787	0.1444
CVT	0.3513	0.8988	28.4634	8.3969	0.7668	0.7252	0.8411
NSCT	0.3864	0.9025	28.4964	8.4097	0.8275	0.8329	2.3144
Ours	0.5300	0.8991	29.3041	8.9991	0.8328	0.9826	1.0568

Table 6. Comparison of deep convolutional neural network and the algorithm in this paper on 4 data sets.

Data Set	Methods	MI	FMI	PSNR	SD	Q^Y	Q^VIF	TIME
Color multi-focus set	CNN	1.1512	0.8913	36.4127	10.1764	0.9801	1.6011	85.1135
Color multi-focus set	Ours	1.1235	0.9010	36.4724	10.2797	0.9791	1.6023	1.0652
Gray multi-focus set	CNN	0.9976	0.8762	35.4385	9.9277	0.9386	1.4995	49.1315
Gray multi-focus set	Ours	1.0506	0.8897	35.7860	10.1688	0.9616	1.5071	0.3329
Medical multi-modal set	CNN	0.7362	0.8928	31.4792	9.5564	0.7982	0.8869	24.9169
Medical multi-modal set	Ours	0.7819	0.8984	32.1865	9.5780	0.9616	0.9579	1.1853
Infrared and visual sets	CNN	0.6283	0.7936	28.5707	8.5180	0.7920	0.8163	23.1417
Infrared and visual sets	Ours	0.5300	0.8991	29.3041	8.9991	0.8328	0.9826	1.0568

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Guo, X.-T.; Duan, X.-J.; Kong, H.-H. Multi-Source Image Fusion Based on BEMD and Region Sharpness Guidance Region Overlapping Algorithm. Appl. Sci. 2024, 14, 7764. https://doi.org/10.3390/app14177764

AMA Style

Guo X-T, Duan X-J, Kong H-H. Multi-Source Image Fusion Based on BEMD and Region Sharpness Guidance Region Overlapping Algorithm. Applied Sciences. 2024; 14(17):7764. https://doi.org/10.3390/app14177764

Chicago/Turabian Style

Guo, Xiao-Ting, Xu-Jie Duan, and Hui-Hua Kong. 2024. "Multi-Source Image Fusion Based on BEMD and Region Sharpness Guidance Region Overlapping Algorithm" Applied Sciences 14, no. 17: 7764. https://doi.org/10.3390/app14177764

APA Style

Guo, X.-T., Duan, X.-J., & Kong, H.-H. (2024). Multi-Source Image Fusion Based on BEMD and Region Sharpness Guidance Region Overlapping Algorithm. Applied Sciences, 14(17), 7764. https://doi.org/10.3390/app14177764

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Multi-Source Image Fusion Based on BEMD and Region Sharpness Guidance Region Overlapping Algorithm

Abstract

1. Introduction

2. The Related Work

3. Bidimensional Empirical Mode Decomposition

4. Multi-Source Image Fusion Algorithm

4.1. Selection Strategy by Sharpness Comparison

4.2. Improved Sobel Weighting Operator

4.3. The Maximum Sharpness Value Guidance of the Region Overlapping Fusion Algorithm

5. Experimental Results

5.1. Qualitative Evaluation

5.2. Quantitative Evaluation

6. Discussion and Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI