*2.2. Image Fusion Using Low-Rank and Sparse Matrix*

According to the fusion algorithm processing flow in Figure 4, the image fusion algorithm was verified. In the process, the diagram of a matrix structure of the source image for multifocus image of NVG was as shown in Figure 6. In particular, **I** is a two-dimenisonal image matrix, *t* is the total number of images, the image height is *n*, and the width is *m*.

**Figure 6.** Diagram of matrix structure of the source image.

First, single source images were converted into one-dmeiensional vectors. Data arrangement in this step was as shown in Figure 7, where **IR** was a one-dimensional vector. Each one-dimensional vector **IR** *<sup>t</sup>* was sequenced from top to bottom according to frame sequence. After combination, it was named **D** matrix (data matrix).

**Figure 7.** Diagram of the image matrix converted into one-dimensional vectors.

Then, the deep semi-NMF model method proposed by Trigeorgis, Bousmalis, Zafeiriou, and Schuller [34] was used here to obtain the low-dimensional representation. The equation is as shown in Equation (1):

$$\mathbf{D} \approx \mathbf{Z} \times \mathbf{H}.\tag{1}$$

In particular, **D** is data matrix, **Z** is loadings matrix, and **H** is features matrix. This study adopted the method with low-dimesional characteristics to obtain **A** and **E** matrices. **A** is low-rank matrix and **E** is sparse matrix:

$$\mathbf{A} = \mathbf{Z} \times \mathbf{H},\tag{2}$$

$$\mathbf{E} = \mathbf{D} - \mathbf{A}.\tag{3}$$

In **A** matrix, the relationship bewteen respective row vectors and images was as shown in Equation (4):

$$\mathbf{A} = \begin{bmatrix} \mathbf{^A\_1\mathbf{I}^R} \\ \mathbf{^A\_1\mathbf{I}^R} \\ \vdots \\ \mathbf{^A\_1\mathbf{I}^R} \end{bmatrix} \mathbf{^A} \mathbf{I}\_1^R \text{; the size is 1 by L} \tag{4}$$

In particular, the length of *L*is*n* × *m*. In **A** matrix, the respective row vectors are, **AIR** <sup>1</sup> , **AIR** <sup>2</sup> , ... , **AIR** *t* , respectively. In this paper, **D** matrix formed by respective images was decomposed into **A** and **E** matrices. Therefore, these two charactersitics were targeted for processing. In particular, **A** process involved reshaping the row vectors from **AIR** <sup>1</sup> through **AIR** *<sup>t</sup>* to the two-dimensional images of **AI**<sup>1</sup> <sup>−</sup> **AI***<sup>t</sup>* and obtaining the mean value. The result was called **AIbest**, and the process was as shown in Equation (5):

$$\mathbf{^A I\_{best}} = \left(\mathbf{^A I\_1} + \mathbf{^A I\_2} + \dots + \mathbf{^A I\_l}\right) / t \tag{5}$$

The first image corresponding to sparse matrix **EI**<sup>1</sup> and mask offset manipulation was as shown in Figure 8.

$$\prod\_{\tau^\*} \tau^\*$$

**Figure 8.** Diagram of the corresponding image in sparse matrix and mask offset.

The corresponding image in sparse matrix was as shown in Equations (6) and (7):

$$\text{Output} \newline d\_z^x = \begin{array}{c} \text{argmax} \\ x \in \left[1, \, t\right] \end{array} \left( \text{Var} \Big(\_{\mathbf{m}\_{\varepsilon}} \mathbf{I}\_x \Big) \right) \newline Z = 1, \, 2, \, \dots, \text{len} \tag{6}$$

$$\mathbf{^EI\_{best}} = \mathbf{^EI}(OptIndex\_z^x) \tag{7}$$

In particular, Var in Equation (6) is the computed variance, and *len* is the total number of images undergoing mask offset. The best label obtained according to Equation (6) was used to acquire the **EIbest** image in Equation (7), through which the best edge information was retained. Finally, the image obtained in Equation (5) was added to the images in **AIbest** and Equation (7) to get the best fusion image **Ibest**, as shown in Equation (8):

$$\mathbf{I}\_{\rm best} = \,^\mathbf{A}\mathbf{I}\_{\rm best} + \,^\mathbf{E}\mathbf{I}\_{\rm best} \tag{8}$$

#### *2.3. Autofocus Using Sparse Matrix*

In addition to image fusion, autofocusing using sparse matrix method process was also put forward in this study. The sparse feature of sparse matrix was mainly used to test the focus stripe correlation generated from the testing bench. Since low-rank matrix had the main components of focus stripe, it had the same signficiance pointed out in Equation (3) where sparse matrix was the orignal image subtracted by low-rank matrix. Hence, the correpsonding frame information in sparse matrix can be used as a reference for sharpness. This concept was applied in the acutal practice. First, 110 images of different focal distances were compiled into **D** matrix accoridng to the arrangement diagram in Figures 6 and 7. As shown in Equations (1) through (4), they were decomposed into low-rank matrix **A** and sparse matrix **E**. Then, *EIR <sup>i</sup>* , *i* = 1,2,3 ... *t* corrsponding to different focal distances in *E* matrix were directly used to calulate the images corresponding to single *EIR <sup>i</sup>* , to tally the results. An image frame corresponding to the lowest value was the sharpest frame. In particular, the lowest point in Figure 4 was the frame with the best sharpness. The calculation can be simplified into Equation (9).

$$\begin{array}{rcl} FP &=& \underset{i \in \{1, \, t\}}{\operatorname{argmin}} \left( \sum\_{k=1}^{N} \, \_iI\_i^R(k) \right) \\ &=& \underset{i \in \{1, \, t\}}{\operatorname{argmin}} \left( \sum\_{x=1}^{m} \sum\_{y=1}^{n} \, \_iI\_i(x, y) \right) \end{array} \tag{9}$$

where *FP* is focus position and *N* = *m* × *n* is image size.

#### **3. Experiment Results and Discussion**

#### *3.1. Image Fusion Results*

According to the image fusion method propsed in Figure 4 and the based-on variance calculated in discrete cosine transform domain without consistency verification (DctVar) and based-on variance calculated in discrete cosine transform domain with consistency verification (DctVarCv) [12,13] and structure-aware image fusion (SAIF) [11] methods, comparison and verification were carried out. This study was to evaluate the quality of image fusion using relevant indicators compiled by Liu et al. [42]. The original images verified were as shown in the description of the tested images section. The fusion results of the respective images were as shown from Figures 9–14. Among which, (a) is the mehtod

proposed in this study, (b) is the fused image result of DctVar method, (c) is the fused image result of DctVarCv method, and (d) is the fused image result of SAIF method. The fusion indicator results of vairous images were as shown from Tables 1–6. Among these four fusion methods, gray background's fusion quality indicator pointed out the best results for the 12 indicators, which in seuqence were: *Q*MI, *Q*TE, *Q*NCIE, *QG*, *QM*, *Q*SF, *QP*, *QS*, *QC*, *QY*, *Q*CV, *Q*CB. The higher the value, the better the fusion quality.

(**a**) Proposed method (**b**) DctVar

(**c**) DctVarCv (**d**) SAIF

**Figure 9.** The fusion result of the aircraft image.

(**a**) Proposed method (**b**) DctVar

(**c**) DctVarCv (**d**) SAIF

**Figure 10.** The fusion result of the clock image.

(**c**) DctVarCv (**d**) SAIF

(**a**) Proposed method (**b**) DctVar

(**c**) DctVarCv (**d**) SAIF

**Figure 12.** The fusion result of the lab image.

(**c**) DctVarCv (**d**) SAIF

**Figure 13.** The fusion result of the leopard image.

#### **Figure 14.** The fusion result of the toy image.

**Table 1.** List of aircraft fusion results and fusion quality indicators.


Optimum rule: Index headed the table with maximum points. Normalized mutual information (*Q*MI); Fusion metric based on Tsallis entropy (*Q*TE); Nonlinear correlation information entropy (*Q*NCIE); Gradient-based fusion performance (*QG*); Image fusion metric based on a multiscale scheme (*QM*); Image fusion metric based on spatial frequency (*Q*SF); Image fusion metric based on phase congruency (*QP*); Piella's metric (*QS*); Cvejie's metric (*QC*); Yang's metric (*QY*); Chen–Varshney metric (*Q*CV); Chen–Blum metric (*Q*CB).

**Table 2.** List of clock fusion results and fusion quality indicators.


Optimum rule: Index headed the table with maximum points. Normalized mutual information (*Q*MI); Fusion metric based on Tsallis entropy (*Q*TE); Nonlinear correlation information entropy (*Q*NCIE); Gradient-based fusion performance (*QG*); Image fusion metric based on a multiscale scheme (*QM*); Image fusion metric based on spatial frequency (*Q*SF); Image fusion metric based on phase congruency (*QP*); Piella's metric (*QS*); Cvejie's metric (*QC*); Yang's metric (*QY*); Chen–Varshney metric (*Q*CV); Chen–Blum metric (*Q*CB).


**Table 3.** List of disk fusion results and fusion quality indicators.

Optimum rule: Index headed the table with maximum points. Normalized mutual information (*Q*MI); Fusion metric based on Tsallis entropy (*Q*TE); Nonlinear correlation information entropy (*Q*NCIE); Gradient-based fusion performance (*QG*); Image fusion metric based on a multiscale scheme (*QM*); Image fusion metric based on spatial frequency (*Q*SF); Image fusion metric based on phase congruency (*QP*); Piella's metric (*QS*); Cvejie's metric (*QC*); Yang's metric (*QY*); Chen–Varshney metric (*Q*CV); Chen–Blum metric (*Q*CB).

**Table 4.** List of leopard fusion results and fusion quality indicators.


Optimum rule: Index headed the table with maximum points. Normalized mutual information (*Q*MI); Fusion metric based on Tsallis entropy (*Q*TE); Nonlinear correlation information entropy (*Q*NCIE); Gradient-based fusion performance (*QG*); Image fusion metric based on a multiscale scheme (*QM*); Image fusion metric based on spatial frequency (*Q*SF); Image fusion metric based on phase congruency (*QP*); Piella's metric (*QS*); Cvejie's metric (*QC*); Yang's metric (*QY*); Chen–Varshney metric (*Q*CV); Chen–Blum metric (*Q*CB).

#### *3.2. Image Fusion Results of the Discussion*

In the air craft image fusion result, seven indicators point out that the method in this study derived the best fusion result. From the subjective human eye observation, it was deemed that the fusion result using the DctVar method was clearly the poorest, while the other results were more approximate. In the clock image fusion result, the seven indicdators showed the study derived the best fusion result, while the subjective human eye observation deemed the DctVar and DctVarCv results to be the poorest. The study and the SAIF methods were equally matched in terms of the details. In the disk image fusion result, the five indicators showed the DctVarCv method derived the best fusion result, while the subejctive human eye observation deemed the DctVar and DctVarCv reuslts to be the poorest. The square effect clearly existed in the images. On the other hand, the SAIF method derived the best fusion result. There was a considerable difference between the indicator reuslts and the human eye congition. It was speculated that the square effect exerted less influence on the ratings of the indicdators. In the lab image fusion results, the five indicators showed the DctVarCv derived the best fusion results. The

subejctive human eye observation deemed DctVar and SAIF results to be the poorest, with suqare effect and halo lines in the head region. The DctVarCv and the study derived the best fusion results. In the leopard image fusion results, the seven indicadtors showed SAIF derived the best fusion result, but the respecitve indidactors were basically quite approximate. The subjecitve eye observation deemed the respective methods derived the same results. In the toy image fusion results, the six indicators sshowed DctVarCv derived the best fusion result. The subejctive eye observation deemed the study and SAIF derived the best fusion result, while the square effect existed in the DctVar and DctVarCv in the details. Therefore, it was speculated that the indicators were unable to determine the influence of the square effect.


**Table 5.** List of lab fusion results and fusion quality indicators.

Optimum rule: Index headed the table with maximum points. Normalized mutual information (*Q*MI); Fusion metric based on Tsallis entropy (*Q*TE); Nonlinear correlation information entropy (*Q*NCIE); Gradient-based fusion performance (*QG*); Image fusion metric based on a multiscale scheme (*QM*); Image fusion metric based on spatial frequency (*Q*SF); Image fusion metric based on phase congruency (*QP*); Piella's metric (*QS*); Cvejie's metric (*QC*); Yang's metric (*QY*); Chen–Varshney metric (*Q*CV); Chen–Blum metric (*Q*CB).


**Table 6.** List of toy fusion results and fusion quality indicators.

Optimum rule: Index headed the table with maximum points. Normalized mutual information (*Q*MI); Fusion metric based on Tsallis entropy (*Q*TE); Nonlinear correlation information entropy (*Q*NCIE); Gradient-based fusion performance (*QG*); Image fusion metric based on a multiscale scheme (*QM*); Image fusion metric based on spatial frequency (*Q*SF); Image fusion metric based on phase congruency (*QP*); Piella's metric (*QS*); Cvejie's metric (*QC*); Yang's metric (*QY*); Chen–Varshney metric (*Q*CV); Chen–Blum metric (*Q*CB).

Overall, the respective methods showed advantageousness. Under the premise that the square effect was not considered, the DctVar and DctVarCv results under various indicator rations produced advantageous results. Moreover, under subjective observations, the image details were also sound. Without taking into account the halo effect, SAIF had the best details. Compared to other methods, the study was almost unaffected by the square effect and the halo effect, while DctVar and DctVarCv showed strong square effect. Under subjective observations, it already severely affected the fusion results. In the subjective rating of details, the study was the same as SAIF. However, in terms of indicator ratings, the study received 18 best ratings, which was superior to SAIF with 15 best ratings. Further, the study was not affected by the halo effect; thus the relatively more stable fusion result.

The NVG images in Figure 5m,n were used to test fusion quality. The fusion results of DctVar, DctVarCv, and SAIF methods were as shown in Figure 15a–d. Results showed that the image results using the DctVar method presented many squares that were completely unusable. Hence, the study was significantly superior to the DctVar method. In the surrounding of the focus stripe of DctVarCv, there was a large square, while a circular halo rose in the center for the SAIF. Therefore, the fusion test results of NVG images pointed out the study was also superior to DctVarCv and SAIF.

(**a**) Proposed method (**b**) DctVar

(**c**) DctVarCv (**d**) SAIF

**Figure 15.** The fusion result of the NVG image (60 and 96 degrees).

In addition to the night vision goggle image explored above, this study also observed night vision goggle images fusion with different focal lengths. The result were as shown in Figures 16–18. Intuitively, the fusion results showed that the method proposed in this study has still some incomplete treatments in the peripheral edge area. Otherwise, compared to our methods, it was more advanced than other methods and the discussion was also consistent with the previous paragraph.

(**a**) Proposed method (**b**) DctVar

(**c**) DctVarCv (**d**) SAIF **Figure 16.** The fusion result of the NVG image (61 and 96 degrees).

(**c**) DctVarCv (**d**) SAIF

**Figure 17.** The fusion result of the NVG image (62 and 96 degrees).
