*2.6. Dictionary Learning*

The high-resolution dictionary and the low-resolution dictionary were constructed via Equation (7) using the corresponding pair of training images. These are shown as Equation (8).

$$\begin{cases} X^{\text{high}} = D^{\text{high}} \beta, \\ X^{\text{low}} = D^{\text{low}} \beta \end{cases} \tag{7}$$

$$X = D\beta,\tag{8}$$

where <sup>β</sup> <sup>∈</sup> <sup>R</sup>*Nd*×*Nt* represents the sparse representation and *<sup>X</sup>* <sup>=</sup> *Xhigh <sup>X</sup>low* , *D* = *Dhigh Dlow* . The dictionary was obtained by solving the optimization problem of Equation (9), where the regularization conditions and constraints are added to Equation (8).

$$\underset{D\_{\text{eff}}}{\operatorname{argmin}} \frac{1}{2} \|X - D\beta\|\_{2}^{2} + \lambda \|\beta\|\_{1} \text{ s.t. } \|d\_{i}\|\_{2} \le 1, \; i = 1, 2, \cdots, N\_{t}. \tag{9}$$

where λ is the normalization parameter.

The training data used for dictionary learning were training PAN images for the high-resolution dictionary and the feature map obtained from its corresponding low-resolution training PAN image for the low-resolution dictionary as shown in Table 1. *Xhigh* and *Xlow* were obtained from a high-resolution training PAN image and its corresponding low-resolution training PAN image. Given a high-resolution training PAN image, it was divided into regions of size *p* × *p*. The high-resolution patch *x high <sup>i</sup>* was then obtained for each region by *x high <sup>i</sup>* = *x high raw*,*<sup>i</sup>* − *x high raw*,*i* , where *x high raw*,*<sup>i</sup>* is the *p* × *p* image and *x high raw*,*<sup>i</sup>* is the mean intensity value of *x high raw*,*i* , and *<sup>X</sup>high* = <sup>+</sup> *x high* <sup>1</sup> , ··· , *x high Nt* , is obtained. Given the low-resolution training PAN image, the feature map was calculated with four filters of the first derivative and the second derivative defined by

$$F\_1 = [-1, 0, 1], \; F\_2 = F\_{1'}^T, \; F\_3 = [1, 0, -2, 0, 1], \; F\_4 = F\_{3'}^T$$

where *T* indicates transposition. The feature map was divided into patches of *p* × *p*, and each patch was normalized. Since the feature map was calculated from the entire image, each patch contained the information of its adjacent patch. By arranging the normalized feature map in the raster scan order for

$$\text{Each patch, } X^{low} = \begin{Bmatrix} \mathbf{x}\_1^{low}, \dots, \mathbf{x}\_{N\_l}^{low} \end{Bmatrix}, \mathbf{x}\_i^{low} = \begin{bmatrix} F\_1(i) \\ F\_2(i) \\ F\_3(i) \\ F\_4(i) \end{bmatrix}, i = 1, 2, \dots, N\_l \text{ was obtained.}$$

The algorithm of the dictionary learning is described as follows.


(3) Estimate the sparse representation β by solving the optimization problem of Equation (10) by fixing the dictionary *D*.

$$\beta = \operatorname\*{argmin}\_{\beta} \frac{1}{2} \|X - D\beta\|\_2^2 + \lambda \|\beta\|\_1 \tag{10}$$

(4) Estimate the dictionary *D* by solving the optimization problem of Equation (11) by fixing the sparse representation β.

$$D = \underset{D}{\text{argmin}} \|X - D\beta\|\_2^2 \text{ s.t. } \|d\_i\|\_2 \le 1, \ i = 1, 2, \cdots, N\_t \tag{11}$$

