**5. Discussion**

#### *5.1. Choice of Scales*

To select a proper scale, building, freeway, and airplane, three classes of test images are used to analyze. Figure 13 shows the testing results of networks with different scales. In the single-scale network, recurrent modules ConvLSTM are replaced by a convolution layer to keep the same number of convolution layers.

We can observe that as the scale increases, the values of the three evaluation metrics are all improved. It suggests that exploiting the multi-scale information can help with improving network performance. However, we can meanwhile find that the improvement is small while the scale is greater than 3. We thus choose *s*=3 in our network to balance the network performance and complexity.

**Figure 13.** Test results of proposed network with scale 1, 2, 3, 4 when nosie level L = 2.

#### *5.2. Loss Function*

The influence of loss function on network performance is also discussed in this paper. Instead of L1 loss, L2 loss function is used to train our MSR-net. L2 norm loss, also called Euclidean loss, is the most commonly used loss function in despeckling tasks. It is defined as:

$$\mathcal{L}\_2(\Theta) = \frac{1}{2N} \sum\_{y=1}^{W} \sum\_{x=1}^{H} \left\| \, \varphi \left( \mathfrak{X}(x, y); \Theta \right) - \mathbf{C}(x, y) \right\|\_2^2 \tag{14}$$

where Θ is the filter parameter that needs to be updated during the training process, **C** is the ground truth image without noise, **X** is the input image with speckle noise, and *ϕ* (·) is the output after despeckling. The purpose of training network is to minimize the cost. Smaller loss value suggests a smaller error between the network output and its corresponding ground truth.

**Figure 14.** Evaluation index (PSNR, SSIM, and EFKR) values of MSR-net with L1 and L2 Loss.

As shown in Figure 14, the network trained by L2 loss function is more likely to obtain a higher PSNR only for building images and the network trained by L1 loss function can obtain both slightly higher PSRN and SSIM with the other images. But for EFKR, the advantage of L1 loss is significant compared to L2 loss. Generally speaking, the L1 loss is more suitable to SAR despeckling task.
