**5. Results**

### *5.1. Image Quality Assessment*

Figures 2 and 3 show examples of MR images generated by proposed super-resolution networks and their original HR images, and their magnified view, respectively. In Table 1, the average SSIM and PSNR between super-resolved images using each method and their ground-truth high-resolution images are also summarized.

The output images of the network without SPS, i.e., plain ESRGAN (column (2)), are visibly blurry, and most of the structural features are lost, leading to the lower SSIM/PSNR value. With the proposed SPS (column (3)), generated images are significantly sharper and visibly natural-looking. However, grid-shaped intensity shifts appear at the joints of each patch (Figure 3 (3)). On the other hand, almost all the intensity shifts are suppressed with the images with the proposed discriminator (column (4)).

**Figure 2.** Examples of generated images with their input and ground-truth images. (1) low-resolution input image, (2) ESRGAN without SPS and ASD, (3) ESRGAN + SPS, **(4) ESRGAN + SPS + ASD (proposed)**, and (5) ground truth high-resolution image.

(1) LR image

(2) ESRGAN **Figure3.***Cont*.

(3) ESRGAN + SPS

**(4) ESRGAN + SPS + ASD (proposed)** (5) ground truth high-resolution image **Figure 3.** Magnified example of the the generated image.

**Table 1.** Values of image quality measurements with each method.


### *5.2. Effectiveness on Classification Performance*

Table 2 shows the diagnostic performances of the Alzheimer's disease classifier trained and tested with the super-resolved images. The performance of pure ESRGAN is abysmal because it fails to generate images, and the proposed method outperforms the other methods.

**Table 2.** AUC scores on Alzheimer's disease diagnosis of the networks trained with images generated by each SR method.


Table 3 and Figure 4 summarize the classification accuracies with the downsampled images with/without super-resolution. For the sake of better comparison, results on original (non-downsampled) images are also listed/plotted in them.

**Table 3.** AUC scores on Alzheimer's disease diagnosis of the networks trained with downsampled images.


**Figure 4.** AUC score comparison of networks, trained with downsampled images.

### **6. Discussion**

### *6.1. Qualities of Generated SR Images*

ESRGAN, a sophisticated super-resolution method based on GAN that requires many training images, could not generate any images at all with about 30 training images. This result is worse than the result from the bi-cubic interpolation of LR images. By introducing patch learning with the proposed SPS, we can confirm that it is possible to generate images with a certain level of accuracy. However, as mentioned earlier, discontinuities between patches are noticeable.

With the introduction of the proposed ASD, the discontinuities are mostly suppressed and achieved to generate more natural-looking super-resolution images. Besides the quantitative metrics such as PSNR and SSIM, the line-profile shown in Figure 5 shows that the proposed method can generate the details of finer tissues, which are known to be more challenging to capture with conventional MRI scanners.

In regular GAN training, a generator and a discriminator are trained adversarially. The discriminator indirectly lets the generator learn to make more natural images by trying to identify whether the input is "real" or "fake" (i.e., HR or SR). On the other hand, the proposed ASD takes a two-channeled input, which always contains a HR image in one of the channels. Therefore, in addition to the usual effect, the discriminator itself learns the SR image closer to the HR by implicitly giving the information of the relative location of a patch in a whole-brain image. In this regard, an overfitting effect could be expected because of more information given to the networks during training. Nevertheless, from our experiment with different patients, no adverse effect was confirmed.

#### *6.2. Impact of Super-Resolution on the Disease Classification Problem*

The removal of unwanted boundary discontinuities by ASD resulted in an improved AUC score by 5.4% in diagnosing Alzheimer's disease. The increased visibility of essential structures, as shown in Figure 5, is thought to have contributed to the improved diagnostic performance.

In the experiment with downsampled images, first, we confirmed that the AUC score drops as the input resolution decreases, as we intuitively expected. From Table 3, it can be said that the images enhanced by the proposed method can boost the performance up to 45% closer to the possible upper bound of score.

**Figure 5.** Line profile of the thalamus. The thalamus contains subnuclei with characteristic signal intensity, but it is challenging to identify thalamic subnuclei because low-resolution MRI does not provide sufficient contrast. Our SR method can obtain intrathalamic contrast equivalent to that of HR images.

### *6.3. Limitations and Future Work*

In the proposed method, a low-resolution training image is obtained by simply downsampling the corresponding high-resolution image. However, the actual differences between images with high-field scanners and ordinary ones are not just image resolution but also intensity contrasts, the amount of noise, and so on. The network would perform better if we trained it with high-field and actual ordinary scanners. In the future, we will use more HR images to develop a better method.
