Bidirectional Temporal-Recurrent Propagation Networks for Video Super-Resolution
Abstract
:1. Introduction
- We propose a novel end to end bidirectional temporal-recurrent propagation network, which avoids the complicated combination network of optical estimation and super-resolution. To better integrate the two subnetworks, we take the channel attention mechanism to fuse the extracted temporal and spatial information.
- We propose a progressive up-sampling version of BTRPN. Compared to one-step up-sampling, progressive up-sampling means solving the SR optimization issue in a small solution space, which decreases the difficulty of learning and boosts the performance of reconstructed images.
2. Related Work
2.1. Single-Image Super-Resolution
2.2. Video Super-Resolution
3. The Progressive Up-Sampling Bidirectional Temporal-Recurrent Propagation Network
3.1. Network Architecture
3.2. TRP Unit
3.3. Bidirectional Network
3.4. Attentional Mechanism
3.5. Progressive Up-Sampling
4. Experiments
4.1. Datasets and Training Details
4.2. Model Analysis
4.2.1. Depth and Channel Analysis
4.2.2. Bidirectional Model Analysis
4.2.3. Attention Mechanism
4.2.4. Progressive Up-Sampling
4.3. Comparison with State-of-the-Art Algorithms
4.3.1. Quantitive and Qualitative Comparison
4.3.2. Parameters and Test Time Comparison
5. Conclusions
Author Contributions
Funding
Conflicts of Interest
References
- Zhang, L.; Zhang, H.; Shen, H.; Li, P. A super-resolution reconstruction algorithm for surveillance images. Signal Process. 2010, 90, 848–859. [Google Scholar] [CrossRef]
- Greenspan, H. Super-resolution in medical imaging. Comput. J. 2009, 52, 43–63. [Google Scholar] [CrossRef]
- Cao, L.; Ji, R.; Wang, C.; Li, J. Towards Domain Adaptive Vehicle Detection in Satellite Image by Supervised Super-Resolution Transfer. In Proceedings of the AAAI 2016, Phoenix, AZ, USA, 12–17 February 2016; Volume 35, p. 36. [Google Scholar]
- Ledig, C.; Theis, L.; Huszár, F.; Caballero, J.; Cunningham, A.; Acosta, A.; Aitken, A.; Tejani, A.; Totz, J.; Wang, Z.; et al. Photo-realistic single image super-resolution using a generative adversarial network. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA, 21–26 July 2017; pp. 4681–4690. [Google Scholar]
- Tong, T.; Li, G.; Liu, X.; Gao, Q. Image super-resolution using dense skip connections. In Proceedings of the IEEE International Conference on Computer Vision, Venice, Italy, 22–29 October 2017; pp. 4799–4807. [Google Scholar]
- Tai, Y.; Yang, J.; Liu, X. Image Super-Resolution via Deep Recursive Residual Network. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Honolulu, HI, USA, 21–26 July 2017. [Google Scholar]
- Lai, W.S.; Huang, J.B.; Ahuja, N.; Yang, M.H. Deep Laplacian Pyramid Networks for Fast and Accurate Super-Resolution. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Honolulu, HI, USA, 21–26 July 2017. [Google Scholar]
- Zhang, K.; Zuo, W.; Chen, Y.; Meng, D.; Zhang, L. Beyond a Gaussian Denoiser: Residual Learning of Deep CNN for Image Denoising. IEEE Trans. Image Process. 2017, 26, 3142–3155. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Shi, W.; Caballero, J.; Huszár, F.; Totz, J.; Aitken, A.P.; Bishop, R.; Rueckert, D.; Wang, Z. Real-time single image and video super-resolution using an efficient sub-pixel convolutional neural network. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA, 27–30 June 2016; pp. 1874–1883. [Google Scholar]
- Kappeler, A.; Yoo, S.; Dai, Q.; Katsaggelos, A.K. Video super-resolution with convolutional neural networks. IEEE Trans. Comput. Imaging 2016, 2, 109–122. [Google Scholar] [CrossRef]
- Tao, X.; Gao, H.; Liao, R.; Wang, J.; Jia, J. Detail-Revealing Deep Video Super-Resolution. In Proceedings of the IEEE International Conference on Computer Vision (ICCV), Venice, Italy, 22–29 October 2017. [Google Scholar]
- Xue, T.; Chen, B.; Wu, J.; Wei, D.; Freeman, W.T. Video enhancement with task-oriented flow. Int. J. Comput. Vis. 2019, 127, 1106–1125. [Google Scholar] [CrossRef] [Green Version]
- Liu, D.; Wang, Z.; Fan, Y.; Liu, X.; Wang, Z.; Chang, S.; Huang, T. Robust Video Super-Resolution With Learned Temporal Dynamics. In Proceedings of the IEEE International Conference on Computer Vision (ICCV), Venice, Italy, 22–29 October 2017. [Google Scholar]
- Sajjadi, M.S.M.; Vemulapalli, R.; Brown, M. Frame-Recurrent Video Super-Resolution. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Salt Lake City, UT, USA, 18–22 June 2018. [Google Scholar]
- Hu, J.; Shen, L.; Sun, G. Squeeze-and-Excitation Networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Salt Lake City, UT, USA, 18–22 June 2018. [Google Scholar]
- Yang, Y.; Fan, S.; Tian, S.; Guo, Y.; Liu, L.; Wu, M. Progressive back-projection networks for large-scale super-resolution. J. Electron. Imaging 2019, 28, 033039. [Google Scholar] [CrossRef]
- Liu, C.; Sun, D. On Bayesian Adaptive Video Super Resolution. IEEE Trans. Pattern Anal. Mach. Intell. 2014, 36, 346–360. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Haris, M.; Shakhnarovich, G.; Ukita, N. Recurrent back-projection network for video super-resolution. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Long Beach, CA, USA, 15–20 June 2019; pp. 3897–3906. [Google Scholar]
- Jo, Y.; Oh, S.W.; Kang, J.; Kim, S.J. Deep Video Super-Resolution Network Using Dynamic Upsampling Filters Without Explicit Motion Compensation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Salt Lake City, UT, USA, 18–22 June 2018. [Google Scholar]
- Wang, Z.; Bovik, A.C.; Sheikh, H.R.; Simoncelli, E.P. Image quality assessment: From error visibility to structural similarity. IEEE Trans. Image Process. 2004, 13, 600–612. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Dong, C.; Loy, C.C.; He, K.; Tang, X. Learning a deep convolutional network for image super-resolution. In Proceedings of the European Conference on Computer Vision, Zurich, Switzerland, 6–12 September 2014; pp. 184–199. [Google Scholar]
- Kim, J.; Kwon Lee, J.; Mu Lee, K. Accurate image super-resolution using very deep convolutional networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA, 27–30 June 2016; pp. 1646–1654. [Google Scholar]
- Kim, J.; Lee, J.K.; Lee, K.M. Deeply-Recursive Convolutional Network for Image Super-Resolution. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, NV, USA, 27–30 June 2016. [Google Scholar]
- Haris, M.; Shakhnarovich, G.; Ukita, N. Deep back-projection networks for super-resolution. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA, 18–22 June 2018; pp. 1664–1673. [Google Scholar]
- Zhang, Y.; Li, K.; Li, K.; Wang, L.; Zhong, B.; Fu, Y. Image Super-Resolution Using Very Deep Residual Channel Attention Networks. In Proceedings of the European Conference on Computer Vision (ECCV), Munich, Germany, 8–14 September 2018. [Google Scholar]
- Lim, B.; Son, S.; Kim, H.; Nah, S.; Mu Lee, K. Enhanced deep residual networks for single image super-resolution. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops, Honolulu, HI, USA, 21–26 July 2017; pp. 136–144. [Google Scholar]
- Timofte, R.; Agustsson, E.; Gool, L.V.; Yang, M.H.; Guo, Q. NTIRE 2017 Challenge on Single Image Super-Resolution: Methods and Results. In Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), Honolulu, HI, USA, 21–26 July 2017. [Google Scholar]
- Yang, W.; Zhang, X.; Tian, Y.; Wang, W.; Xue, J.H.; Liao, Q. Deep learning for single image super-resolution: A brief review. IEEE Trans. Multimed. 2019, 21, 3106–3121. [Google Scholar] [CrossRef] [Green Version]
- Isobe, T.; Li, S.; Jia, X.; Yuan, S.; Slabaugh, G.; Xu, C.; Li, Y.L.; Wang, S.; Tian, Q. Video super-resolution with temporal group attention. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Seattle, WA, USA, 13–19 June 2020; pp. 8008–8017. [Google Scholar]
- Tian, Y.; Zhang, Y.; Fu, Y.; Xu, C. TDAN: Temporally-Deformable Alignment Network for Video Super-Resolution. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Seattle, WA, USA, 13–19 June 2020; pp. 3360–3369. [Google Scholar]
- Caballero, J.; Ledig, C.; Aitken, A.; Acosta, A.; Totz, J.; Wang, Z.; Shi, W. Real-Time Video Super-Resolution With Spatio-Temporal Networks and Motion Compensation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Honolulu, HI, USA, 21–26 July 2017. [Google Scholar]
- Li, S.; He, F.; Du, B.; Zhang, L.; Xu, Y.; Tao, D. Fast Spatio-Temporal Residual Network for Video Super-Resolution. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Long Beach, CA, USA, 15–20 June 2019. [Google Scholar]
- Wang, X.; Chan, K.C.; Yu, K.; Dong, C.; Change Loy, C. EDVR: Video Restoration With Enhanced Deformable Convolutional Networks. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) Workshops, Long Beach, CA, USA, 15–20 June 2019. [Google Scholar]
- Nah, S.; Baik, S.; Hong, S.; Moon, G.; Son, S.; Timofte, R.; Mu Lee, K. NTIRE 2019 Challenge on Video Deblurring and Super-Resolution: Dataset and Study. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) Workshops, Long Beach, CA, USA, 15–20 June 2019. [Google Scholar]
- Kingma, D.P.; Ba, J. Adam: A method for stochastic optimization. arXiv 2014, arXiv:1412.6980. [Google Scholar]
- Huber, P.J. Robust estimation of a location parameter. In Breakthroughs in Statistics; Springer: Berlin/Heidelberg, Germany, 1992; pp. 492–518. [Google Scholar]
- Huang, Y.; Wang, W.; Wang, L. Bidirectional recurrent convolutional networks for multi-frame super-resolution. Procedings of the Advances in Neural Information Processing Systems, Montreal, QC, Canada, 7–12 December 2015; pp. 235–243. [Google Scholar]
- Wang, L.; Guo, Y.; Lin, Z.; Deng, X.; An, W. Learning for video super-resolution through HR optical flow estimation. In Proceedings of the Asian Conference on Computer Vision, Perth, Australia, 2 December 2018; pp. 514–529. [Google Scholar]
Calendar | City | Foliage | Walk | Average | |||||||
---|---|---|---|---|---|---|---|---|---|---|---|
Time Axis | Scale | PSNR | SSIM | PSNR | SSIM | PSNR | SSIM | PSNR | SSIM | PSNR | SSIM |
positive | 4 | 22.86 | 0.754 | 27.07 | 0.774 | 25.46 | 0.722 | 29.23 | 0.889 | 26.16 | 0.785 |
reverse | 4 | 22.77 | 0.749 | 27.08 | 0.775 | 25.24 | 0.714 | 29.26 | 0.889 | 26.09 | 0.782 |
Calendar | City | Foliage | Walk | Average | |||||||
---|---|---|---|---|---|---|---|---|---|---|---|
Model | Scale | PSNR | SSIM | PSNR | SSIM | PSNR | SSIM | PSNR | SSIM | PSNR | SSIM |
BTRPN10-64 | 4 | 23.30 | 0.780 | 27.62 | 0.794 | 25.91 | 0.743 | 30.04 | 0.897 | 26.69 | 0.804 |
BTRPN20-64 | 4 | 23.39 | 0.786 | 27.68 | 0.799 | 25.99 | 0.746 | 30.26 | 0.900 | 26.83 | 0.808 |
BTRPN10-128 | 4 | 23.56 | 0.794 | 27.78 | 0.804 | 26.15 | 0.754 | 30.44 | 0.904 | 26.98 | 0.814 |
BTRPN20-128 | 4 | 23.69 | 0.804 | 27.84 | 0.811 | 26.37 | 0.766 | 30.72 | 0.909 | 27.15 | 0.822 |
Model | Scale | Iterations | Parameters | Training Time |
---|---|---|---|---|
BTRPN10-64 | 4 | 50,000 | 670 K | 40 min |
BTRPN20-64 | 4 | 50,000 | 2600 K | 45–50 min |
BTRPN10-128 | 4 | 50,000 | 1040 K | 50–55 min |
BTRPN20-128 | 4 | 50,000 | 4070 K | 1 h |
Model | Scale | Parameters | Test Time |
---|---|---|---|
BTRPN10-64 | 4 | 670 K | 0.016 s |
BTRPN20-64 | 4 | 2600 K | 0.036 s |
BTRPN10-128 | 4 | 1040 K | 0.027 s |
BTRPN20-128 | 4 | 4070 K | 0.066 s |
Calendar | City | Foliage | Walk | Average | |||||||
---|---|---|---|---|---|---|---|---|---|---|---|
Model | Time Axis | PSNR | SSIM | PSNR | SSIM | PSNR | SSIM | PSNR | SSIM | PSNR | SSIM |
TRPN-7L | positive | 23.01 | 0.766 | 27.13 | 0.778 | 25.63 | 0.733 | 29.38 | 0.891 | 26.29 | 0.792 |
TRPN-7L | reverse | 22.92 | 0.760 | 27.14 | 0.779 | 25.26 | 0.718 | 39.41 | 0.892 | 26.18 | 0.787 |
BTRPN-5L | positive | 22.95 | 0.761 | 27.23 | 0.785 | 25.54 | 0.728 | 29.74 | 0.897 | 26.36 | 0.793 |
BTRPN-5L | reverse | 22.95 | 0.761 | 27.23 | 0.785 | 25.54 | 0.728 | 29.73 | 0.897 | 26.36 | 0.793 |
Calendar | City | Foliage | Walk | Average | ||||||
---|---|---|---|---|---|---|---|---|---|---|
Attention Mechanism | PSNR | SSIM | PSNR | SSIM | PSNR | SSIM | PSNR | SSIM | PSNR | SSIM |
not used | 22.95 | 0.761 | 27.23 | 0.785 | 25.54 | 0.728 | 29.74 | 0.897 | 26.36 | 0.793 |
used | 23.34 | 0.784 | 27.62 | 0.796 | 25.95 | 0.746 | 30.20 | 0.899 | 26.78 | 0.807 |
Calendar | City | Foliage | Walk | Average | ||||||
---|---|---|---|---|---|---|---|---|---|---|
PSNR | SSIM | PSNR | SSIM | PSNR | SSIM | PSNR | SSIM | PSNR | SSIM | |
one-step up-sampling | 23.34 | 0.784 | 27.62 | 0.796 | 25.95 | 0.746 | 30.20 | 0.899 | 26.78 | 0.807 |
progressive up-sampling | 23.56 | 0.794 | 27.78 | 0.804 | 26.15 | 0.754 | 30.44 | 0.904 | 26.98 | 0.814 |
Calendar | City | Foliage | Walk | Average | ||||||
---|---|---|---|---|---|---|---|---|---|---|
Algorithm | PSNR | SSIM | PSNR | SSIM | PSNR | SSIM | PSNR | SSIM | PSNR | SSIM |
Bicubic | 20.39 | 0.572 | 25.16 | 0.603 | 23.47 | 0.567 | 26.10 | 0.797 | 23.78 | 0.635 |
RCAN | 22.33 | 0.725 | 26.10 | 0.696 | 24.74 | 0.665 | 28.65 | 0.872 | 25.46 | 0.740 |
VSRNet | - | - | - | - | - | - | - | - | 24.84 | 0.705 |
VESPCN | - | - | - | - | - | - | - | - | 25.35 | 0.756 |
DRVSR | 22.16 | 0.747 | 27.00 | 0.757 | 25.43 | 0.721 | 28.91 | 0.876 | 25.88 | 0.775 |
Bayesian | - | - | - | - | - | - | - | - | 26.16 | 0.815 |
21.66 | 0.704 | 26.45 | 0.720 | 24.98 | 0.698 | 28.26 | 0.859 | 25.34 | 0.745 | |
BRCN | - | - | - | - | - | - | - | - | 24.43 | 0.662 |
SOF-VSR | 22.64 | 0.745 | 26.93 | 0.752 | 25.45 | 0.718 | 29.19 | 0.881 | 26.05 | 0.767 |
FRVSR | - | - | - | - | - | - | - | - | 26.69 | 0.822 |
DUF-16L | - | - | - | - | - | - | - | - | 26.81 | 0.815 |
RBPN | 23.99 | 0.807 | 27.73 | 0.803 | 26.22 | 0.757 | 30.70 | 0.909 | 27.12 | 0.808 |
BTRPN | 23.69 | 0.804 | 27.84 | 0.811 | 26.37 | 0.766 | 30.72 | 0.909 | 27.15 | 0.822 |
Model | Scale | Test Time |
---|---|---|
BRCN | 4 | 0.024 s |
SOF-VSR | 4 | 0.120 s |
DUF-16L | 4 | 0.420 s |
DUF-28L | 4 | 0.500 s |
RBPN | 4 | 0.50 s |
BTRPN | 4 | 0.066 s |
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations. |
© 2020 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).
Share and Cite
Han, L.; Fan, C.; Yang, Y.; Zou, L. Bidirectional Temporal-Recurrent Propagation Networks for Video Super-Resolution. Electronics 2020, 9, 2085. https://doi.org/10.3390/electronics9122085
Han L, Fan C, Yang Y, Zou L. Bidirectional Temporal-Recurrent Propagation Networks for Video Super-Resolution. Electronics. 2020; 9(12):2085. https://doi.org/10.3390/electronics9122085
Chicago/Turabian StyleHan, Lei, Cien Fan, Ye Yang, and Lian Zou. 2020. "Bidirectional Temporal-Recurrent Propagation Networks for Video Super-Resolution" Electronics 9, no. 12: 2085. https://doi.org/10.3390/electronics9122085