End-to-End Light Field Image Compression with Multi-Domain Feature Learning
Abstract
:1. Introduction
- (1)
- A novel multi-domain feature learning-based light field image compression network is proposed to improve compression efficiency by effectively utilizing multi-domain features and their correlation to obtain a complete angle feature and reduce the redundancy among multi-domain features.
- (2)
- An EPI-based angle completion module is developed to obtain a complete angle feature by fully exploring the large disparity angle information contained in the EPI domain.
- (3)
- A spatial-angle joint transform module is proposed to reduce redundancy by modeling the intrinsic correlation between spatial and complete angle features.
- (4)
- Experimental results on the EPFL dataset demonstrate that MFLFIC-Net achieves superior performance on MS-SSIM and PSNR metrics compared to public state-of-the-art methods.
2. Related Works
2.1. Traditional Light Field Image Compression
2.2. Learning-Based Light Field Image Compression
3. Method
3.1. Architecture of MFLFIC-Net
3.2. EPI-Based Angle Completion Module
3.3. Spatial-Angle Joint Transform Module
3.4. Implementation Details
4. Experiments
4.1. Experimental Settings
4.2. Comparison Results
4.3. Ablation Study
4.4. Complexity Analysis
5. Conclusions
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Conflicts of Interest
References
- Liu, F.; Hon, G. Depth estimation from a hierarchical baseline stereo with a developed light field camera. Appl. Sci. 2024, 14, 550. [Google Scholar] [CrossRef]
- Lei, J.; Liu, B.; Peng, B.; Cao, X.; Ling, N. Deep gradual-conversion and cycle network for single-view synthesis. IEEE Trans. Emerge. Top. Comput. 2023, 7, 1665–1675. [Google Scholar] [CrossRef]
- Ai, X.; Wang, Y. The cube surface light field for interactive free-viewpoint rendering. Appl. Sci. 2022, 12, 7212. [Google Scholar] [CrossRef]
- Amirpour, H.; Guillemot, C.; Ghanbari, M. Advanced scalability for light field image coding. IEEE Trans. Image Process. 2022, 31, 7435–7448. [Google Scholar] [CrossRef] [PubMed]
- Gu, J.; Guo, B.; Wen, J. High efficiency light field compression via virtual reference and hierarchical MV-HEVC. In Proceedings of the International Conference on Multimedia and Expo, Shanghai, China, 8–12 July 2019. [Google Scholar]
- Huang, X.; An, P.; Shan, L.; Ma, R. LF-CAE: View synthesis for light field coding using depth estimation. In Proceedings of the International Conference on Multimedia and Expo, San Diego, CA, USA, 23–27 July 2018. [Google Scholar]
- Huang, X.; An, P.; Chen, D.; Liu, D.; Shen, L. Low bitrate light field compression with geometry and content consistency. IEEE Trans. Multimed. 2022, 24, 152–165. [Google Scholar] [CrossRef]
- Liu, D.; Huang, X.; Zhan, W. View synthesis-based light field image compression using a generative adversarial network. Inf. Sci. 2021, 545, 118–131. [Google Scholar] [CrossRef]
- Peng, B.; Zhang, X.; Lei, J.; Zhang, Z.; Ling, N.; Huang, Q. LVE-S2D: Low-light video enhancement from static to dynamic. IEEE Trans. Circuits Syst. Video Technol. 2022, 32, 8342–8352. [Google Scholar] [CrossRef]
- Shi, X.; Lin, J.; Jiang, D.; Nian, C.; Yin, J. Recurrent network with enhanced alignment and attention-guided aggregation for compressed video quality enhancement. In Proceedings of the International Conference on Visual Communications and Image Processing, Suzhou, China, 13–16 December 2022. [Google Scholar]
- Lei, J.; Song, J.; Peng, B.; Li, W.; Pan, Z.; Huang, Q. C2FNet: A coarse-to-fine network for multi-view 3D point cloud generation. IEEE Trans. Image Process. 2022, 31, 6707–6718. [Google Scholar] [CrossRef] [PubMed]
- Shen, X.; Li, X.; Elhoseiny, M. MoStGAN-V: Video generation with temporal motion styles. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Vancouver, BC, Canada, 17–24 June 2023. [Google Scholar]
- Jung, H.K.; Choi, G.S. Improved yolov5: Efficient object detection using drone images under various conditions. Appl. Sci. 2022, 12, 7255. [Google Scholar] [CrossRef]
- Yu, C.; Peng, B.; Huang, Q. PIPC-3Ddet: Harnessing perspective information and proposal correlation for 3D point cloud object detection. IEEE Trans. Circuits Syst. Video Technol. 2023, accepted. [Google Scholar] [CrossRef]
- Peng, B.; Chang, R.; Pan, Z. Deep in-loop filtering via multi-domain correlation learning and partition constraint for multiview video coding. IEEE Trans. Circuits Syst. Video Technol. 2022, 33, 1911–1921. [Google Scholar] [CrossRef]
- Li, H.; Wei, G.; Wang, T. Reducing video coding complexity based on CNN-CBAM in HEVC. Appl. Sci. 2023, 13, 10135. [Google Scholar] [CrossRef]
- Zhang, J.; Hou, Y.; Pan, Z. SWGNet: Step-wise reference frame generation network for multiview video coding. IEEE Trans. Circuits Syst. Video Technol. 2023, accepted. [Google Scholar] [CrossRef]
- Hu, Y.; Yang, W.; Liu, J. 3D-CNN autoencoder for plenoptic image compression. In Proceedings of the International Conference on Visual Communications and Image Processing, Macau, China, 1–4 December 2020. [Google Scholar]
- Tong, K.; Jin, X.; Wang, C.; Jiang, F. SADN: Learned light field image compression with spatial-angular decorrelation. In Proceedings of the International Conference on Acoustics, Speech and Signal Processing, Singapore, 23–27 May 2022. [Google Scholar]
- Singh, M.; Rameshan, R.M. Learning-based practical light field image compression using a disparity-aware model. In Proceedings of the Picture Coding Symposium, Speech and Signal Processing, Bristol, UK, 29 June–2 July 2021. [Google Scholar]
- Jia, C.; Zhang, X.; Wang, S.; Wang, S.; Ma, S. Light field image compression using generative adversarial network-based view synthesis. J. Emerg. Sel. Top. Power Electron. 2018, 9, 177–189. [Google Scholar] [CrossRef]
- Liu, D.; Wang, L.; Li, L.; Xiong, Z.; Wu, F. Pseudo-sequence-based light field image compression. In Proceedings of the International Conference on Multimedia & Expo Workshops, Seattle, WA, USA, 11–15 July 2016. [Google Scholar]
- Conceição, R.; Porto, M.; Zatt, B.; Agostini, L. LF-CAE: Context-adaptive encoding for lenslet light fields using HEVC. In Proceedings of the International Conference on Image Processing, Athens, Greece, 7–10 October 2018. [Google Scholar]
- Liu, D.; An, P.; Ma, R.; Zhan, W.; Huang, X. Content-based light field image compression method with gaussian process regression. IEEE Trans. Multimed. 2020, 22, 846–859. [Google Scholar] [CrossRef]
- Dai, F.; Zhang, J.; Ma, Y.; Zhang, Y. Lenselet image compression scheme based on subaperture images streaming. In Proceedings of the International Conference on Image Processing, Quebec, QC, Canada, 27–30 September 2015. [Google Scholar]
- Bakir, N.; Hamidouche, W.; Fezza, S.A.; Samrouth, K. Light field image coding using dual discriminator generative adversarial network and VVC temporal scalability. In Proceedings of the International Conference on Multimedia and Expo, London, UK, 6–10 July 2020. [Google Scholar]
- Zhao, Z.; Wang, S.; Jia, C.; Zhang, X.; Ma, S.; Yang, J. Light field image compression based on deep learning. In Proceedings of the International Conference on Multimedia and Expo, San Diego, CA, USA, 23–27 July 2018. [Google Scholar]
- Van, V.; Huu, T.N.; Yim, J.; Jeon, B. Downsampling based light field video coding with restoration network using joint spatio-angular and epipolar information. In Proceedings of the International Conference on Image Processing, Bordeaux, France, 16–19 October 2022. [Google Scholar]
- Ballé, J.; Minnen, D.; Singh, S. Variational image compression with a scale hyperprior. In Proceedings of the International Conference on Learning Representations, Vancouver, BC, Canada, 30 April–3 May 2018. [Google Scholar]
- Minnen, D.; Ballé, J.; Toderici, G.D. Joint autoregressive and hierarchical priors for learned image compression. In Proceedings of the Advances in Neural Information Processing Systems, Montreal, QC, Canada, 4–6 December 2018.
- Cheng, Z.; Sun, H.; Takeuchi, M. Learned image compression with discretized gaussian mixture likelihoods and attention modules. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Seattle, WA, USA, 13–19 June 2020. [Google Scholar]
- Kingma, D.P.; Ba, J. Adam: A method for stochastic optimization. arXiv 2017, arXiv:1412.6980. [Google Scholar]
- Rerabek, M.; Ebrahimi, T. New light field image dataset. In Proceedings of the International Conference on Quality of Multimedia Experience, Lisbon, Portugal, 6–8 June 2016. [Google Scholar]
- HEVC Official Test Model. Available online: https://vcgit.hhi.fraunhofer.de/jvet/HM/-/tags (accessed on 27 January 2024).
- VVC Official Test Model. Available online: https://vcgit.hhi.fraunhofer.de/jvet/VVCSoftware_VTM (accessed on 27 January 2024).
Methods | BD-PSNR (dB) | BD-Rate (%) |
---|---|---|
GPR | −2.253 | 138.095 |
VTM 10.0 | 1.021 | −35.557 |
SADN | 1.795 | −44.256 |
Proposed | 2.149 | −65.585 |
Methods | BD-PSNR (dB) | BD-Rate (%) |
---|---|---|
w/o E-ACM and w/o SAJTM | 1.541 | −48.167 |
w/o SAJTM | 1.918 | −54.970 |
Proposed | 2.149 | −65.585 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Ye, K.; Li, Y.; Li, G.; Jin, D.; Zhao, B. End-to-End Light Field Image Compression with Multi-Domain Feature Learning. Appl. Sci. 2024, 14, 2271. https://doi.org/10.3390/app14062271
Ye K, Li Y, Li G, Jin D, Zhao B. End-to-End Light Field Image Compression with Multi-Domain Feature Learning. Applied Sciences. 2024; 14(6):2271. https://doi.org/10.3390/app14062271
Chicago/Turabian StyleYe, Kangsheng, Yi Li, Ge Li, Dengchao Jin, and Bo Zhao. 2024. "End-to-End Light Field Image Compression with Multi-Domain Feature Learning" Applied Sciences 14, no. 6: 2271. https://doi.org/10.3390/app14062271
APA StyleYe, K., Li, Y., Li, G., Jin, D., & Zhao, B. (2024). End-to-End Light Field Image Compression with Multi-Domain Feature Learning. Applied Sciences, 14(6), 2271. https://doi.org/10.3390/app14062271