Information Bottleneck Driven Deep Video Compression—IBOpenDVCW
Abstract
:1. Introduction
2. Wavelet Theory
3. Information Bottleneck (IB)
- represents the compression cost of the mutual information between X and T.
- represents the relevance of the mutual information between T and Y.
- is a Lagrange multiplier that balances compression and relevance.
4. Proposed Method
5. Training Strategy
6. Experiments and Results
7. Summary and Discussion
Author Contributions
Funding
Institutional Review Board Statement
Data Availability Statement
Conflicts of Interest
References
- Symes, P. Digital Video Compression; Digital Video/Audio Series; McGraw-Hill: New York, NY, USA, 2004. [Google Scholar]
- Zhang, Z.; Shi, Y.; Toda, H.; Akiduki, T. A Study of a new wavelet neural network for deep learning. In Proceedings of the 2017 International Conference on Wavelet Analysis and Pattern Recognition (ICWAPR), Ningbo, China, 9–12 July 2017; pp. 127–131. [Google Scholar] [CrossRef]
- Ma, S.; Zhang, X.; Jia, C.; Zhao, Z.; Wang, S.; Wang, S. Image and Video Compression with Neural Networks: A Review. arXiv 2019, arXiv:1904.03567. [Google Scholar] [CrossRef]
- Shin, S. Industrial application of wavelet analysis. In Proceedings of the 2008 International Conference on Wavelet Analysis and Pattern Recognition, Hong Kong, China, 30–31 August 2008; Volume 2, pp. 607–610. [Google Scholar] [CrossRef]
- Keinert, F. Wavelets and Multiwavelets; Studies in Advanced Mathematics; CRC Press: Boca Raton, FL, USA, 2003. [Google Scholar]
- Graps, A. An introduction to wavelets. IEEE Comput. Sci. Eng. 1995, 2, 50–61. [Google Scholar] [CrossRef]
- Birman, R.; Segal, Y.; Hadar, O. Overview of Research in the field of Video Compression using Deep Neural Networks. Multimed. Tools Appl. 2020, 79, 11699–11722. [Google Scholar] [CrossRef]
- Sankaralingam, E.; Thangaraj, V.; Vijayamani, S.; Palaniswamy, N. Video Compression Using Multiwavelet and Multistage Vector Quantization 385 Video Compression Using Multiwavelet and Multistage Vector Quantization. Int. Arab J. Inf. Technol. 2009, 6, 385–393. [Google Scholar]
- Joy, H.; Kounte, M.R.; Chandrasekhar, A.; Paul, M. Deep Learning Based Video Compression Techniques with Future Research Issues. Wirel. Pers. Commun. 2023, 131, 2599–2625. [Google Scholar] [CrossRef]
- Mochurad, L. A Comparison of Machine Learning-Based and Conventional Technologies for Video Compression. Technologies 2024, 12, 52. [Google Scholar] [CrossRef]
- Tishby, N.; Zaslavsky, N. Deep learning and the information bottleneck principle. In Proceedings of the 2015 IEEE Information Theory Workshop (ITW), Jerusalem, Israel, 26 April–1 May 2015; pp. 1–5. [Google Scholar] [CrossRef]
- van Berkel, M.; Witvoet, G.; Nuij, P.; Steinbuch, M. Wavelets for Feature Detection: Theoretical Background; CST, Eindhoven University of Technology: Eindhoven, The Netherlands, 2010. [Google Scholar]
- Cover, T.M.; Thomas, J.A. Elements of Information Theory; Wiley Series in Telecommunications and Signal Processing; Wiley-Interscience: Hoboken, NJ, USA, 2006. [Google Scholar]
- Lu, G.; Ouyang, W.; Xu, D.; Zhang, X.; Cai, C.; Gao, Z. DVC: An End-to-end Deep Video Compression Framework. arXiv 2019, arXiv:1812.00101. [Google Scholar] [CrossRef]
- Yang, R.; Gool, L.V.; Timofte, R. OpenDVC: An Open Source Implementation of the DVC Video Compression Method. arXiv 2020, arXiv:2006.15862. [Google Scholar] [CrossRef]
- Ballé, J.; Laparra, V.; Simoncelli, E.P. End-to-end Optimized Image Compression. arXiv 2017, arXiv:1611.01704. [Google Scholar] [CrossRef]
- Kraskov, A.; Stögbauer, H.; Grassberger, P. Estimating mutual information. Phys. Rev. E 2004, 69, 066138. [Google Scholar] [CrossRef] [PubMed]
- Bellard, F. BPG Image Format. Available online: https://bellard.org/bpg/ (accessed on 28 June 2024).
- Mercat, A.; Viitanen, M.; Vanne, J. UVG dataset: 50/120fps 4K sequences for video codec analysis and development. In Proceedings of the MMSys ’20: Proceedings of the 11th ACM Multimedia Systems Conference, Istanbul, Turkey, 8–11 June 2020; pp. 297–302. [CrossRef]
Wavelet_Name | LPF Filter | HPF Filter |
---|---|---|
haar | 0.7071067811865476, 0.7071067811865476 | −0.7071067811865476, 0.7071067811865476 |
db2 | −0.12940952255126037, 0.2241438680420134, 0.8365163037378079, 0.48296291314453416 | −0.48296291314453416, 0.8365163037378079, −0.2241438680420134, −0.12940952255126037 |
db3 | 0.03522629188570953, −0.08544127388202666, −0.13501102001025458, 0.45987750211849154, 0.8068915093110925, 0.33267055295008263 | −0.33267055295008263, 0.8068915093110925, −0.45987750211849154, −0.13501102001025458, 0.08544127388202666, 0.03522629188570953 |
sym3 | 0.035226291882100656, −0.08544127388224149, −0.13501102001039084, 0.4598775021193313, 0.8068915093133388, 0.3326705529509569 | −0.3326705529509569, 0.8068915093133388, −0.4598775021193313, −0.13501102001039084, 0.08544127388224149, 0.035226291882100656 |
AveragePooling2D | DWT-haar | DWT-db2 | DWT-db3 | DWT-sym3 | DWT-bior1.3 | |
---|---|---|---|---|---|---|
level 0 | 7.821273 | 7.821273 | 7.821273 | 7.821273 | 7.821273 | 7.821273 |
level 1 | 6.825602 | 6.825602 | 6.836400 | 6.848318 | 6.848318 | 6.846506 |
level 2 | 5.832305 | 5.832305 | 5.877017 | 5.900455 | 5.900455 | 5.895880 |
level 3 | 4.842934 | 4.842934 | 4.930845 | 5.019609 | 5.019609 | 5.011204 |
level 4 | 3.855431 | 3.855431 | 4.032167 | 4.192275 | 4.192275 | 4.178278 |
Wavelet_Name | Mutual_Information |
---|---|
haar | 2.894183 |
db2 | 3.159218 |
db3 | 3.215511 |
sym3 | 3.215256 |
Wavelet_Name | Mutual_Information | |
---|---|---|
1 | coif2 | 3.267934 |
2 | coif3 | 3.282238 |
3 | sym2 | 3.159884 |
4 | bior1.3 | 2.858263 |
5 | bior2.2 | 3.270967 |
6 | rbio1.3 | 3.246917 |
7 | rbio2.2 | 2.962761 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Leiderman, T.; Ben Ezra, Y. Information Bottleneck Driven Deep Video Compression—IBOpenDVCW. Entropy 2024, 26, 836. https://doi.org/10.3390/e26100836
Leiderman T, Ben Ezra Y. Information Bottleneck Driven Deep Video Compression—IBOpenDVCW. Entropy. 2024; 26(10):836. https://doi.org/10.3390/e26100836
Chicago/Turabian StyleLeiderman, Timor, and Yosef Ben Ezra. 2024. "Information Bottleneck Driven Deep Video Compression—IBOpenDVCW" Entropy 26, no. 10: 836. https://doi.org/10.3390/e26100836
APA StyleLeiderman, T., & Ben Ezra, Y. (2024). Information Bottleneck Driven Deep Video Compression—IBOpenDVCW. Entropy, 26(10), 836. https://doi.org/10.3390/e26100836