Rate–Distortion–Perception Trade-Off in Information Theory, Generative Models, and Intelligent Communications
Abstract
:1. Introduction
- Perceptual constraints fundamentally alter the rate–distortion trade-off, requiring additional rate even with unlimited common randomness.
- Generative models naturally align with RDP objectives, making them powerful tools for developing perceptually optimized communication systems.
- The practical implementation of RDP-aware systems faces challenges related to computational complexity and the need for robustness across diverse channel conditions.
Notations
2. Distortion and Perceptual Measures
2.1. Distortion Measures
2.2. Perceptual Measures in Information Theory
2.2.1. Total Variation Distance
2.2.2. KL Divergence
2.2.3. Wasserstein Distance
2.2.4. Common Assumptions
2.3. Interplay Between Distortion and Perception Constraints
3. Information-Theoretic Results
3.1. Realism Constraints
3.1.1. Weak and Strong Realism Constraints
3.1.2. Perfect and Imperfect Realism Constraints
3.2. Information-Theoretic System Model
3.3. Coding Theorems
3.4. The Rate–Distortion–Perception Frontier
- Common randomness and private randomness are helpful in achieving (near) perfect realism.
- The introduction of the perceptual constraint results in a higher rate requirement, even when common randomness and decoder randomness are unlimited.
4. Generative Modeling as a Distribution Approximation Process
4.1. Objectives, Tasks, and Architectures
4.1.1. Variational Auto-Encoder
4.1.2. Generative Adversarial Network
4.1.3. Transformer
4.1.4. Diffusion Model
4.2. Practical Distortion and Perceptual Quality Measures
4.3. Experimental Validation in Image Compression
5. AI-Empowered Communication with Perceptual Constraint
5.1. Source–Channel Coding for Perceptual Oriented Communications
5.2. AI-Empowered Semantic Communication
5.2.1. Theoretical Foundation and Practical Motivation
Pilot Signals and Synchronization
Implementation of Private Randomness
5.2.2. Generative Modeling in User Equipment
5.2.3. Generative Modeling in Base Station
5.2.4. Generative Modeling in Core Network
5.2.5. Basic Unit for AI-Empowered Communications
6. Conclusions and Future Direction
6.1. Conclusions
6.2. Future Direction
6.2.1. Information-Theoretic Directions
6.2.2. Architectural Improvement for Intelligent Communication Systems
Interactive AI Agents
Perceptually Optimized Data Transmission
Human-Centric Quality Systems
6.2.3. Implementation Challenges in Deploying Generative Models for Intelligent Communications
Computational Complexity
- Model architecture simplification through techniques like depthwise separable convolutions and reduced latent space dimensions.
- Knowledge distillation methods where smaller models are trained to mimic the behavior of larger, more complex generative models.
- The quantization of model weights and activations to reduce precision requirements and computational intensity.
Latency Considerations
Hardware Constraints
- On-device model compression techniques, including pruning and quantization.
- Federated learning approaches where model training is distributed across multiple devices.
- Hybrid architectures that perform partial processing at the edge and partial processing in the cloud.
Author Contributions
Funding
Institutional Review Board Statement
Data Availability Statement
Conflicts of Interest
References
- Blau, Y.; Michaeli, T. The perception-distortion tradeoff. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA, 18–23 June 2018; pp. 6228–6237. [Google Scholar]
- Dahl, R.; Norouzi, M.; Shlens, J. Pixel recursive super resolution. In Proceedings of the IEEE International Conference on Computer Vision, Venice, Italy, 22–29 October 2017; pp. 5439–5448. [Google Scholar]
- Tschannen, M.; Agustsson, E.; Lucic, M. Deep generative models for distribution-preserving lossy compression. Adv. Neural Inf. Process. Syst. 2018, 31, 5929–5940. [Google Scholar]
- Blau, Y.; Michaeli, T. Rethinking lossy compression: The rate-distortion-perception tradeoff. In Proceedings of the International Conference on Machine Learning. PMLR, Long Beach, CA, USA, 9–15 June 2019; pp. 675–685. [Google Scholar]
- Ledig, C.; Theis, L.; Huszár, F.; Caballero, J.; Cunningham, A.; Acosta, A.; Aitken, A.; Tejani, A.; Totz, J.; Wang, Z.; et al. Photo-realistic single image super-resolution using a generative adversarial network. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA, 21–26 July 2017; pp. 4681–4690. [Google Scholar]
- Agustsson, E.; Tschannen, M.; Mentzer, F.; Timofte, R.; Gool, L.V. Generative adversarial networks for extreme learned image compression. In Proceedings of the IEEE/CVF International Conference on Computer Vision, Seoul, Republic of Korea, 27 October–2 November 2019; pp. 221–231. [Google Scholar]
- Mentzer, F.; Toderici, G.D.; Tschannen, M.; Agustsson, E. High-fidelity generative image compression. Adv. Neural Inf. Process. Syst. 2020, 33, 11913–11924. [Google Scholar]
- Yan, Z.; Wen, F.; Ying, R.; Ma, C.; Liu, P. On perceptual lossy compression: The cost of perceptual reconstruction and an optimal training framework. In Proceedings of the International Conference on Machine Learning. PMLR, Virtual, 18–24 July 2021; pp. 11682–11692. [Google Scholar]
- Zhang, G.; Qian, J.; Chen, J.; Khisti, A. Universal rate-distortion-perception representations for lossy compression. Adv. Neural Inf. Process. Syst. 2021, 34, 11517–11529. [Google Scholar]
- Yan, Z.; Wen, F.; Liu, P. Optimally Controllable Perceptual Lossy Compression. In Proceedings of the International Conference on Machine Learning. PMLR, Baltimore, MD, USA, 17–23 July 2022; pp. 24911–24928. [Google Scholar]
- Salehkalaibar, S.; Phan, T.B.; Chen, J.; Yu, W.; Khisti, A. On the choice of perception loss function for learned video compression. Adv. Neural Inf. Process. Syst. 2023, 36, 48226–48274. [Google Scholar]
- O’shea, T.; Hoydis, J. An introduction to deep learning for the physical layer. IEEE Trans. Cogn. Commun. Netw. 2017, 3, 563–575. [Google Scholar]
- Bourtsoulatze, E.; Burth Kurka, D.; Gündüz, D. Deep Joint Source-Channel Coding for Wireless Image Transmission. IEEE Trans. Cogn. Comms. Netw. 2019, 5, 567–579. [Google Scholar] [CrossRef]
- Bégaint, J.; Racapé, F.; Feltman, S.; Pushparaja, A. CompressAI: A PyTorch library and evaluation platform for end-to-end compression research. arXiv 2020, arXiv:2011.03029. [Google Scholar]
- Cheng, Z.; Sun, H.; Takeuchi, M.; Katto, J. Learned image compression with discretized gaussian mixture likelihoods and attention modules. In Proceedings of the IEEE/CVF conference on Computer Vision and Pattern Recognition, Seattle, WA, USA, 14–19 June 2020; pp. 7939–7948. [Google Scholar]
- Gregory, R.L. Perceptions as hypotheses. Philos. Trans. R. Soc. Lond. B Biol. Sci. 1980, 290, 181–197. [Google Scholar]
- Rao, R.P.; Ballard, D.H. Predictive coding in the visual cortex: A functional interpretation of some extra-classical receptive-field effects. Nat. Neurosci. 1999, 2, 79–87. [Google Scholar]
- Jiang, L.P.; Rao, R.P. Predictive Coding Theories of Cortical Function. 2022. Available online: https://oxfordre.com/neuroscience/display/10.1093/acrefore/9780190264086.001.0001/acrefore-9780190264086-e-328 (accessed on 26 February 2025).
- Niu, X.; Bai, B.; Deng, L.; Han, W. Beyond Scaling Laws: Understanding Transformer Performance with Associative Memory. arXiv 2024, arXiv:2405.08707. [Google Scholar]
- Lin, B.; Ye, Y.; Zhu, B.; Cui, J.; Ning, M.; Jin, P.; Yuan, L. Video-LLaVA: Learning United Visual Representation by Alignment Before Projection. In Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing, Miami, FL, USA, 12–16 November 2024; pp. 5971–5984. [Google Scholar]
- Fei, W.; Niu, X.; Xie, G.; Zhang, Y.; Bai, B.; Deng, L.; Han, W. Retrieval meets reasoning: Dynamic in-context editing for long-text understanding. arXiv 2024, arXiv:2406.12331. [Google Scholar]
- Meng, C.; He, Y.; Song, Y.; Song, J.; Wu, J.; Zhu, J.Y.; Ermon, S. SDEdit: Guided Image Synthesis and Editing with Stochastic Differential Equations. In Proceedings of the International Conference on Learning Representations, Online, 25–29 April 2022. [Google Scholar]
- Cover, T.M. Elements of Information Theory; John Wiley & Sons: Hoboken, NJ, USA, 1999. [Google Scholar]
- Theis, L.; Agustsson, E. On the advantages of stochastic encoders. arXiv 2021, arXiv:2102.09270. [Google Scholar]
- Wagner, A.B. The Rate-Distortion-Perception Tradeoff: The Role of Common Randomness. arXiv 2022, arXiv:2202.04147. [Google Scholar]
- Chen, J.; Yu, L.; Wang, J.; Shi, W.; Ge, Y.; Tong, W. On the Rate-Distortion-Perception Function. IEEE J. Sel. Areas Inf. Theory 2022, 3, 664–673. [Google Scholar]
- Hamdi, Y.; Wagner, A.B.; Gündüz, D. The Rate-Distortion-Perception Trade-off: The Role of Private Randomness. arXiv 2024, arXiv:2404.01111. [Google Scholar]
- Li, M.; Klejsa, J.; Kleijn, W.B. Distribution preserving quantization with dithering and transformation. IEEE Signal Process. Lett. 2010, 17, 1014–1017. [Google Scholar]
- Li, M.; Klejsa, J.; Kleijn, W.B. On distribution preserving quantization. arXiv 2011, arXiv:1108.3728. [Google Scholar]
- Saldi, N.; Linder, T.; Yüksel, S. Randomized quantization and source coding with constrained output distribution. IEEE Trans. Inf. Theory 2014, 61, 91–106. [Google Scholar]
- Saldi, N.; Linder, T.; Yüksel, S. Output constrained lossy source coding with limited common randomness. IEEE Trans. Inf. Theory 2015, 61, 4984–4998. [Google Scholar]
- Tian, C.; Chen, J.; Narayanan, K. Source-Channel Separation Theorems for Distortion Perception Coding. arXiv 2025, arXiv:2501.17706. [Google Scholar]
- Qiu, Y.; Wagner, A.B.; Ballé, J.; Theis, L. Wasserstein distortion: Unifying fidelity and realism. In Proceedings of the 2024 58th Annual Conference on Information Sciences and Systems (CISS), Princeton, NJ, USA, 13–15 March 2024; IEEE: New York, NY, USA, 2024; pp. 1–6. [Google Scholar]
- Marton, K. Bounding d¯-distance by informational divergence: A method to prove measure concentration. Ann. Probab. 1996, 24, 857–866. [Google Scholar]
- Marton, K. Measure concentration for a class of random processes. Probab. Theory Relat. Fields 1998, 110, 427–439. [Google Scholar]
- Yassaee, M.H.; Aref, M.R.; Gohari, A. Achievability proof via output statistics of random binning. IEEE Trans. Inf. Theory 2014, 60, 6760–6786. [Google Scholar]
- Pinsker, M.S. Information and information stability of random variables and processes. J. R. Stat. Soc. Ser. Appl. Stat. 1964, 13, 134–135. [Google Scholar]
- Csiszár, I.; Talata, Z. Context tree estimation for not necessarily finite memory processes, via BIC and MDL. IEEE Trans. Inf. Theory 2006, 52, 1007–1016. [Google Scholar]
- Cuff, P. Distributed channel synthesis. IEEE Trans. Inf. Theory 2013, 59, 7071–7096. [Google Scholar]
- Villani, C. Optimal Transport: Old and New; Springer: Berlin/Heidelberg, Germany, 2009; Volume 338. [Google Scholar]
- Cuff, P.W.; Permuter, H.H.; Cover, T.M. Coordination capacity. IEEE Trans. Inf. Theory 2010, 56, 4181–4206. [Google Scholar]
- Raginsky, M. Empirical processes, typical sequences, and coordinated actions in standard Borel spaces. IEEE Trans. Inf. Theory 2012, 59, 1288–1301. [Google Scholar]
- Niu, X.; Gündüz, D.; Bai, B.; Han, W. Conditional Rate-Distortion-Perception Trade-Off. In Proceedings of the 2023 IEEE International Symposium on Information Theory (ISIT), Taipei, Taiwan, 25–30 June 2023; IEEE: New York, NY, USA, 2023; pp. 1068–1073. [Google Scholar]
- Matsumoto, R. Introducing the perception-distortion tradeoff into the rate-distortion theory of general information sources. IEICE Commun. Express 2018, 7, 427–431. [Google Scholar]
- Matsumoto, R. Rate-distortion-perception tradeoff of variable-length source coding for general information sources. IEICE Commun. Express 2019, 8, 38–42. [Google Scholar]
- Hamdi, Y.; Gündüz, D. The Rate-Distortion-Perception Trade-off with Side Information. In Proceedings of the 2023 IEEE International Symposium on Information Theory (ISIT), Taipei, Taiwan, 25–30 June 2023; IEEE: New York, NY, USA, 2023; pp. 1056–1061. [Google Scholar]
- Theis, L.; Wagner, A.B. A coding theorem for the rate-distortion-perception function. arXiv 2021, arXiv:2104.13662. [Google Scholar]
- Li, C.T.; El Gamal, A. Strong functional representation lemma and applications to coding theorems. IEEE Trans. Inf. Theory 2018, 64, 6967–6978. [Google Scholar] [CrossRef]
- Csiszár, I.; Körner, J. Information Theory: Coding Theorems for Discrete Memoryless Systems, 2nd ed.; Cambridge University Press: Cambridge, UK, 2011. [Google Scholar]
- Chen, C.; Niu, X.; Ye, W.; Wu, S.; Bai, B.; Chen, W.; Lin, S.J. Computation of rate-distortion-perception functions with Wasserstein barycenter. In Proceedings of the 2023 IEEE International Symposium on Information Theory (ISIT), Taipei, Taiwan, 25–30 June 2023; IEEE: New York, NY, USA, 2023; pp. 1074–1079. [Google Scholar]
- Chen, C.; Niu, X.; Ye, W.; Wu, H.; Bai, B. Computation and Critical Transitions of Rate-Distortion-Perception Functions with Wasserstein Barycenter. arXiv 2024, arXiv:2404.04681. [Google Scholar]
- Serra, G.; Stavrou, P.A.; Kountouris, M. Computation of the Multivariate Gaussian Rate-Distortion-Perception Function. In Proceedings of the 2024 IEEE International Symposium on Information Theory (ISIT), Athens, Greece, 7–12 July 2024; IEEE: New York, NY, USA, 2024; pp. 1077–1082. [Google Scholar]
- Freirich, D.; Michaeli, T.; Meir, R. A theory of the distortion-perception tradeoff in wasserstein space. Adv. Neural Inf. Process. Syst. 2021, 34, 25661–25672. [Google Scholar]
- Chen, C.; Mo, J. IQA-PyTorch: PyTorch Toolbox for Image Quality Assessment. 2022. Available online: https://github.com/chaofengc/IQA-PyTorch (accessed on 26 February 2025).
- Graves, A. Sequence transduction with recurrent neural networks. arXiv 2012, arXiv:1211.3711. [Google Scholar]
- Ackley, D.H.; Hinton, G.E.; Sejnowski, T.J. A learning algorithm for Boltzmann machines. Cogn. Sci. 1985, 9, 147–169. [Google Scholar]
- Fan, A.; Lewis, M.; Dauphin, Y. Hierarchical Neural Story Generation. In Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), Melbourne, Australia, 15–20 July 2018; pp. 889–898. [Google Scholar]
- Holtzman, A.; Buys, J.; Du, L.; Forbes, M.; Choi, Y. The Curious Case of Neural Text Degeneration. In Proceedings of the International Conference on Learning Representations, Addis Ababa, Ethiopia, 30 April 2020. [Google Scholar]
- Meister, C.; Pimentel, T.; Wiher, G.; Cotterell, R. Locally typical sampling. Trans. Assoc. Comput. Linguist. 2023, 11, 102–121. [Google Scholar]
- Kingma, D.P.; Welling, M. Auto-encoding variational Bayes. In Proceedings of the 2nd International Conference on Learning Representations, Banff, AB, Canada, 14–16 April 2014. [Google Scholar]
- Goodfellow, I.; Pouget-Abadie, J.; Mirza, M.; Xu, B.; Warde-Farley, D.; Ozair, S.; Courville, A.; Bengio, Y. Generative adversarial nets. In Proceedings of the Advances in Neural Information Processing Systems, Montreal, QC, Canada, 8–13 December 2014; Volume 27. [Google Scholar]
- Sohl-Dickstein, J.; Weiss, E.; Maheswaranathan, N.; Ganguli, S. Deep Unsupervised Learning using Nonequilibrium Thermodynamics. In Proceedings of the International Conference on Machine Learning (ICML), Lille, France, 7–9 July 2015. [Google Scholar]
- Ho, J.; Jain, A.; Abbeel, P. Denoising Diffusion Probabilistic Models. In Proceedings of the Advances in Neural Information Processing Systems (NeurIPS), Virtual, 6–12 December 2020. [Google Scholar]
- Vaswani, A.; Shazeer, N.; Parmar, N.; Uszkoreit, J.; Jones, L.; Gomez, A.N.; Kaiser, Ł.; Polosukhin, I. Attention is all you need. Adv. Neural Inf. Process. Syst. 2017, 30, 5998–6008. [Google Scholar]
- Arjovsky, M.; Chintala, S.; Bottou, L. Wasserstein generative adversarial networks. In Proceedings of the International Conference on Machine Learning. PMLR, Sydney, NSW, Australia, 6–11 August 2017; pp. 214–223. [Google Scholar]
- Kingma, D.P.; Mohamed, S.; Jimenez Rezende, D.; Welling, M. Semi-supervised learning with deep generative models. Adv. Neural Inf. Process. Syst. 2014, 27, 3581–3589. [Google Scholar]
- Goodfellow, I.; Pouget-Abadie, J.; Mirza, M.; Xu, B.; Warde-Farley, D.; Ozair, S.; Courville, A.; Bengio, Y. Generative adversarial networks. Commun. ACM 2020, 63, 139–144. [Google Scholar] [CrossRef]
- Nowozin, S.; Cseke, B.; Tomioka, R. f-GAN: Training generative neural samplers using variational divergence minimization. In Proceedings of the Advances in Neural Information Processing Systems, Barcelona, Spain, 5–10 December 2016; Volume 29. [Google Scholar]
- Gulrajani, I.; Ahmed, F.; Arjovsky, M.; Dumoulin, V.; Courville, A.C. Improved training of Wasserstein GANs. In Proceedings of the Advances in Neural Information Processing Systems, Long Beach, CA, USA, 4–9 December 2017. [Google Scholar]
- Farnia, F.; Tse, D. A convex duality framework for GANs. In Proceedings of the Advances in Neural Information Processing Systems, Montréal, QC, Canada, 3–8 December 2018. [Google Scholar]
- Esser, P.; Rombach, R.; Ommer, B. Taming transformers for high-resolution image synthesis. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Nashville, TN, USA, 20–25 June 2021; pp. 12873–12883. [Google Scholar]
- Mentzer, F.; Toderici, G.; Minnen, D.; Hwang, S.J.; Caelles, S.; Lucic, M.; Agustsson, E. VCT: A video compression transformer. In Proceedings of the 36th International Conference on Neural Information Processing Systems, New Orleans, LA, USA, 28 November–9 December 2022; pp. 13091–13103. [Google Scholar]
- Song, Y.; Ermon, S. Generative Modeling by Estimating Gradients of the Data Distribution. In Proceedings of the Advances in Neural Information Processing Systems (NeurIPS), Vancouver, BC, Canada, 8–14 December 2019. [Google Scholar]
- Song, Y.; Ermon, S. Improved Techniques for Training Score-Based Generative Models. In Proceedings of the Advances in Neural Information Processing Systems (NeurIPS), Virtual, 6–12 December 2020. [Google Scholar]
- Dhariwal, P.; Nichol, A. Diffusion Models Beat GANs on Image Synthesis. In Proceedings of the Advances in Neural Information Processing Systems (NeurIPS), Online, 6–14 December 2021; pp. 8780–8794. [Google Scholar]
- Kingma, D.; Gao, R. Understanding diffusion objectives as the elbo with simple data augmentation. Adv. Neural Inf. Process. Syst. 2023, 36, 65484–65516. [Google Scholar]
- Wang, Z.; Simoncelli, E.P.; Bovik, A.C. Multiscale structural similarity for image quality assessment. In Proceedings of the Thrity-Seventh Asilomar Conference on Signals, Systems & Computers, Pacific Grove, CA, USA, 9–12 November 2003; IEEE: New York, NY, USA, 2003; Volume 2, pp. 1398–1402. [Google Scholar]
- Wiegand, T.; Sullivan, G.J.; Bjontegaard, G.; Luthra, A. Overview of the H. 264/AVC video coding standard. IEEE Trans. Circuits Syst. Video Technol. 2003, 13, 560–576. [Google Scholar] [CrossRef]
- Sullivan, G.J.; Ohm, J.R.; Han, W.J.; Wiegand, T. Overview of the high efficiency video coding (HEVC) standard. IEEE Trans. Circuits Syst. Video Technol. 2012, 22, 1649–1668. [Google Scholar] [CrossRef]
- Bross, B.; Wang, Y.K.; Ye, Y.; Liu, S.; Chen, J.; Sullivan, G.J.; Ohm, J.R. Overview of the versatile video coding (VVC) standard and its applications. IEEE Trans. Circuits Syst. Video Technol. 2021, 31, 3736–3764. [Google Scholar]
- Heusel, M.; Ramsauer, H.; Unterthiner, T.; Nessler, B.; Hochreiter, S. GANs trained by a two time-scale update rule converge to a local nash equilibrium. Adv. Neural Inf. Process. Syst. 2017, 30, 6626–6637. [Google Scholar]
- Zhang, R.; Isola, P.; Efros, A.A.; Shechtman, E.; Wang, O. The unreasonable effectiveness of deep features as a perceptual metric. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA, 18–22 June 2018; pp. 586–595. [Google Scholar]
- Bhardwaj, S.; Fischer, I.; Ballé, J.; Chinen, T. An unsupervised information-theoretic perceptual quality metric. In Proceedings of the Advances in Neural Information Processing Systems, Virtual, 6–12 December 2020. [Google Scholar]
- Alon, N.; Orlitsky, A. Source coding and graph entropies. IEEE Trans. Inf. Theory 1996, 42, 1329–1339. [Google Scholar] [CrossRef]
- Harangi, V.; Niu, X.; Bai, B. Conditional graph entropy as an alternating minimization problem. IEEE Trans. Inf. Theory 2023, 70, 904–919. [Google Scholar] [CrossRef]
- Harangi, V.; Niu, X.; Bai, B. Generalizing Körner’s graph entropy to graphons. Eur. J. Comb. 2023, 114, 103779. [Google Scholar] [CrossRef]
- Theis, L.; Salimans, T.; Hoffman, M.D.; Mentzer, F. Lossy compression with Gaussian diffusion. arXiv 2022, arXiv:2206.08889. [Google Scholar]
- Yang, R.; Mandt, S. Lossy image compression with conditional diffusion models. Adv. Neural Inf. Process. Syst. 2023, 36, 64971–64995. [Google Scholar]
- Elata, N.; Michaeli, T.; Elad, M. Zero-Shot Image Compression with Diffusion-Based Posterior Sampling. arXiv 2024, arXiv:2407.09896. [Google Scholar]
- Fei, W.; Niu, X.; Zhou, P.; Hou, L.; Bai, B.; Deng, L.; Han, W. Extending Context Window of Large Language Models via Semantic Compression. In Proceedings of the Findings of the Association for Computational Linguistics: ACL 2024, Association for Computational Linguistics, Bangkok, Thailand, 11–16 August 2024; pp. 5169–5181. [Google Scholar]
- Arda, E.; Yener, A. A Rate-Distortion Framework for Summarization. arXiv 2025, arXiv:2501.13100. [Google Scholar]
- Fei, W.; Niu, X.; Xie, G.; Liu, Y.; Bai, B.; Han, W. Efficient Prompt Compression with Evaluator Heads for Long-Context Transformer Inference. arXiv 2025, arXiv:2501.12959. [Google Scholar]
- Ballé, J.; Minnen, D.; Singh, S.; Hwang, S.J.; Johnston, N. Variational image compression with a scale hyperprior. In Proceedings of the International Conference on Learning Representations, Vancouver, BC, Canada, 30 April–3 May 2018. [Google Scholar]
- Toderici, G.; Theis, L.; Johnston, N.; Agustsson, E.; Mentzer, F.; Ballé, J.; Shi, W.; Timofte, R. CLIC 2020: Challenge on learned image compression. 2020. Available online: https://www.tensorflow.org/datasets/catalog/clic (accessed on 26 February 2025).
- Shannon, C.E. A mathematical theory of communication. Bell Syst. Tech. J. 1948, 27, 379–423. [Google Scholar]
- Goldsmith, A. Joint source/channel coding for wireless channels. In Proceedings of the IEEE Vehicular Technology Conference, Chicago, IL, USA, 25–28 July 1995; pp. 614–618. [Google Scholar]
- Vembu, S.; Verdu, S.; Steinberg, Y. The source-channel separation theorem revisited. IEEE Trans. Inf. Theory 1995, 41, 44–54. [Google Scholar]
- Gündüz, D.; Qin, Z.; Aguerri, I.E.; Dhillon, H.S.; Yang, Z.; Yener, A.; Wong, K.K.; Chae, C.B. Beyond Transmitting Bits: Context, Semantics, and Task-Oriented Communications. IEEE J. Sel. Areas Commun. 2023, 41, 5–41. [Google Scholar]
- Kurka, D.B.; Gündüz, D. Bandwidth-agile image transmission with deep joint source-channel coding. IEEE Trans. Wirel. Commun. 2021, 20, 8081–8095. [Google Scholar]
- Tung, T.Y.; Gündüz, D. DeepWiVe: Deep-Learning-Aided Wireless Video Transmission. IEEE J. Sel. Areas Commun. 2022, 40, 2570–2583. [Google Scholar]
- Wang, M.; Zhang, Z.; Li, J.; Ma, M.; Fan, X. Deep Joint Source-Channel Coding for Multi-Task Network. IEEE Signal Process. Lett. 2021, 28, 1973–1977. [Google Scholar]
- Yang, M.; Bian, C.; Kim, H.S. OFDM-guided Deep Joint Source Channel Coding for Wireless Multipath Fading Channels. IEEE Trans. Cogn. Commun. Netw. 2022, 8, 584–599. [Google Scholar]
- Shao, Y.; Gunduz, D. Semantic Communications With Discrete-Time Analog Transmission: A PAPR Perspective. IEEE Wirel. Commun. Lett. 2023, 12, 510–514. [Google Scholar] [CrossRef]
- Wu, H.; Shao, Y.; Mikolajczyk, K.; Gündüz, D. Channel-Adaptive Wireless Image Transmission With OFDM. IEEE Wirel. Commun. Lett. 2022, 11, 2400–2404. [Google Scholar] [CrossRef]
- Niu, X.; Wang, X.; Gündüz, D.; Bai, B.; Chen, W.; Zhou, G. A hybrid wireless image transmission scheme with diffusion. In Proceedings of the 2023 IEEE 24th International Workshop on Signal Processing Advances in Wireless Communications (SPAWC), Shanghai, China, 25–28 September 2023; IEEE: New York, NY, USA, 2023; pp. 86–90. [Google Scholar]
- Kountouris, M.; Pappas, N. Semantics-empowered communication for networked intelligent systems. IEEE Commun. Mag. 2021, 59, 96–102. [Google Scholar] [CrossRef]
- Gündüz, D.; Chiariotti, F.; Huang, K.; Kalør, A.E.; Kobus, S.; Popovski, P. Timely and massive communication in 6G: Pragmatics, learning, and inference. IEEE BITS Inf. Theory Mag. 2023, 3, 27–40. [Google Scholar] [CrossRef]
- Li, Z.; Wang, Q.; Wang, Y.; Chen, T. The Architecture of AI and Communication Integration towards 6G: An O-RAN Evolution. In Proceedings of the 30th Annual International Conference on Mobile Computing and Networking, Washington, DC, USA, 18–22 November 2024; pp. 2329–2334. [Google Scholar]
- Cui, Q.; You, X.; Wei, N.; Nan, G.; Zhang, X.; Zhang, J.; Lyu, X.; Ai, M.; Tao, X.; Feng, Z.; et al. Overview of AI and Communication for 6G Network: Fundamentals, Challenges, and Future Research Opportunities. arXiv 2024, arXiv:2412.14538. [Google Scholar]
- Tao, M.; Zhou, Y.; Shi, Y.; Lu, J.; Cui, S.; Lu, J.; Letaief, K.B. Federated Edge Learning for 6G: Foundations, Methodologies, and Applications; IEEE: New York, NY, USA, 2024. [Google Scholar]
- Park, J.; Ko, S.W.; Choi, J.; Kim, S.L.; Choi, J.; Bennis, M. Towards semantic MAC protocols for 6G: From protocol learning to language-oriented approaches. IEEE BITS Inf. Theory Mag. 2024, 4, 59–72. [Google Scholar] [CrossRef]
- Van Huynh, N.; Wang, J.; Du, H.; Hoang, D.T.; Niyato, D.; Nguyen, D.N.; Kim, D.I.; Letaief, K.B. Generative AI for physical layer communications: A survey. IEEE Trans. Cogn. Commun. Netw. 2024, 10, 706–728. [Google Scholar] [CrossRef]
- Han, X.; Wu, Y.; Gao, Z.; Feng, B.; Shi, Y.; Gündüz, D.; Zhang, W. SCSC: A Novel Standards-Compatible Semantic Communication Framework for Image Transmission. IEEE Trans. Commun. 2025; early access. [Google Scholar]
- Tao, Z.; Guo, Y.; He, G.; Huang, Y.; You, X. Deep learning-based modeling of 5G core control plane for 5G network digital twin. IEEE Trans. Cogn. Commun. Netw. 2023, 10, 238–251. [Google Scholar] [CrossRef]
- Zheng, J.; Du, B.; Du, H.; Kang, J.; Niyato, D.; Zhang, H. Energy-Efficient Resource Allocation in Generative AI-Aided Secure Semantic Mobile Networks. IEEE Trans. Mob. Comput. 2024, 23, 11422–11435. [Google Scholar] [CrossRef]
- Liu, Z.; Du, H.; Huang, L.; Gao, Z.; Niyato, D. Joint Model Caching and Resource Allocation in Generative AI-Enabled Wireless Edge Networks. arXiv 2024, arXiv:2411.08672. [Google Scholar]
- Wang, X.; Feng, L.; Zhou, F.; Li, W. Joint Power Allocation and Reliability Optimization with Generative AI for Wireless Networked Control Systems. In Proceedings of the 2024 IEEE/CIC International Conference on Communications in China (ICCC Workshops), Hangzhou, China, 7–9 August 2024; IEEE: New York, NY, USA, 2024; pp. 197–202. [Google Scholar]
- Tolba, B.; Elsabrouty, M.; Abdu-Aguye, M.G.; Gacanin, H.; Kasem, H.M. Massive MIMO CSI feedback based on generative adversarial network. IEEE Commun. Lett. 2020, 24, 2805–2808. [Google Scholar] [CrossRef]
- Zeng, Y.; Qiao, L.; Gao, Z.; Qin, T.; Wu, Z.; Khalaf, E.; Chen, S.; Guizani, M. CSI-GPT: Integrating generative pre-trained transformer with federated-tuning to acquire downlink massive MIMO channels. IEEE Trans. Veh. Technol. 2024, 74, 5187–5192. [Google Scholar] [CrossRef]
- Zhao, Z.; Meng, F.; Li, H.; Li, X.; Zhu, G. Mining Limited Data Sufficiently: A BERT-inspired Approach for CSI Time Series Application in Wireless Communication and Sensing. arXiv 2024, arXiv:2412.06861. [Google Scholar]
- Balevi, E.; Doshi, A.; Jalal, A.; Dimakis, A.; Andrews, J.G. High dimensional channel estimation using deep generative networks. IEEE J. Sel. Areas Commun. 2020, 39, 18–30. [Google Scholar] [CrossRef]
- Arvinte, M.; Tamir, J.I. MIMO channel estimation using score-based generative models. IEEE Trans. Wirel. Commun. 2022, 22, 3698–3713. [Google Scholar] [CrossRef]
- Fesl, B.; Strasser, M.B.F.; Joham, M.; Utschick, W. Diffusion-based generative prior for low-complexity MIMO channel estimation. IEEE Wirel. Commun. Lett. 2024, 13, 3493–3497. [Google Scholar] [CrossRef]
- Yang, T.; Zhang, P.; Zheng, M.; Shi, Y.; Jing, L.; Huang, J.; Li, N. WirelessGPT: A Generative Pre-trained Multi-task Learning Framework for Wireless Communication. arXiv 2025, arXiv:2502.06877. [Google Scholar]
- Xie, H.; Qin, Z.; Li, G.Y.; Juang, B.H. Deep learning enabled semantic communication systems. IEEE Trans. Signal Process. 2021, 69, 2663–2675. [Google Scholar] [CrossRef]
- Erdemir, E.; Tung, T.Y.; Dragotti, P.L.; Gunduz, D. Generative Joint Source-Channel Coding for Semantic Image Transmission. arXiv 2022, arXiv:2211.13772. [Google Scholar] [CrossRef]
- Zhang, G.; Li, H.; Cai, Y.; Hu, Q.; Yu, G.; Qin, Z. Progressive Learned Image Transmission for Semantic Communication Using Hierarchical VAE. IEEE Trans. Cogn. Commun. Netw. 2025; early access. [Google Scholar]
- Zhang, M.; Wu, H.; Zhu, G.; Jin, R.; Chen, X.; Gündüz, D. Semantics-Guided Diffusion for Deep Joint Source-Channel Coding in Wireless Image Transmission. arXiv 2025, arXiv:2501.01138. [Google Scholar]
- Zhang, H.; Tao, M. SNR-EQ-JSCC: Joint Source-Channel Coding with SNR-Based Embedding and Query. IEEE Wirel. Commun. Lett. 2025, 14, 881–885. [Google Scholar]
- Li, B.; Liu, Y.; Niu, X.; Bait, B.; Han, W.; Deng, L.; Gunduz, D. Extreme Video Compression with Prediction Using Pre-trained Diffusion Models. In Proceedings of the 2024 16th International Conference on Wireless Communications and Signal Processing (WCSP), Hefei, China, 24–26 October 2024; IEEE: New York, NY, USA, 2024; pp. 1449–1455. [Google Scholar]
- Yilmaz, S.F.; Niu, X.; Bai, B.; Han, W.; Deng, L.; Gündüz, D. High perceptual quality wireless image delivery with denoising diffusion models. In Proceedings of the IEEE INFOCOM 2024-IEEE Conference on Computer Communications Workshops (INFOCOM WKSHPS), Vancouver, BC, Canada, 20 May 2024; IEEE: New York, NY, USA, 2024; pp. 1–5. [Google Scholar]
- Guo, L.; Chen, W.; Sun, Y.; Ai, B.; Pappas, N.; Quek, T. Diffusion-Driven Semantic Communication for Generative Models with Bandwidth Constraints. arXiv 2024, arXiv:2407.18468. [Google Scholar] [CrossRef]
- Pei, J.; Feng, C.; Wang, P.; Tabassum, H.; Shi, D. Latent Diffusion Model-Enabled Low-Latency Semantic Communication in the Presence of Semantic Ambiguities and Wireless Channel Noises. IEEE Trans. Wirel. Commun. 2025; early access. [Google Scholar]
- Tung, T.Y.; Kurka, D.B.; Jankowski, M.; Gündüz, D. DeepJSCC-Q: Constellation Constrained Deep Joint Source-Channel Coding. IEEE J. Sel. Areas Inf. Theory 2022, 3, 720–731. [Google Scholar]
- Kurka, D.B.; Gündüz, D. DeepJSCC-f: Deep joint source-channel coding of images with feedback. IEEE J. Sel. Areas Inf. Theory 2020, 1, 178–193. [Google Scholar]
- Geng, Y.; Niu, X.; Bai, B.; Han, W. Capacity Bounds of Broadcast Channel with a Full-Duplex Base-User Pair. In Proceedings of the 2024 IEEE Information Theory Workshop (ITW), Shenzhen, China, 24–28 November 2024; IEEE: New York, NY, USA, 2024; pp. 145–150. [Google Scholar]
- Jiao, T.; Ye, C.; Huang, Y.; Feng, Y.; Xiao, Z.; Xu, Y.; He, D.; Guan, Y.; Yang, B.; Chang, J.; et al. 6G-Oriented CSI-Based Multi-Modal Pre-Ttaining and Downstream Task Adaptation Paradigm. In Proceedings of the 2024 IEEE International Conference on Communications Workshops (ICC Workshops), Denver, CO, USA, 9–13 June 2024; IEEE: New York, NY, USA, 2024; pp. 1389–1394. [Google Scholar]
- Delfani, E.; Pappas, N. Optimizing Information Freshness in Constrained IoT Systems: A Token-Based Approach. IEEE Trans. Commun. 2024; early access. [Google Scholar]
- Li, J.; Zhang, W. Asymptotically Optimal Joint Sampling and Compression for Timely Status Updates: Age-Distortion Tradeoff. IEEE Trans. Veh. Technol. 2024, 74, 2338–2352. [Google Scholar]
- Qiao, L.; Mashhadi, M.B.; Gao, Z.; Gündüz, D. Token-Domain Multiple Access: Exploiting Semantic Orthogonality for Collision Mitigation. arXiv 2025, arXiv:2502.06118. [Google Scholar]
- Bachmann, R.; Allardice, J.; Mizrahi, D.; Fini, E.; Kar, O.F.; Amirloo, E.; El-Nouby, A.; Zamir, A.; Dehghan, A. FlexTok: Resampling Images into 1D Token Sequences of Flexible Length. arXiv 2025, arXiv:2502.13967. [Google Scholar]
- Sargent, K.; Hsu, K.; Johnson, J.; Fei-Fei, L.; Wu, J. Flow to the Mode: Mode-Seeking Diffusion Autoencoders for State-of-the-Art Image Tokenization. arXiv 2025, arXiv:2503.11056. [Google Scholar]
- Yu, L.; Lezama, J.; Gundavarapu, N.B.; Versari, L.; Sohn, K.; Minnen, D.; Cheng, Y.; Gupta, A.; Gu, X.; Hauptmann, A.G.; et al. Language Model Beats Diffusion-Tokenizer is key to visual generation. In Proceedings of the International Conference on Learning Representations, Vienna, Austria, 7–11 May 2024. [Google Scholar]
- Yang, W.; Du, H.; Liew, Z.Q.; Lim, W.Y.B.; Xiong, Z.; Niyato, D.; Chi, X.; Shen, X.; Miao, C. Semantic communications for future internet: Fundamentals, applications, and challenges. IEEE Commun. Surv. Tutorials 2022, 25, 213–250. [Google Scholar]
- Luo, X.; Chen, H.H.; Guo, Q. Semantic communications: Overview, open issues, and future research directions. IEEE Wirel. Commun. 2022, 29, 210–219. [Google Scholar] [CrossRef]
- Guo, S.; Wang, Y.; Zhang, N.; Su, Z.; Luan, T.H.; Tian, Z.; Shen, X. A survey on semantic communication networks: Architecture, security, and privacy. IEEE Commun. Surv. Tutor. 2024; early access. [Google Scholar]
- Chaccour, C.; Saad, W.; Debbah, M.; Han, Z.; Poor, H.V. Less data, more knowledge: Building next generation semantic communication networks. IEEE Commun. Surv. Tutor. 2024, 27, 37–76. [Google Scholar] [CrossRef]
- Wu, H.; Chen, G.; Gunduz, D. Actions Speak Louder Than Words: Rate-Reward Trade-off in Markov Decision Processes. In Proceedings of the The Thirteenth International Conference on Learning Representations, Singapore, 24–28 April 2025. [Google Scholar]
- Hamdi, Y.; Niu, X.; Bai, B.; Gunduz, D. Non-interactive Remote Coordination. In Proceedings of the Workshop on Machine Learning and Compression, NeurIPS, Vancouver, BC, Canada, 15 December 2024. [Google Scholar]
- Zhang, G.; Yue, Y.; Li, Z.; Yun, S.; Wan, G.; Wang, K.; Cheng, D.; Yu, J.X.; Chen, T. Cut the crap: An economical communication pipeline for LLM-based multi-agent systems. In Proceedings of the International Conference on Learning Representations, Singapore, 24–28 April 2025. [Google Scholar]
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Niu, X.; Bai, B.; Guo, N.; Zhang, W.; Han, W. Rate–Distortion–Perception Trade-Off in Information Theory, Generative Models, and Intelligent Communications. Entropy 2025, 27, 373. https://doi.org/10.3390/e27040373
Niu X, Bai B, Guo N, Zhang W, Han W. Rate–Distortion–Perception Trade-Off in Information Theory, Generative Models, and Intelligent Communications. Entropy. 2025; 27(4):373. https://doi.org/10.3390/e27040373
Chicago/Turabian StyleNiu, Xueyan, Bo Bai, Nian Guo, Weixi Zhang, and Wei Han. 2025. "Rate–Distortion–Perception Trade-Off in Information Theory, Generative Models, and Intelligent Communications" Entropy 27, no. 4: 373. https://doi.org/10.3390/e27040373
APA StyleNiu, X., Bai, B., Guo, N., Zhang, W., & Han, W. (2025). Rate–Distortion–Perception Trade-Off in Information Theory, Generative Models, and Intelligent Communications. Entropy, 27(4), 373. https://doi.org/10.3390/e27040373