Spread Spectrum Image Watermarking Through Latent Diffusion Model
Abstract
:1. Introduction
- The proposed scheme embeds the watermark information in a spread spectrum manner. It does not require a pre-training embedding network or a decoder structure, and therefore, the network is shown to be more lightweight.
- The noisy feature representation of the latent vector is explored to hide the watermark message. Benefiting from the reversible property of diffusion models, the watermarked image can be produced without compromising the perceptual quality. Additionally, the VAE encoder acts as a robust feature extractor, which also provides a sweet spot to accommodate the watermark. Furthermore, the diffusion operated on the lower dimensional vector yielded by the VAE-encoder is computationally efficient.
- Experimental results show that our framework performs robustly under common attacks like JPEG compression, brightness adjustment, and blurring. More importantly, it is resistant to image regeneration and image editing.
2. Method
2.1. Diffusion and Inversion
2.2. Watermark Embedding
2.3. Watermark Extraction
2.4. Loss Function
3. Experiments
3.1. Experimental Settings
3.2. Imperceptibility Test
3.3. Robustness Test
Methods | Common Attacks | Regeneration Attacks | Image Editing | |||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|
1 | 2 | 3 | 4 | 5 | 6 | 7 | B | C | D | P | U | |
COCO | ||||||||||||
HiDDeN | 0.310 | 0.151 | 0.154 | 0.223 | 0.188 | 0.182 | 0.295 | 0.417 | 0.378 | 0.416 | 0.461 | 0.457 |
Stegastamp | 0.001 | 0.001 | 0.002 | 0.003 | 0.001 | 0.186 | 0.001 | 0.002 | 0.002 | 0.137 | 0.058 | 0.200 |
SSL | 0.032 | 0.001 | 0.001 | 0.036 | 0.001 | 0.201 | 0.122 | 0.318 | 0.290 | 0.207 | 0.131 | 0.409 |
Stable Signature | 0.124 | 0.006 | 0.008 | 0.025 | 0.066 | 0.012 | 0.169 | 0.377 | 0.284 | 0.522 | 0.511 | 0.583 |
Proposed | 0.014 | 0.001 | 0.001 | 0.041 | 0.001 | 0.217 | 0.021 | 0.067 | 0.070 | 0.018 | 0.010 | 0.079 |
Proposed (64) | 0.013 | 0.001 | 0.001 | 0.032 | 0.001 | 0.221 | 0.019 | 0.059 | 0.070 | 0.013 | 0.014 | 0.063 |
DiffusionDB | ||||||||||||
HiDDeN | 0.319 | 0.155 | 0.155 | 0.226 | 0.184 | 0.187 | 0.305 | 0.420 | 0.385 | 0.422 | 0.481 | 0.448 |
Stegastamp | 0.001 | 0.002 | 0.003 | 0.003 | 0.001 | 0.189 | 0.001 | 0.002 | 0.003 | 0.152 | 0.073 | 0.258 |
SSL | 0.038 | 0.001 | 0.001 | 0.031 | 0.001 | 0.211 | 0.139 | 0.322 | 0.298 | 0.260 | 0.163 | 0.376 |
Stable Signature | 0.113 | 0.005 | 0.006 | 0.013 | 0.062 | 0.008 | 0.146 | 0.375 | 0.275 | 0.521 | 0.482 | 0.552 |
Proposed | 0.025 | 0.001 | 0.001 | 0.049 | 0.001 | 0.227 | 0.029 | 0.077 | 0.081 | 0.028 | 0.018 | 0.100 |
Proposed (64) | 0.023 | 0.001 | 0.001 | 0.047 | 0.001 | 0.220 | 0.030 | 0.081 | 0.089 | 0.018 | 0.025 | 0.094 |
3.4. Capacity
3.5. Discussions
4. Conclusions
Author Contributions
Funding
Institutional Review Board Statement
Data Availability Statement
Conflicts of Interest
References
- Van Schyndel, R.G.; Tirkel, A.Z.; Osborne, C.F. A digital watermark. In Proceedings of the 1st International Conference on Image Processing, Washington, DC, USA, 13–16 November 1994; IEEE: New York, NY, USA, 1994; Volume 2, pp. 86–90. [Google Scholar]
- Hwang, M.J.; Lee, J.; Lee, M.; Kang, H.G. SVD-based adaptive QIM watermarking on stereo audio signals. IEEE Trans. Multimed. 2017, 20, 45–54. [Google Scholar] [CrossRef]
- Guan, H.; Zeng, Z.; Liu, J.; Zhang, S. A novel robust digital image watermarking algorithm based on two-level DCT. In Proceedings of the 2014 International Conference on Information Science, Electronics and Electrical Engineering, Sapporo, Japan, 26–28 April 2014; IEEE: New York, NY, USA, 2014; Volume 3, pp. 1804–1809. [Google Scholar]
- Zhu, J.; Kaplan, R.; Johnson, J.; Li, F. Hidden: Hiding data with deep networks. In Proceedings of the European Conference on Computer Vision (ECCV), Munich, Germany, 8–14 September 2018; pp. 657–672. [Google Scholar]
- Tancik, M.; Mildenhall, B.; Ng, R. Stegastamp: Invisible hyperlinks in physical photographs. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Virtual Event, 14–19 June 2020; pp. 2117–2126. [Google Scholar]
- Ronneberger, O.; Fischer, P.; Brox, T. U-net: Convolutional networks for biomedical image segmentation. In Proceedings of the Medical Image Computing and Computer-Assisted Intervention—MICCAI 2015: 18th International Conference, Munich, Germany, 5–9 October 2015; Proceedings, Part III 18. Springer: Berlin/Heidelberg, Germany, 2015; pp. 234–241. [Google Scholar]
- Jia, Z.; Fang, H.; Zhang, W. Mbrs: Enhancing robustness of dnn-based watermarking by mini-batch of real and simulated jpeg compression. In Proceedings of the 29th ACM International Conference on Multimedia, Chengdu, China, 20–24 October 2021; pp. 41–49. [Google Scholar]
- Wang, B.; Wu, Y.; Wang, G. Adaptor: Improving the robustness and imperceptibility of watermarking by the adaptive strength factor. IEEE Trans. Circuits Syst. Video Technol. 2023, 33, 6260–6272. [Google Scholar] [CrossRef]
- Fang, H.; Chen, K.; Qiu, Y.; Liu, J.; Xu, K.; Fang, C.; Zhang, W.; Chang, E.C. DeNoL: A Few-Shot-Sample-Based Decoupling Noise Layer for Cross-channel Watermarking Robustness. In Proceedings of the 31st ACM International Conference on Multimedia, Ottawa, ON, Canada, 29 October–3 November 2023; pp. 7345–7353. [Google Scholar]
- Fernandez, P.; Sablayrolles, A.; Furon, T.; Jégou, H.; Douze, M. Watermarking images in self-supervised latent spaces. In Proceedings of the ICASSP 2022–2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Singapore, 22–27 May 2022; IEEE: New York, NY, USA, 2022; pp. 3054–3058. [Google Scholar]
- Zhao, Y.; Pang, T.; Du, C.; Yang, X.; Cheung, N.M.; Lin, M. A recipe for watermarking diffusion models. arXiv 2023, arXiv:2303.10137. [Google Scholar]
- Fernandez, P.; Couairon, G.; Jégou, H.; Douze, M.; Furon, T. The stable signature: Rooting watermarks in latent diffusion models. In Proceedings of the IEEE/CVF International Conference on Computer Vision, Paris, France, 1–6 October 2023; pp. 22466–22477. [Google Scholar]
- Zhang, L.; Liu, X.; Martin, A.V.; Bearfield, C.X.; Brun, Y.; Guan, H. Robust Image Watermarking using Stable Diffusion. arXiv 2024, arXiv:2401.04247. [Google Scholar]
- Wen, Y.; Kirchenbauer, J.; Geiping, J.; Goldstein, T. Tree-ring watermarks: Fingerprints for diffusion images that are invisible and robust. arXiv 2023, arXiv:2305.20030. [Google Scholar]
- Tan, Y.; Peng, Y.; Fang, H.; Chen, B.; Xia, S.T. WaterDiff: Perceptual Image Watermarks Via Diffusion Model. In Proceedings of the ICASSP 2024—2024 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Seoul, Republic of Korea, 14–19 April 2024; pp. 3250–3254. [Google Scholar] [CrossRef]
- Zhao, X.; Zhang, K.; Su, Z.; Vasan, S.; Grishchenko, I.; Kruegel, C.; Vigna, G.; Wang, Y.X.; Li, L. Invisible image watermarks are provably removable using generative AI. arXiv 2023, arXiv:2306.01953. [Google Scholar]
- Cox, I.J.; Kilian, J.; Leighton, F.T.; Shamoon, T. Secure spread spectrum watermarking for multimedia. IEEE Trans. Image Process. 1997, 6, 1673–1687. [Google Scholar] [CrossRef] [PubMed]
- Ho, J.; Jain, A.; Abbeel, P. Denoising diffusion probabilistic models. Adv. Neural Inf. Process. Syst. 2020, 33, 6840–6851. [Google Scholar]
- Song, J.; Meng, C.; Ermon, S. Denoising diffusion implicit models. arXiv 2020, arXiv:2010.02502. [Google Scholar]
- Rombach, R.; Blattmann, A.; Lorenz, D.; Esser, P.; Ommer, B. High-resolution image synthesis with latent diffusion models. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, New Orleans, LA, USA, 18–24 June 2022; pp. 10684–10695. [Google Scholar]
- Zhao, H.; Gallo, O.; Frosio, I.; Kautz, J. Loss functions for image restoration with neural networks. IEEE Trans. Comput. Imaging 2016, 3, 47–57. [Google Scholar] [CrossRef]
- Czolbe, S.; Krause, O.; Cox, I.; Igel, C. A loss function for generative neural networks based on watson’s perceptual model. Adv. Neural Inf. Process. Syst. 2020, 33, 2051–2061. [Google Scholar]
- Simonyan, K.; Zisserman, A. Very deep convolutional networks for large-scale image recognition. arXiv 2014, arXiv:1409.1556. [Google Scholar]
- Lin, T.Y.; Maire, M.; Belongie, S.; Hays, J.; Perona, P.; Ramanan, D.; Dollár, P.; Zitnick, C.L. Microsoft coco: Common objects in context. In Proceedings of the Computer Vision—ECCV 2014: 13th European Conference, Zurich, Switzerland, 6–12 September 2014; Proceedings, Part V 13. Springer: Berlin/Heidelberg, Germany, 2014; pp. 740–755. [Google Scholar]
- Wang, Z.J.; Montoya, E.; Munechika, D.; Yang, H.; Hoover, B.; Chau, D.H. Diffusiondb: A large-scale prompt gallery dataset for text-to-image generative models. arXiv 2022, arXiv:2210.14896. [Google Scholar]
- Kingma, D.P.; Ba, J. Adam: A method for stochastic optimization. arXiv 2014, arXiv:1412.6980. [Google Scholar]
- Dabov, K.; Foi, A.; Katkovnik, V.; Egiazarian, K. Image denoising by sparse 3-D transform-domain collaborative filtering. IEEE Trans. Image Process. 2007, 16, 2080–2095. [Google Scholar] [CrossRef] [PubMed]
- Ballé, J.; Minnen, D.; Singh, S.; Hwang, S.J.; Johnston, N. Variational image compression with a scale hyperprior. arXiv 2018, arXiv:1802.01436. [Google Scholar]
- Cheng, Z.; Sun, H.; Takeuchi, M.; Katto, J. Learned image compression with discretized gaussian mixture likelihoods and attention modules. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Virtual Event, 14–19 June 2020; pp. 7939–7948. [Google Scholar]
- Zhang, R.; Isola, P.; Efros, A.A.; Shechtman, E.; Wang, O. The unreasonable effectiveness of deep features as a perceptual metric. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA, 18–22 June 2018; pp. 586–595. [Google Scholar]
- Chen, B.; Wornell, G.W. Quantization index modulation: A class of provably good methods for digital watermarking and information embedding. IEEE Trans. Inf. Theory 2001, 47, 1423–1443. [Google Scholar] [CrossRef]
- Brooks, T.; Holynski, A.; Efros, A.A. InstructPix2Pix: Learning to Follow Image Editing Instructions. In Proceedings of the 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Vancouver, BC, Canada, 18–22 June 2023; pp. 18392–18402. [Google Scholar] [CrossRef]
- Zhao, H.; Ma, X.; Chen, L.; Si, S.; Wu, R.; An, K.; Yu, P.; Zhang, M.; Li, Q.; Chang, B. UltraEdit: Instruction-based Fine-Grained Image Editing at Scale. 2024. Available online: http://arxiv.org/abs/2407.05282 (accessed on 6 December 2024).
COCO | DiffusionDB | |||||
---|---|---|---|---|---|---|
Methods | PSNR ↑ | SSIM ↑ | LPIPS ↓ | PSNR ↑ | SSIM ↑ | LPIPS ↓ |
HiDDeN | 31.70 | 0.93 | 0.02 | 31.42 | 0.94 | 0.02 |
Stegastamp | 28.73 | 0.89 | 0.07 | 28.05 | 0.89 | 0.07 |
SSL | 32.07 | 0.87 | 0.11 | 32.11 | 0.88 | 0.10 |
Stable Signature | 26.43 | 0.75 | 0.06 | 25.78 | 0.75 | 0.06 |
Proposed | 30.85 | 0.90 | 0.05 | 30.49 | 0.90 | 0.05 |
Proposed (64) | 28.28 | 0.88 | 0.10 | 28.63 | 0.89 | 0.09 |
Transparency | Robustness | ||||||
---|---|---|---|---|---|---|---|
Capacity | PSNR ↑ | SSIM ↑ | LPIPS ↓ | Common | Regeneration | Editing | |
32 | 30.85 | 0.90 | 0.05 | 0.042 | 0.052 | 0.045 | |
Proposed | 64 | 28.28 | 0.88 | 0.10 | 0.041 | 0.071 | 0.039 |
96 | 22.73 | 0.78 | 0.27 | 0.026 | 0.022 | 0.026 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Wu, H.; Lin, X.; Tan, G. Spread Spectrum Image Watermarking Through Latent Diffusion Model. Entropy 2025, 27, 428. https://doi.org/10.3390/e27040428
Wu H, Lin X, Tan G. Spread Spectrum Image Watermarking Through Latent Diffusion Model. Entropy. 2025; 27(4):428. https://doi.org/10.3390/e27040428
Chicago/Turabian StyleWu, Hongfei, Xiaodan Lin, and Gewei Tan. 2025. "Spread Spectrum Image Watermarking Through Latent Diffusion Model" Entropy 27, no. 4: 428. https://doi.org/10.3390/e27040428
APA StyleWu, H., Lin, X., & Tan, G. (2025). Spread Spectrum Image Watermarking Through Latent Diffusion Model. Entropy, 27(4), 428. https://doi.org/10.3390/e27040428