RoSe-Mix: Robust and Secure Deep Neural Network Watermarking in Black-Box Settings via Image Mixup
Abstract
:1. Introduction
- We provide a security analysis of this method by formalizing the complexity for an attacker to break the watermarking scheme, demonstrating that breaking the watermarking protocol is exponentially hard with respect to the size of the selected secret key.
- We conduct experiments on several datasets and models to evaluate the performance of our proposed method.
2. Related Works
3. Proposed Method
3.1. Motivation
- Trigger set vulnerability: RoSe’s main limitation lies in its trigger set, which acts as the Owner’s key. Since the same trigger set is used for both watermark embedding and verification, it is vulnerable to a potentially untrustworthy verifier. Furthermore, increasing the number of watermark samples from the training data can degrade the model’s performance on the main task, limiting the scalability of RoSe [23].
- Domain space overlap: In Mixer, the verification trigger set differs from the trigger samples used during watermark embedding. However, since both sets are derived from the same data domain, represented as a sphere in the feature space, there remains a risk that an attacker could generate keys and that are very similar to the Owner’s keys. This could place the attacker in the same domain space, making it difficult for the verifier to distinguish between the legitimate Owner and the attacker in case of a dispute [25].
3.2. RoSe-Mix: A Hybrid Watermarking Approach
3.3. Security Analysis
- a.
- The bitstream (in Equation (1)) that is used to generate the seed in the RoSe method (We can approximate the work required for RoSe-Mix by adopting the RoSe analysis, where, instead of guessing the correct labels, the attacker must guess the correct bitstream and thereby the seed. This assumes that the function is known).
- b.
- , which are the parameters used in the Mixer method.
3.3.1. RoSe
- a.
- Generating black-box adversarial examples.
- b.
- Regenerating hash-based triggers for verification.
3.3.2. Mixer
3.3.3. RoSe-Mix
4. Experimental Results
4.1. Settings
4.1.1. Dataset and Models
4.1.2. Metrics
4.2. Modulation Parameters
4.3. Fidelity
4.4. Robustness
- a.
- Mixup-based embedding: Unlike traditional backdoor-based watermarking schemes, RoSe-Mix does not rely on a fixed set of trigger inputs. Instead, Mixup continuously generates new trigger samples, preventing an adversary from isolating and removing the watermark through fine-tuning or pruning. This also explains why remains significantly higher than TA under aggressive pruning—the watermark is encoded as a distributed manifold rather than discrete patterns.
- b.
- Cryptographic hashing mechanism: The use of one-way hash functions ensures that, even if an attacker attempts to regenerate trigger samples, they cannot feasibly recover the correct key-label mappings without access to the private key. This is evident from the minimal USR values in Table 2, showing that unauthorized attempts to extract the watermark result in high failure rates.
4.5. Security
5. Conclusions
Author Contributions
Funding
Institutional Review Board Statement
Data Availability Statement
Conflicts of Interest
References
- Azzouzi, M.E.; Coatrieux, G.; Bellafqira, R.; Delamarre, D.; Riou, C.; Oubenali, N.; Cabon, S.; Cuggia, M.; Bouzillé, G. Automatic de-identification of French electronic health records: A cost-effective approach exploiting distant supervision and deep learning models. BMC Med Informat. Decis. Mak. 2024, 24, 54. [Google Scholar]
- Mohammadi Foumani, N.; Miller, L.; Tan, C.W.; Webb, G.I.; Forestier, G.; Salehi, M. Deep learning for time series classification and extrinsic regression: A current survey. ACM Comput. Surv. 2024, 56, 1–45. [Google Scholar]
- Nafea, A.A.; Alameri, S.A.; Majeed, R.R.; Khalaf, M.A.; AL-Ani, M.M. A Short Review on Supervised Machine Learning and Deep Learning Techniques in Computer Vision. Babylon. J. Mach. Learn. 2024, 2024, 48–55. [Google Scholar]
- Buchholz, K. The Extreme Cost of Training AI Models. 2024. Available online: https://www.statista.com/chart/33114/estimated-cost-of-training-selected-ai-models/ (accessed on 3 March 2025).
- Uchida, Y.; Nagai, Y.; Sakazawa, S.; Satoh, S. Embedding watermarks into deep neural networks. In Proceedings of the 2017 ACM on International Conference on Multimedia Retrieval, Bucharest, Romania, 6–9 June 2017; pp. 269–277. [Google Scholar]
- Sun, Y.; Liu, L.; Yu, N.; Liu, Y.; Tian, Q.; Guo, D. Deep Watermarking for Deep Intellectual Property Protection: A Comprehensive Survey. Available online: https://papers.ssrn.com/sol3/papers.cfm?abstract_id=4697020 (accessed on 25 March 2025). [CrossRef]
- Lansari, M.; Bellafqira, R.; Kapusta, K.; Thouvenot, V.; Bettan, O.; Coatrieux, G. When federated learning meets watermarking: A comprehensive overview of techniques for intellectual property protection. Mach. Learn. Knowl. Extr. 2023, 5, 1382–1406. [Google Scholar] [CrossRef]
- Darvish Rouhani, B.; Chen, H.; Koushanfar, F. Deepsigns: An end-to-end watermarking framework for ownership protection of deep neural networks. In Proceedings of the Twenty-Fourth International Conference on Architectural Support for Programming Languages and Operating Systems, Providence, RI, USA, 13–17 April 2019; pp. 485–497. [Google Scholar]
- Fan, L.; Ng, K.W.; Chan, C.S. Rethinking deep neural network ownership verification: Embedding passports to defeat ambiguity attacks. In Proceedings of the 33rd International Conference on Neural Information Processing Systems, Vancouver, BC, Canada, 8–14 December 2019. [Google Scholar]
- Wang, T.; Kerschbaum, F. Riga: Covert and robust white-box watermarking of deep neural networks. In Proceedings of the Web Conference 2021, Ljubljana, Slovenia, 19–23 April 2021; pp. 993–1004. [Google Scholar]
- Bellafqira, R.; Coatrieux, G. Diction: Dynamic robust white box watermarking scheme. arXiv 2022, arXiv:2210.15745. [Google Scholar]
- Lv, P.; Li, P.; Zhang, S.; Chen, K.; Liang, R.; Ma, H.; Zhao, Y.; Li, Y. A robustness-assured white-box watermark in neural networks. IEEE Trans. Dependable Secur. Comput. 2023, 20, 5214–5229. [Google Scholar]
- Chen, H.; Liu, C.; Zhu, T.; Zhou, W. When deep learning meets watermarking: A survey of application, attacks and defenses. Comput. Stand. Interfaces 2024, 89, 103830. [Google Scholar]
- Lansari, M.; Bellafqira, R.; Kapusta, K.; Kallas, K.; Thouvenot, V.; Bettan, O.; Coatrieux, G. FedCrypt: A Dynamic White-Box Watermarking Scheme for Homomorphic Federated Learning. TechRxiv 2024. [Google Scholar] [CrossRef]
- Adi, Y.; Baum, C.; Cisse, M.; Pinkas, B.; Keshet, J. Turning your weakness into a strength: Watermarking deep neural networks by backdooring. In Proceedings of the 27th USENIX Security Symposium (USENIX Security 18), Baltimore, MD, USA, 15–17 August 2018; pp. 1615–1631. [Google Scholar]
- Guo, J.; Potkonjak, M. Watermarking deep neural networks for embedded systems. In Proceedings of the 2018 IEEE/ACM International Conference on Computer-Aided Design (ICCAD), San Diego, CA, USA, 5–8 November 2018; pp. 1–8. [Google Scholar]
- Zhang, J.; Gu, Z.; Jang, J.; Wu, H.; Stoecklin, M.P.; Huang, H.; Molloy, I. Protecting intellectual property of deep neural networks with watermarking. In Proceedings of the 2018 on Asia Conference on Computer and Communications Security, Incheon, Republic of Korea, 4–8 June 2018; pp. 159–172. [Google Scholar]
- Le Merrer, E.; Perez, P.; Trédan, G. Adversarial frontier stitching for remote neural network watermarking. Neural Comput. Appl. 2020, 32, 9233–9244. [Google Scholar] [CrossRef]
- Yadollahi, M.M.; Shoeleh, F.; Dadkhah, S.; Ghorbani, A.A. Robust black-box watermarking for deep neural network using inverse document frequency. In Proceedings of the 2021 IEEE Intl Conf on Dependable, Autonomic and Secure Computing, Intl Conf on Pervasive Intelligence and Computing, Intl Conf on Cloud and Big Data Computing, Intl Conf on Cyber Science and Technology Congress (DASC/PiCom/CBDCom/CyberSciTech), Virtual, 25–28 October 2021; pp. 574–581. [Google Scholar]
- Wang, Y.; Wu, H. Protecting the intellectual property of speaker recognition model by black-box watermarking in the frequency domain. Symmetry 2022, 14, 619. [Google Scholar] [CrossRef]
- Gloaguen, T.; Jovanović, N.; Staab, R.; Vechev, M. Black-box detection of language model watermarks. arXiv 2024, arXiv:2405.20777. [Google Scholar]
- Leroux, S.; Vanassche, S.; Simoens, P. Multi-bit Black-box Watermarking of Deep Neural Networks in Embedded Applications. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Seattle, WA, USA, 16–22 June 2024; pp. 2121–2130. [Google Scholar]
- Kallas, K.; Furon, T. Rose: A robust and secure dnn watermarking. In Proceedings of the 2022 IEEE International Workshop on Information Forensics and Security (WIFS), Online, 12–16 December 2022; pp. 1–6. [Google Scholar]
- Zhang, H. mixup: Beyond empirical risk minimization. arXiv 2017, arXiv:1710.09412. [Google Scholar]
- Kallas, K.; Furon, T. Mixer: Dnn watermarking using image mixup. In Proceedings of the ICASSP 2023—2023 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Rhodes Island, Greece, 4–10 June 2023; pp. 1–5. [Google Scholar]
- Boenisch, F. A systematic review on model watermarking for neural networks. Front. Big Data 2021, 4, 729663. [Google Scholar]
- Oh, G.; Kim, S.; Cho, W.; Lee, S.; Chung, J.; Song, D.; Yu, Y. SEAL: Entangled White-box Watermarks on Low-Rank Adaptation. arXiv 2025, arXiv:2501.09284. [Google Scholar]
- Downer, J.; Wang, R.; Wang, B. Watermarking Graph Neural Networks via Explanations for Ownership Protection. arXiv 2025, arXiv:2501.05614. [Google Scholar]
- Krizhevsky, A.; Hinton, G. Learning Multiple Layers of Features from Tiny Images. 2009. Available online: https://www.cs.utoronto.ca/~kriz/learning-features-2009-TR.pdf (accessed on 25 March 2025).
- Liang, J.; Wang, R. Fedcip: Federated client intellectual property protection with traitor tracking. arXiv 2023, arXiv:2306.01356. [Google Scholar]
- LeCun, Y.; Bottou, L.; Bengio, Y.; Haffner, P. Gradient-based learning applied to document recognition. Proc. IEEE 1998, 86, 2278–2324. [Google Scholar] [CrossRef]
- LeCun, Y.; Cortes, C. MNIST Handwritten Digit Database. 2010. Available online: https://www.semanticscholar.org/paper/The-mnist-database-of-handwritten-digits-LeCun-Cortes/dc52d1ede1b90bf9d296bc5b34c9310b7eaa99a2 (accessed on 25 March 2025).
- Xiao, H.; Rasul, K.; Vollgraf, R. Fashion-mnist: A novel image dataset for benchmarking machine learning algorithms. arXiv 2017, arXiv:1708.07747. [Google Scholar]
- Simonyan, K. Very deep convolutional networks for large-scale image recognition. arXiv 2014, arXiv:1409.1556. [Google Scholar]
- He, K.; Zhang, X.; Ren, S.; Sun, J. Deep residual learning for image recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA, 27–30 June 2016; pp. 770–778. [Google Scholar]
- Stallkamp, J.; Schlipsing, M.; Salmen, J.; Igel, C. The German Traffic Sign Recognition Benchmark: A multi-class classification competition. In Proceedings of the IEEE International Joint Conference on Neural Networks, San Jose, CA, USA, 31 July–5 August 2011; pp. 1453–1460. [Google Scholar]
- Deng, J.; Dong, W.; Socher, R.; Li, L.J.; Li, K.; Fei-Fei, L. Imagenet: A large-scale hierarchical image database. In Proceedings of the 2009 IEEE Conference on Computer Vision and Pattern Recognition, Miami, FL, USA, 20–25 June 2009; pp. 248–255. [Google Scholar]
- Szyller, S.; Atli, B.G.; Marchal, S.; Asokan, N. Dawn: Dynamic adversarial watermarking of neural networks. In Proceedings of the 29th ACM International Conference on Multimedia, Virtual, 20–24 October 2021; pp. 4417–4425. [Google Scholar]
- Pascal, L.; Michiardi, P.; Bost, X.; Huet, B.; Zuluaga, M.A. Maximum roaming multi-task learning. In Proceedings of the AAAI Conference on Artificial Intelligence, Virtual, 2–9 February 2021; Volume 35, pp. 9331–9341. [Google Scholar]
- Natarajan, N.; Dhillon, I.S.; Ravikumar, P.K.; Tewari, A. Learning with noisy labels. In Proceedings of the 27th International Conference on Neural Information Processing Systems, Lake Tahoe, NV, USA, 5–10 December 2013; Volume 26. [Google Scholar]
Metric | Host DNN | Watermarked DNN | Fine-Tune | Dyn. Quant. | Full Uint8. Quant. | Full Int8. Quant. | Float16 Quant. | JPEG55 |
---|---|---|---|---|---|---|---|---|
MNIST | ||||||||
TA | 98.64 | 99.11 | 99.11 | 99.14 | 99.14 | 99.14 | 99.11 | 99.09 |
13.54 | 100 | 100 | 100 | 100 | 100 | 100 | 100 | |
12.50 | 100 | 100 | 100 | 100 | 100 | 100 | 100 | |
FashionMNIST | ||||||||
TA | 88.44 | 88.63 | 88.63 | 88.62 | 88.62 | 88.62 | 88.63 | 87.73 |
8.33 | 89.4 | 89.4 | 90.2 | 90.2 | 90.2 | 89.4 | 88.6 | |
7.8 | 88 | 88 | 88 | 88 | 88 | 88 | 88.4 | |
Cifar10 | ||||||||
TA | 70.52 | 78 | 78 | 78.04 | 78.04 | 78.04 | 77.99 | 75.42 |
10.42 | 100 | 100 | 100 | 100 | 100 | 100 | 100 | |
7.29 | 96.8 | 96.8 | 96.8 | 96.8 | 96.8 | 96.8 | 96.4 | |
Transfer Learning | ||||||||
TA | 86.54 | 86.57 | 71.8 | 86.63 | 86.63 | 86.63 | 86.56 | 80.36 |
11.46 | 100 | 67 | 100 | 100 | 100 | 100 | 100 | |
9.38 | 100 | 70.2 | 100 | 100 | 100 | 100 | 100 | |
GTSRB | ||||||||
TA | 78.89 | 90.1 | 80.84 | 90.08 | 90.08 | 90.08 | 90.1 | 86.45 |
2.2 | 100 | 89.4 | 100 | 100 | 100 | 100 | 92.8 | |
1.4 | 88 | 94.8 | 88.6 | 88.6 | 88.6 | 88.2 | 79.4 |
Dataset | USR (%) | |
---|---|---|
Mixer | RoSe-Mix | |
MNIST | 51.1 | 20.78 |
FashionMNIST | 39.3 | 15.32 |
Cifar10 | 38.4 | 15.0 |
Transfer Learning | 39.5 | 14.7 |
GTSRB | 12.2 | 4.69 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
El Hajjar, T.; Lansari, M.; Bellafqira, R.; Coatrieux, G.; Kapusta, K.; Kallas, K. RoSe-Mix: Robust and Secure Deep Neural Network Watermarking in Black-Box Settings via Image Mixup. Mach. Learn. Knowl. Extr. 2025, 7, 32. https://doi.org/10.3390/make7020032
El Hajjar T, Lansari M, Bellafqira R, Coatrieux G, Kapusta K, Kallas K. RoSe-Mix: Robust and Secure Deep Neural Network Watermarking in Black-Box Settings via Image Mixup. Machine Learning and Knowledge Extraction. 2025; 7(2):32. https://doi.org/10.3390/make7020032
Chicago/Turabian StyleEl Hajjar, Tamara, Mohammed Lansari, Reda Bellafqira, Gouenou Coatrieux, Katarzyna Kapusta, and Kassem Kallas. 2025. "RoSe-Mix: Robust and Secure Deep Neural Network Watermarking in Black-Box Settings via Image Mixup" Machine Learning and Knowledge Extraction 7, no. 2: 32. https://doi.org/10.3390/make7020032
APA StyleEl Hajjar, T., Lansari, M., Bellafqira, R., Coatrieux, G., Kapusta, K., & Kallas, K. (2025). RoSe-Mix: Robust and Secure Deep Neural Network Watermarking in Black-Box Settings via Image Mixup. Machine Learning and Knowledge Extraction, 7(2), 32. https://doi.org/10.3390/make7020032