A Transformer-Based DeepFake-Detection Method for Facial Organs
Abstract
:1. Introduction
- We propose a novel deepfake-detection method based on transformer architecture, which can identify facial-detail editing and is robust to synthetic facial-recognition methods dealing with occluded masks or sunglasses.
- The method focuses on organ-based forgery detection, which trains different transformers for different organs. Each transformer can work independently and flexible. At the same time, the weight of obscured and stained organs are reduced automatically.
- We propose a deepfake-detection dataset, namely Facial Organ Fake Detection Test Dataset (FOFDTD). It is consisting of 750 authentic images, 750 GAN-generated images, and 900 forgery images made by humans, including masks, sunglasses, and undecorated faces. All the authentic images in the FOFDTD are collected from the Internet, and are mainly used for deepfake detection in real-world.
2. Related Work
3. Proposed Method
3.1. Overview of the Framework
3.2. Organ Selection and Feature Extraction
3.3. Organ-Level Transformer
3.4. Whole-Face Transformer and Classifier Network
3.5. Loss Functions
4. Facial-Organ Forgery-Detection Test Dataset
4.1. GAN-Generated Image
4.2. Artificial Image
5. Experimental Results
5.1. Implementation Details
5.2. Ablation Study
5.3. Comparison with Other Methods
5.4. Cross-Dataset Evaluation
5.5. Complexity Measure
6. Conclusions
Author Contributions
Funding
Conflicts of Interest
References
- Nataraj, L.; Mohammed, T.M.; Manjunath, B.S.; Chandrasekaran, S.; Flenner, A.; Bappy, J.H.; Roy-Chowdhury, A.K. Detecting GAN generated fake images using co-occurrence matrices. Electron. Imaging 2019, 2019, 532-1–532-7. [Google Scholar] [CrossRef] [Green Version]
- Barni, M.; Kallas, K.; Nowroozi, E.; Tondi, B. CNN detection of GAN-generated face images based on cross-band co-occurrences analysis. In Proceedings of the 2020 IEEE International Workshop on Information Forensics and Security (WIFS), New York, NY, USA, 6–11 December 2020; pp. 1–6. [Google Scholar]
- Mi, Z.; Jiang, X.; Sun, T.; Xu, K. Gan-generated image detection with self-attention mechanism against gan generator defect. IEEE J. Sel. Top. Signal Process. 2020, 14, 969–981. [Google Scholar] [CrossRef]
- Wang, S.Y.; Wang, O.; Zhang, R.; Owens, A.; Efros, A.A. CNN-generated images are surprisingly easy to spot... for now. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Seattle, WA, USA, 13–19 June 2020; pp. 8695–8704. [Google Scholar]
- Hu, J.; Liao, X.; Wang, W.; Qin, Z. Detecting Compressed Deepfake Videos in Social Networks Using Frame-Temporality Two-Stream Convolutional Network. IEEE Trans. Circuits Syst. Video Technol. 2021, 32, 1089–1102. [Google Scholar] [CrossRef]
- Chen, B.; Liu, X.; Zheng, Y.; Zhao, G.; Shi, Y.Q. A robust GAN-generated face detection method based on dual-color spaces and an improved Xception. IEEE Trans. Circuits Syst. Video Technol. 2021, 32, 3527–3538. [Google Scholar] [CrossRef]
- He, Y.; Yu, N.; Keuper, M.; Fritz, M. Beyond the spectrum: Detecting deepfakes via re-synthesis. IJCAI 2021, 2534–2541. [Google Scholar]
- Ni, Y.; Meng, D.; Yu, C.; Quan, C.; Ren, D.; Zhao, Y. CORE: Consistent Representation Learning for Face Forgery Detection. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, New Orleans, LA, USA, 18–24 June 2022; pp. 12–21. [Google Scholar]
- Wang, J.; Wu, Z.; Ouyang, W.; Han, X.; Chen, J.; Jiang, Y.G.; Li, S.N. M2tr: Multi-modal multi-scale transformers for deepfake detection. In Proceedings of the 2022 International Conference on Multimedia Retrieval, Newark, NJ, USA, 27–30 June 2022; pp. 615–623. [Google Scholar]
- Yuan, Y.; Fu, X.; Wang, G.; Li, Q.; Li, X. Forgery-Domain-Supervised Deepfake Detection with Non-Negative Constraint. IEEE Signal Process. Lett. 2022. [Google Scholar] [CrossRef]
- Ciftci, U.A.; Demir, I.; Yin, L. Fakecatcher: Detection of synthetic portrait videos using biological signals. IEEE Trans. Pattern Anal. Mach. Intell. 2020. [Google Scholar] [CrossRef] [PubMed]
- Agarwal, S.; Farid, H.; El-Gaaly, T.; Lim, S.N. Detecting deep-fake videos from appearance and behavior. In Proceedings of the 2020 IEEE International Workshop on Information Forensics and Security (WIFS), New York, NY, USA, 6–11 December 2020; pp. 1–6. [Google Scholar]
- Hu, S.; Li, Y.; Lyu, S. Exposing GAN-generated Faces Using Inconsistent Corneal Specular Highlights. In Proceedings of the ICASSP 2021—2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Toronto, ON, Canada, 6–11 June 2021; pp. 2500–2504. [Google Scholar]
- Matern, F.; Riess, C.; Stamminger, M. Exploiting visual artifacts to expose deepfakes and face manipulations. In Proceedings of the 2019 IEEE Winter Applications of Computer Vision Workshops (WACVW), Waikoloa Village, HI, USA, 9–11 January 2019; pp. 83–92. [Google Scholar]
- Nirkin, Y.; Wolf, L.; Keller, Y.; Hassner, T. DeepFake detection based on discrepancies between faces and their context. IEEE Trans. Pattern Anal. Mach. Intell. 2021, 44, 6111–6121. [Google Scholar] [CrossRef] [PubMed]
- Dang, H.; Liu, F.; Stehouwer, J.; Liu, X.; Jain, A.K. On the detection of digital face manipulation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Seattle, WA, USA, 13–19 June 2020; pp. 5781–5790. [Google Scholar]
- Fernando, T.; Fookes, C.; Denman, S.; Sridharan, S. Detection of fake and fraudulent faces via neural memory networks. IEEE Trans. Inf. Forensics Secur. 2020, 16, 1973–1988. [Google Scholar] [CrossRef]
- Xu, Y.; Jia, G.; Huang, H.; Duan, J.; He, R. Visual-Semantic Transformer for Face Forgery Detection. In Proceedings of the 2021 IEEE International Joint Conference on Biometrics (IJCB), Shenzhen, China, 4–7 August 2021; pp. 1–7. [Google Scholar]
- Fridrich, J.; Kodovsky, J. Rich models for steganalysis of digital images. IEEE Trans. Inf. Forensics Secur. 2012, 7, 868–882. [Google Scholar] [CrossRef] [Green Version]
- Cozzolino, D.; Poggi, G.; Verdoliva, L. Recasting residual-based local descriptors as convolutional neural networks: An application to image forgery detection. In Proceedings of the 5th ACM Workshop on Information Hiding and Multimedia Security, Philadelphia, PA, USA, 20–22 June 2017; pp. 159–164. [Google Scholar]
- Tan, M.; Le, Q. Efficientnet: Rethinking model scaling for convolutional neural networks. In Proceedings of the International Conference on Machine Learning, Long Beach, CA, USA, 10–15 June 2019; pp. 6105–6114. [Google Scholar]
- Afchar, D.; Nozick, V.; Yamagishi, J.; Echizen, I. Mesonet: A compact facial video forgery detection network. In Proceedings of the 2018 IEEE International Workshop on Information Forensics and Security (WIFS), Hong Kong, China, 11–13 December 2018; pp. 1–7. [Google Scholar]
- Güera, D.; Delp, E.J. Deepfake video detection using recurrent neural networks. In Proceedings of the 2018 15th IEEE International Conference on Advanced Video and Signal Based Surveillance (AVSS), Auckland, New Zealand, 27–30 November 2018; pp. 1–6. [Google Scholar]
- Chollet, F. Xception: Deep learning with depthwise separable convolutions. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA, 21–26 July 2017; pp. 1251–1258. [Google Scholar]
- Qian, Y.; Yin, G.; Sheng, L.; Chen, Z.; Shao, J. Thinking in frequency: Face forgery detection by mining frequency-aware clues. In Proceedings of the European Conference on Computer Vision, Glasgow, UK, 23–28 August 2020; pp. 86–103. [Google Scholar]
- Luo, Y.; Zhang, Y.; Yan, J.; Liu, W. Generalizing face forgery detection with high-frequency features. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Nashville, TN, USA, 19–25 June 2021; pp. 16317–16326. [Google Scholar]
- Chen, H.S.; Rouhsedaghat, M.; Ghani, H.; Hu, S.; You, S.; Kuo CC, J. Defakehop: A light-weight high-performance deepfake detector. In Proceedings of the 2021 IEEE International Conference on Multimedia and Expo (ICME), Shenzhen, China, 5–9 July 2021; pp. 1–6. [Google Scholar]
- Chen, S.; Yao, T.; Chen, Y.; Ding, S.; Li, J.; Ji, R. Local relation learning for face forgery detection. AAAI Conf. Artif. Intell. 2021, 35, 1081–1088. [Google Scholar] [CrossRef]
- Zhao, H.; Zhou, W.; Chen, D.; Wei, T.; Zhang, W.; Yu, N. Multi-attentional deepfake detection. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Nashville, TN, USA, 20–25 June 2021; pp. 2185–2194. [Google Scholar]
- Dlib. Dlib C++ Library [EB/OL]. Available online: http://dlib.net/ (accessed on 14 September 2022).
- Choi, Y.; Choi, M.; Kim, M.; Ha, J.W.; Kim, S.; Choo, J. Stargan: Unified generative adversarial networks for multi-domain image-to-image translation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA, 18–23 June 2018; pp. 8789–8797. [Google Scholar]
- Rossler, A.; Cozzolino, D.; Verdoliva, L.; Riess, C.; Thies, J.; Nießner, M. Faceforensics++: Learning to detect manipulated facial images. In Proceedings of the IEEE/CVF International Conference on Computer Vision, Seoul, Republic of Korea, 17 October–2 November 2019; pp. 1–11. [Google Scholar]
- Li, Y.; Yang, X.; Sun, P.; Qi, H.; Lyu, S. Celeb-df: A large-scale challenging dataset for deepfake forensics. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Seattle, WA, USA, 13–19 June 2020; pp. 3207–3216. [Google Scholar]
- Dolhansky, B.; Howes, R.; Pflaum, B.; Baram, N.; Ferrer, C.C. The deepfake detection challenge (dfdc) preview dataset. arXiv 2019, arXiv:1910.08854. [Google Scholar]
- Dolhansky, B.; Bitton, J.; Pflaum, B.; Lu, J.; Howes, R.; Wang, M.; Ferrer, C.C. The Deepfake Detection Challenge (DFDC) Dataset. arXiv 2020, arXiv:2006.07397. [Google Scholar]
- Zhao, T.; Xu, X.; Xu, M.; Ding, H.; Xiong, Y.; Xia, W. Learning self-consistency for deepfake detection. In Proceedings of the IEEE/CVF International Conference on Computer Vision, Montreal, QC, Canada, 10–17 October 2021; pp. 15023–15033. [Google Scholar]
Combination | Simulated Scene | ACC | AUC |
---|---|---|---|
eyebrows + eyes + nose | Face with mask | 86.73 | 83.24 |
nose + mouth + ears | Face with sunglasses | 86.52 | 84.13 |
eyes + nose + mouth | Cover the eyebrows | 95.43 | 93.64 |
eyes + nose + mouth + eyebrows + ears | Undecorated face | 99.67 | 99.93 |
Methods | RAW | HQ | LQ | |||
---|---|---|---|---|---|---|
ACC | AUC | ACC | AUC | ACC | AUC | |
Steg.Features [19] | 97.63 | - | 70.97 | - | 55.98 | - |
LD-CNN [20] | 98.57 | - | 78.45 | - | 58.69 | - |
MesoNet [22] | 95.23 | - | 83.10 | - | 70.47 | - |
F3-Net [25] | 99.95 | 99.80 | 97.52 | 98.10 | 90.43 | 93.30 |
RFAM [28] | 99.87 | 99.92 | 97.59 | 99.46 | 91.47 | 95.21 |
Multi-attention [29] | - | - | 97.60 | 99.29 | 88.69 | 90.41 |
CORE [8] | 99.97 | 100.00 | 97.61 | 99.66 | 87.99 | 90.61 |
M2TR [9] | 99.50 | 99.92 | 97.93 | 99.51 | 92.89 | 95.31 |
Ours | 99.67 | 99.93 | 98.12 | 99.67 | 94.14 | 96.43 |
Methods | DFD | DFDC-P | Celeb-DF |
---|---|---|---|
Xception [24] | 87.86 | - | 73.04 |
Local-relation [28] | 89.24 | 76.53 | 78.26 |
HFF [26] | 91.90 | - | 79.40 |
LSC [36] | - | 74.37 | 81.80 |
CORE [8] | 93.74 | 75.74 | 79.45 |
Ours | 94.32 | 75.93 | 82.43 |
Methods | All | Mask Face | Sunglasses | Undecorated | ||||
---|---|---|---|---|---|---|---|---|
ACC | AUC | ACC | AUC | ACC | AUC | ACC | AUC | |
Efficientnet-B4 [21] | 49.1 | 78.5 | 48.7 | 68.5 | 48.8 | 89.4 | 49.7 | 83.2 |
M2TR [9] | 61.7 | 63.8 | 57.6 | 58.8 | 66.5 | 70.6 | 61.0 | 65.4 |
Ours | 64.2 | 66.7 | 60.5 | 60.2 | 68.9 | 72.3 | 63.3 | 67.4 |
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations. |
© 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Xue, Z.; Liu, Q.; Shi, H.; Zou, R.; Jiang, X. A Transformer-Based DeepFake-Detection Method for Facial Organs. Electronics 2022, 11, 4143. https://doi.org/10.3390/electronics11244143
Xue Z, Liu Q, Shi H, Zou R, Jiang X. A Transformer-Based DeepFake-Detection Method for Facial Organs. Electronics. 2022; 11(24):4143. https://doi.org/10.3390/electronics11244143
Chicago/Turabian StyleXue, Ziyu, Qingtong Liu, Haichao Shi, Ruoyu Zou, and Xiuhua Jiang. 2022. "A Transformer-Based DeepFake-Detection Method for Facial Organs" Electronics 11, no. 24: 4143. https://doi.org/10.3390/electronics11244143
APA StyleXue, Z., Liu, Q., Shi, H., Zou, R., & Jiang, X. (2022). A Transformer-Based DeepFake-Detection Method for Facial Organs. Electronics, 11(24), 4143. https://doi.org/10.3390/electronics11244143