Deep Speech Synthesis and Its Implications for News Verification: Lessons Learned in the RTVE-UGR Chair
Abstract
:1. Introduction
- Develop and apply advanced methodologies to verify the authenticity of audio recordings, identifying manipulations, deepfakes, and any type of fake audio.
- Test and evaluate existing fake audio detection models to determine their effectiveness.
- Generation of audio datasets using artificial intelligence techniques to train the selected models with voices of interest.
- Train and optimize models to improve their ability to detect fake audio.
- Create useful and accessible tools for journalists that facilitate the quick and effective verification of audios.
- Train journalists in the use of these tools.
2. State of the Art
2.1. Verification of Deep Audio Fakes: A Challenge for Journalism
2.2. Detection of Audio Fakes: A Technological Challenge
2.3. Deep Audio in News Verification: Multidisciplinary Initiatives
- WeVerify: This project expanded and improved existing verification tools, including a web browser plugin, a deepfake detector, and a domain credibility service [53].
- FANDANGO: In this project, the spread of fake news on social networks was analyzed using big data and artificial intelligence techniques. It also developed tools to detect and combat disinformation [54].
- AI4Media: Focuses on research and training in artificial intelligence for the media sector. Its objective is to develop tools and resources that help the media combat misinformation and improve the quality of information [55].
- IBERIFIER (https://iberifier.eu/observatorio/, accessed on 23 October 2024): It is a digital media observatory focused on Spain and Portugal, supported by the European Commission, and is part of the European Digital Media Observatory (EDMO). It is coordinated by the University of Navarra and agglutinates twelve universities, five organizations dedicated to news verification, several news agencies, and six multidisciplinary research centers. Its objectives include research into the characteristics and trends of the digital media ecosystem, the development of technologies for the detection of disinformation, the verification of disinformation, the preparation of reports on disinformation threats, and the promotion of media literacy initiatives.
- Incorporate emerging technologies: Explore new applications of artificial intelligence in analysis and detection of fake audio, ensuring that this technology is useful and effective.
- Promote multidisciplinary collaboration: We seek to contribute multidisciplinary approaches and strengthen our collaboration with other universities in the IVERES project as well as with the rest of the multidisciplinary team of the RTVE-UGR chair.
- Training and dissemination: We aim to train journalism professionals in the use of tools for news verification as well as the ability to understand and evaluate the ethical implications of these technologies and disseminate the advances of this chair.
3. Our Approach to Multidisciplinary Collaboration Toward Deep Audio Detection
3.1. Research and Identification of Voices
3.2. Datasets and Models
3.3. Web Audio Verification Tool
4. Results Achieved Following Our Pipeline
4.1. Datasets and Models Used
4.2. Description of the Resulting Detection Models
Implemented Models
- Audio Labeling: Each audio sample is labeled as “authentic” or “fake”. This process is crucial to ensure the quality of training and evaluation of the model, since the model will need to know whether the audio is real or fake at the time of training.
- Pre-processing: Audios are processed to normalize features such as volume and to extract relevant features using FastAudio. This step is critical to transforming data into a form that models can process effectively.
- Model training: The model is trained with 80% of the data we have, labeled and preprocessed.
- Testing: the models are tested with the remaining 20% of the data to evaluate their accuracy and generalization ability.
- Deployment: once validated, the models are deployed on a server, where we can access them through an API to perform verifications sending an audio from the web tool.
4.3. Description of the Website Created as a Frontend for the Models
- Audio upload: The user uploads the audio file through the interface.
- Model selection: The user chooses whether to apply a specific model or the general one.
- Results display: The results are returned to the user through the web interface.
5. Evaluation of the Deepfake Detection Models
- ASVspoof2019: A dataset provided by the ASVspoof challenge 2019, consisting of 108,978 fake audios and 12.483 real audios (https://zenodo.org/records/4837263, accessed on 23 October 2024).
- ASVspoof2021: The evaluation dataset from the ASVspoof challenge 2021, containing 335,497 audios (https://doi.org/10.5281/zenodo.4837263, accessed on 23 October 2024). These audios are unlabeled, but the challenge provides an evaluation tool to measure model performance through submission to the ASVspoof evaluation server.
- RTVE-UGR dataset: A dataset generated by Monoceros Labs, containing 519 real and 1.097 fake audios.
- True Positive (TP) refers to a fake audio correctly identified.
- False Positive (FP) refers to a real audio incorrectly identified as fake.
- False Negative (FN) refers to a fake audio incorrectly identified as real.
- True Negative (TN) refers to a real audio correctly identified.
5.1. Experiment 1: Replicating FastAudio Experimental Results
5.2. Experiment 2: Testing an Antispoof Model with Our Dataset
5.3. Experiment 3: Evaluation of the Specific Models Developed for Individual Speakers of Interest
5.4. Experiment 4: Evaluation of a General Model Developed with Our Dataset
6. Conclusions and Future Work
- Developing more models: We will continue to develop models with more voices of persons of interest, expanding our dataset and improving the accuracy of our models.
- Improved detection capabilities: We are working on introducing more voices generated using different fake voice creation methods to improve the detection of tampering and spoofing.
- Audio analysis by fragments: Another area of future development is the analysis of audio by short fragments, such as 4-second intervals, to detect possible manipulations within long audios. This technique will allow us to identify false fragments inserted into authentic recordings, increasing the precision and reliability of our verification tools.
- Multidisciplinary collaboration: We will continue to strengthen collaboration with universities, research centers, and technology companies to enrich our methodologies and tools, ensuring a diversity of perspectives and knowledge.
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Acknowledgments
Conflicts of Interest
References
- Tan, X. Neural Text-to-Speech Synthesis; Springer: Berlin/Heidelberg, Germany, 2023. [Google Scholar] [CrossRef]
- Cai, Z.; Yang, Y.; Li, M. Cross-lingual multi-speaker speech synthesis with limited bilingual training data. Comput. Speech Lang. 2023, 77, 101427. [Google Scholar] [CrossRef]
- Eren, E.; Demiroglu, C. Deep learning-based speaker-adaptive postfiltering with limited adaptation data for embedded text-to-speech synthesis systems. Comput. Speech Lang. 2023, 81, 101520. [Google Scholar] [CrossRef]
- Mehrish, A.; Majumder, N.; Bharadwaj, R.; Mihalcea, R.; Poria, S. A review of deep learning techniques for speech processing. Inf. Fusion 2023, 99, 101869. [Google Scholar] [CrossRef]
- James, J.; Balamurali, B.T.; Watson, C.I.; MacDonald, B. Empathetic Speech Synthesis and Testing for Healthcare Robots. Int. J. Soc. Robot. 2021, 13, 2119–2137. [Google Scholar] [CrossRef]
- Angrick, M.; Luo, S.; Rabbani, Q.; Candrea, D.N.; Shah, S.; Milsap, G.W.; Anderson, W.S.; Gordon, C.R.; Rosenblatt, K.R.; Clawson, L.; et al. Online speech synthesis using a chronically implanted brain–computer interface in an individual with ALS. Sci. Rep. 2024, 14, 9617. [Google Scholar] [CrossRef]
- Xie, Q.; Tian, X.; Liu, G.; Song, K.; Xie, L.; Wu, Z.; Li, H.; Shi, S.; Li, H.; Hong, F.; et al. The Multi-Speaker Multi-Style Voice Cloning Challenge 2021. In Proceedings of the 2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Toronto, ON, Canada, 6–11 June 2021; pp. 8613–8617. [Google Scholar] [CrossRef]
- Luong, H.T.; Yamagishi, J. NAUTILUS: A Versatile Voice Cloning System. IEEE/ACM Trans. Audio Speech Lang. Process. 2020, 28, 2967–2981. [Google Scholar] [CrossRef]
- Ijiga, O.M.; Idoko, I.P.; Enyejo, L.A.; Akoh, O.; Ugbane, S.I.; Ibokette, A.I. Harmonizing the voices of AI: Exploring generative music models, voice cloning, and voice transfer for creative expression. World J. Adv. Eng. Technol. Sci. 2024, 11, 372–394. [Google Scholar] [CrossRef]
- Hu, W.; Zhu, X. A real-time voice cloning system with multiple algorithms for speech quality improvement. PLoS ONE 2023, 18, e0283440. [Google Scholar] [CrossRef]
- Chadha, A.; Kumar, V.; Kashyap, S.; Gupta, M. Deepfake: An Overview. In Proceedings of the Second International Conference on Computing, Communications, and Cyber-Security, Singapore, 2–4 December 2020; pp. 557–566. [Google Scholar] [CrossRef]
- Nguyen, T.T.; Nguyen, Q.V.H.; Nguyen, D.T.; Nguyen, D.T.; Huynh-The, T.; Nahavandi, S.; Nguyen, T.T.; Pham, Q.V.; Nguyen, C.M. Deep learning for deepfakes creation and detection: A survey. Comput. Vis. Image Underst. 2022, 223, 103525. [Google Scholar] [CrossRef]
- Sadekova, T.; Gogoryan, V.; Vovk, I.; Popov, V.; Kudinov, M.; Wei, J. A Unified System for Voice Cloning and Voice Conversion through Diffusion Probabilistic Modeling. In Proceedings of the Interspeech 2022, Incheon, Republic of Korea, 18–22 September 2022; pp. 3003–3007. [Google Scholar] [CrossRef]
- Rodríguez-Ortega, Y.; Ballesteros, D.M.; Renza, D. A Machine Learning Model to Detect Fake Voice. In Proceedings of the ICAI 2020, Ota, Nigeria, 29–31 October 2020; Florez, H., Misra, S., Eds.; Springer: Cham, Switzerland, 2020; pp. 3–13. [Google Scholar] [CrossRef]
- Lyu, S. Deepfake Detection: Current Challenges and Next Steps. In Proceedings of the 2020 IEEE International Conference on Multimedia & Expo Workshops (ICMEW), London, UK, 6–10 July 2020; pp. 1–6. [Google Scholar] [CrossRef]
- Helmus, T.C. Artificial Intelligence, Deepfakes, and Disinformation: A Primer; Technical Report; RAND Corporation: Santa Monica, CA, USA, 2022. [Google Scholar]
- Gambín, A.F.; Yazidi, A.; Vasilakos, A.; Haugerud, H.; Djenouri, Y. Deepfakes: Current and future trends. Artif. Intell. Rev. 2024, 57, 64. [Google Scholar] [CrossRef]
- Gregory, S. Fortify the Truth: How to Defend Human Rights in an Age of Deepfakes and Generative AI. J. Hum. Rights Pract. 2023, 15, 702–714. [Google Scholar] [CrossRef]
- Naitali, A.; Ridouani, M.; Salahdine, F.; Kaabouch, N. Deepfake Attacks: Generation, Detection, Datasets, Challenges, and Research Directions. Computers 2023, 12, 216. [Google Scholar] [CrossRef]
- Diakopoulos, N.; Johnson, D. Anticipating and addressing the ethical implications of deepfakes in the context of elections. New Media Soc. 2021, 23, 2072–2098. [Google Scholar] [CrossRef]
- Mcuba, M.; Singh, A.; Ikuesan, R.A.; Venter, H. The Effect of Deep Learning Methods on Deepfake Audio Detection for Digital Investigation. Procedia Comput. Sci. 2023, 219, 211–219. [Google Scholar] [CrossRef]
- Almutairi, Z.; Elgibreen, H. A Review of Modern Audio Deepfake Detection Methods: Challenges and Future Directions. Algorithms 2022, 15, 155. [Google Scholar] [CrossRef]
- Khanjani, Z.; Watson, G.; Janeja, V.P. Audio deepfakes: A survey. Front. Big Data 2023, 5, 1001063. [Google Scholar] [CrossRef]
- Akhtar, Z.; Pendyala, T.L.; Athmakuri, V.S. Video and Audio Deepfake Datasets and Open Issues in Deepfake Technology: Being Ahead of the Curve. J. Forensic Sci. 2024, 4, 289–377. [Google Scholar] [CrossRef]
- Wang, Y.; Huang, H. Audio–visual deepfake detection using articulatory representation learning. Comput. Vis. Image Underst. 2024, 248, 104133. [Google Scholar] [CrossRef]
- OECD. Facts Not Fakes: Tackling Disinformation, Strengthening Information Integrity; Organisation for Economic Co-Operation and Development: Paris, France, 2024. [Google Scholar]
- Guo, Z.; Schlichtkrull, M.; Vlachos, A. A Survey on Automated Fact-Checking. Trans. Assoc. Comput. Linguist. 2022, 10, 178–206. [Google Scholar] [CrossRef]
- Díaz-Lucena, A.; Hidalgo-Cobo, P. Verification Agencies on TikTok: The Case of MediaWise and Politifact. Societies 2024, 14, 59. [Google Scholar] [CrossRef]
- López-Marcos, C.; Vicente-Fernández, P. Fact Checkers Facing Fake News and Disinformation in the Digital Age: A Comparative Analysis between Spain and United Kingdom. Publications 2021, 9, 36. [Google Scholar] [CrossRef]
- Valero-Pastor, J. Plataformas, Consumo Mediático y Nuevas Realidades Digitales: Hacia Una Perspectiva Integradora; Dykinson: Madrid, Spain, 2021. [Google Scholar]
- Tejedor, S.; Vila, P. Exo Journalism: A Conceptual Approach to a Hybrid Formula between Journalism and Artificial Intelligence. Journal. Media 2021, 2, 830–840. [Google Scholar] [CrossRef]
- Gao, Y.; Wang, X.; Zhang, Y.; Zeng, P.; Ma, Y. Temporal Feature Prediction in Audio–Visual Deepfake Detection. Electronics 2024, 13, 3433. [Google Scholar] [CrossRef]
- Schäfer, K.; Choi, J.E.; Zmudzinski, S. Explore the world of audio deepfakes: A guide to detection techniques for non-experts. In Proceedings of the 3rd ACM International Workshop on Multimedia AI Against Disinformation, Phuket, Thailand, 10–13 June 2024; pp. 13–22. [Google Scholar] [CrossRef]
- Yamagishi, J.; Wang, X.; Todisco, M.; Sahidullah, M.; Patino, J.; Nautsch, A.; Liu, X.; Lee, K.A.; Kinnunen, T.; Evans, N.; et al. ASVspoof 2021: Accelerating progress in spoofed and deepfake speech detection. In Proceedings of the 2021 Edition of the Automatic Speaker Verification and Spoofing Countermeasures Challenge, Online, 16 September 2021; pp. 47–54. [Google Scholar] [CrossRef]
- Yi, J.; Tao, J.; Fu, R.; Yan, X.; Wang, C.; Wang, T.; Zhang, C.Y.; Zhang, X.; Zhao, Y.; Ren, Y.; et al. ADD 2023: The Second Audio Deepfake Detection Challenge. arXiv 2023, arXiv:2305.13774. [Google Scholar] [CrossRef]
- Yi, J.; Fu, R.; Tao, J.; Nie, S.; Ma, H.; Wang, C.; Wang, T.; Tian, Z.; Bai, Y.; Fan, C.; et al. ADD 2022: The First Audio Deep Synthesis Detection Challenge. arXiv 2022, arXiv:2202.08433. [Google Scholar] [CrossRef]
- Wu, Z.; Kinnunen, T.; Evans, N.; Yamagishi, J.; Hanilçi, C.; Sahidullah, M.; Sizov, A. ASVspoof 2015: The first automatic speaker verification spoofing and countermeasures challenge. In Proceedings of the Interspeech 2015, Dresden, Germany, 6–10 September 2015; pp. 2037–2041. [Google Scholar] [CrossRef]
- Kinnunen, T.; Sahidullah, M.; Delgado, H.; Todisco, M.; Evans, N.; Yamagishi, J.; Lee, K.A. The ASVspoof 2017 Challenge: Assessing the Limits of Replay Spoofing Attack Detection. In Proceedings of the Interspeech 2017, Stockholm, Sweden, 20–24 August 2017; pp. 2–6. [Google Scholar] [CrossRef]
- Todisco, M.; Wang, X.; Vestman, V.; Sahidullah, M.; Delgado, H.; Nautsch, A.; Yamagishi, J.; Evans, N.; Kinnunen, T.; Lee, K.A. ASVspoof 2019: Future Horizons in Spoofed and Fake Audio Detection. arXiv 2019, arXiv:1904.05441. [Google Scholar] [CrossRef]
- Lai, C.I.; Chen, N.; Villalba, J.; Dehak, N. ASSERT: Anti-Spoofing with Squeeze-Excitation and Residual neTworks. arXiv 2019, arXiv:1904.01120. [Google Scholar] [CrossRef]
- Lai, C.I.; Abad, A.; Richmond, K.; Yamagishi, J.; Dehak, N.; King, S. Attentive Filtering Networks for Audio Replay Attack Detection. arXiv 2018, arXiv:1810.13048. [Google Scholar] [CrossRef]
- Kang, W.H.; Alam, J.; Fathan, A. CRIM’s System Description for the ASVSpoof2021 Challenge. In Proceedings of the 2021 Edition of the Automatic Speaker Verification and Spoofing Countermeasures Challenge, Online, 16 September 2021; pp. 100–106. [Google Scholar] [CrossRef]
- Tak, H.; Jung, J.w.; Patino, J.; Kamble, M.; Todisco, M.; Evans, N. End-to-End Spectro-Temporal Graph Attention Networks for Speaker Verification Anti-Spoofing and Speech Deepfake Detection. arXiv 2021, arXiv:2107.12710. [Google Scholar] [CrossRef]
- Tak, H.; Patino, J.; Todisco, M.; Nautsch, A.; Evans, N.; Larcher, A. End-to-End anti-spoofing with RawNet2. In Proceedings of the 2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Toronto, ON, Canada, 6–11 June 2021; pp. 6369–6373. [Google Scholar] [CrossRef]
- Chen, X.; Zhang, Y.; Zhu, G.; Duan, Z. UR Channel-Robust Synthetic Speech Detection System for ASVspoof 2021. arXiv 2021, arXiv:2107.12018. [Google Scholar] [CrossRef]
- Cáceres, J.; Font, R.; Grau, T.; Molina, J. The Biometric Vox System for the ASVspoof 2021 Challenge. In Proceedings of the 2021 Edition of the Automatic Speaker Verification and Spoofing Countermeasures Challenge, Online, 16 September 2021; pp. 68–74. [Google Scholar] [CrossRef]
- Wang, X.; Qin, X.; Zhu, T.; Wang, C.; Zhang, S.; Li, M. The DKU-CMRI System for the ASVspoof 2021 Challenge: Vocoder based Replay Channel Response Estimation. In Proceedings of the 2021 Edition of the Automatic Speaker Verification and Spoofing Countermeasures Challenge, Online, 16 September 2021; pp. 16–21. [Google Scholar] [CrossRef]
- Fu, Q.; Teng, Z.; White, J.; Powell, M.; Schmidt, D.C. FastAudio: A Learnable Audio Front-End for Spoof Speech Detection. arXiv 2021, arXiv:2109.02774. [Google Scholar] [CrossRef]
- Wang, X.; Yamagishi, J.; Todisco, M.; Delgado, H.; Nautsch, A.; Evans, N.; Sahidullah, M.; Vestman, V.; Kinnunen, T.; Lee, K.A.; et al. ASVspoof 2019: A large-scale public database of synthesized, converted and replayed speech. Comput. Speech Lang. 2020, 64, 101114. [Google Scholar] [CrossRef]
- Masood, M.; Nawaz, M.; Malik, K.M.; Javed, A.; Irtaza, A.; Malik, H. Deepfakes generation and detection: State-of-the-art, open challenges, countermeasures, and way forward. Appl. Intell. 2023, 53, 3974–4026. [Google Scholar] [CrossRef]
- Teyssou, D. Applying Design Thinking Methodology: The InVID Verification Plugin. In Video Verification in the Fake News Era; Mezaris, V., Nixon, L., Papadopoulos, S., Teyssou, D., Eds.; Springer International Publishing: Berlin/Heidelberg, Germany, 2019; pp. 263–279. [Google Scholar] [CrossRef]
- Teyssou, D.; Leung, J.M.; Apostolidis, E.; Apostolidis, K.; Papadopoulos, S.; Zampoglou, M.; Papadopoulou, O.; Mezaris, V. The InVID Plug-in: Web Video Verification on the Browser. In Proceedings of the First International Workshop on Multimedia Verification, New York, NY, USA, 23–27 October 2017; pp. 23–30. [Google Scholar] [CrossRef]
- Marinova, Z.; Spangenberg, J.; Teyssou, D.; Papadopoulos, S.; Sarris, N.; Alaphilippe, A.; Bontcheva, K. Weverify: Wider and Enhanced Verification for You Project Overview and Tools. In Proceedings of the 2020 IEEE International Conference on Multimedia & Expo Workshops (ICMEW), London, UK, 6–10 July 2020; pp. 1–4. [Google Scholar] [CrossRef]
- Nucci, F.; Boi, S.; Magaldi, M. Artificial Intelligence Against Disinformation: The FANDANGO Practical Case. In Proceedings of the First International Forum on Digital and Democracy. Towards A Sustainable Evolution, Venice, Italy, 10–11 December 2020. [Google Scholar]
- Tsalakanidou, F.; Papadopoulos, S.; Mezaris, V.; Kompatsiaris, I.; Gray, B.; Tsabouraki, D.; Kalogerini, M.; Negro, F.; Montagnuolo, M.; de Vos, J.; et al. The AI4Media Project: Use of Next-Generation Artificial Intelligence Technologies for Media Sector Applications. In Proceedings of the Artificial Intelligence Applications and Innovations, Crete, Greece, 25–27 June 2021; Maglogiannis, I., Macintyre, J., Iliadis, L., Eds.; Springer International Publishing: Berlin/Heidelberg, Germany, 2021; pp. 81–93. [Google Scholar] [CrossRef]
- Pawlicka, A.; Pawlicki, M.; Kozik, R.; Andrychowicz-Trojanowska, A.; Choraś, M. AI vs linguistic-based human judgement: Bridging the gap in pursuit of truth for fake news detection. Inf. Sci. 2024, 679, 121097. [Google Scholar] [CrossRef]
- Whittaker, L.; Mulcahy, R.; Letheren, K.; Kietzmann, J.; Russell-Bennett, R. Mapping the deepfake landscape for innovation: A multidisciplinary systematic review and future research agenda. Technovation 2024, 123, 102784. [Google Scholar] [CrossRef]
- de Lima-Santos, M.F. ProPublica’s Data Journalism: How Multidisciplinary Teams and Hybrid Profiles Create Impactful Data Stories. Media Commun. 2022, 10, 5–15. [Google Scholar] [CrossRef]
- Bisiani, S.; Abellan, A.; Arias Robles, F.; García-Avilés, J.A. The Data Journalism Workforce: Demographics, Skills, Work Practices, and Challenges in the Aftermath of the COVID-19 Pandemic. J. Pract. 2023, 1–21. [Google Scholar] [CrossRef]
- Mtchedlidze, J. Technical Expertise in Newsrooms: Understanding Data Journalists’ Roles and Practices. Journal. Media 2024, 5, 1316–1328. [Google Scholar] [CrossRef]
- Mathias-Felipe de Lima-Santos, W.N.Y.; Dodds, T. Guiding the way: A comprehensive examination of AI guidelines in global media. AI Soc. 2024. [Google Scholar] [CrossRef]
Roles | Experience | Tasks Performed |
---|---|---|
Computer science researchers (CSR) | Experts in speech technologies, AI, machine learning and signal processing | Research and development of deepfake detection models |
Experts in speech synthesis and voice cloning (SSVC) | Specialists in speech synthesis and audio technologies | Creating synthetic voices to generate fake samples for training |
Web developer (WD) | Experience in frontend development and UX/UI design | Design and maintenance of the user interface |
Backend developer (BD) | Experience in databases, servers and APIs | Implementation of the backend logic and deployment of the models on the server |
Systems engineer (SE) | Specialization in IT infrastructure and networks | Technical support and server maintenance |
News verification journalists (NVJ) | Experience in verification journalism and fact checking | Study the incidence of voice fakes, identify persons who are frequent targets of fakes, test and feedback on the use of tools |
Generalistic journalists (GJ) | Various specializations in journalism different from verification | Test and feedback on the use of tools |
Journalism researchers (JR) | Specialization in journalistic investigation | Analysis on the adequacy and potential impact of the tools developed |
Experiment id | Training Set (lang.) | Test Set (lang.) | Type of Test |
---|---|---|---|
Exp. 1 | ASVspoof2019 (en) | ASVspoof2021 (en) | Unknown speaker, general model |
Exp. 2 | ASVspoof2019 (en) | RTVE-UGR (es) | Unknown speaker, general model |
Exp. 3 | RTVE-UGR (es) | RTVE-UGR (es) | Known speaker, specific model |
Exp. 4 | RTVE-UGR (es) | RTVE-UGR (es) | Known speaker, general model |
Fake Audios | Real Audios | |
---|---|---|
True Positives (TP) | 1098 | — |
False Negatives (FN) | 0 | — |
False Positives (FP) | — | 520 |
True Negatives (TN) | — | 0 |
Fake Audios | Real Audios | |
---|---|---|
True Positives (TP) | 220 | — |
False Negatives (FN) | 0 | — |
False Positives (FP) | — | 3 |
True Negatives (TN) | — | 101 |
Fake Audios | Real Audios | |
---|---|---|
True Positives (TP) | 97 | — |
False Negatives (FN) | 0 | — |
False Positives (FP) | — | 0 |
True Negatives (TN) | — | 102 |
Fake Audios | Real Audios | |
---|---|---|
True Positives (TP) | 354 | — |
False Negatives (FN) | 0 | — |
False Positives (FP) | — | 4 |
True Negatives (TN) | — | 252 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Calderón-González, D.; Ábalos, N.; Bayo, B.; Cánovas, P.; Griol, D.; Muñoz-Romero, C.; Pérez, C.; Vila, P.; Callejas, Z. Deep Speech Synthesis and Its Implications for News Verification: Lessons Learned in the RTVE-UGR Chair. Appl. Sci. 2024, 14, 9916. https://doi.org/10.3390/app14219916
Calderón-González D, Ábalos N, Bayo B, Cánovas P, Griol D, Muñoz-Romero C, Pérez C, Vila P, Callejas Z. Deep Speech Synthesis and Its Implications for News Verification: Lessons Learned in the RTVE-UGR Chair. Applied Sciences. 2024; 14(21):9916. https://doi.org/10.3390/app14219916
Chicago/Turabian StyleCalderón-González, Daniel, Nieves Ábalos, Blanca Bayo, Pedro Cánovas, David Griol, Carlos Muñoz-Romero, Carmen Pérez, Pere Vila, and Zoraida Callejas. 2024. "Deep Speech Synthesis and Its Implications for News Verification: Lessons Learned in the RTVE-UGR Chair" Applied Sciences 14, no. 21: 9916. https://doi.org/10.3390/app14219916
APA StyleCalderón-González, D., Ábalos, N., Bayo, B., Cánovas, P., Griol, D., Muñoz-Romero, C., Pérez, C., Vila, P., & Callejas, Z. (2024). Deep Speech Synthesis and Its Implications for News Verification: Lessons Learned in the RTVE-UGR Chair. Applied Sciences, 14(21), 9916. https://doi.org/10.3390/app14219916