Enhancing Web Application Security: Advanced Biometric Voice Verification for Two-Factor Authentication
Abstract
:1. Introduction
2. Related Works
- − Universality—every individual should possess the considered factor;
- − Uniqueness—the factor should ensure a high degree of differentiation among individuals;
- − Collectability—the factor should be measurable through practical means;
- − Performance—determines the potential for achieving accuracy, speed, and reliability;
- − Acceptability—society should not have reservations about the use of technology employed by the specific factor.
- − Spoofing—indicates the level of difficulty in intercepting and falsifying a sample of data from the respective factor.
3. Methods
3.1. Signal Acquisition
3.2. Signal Pre-Processing
3.3. Extraction of Distinctive Features
3.4. Selection of Distinctive Features
3.5. Normalization of Distinctive Features
3.6. Creation of Voice Models
3.7. Decision-Making System
- − H0 (null hypothesis)—the voice signal X comes from speaker k,
- − H1 (alternative hypothesis)—the voice signal X comes from another speaker ~k from the population.
3.8. Normalizing the Outcome of the Verification
4. Results
4.1. Impact of the Normalization of Distinctive Features on the Effectiveness of the ASR System
4.2. Optimization of the Adopted Alternative Hypothesis in the Decision-Making System
4.3. Optimization of the Decision Threshold
4.4. Comparison with other Speaker Verification Methods
5. Conclusions
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Conflicts of Interest
References
- Piotrowski, Z.; Lenarczyk, P.P. Blind Image Counterwatermarking—Hidden Data Filter. Multimed Tools Appl. 2017, 76, 10119–10131. [Google Scholar] [CrossRef]
- Kaczmarek, P.; Piotrowski, Z. Desigining a mobile application on the example of a system for digital photos watermarking. In Proceedings of the Radioelectronic Systems Conference 2019, Jachranka, Poland, 20–21 November 2019; SPIE: Bellingham, WA, USA, 2020; Volume 11442, pp. 272–279. [Google Scholar] [CrossRef]
- Hossain, M.N.; Zaman, S.F.U.; Khan, T.Z.; Katha, S.A.; Anwar, M.T.; Hossain, M.I. Implementing Biometric or Graphical Password Authentication in a Universal Three-Factor Authentication System. In Proceedings of the 2022 4th International Conference on Computer Communication and the Internet, ICCCI, Chiba, Japan, 1–3 July 2022; pp. 72–77. [Google Scholar] [CrossRef]
- Two-Factor Authentication (2FA) Security Adoption Surges-|ChannelE2E. Available online: https://www.channele2e.com/news/two-factor-authentication-2fa-adoption-surges (accessed on 1 September 2023).
- The 2021 State of the Auth Report: 2FA Climbs, While Password Managers and Biometrics Trend|Duo Security. Available online: https://duo.com/blog/the-2021-state-of-the-auth-report-2fa-climbs-password-managers-biometrics-trend (accessed on 1 September 2023).
- Nogia, Y.; Singh, S.; Tyagi, V. Multifactor Authentication Schemes for Multiserver Based Wireless Application: A Review. In Proceedings of the ICSCCC 2023-3rd International Conference on Secure Cyber Computing and Communications, Jalandhar, India, 26–28 May 2023; pp. 196–201. [Google Scholar] [CrossRef]
- Fujii, H.; Tsuruoka, Y. SV-2FA: Two-Factor User Authentication with SMS and Voiceprint Challenge Response. In Proceedings of the 2013 8th International Conference for Internet Technology and Secured Transactions, ICITST 2013, London, UK, 9–12 December 2013; pp. 283–287. [Google Scholar] [CrossRef]
- The ‘123’ of Biometric Technology|Semantic Scholar. Available online: https://www.semanticscholar.org/paper/The-%E2%80%98-123-%E2%80%99-of-Biometric-Technology-Yau-Yun/b2f539d1face23a018b8e2824a898a8fee3ac77c (accessed on 1 September 2023).
- Mairaj, M.; Khan, M.S.A.; Agha, D.E.S.; Qazi, F. Review on Three-Factor Authorization Based on Different IoT Devices. In Proceedings of the 2023 Global Conference on Wireless and Optical Technologies, GCWOT 2023, Malaga, Spain, 24–27 January 2023. [Google Scholar] [CrossRef]
- Ometov, A.; Bezzateev, S.; Mäkitalo, N.; Andreev, S.; Mikkonen, T.; Koucheryavy, Y. Multi-Factor Authentication: A Survey. Cryptography 2018, 2, 1. [Google Scholar] [CrossRef]
- Alomar, N.; Alsaleh, M.; Alarifi, A. Social Authentication Applications, Attacks, Defense Strategies and Future Research Directions: A Systematic Review. IEEE Commun. Surv. Tutor. 2017, 19, 1080–1111. [Google Scholar] [CrossRef]
- Bezzateev, S.; Fomicheva, S. Soft Multi-Factor Authentication. In Proceedings of the Wave Electronics and its Application in Information and Telecommunication Systems, WECONF-Conference Proceedings, St. Petersburg, Russia, 1–5 June 2020. [Google Scholar] [CrossRef]
- Gandhi, A.; Patil, H.A. Feature Extraction from Temporal Phase for Speaker Recognition. In Proceedings of the 2018 International Conference on Signal Processing and Communications (SPCOM), Bangalore, India, 16–19 July 2018; pp. 382–386. [Google Scholar] [CrossRef]
- Dustor, A. Speaker Verification with TIMIT Corpus-Some Remarks on Classical Methods. In Proceedings of the Signal Processing-Algorithms, Architectures, Arrangements, and Applications Conference Proceedings, SPA 2020, Poznan, Poland, 23–25 September 2020; pp. 174–179. [Google Scholar] [CrossRef]
- Kang, W.H.; Kim, N.S. Adversarially Learned Total Variability Embedding for Speaker Recognition with Random Digit Strings. Sensors 2019, 19, 4709. [Google Scholar] [CrossRef] [PubMed]
- Xu, Q.; Wang, M.; Xu, C.; Xu, L. Speaker Recognition Based on Long Short-Term Memory Networks. In Proceedings of the 2020 IEEE 5th International Conference on Signal and Image Processing (ICSIP), Nanjing, China, 23–25 October 2020; pp. 318–322. [Google Scholar] [CrossRef]
- Hu, Z.; Fu, Y.; Xu, X.; Zhang, H. I-Vector and DNN Hybrid Method for Short Utterance Speaker Recognition. In Proceedings of the 2020 IEEE International Conference on Information Technology, Big Data and Artificial Intelligence (ICIBA), Chongqing, China, 6–8 November 2020; pp. 67–71. [Google Scholar] [CrossRef]
- Lin, W.; Mak, M.-M.; Li, N.; Su, D.; Yu, D. Multi-Level Deep Neural Network Adaptation for Speaker Verification Using MMD and Consistency Regularization. In Proceedings of the ICASSP 2020-2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Barcelona, Spain, 4–8 May 2020; pp. 6839–6843. [Google Scholar] [CrossRef]
- Jagiasi, R.; Ghosalkar, S.; Kulal, P.; Bharambe, A. CNN Based Speaker Recognition in Language and Text-Independent Small Scale System. In Proceedings of the 2019 Third International conference on I-SMAC (IoT in Social, Mobile, Analytics and Cloud) (I-SMAC), Palladam, India, 12–14 December 2019; pp. 176–179. [Google Scholar] [CrossRef]
- Devi, K.J.; Thongam, K. Automatic Speaker Recognition from Speech Signal Using Bidirectional Long-Short-Term Memory Recurrent Neural Network. Comput. Intell. 2023, 39, 170–193. [Google Scholar] [CrossRef]
- Moumin, A.A.; Kumar, S.S. Automatic Speaker Recognition Using Deep Neural Network Classifiers. In Proceedings of the 2021 2nd International Conference on Computation, Automation and Knowledge Management (ICCAKM), Dubai, United Arab Emirates, 19–21 January 2021; pp. 282–286. [Google Scholar] [CrossRef]
- Hong, Q.-B.; Wu, C.-H.; Wang, H.-M.; Huang, C.-L. Statistics Pooling Time Delay Neural Network Based on X-Vector for Speaker Verification. In Proceedings of the ICASSP 2020-2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Barcelona, Spain, 4–8 May 2020; pp. 6849–6853. [Google Scholar] [CrossRef]
- Wang, S.; Yang, Y.; Wu, Z.; Qian, Y.; Yu, K. Data Augmentation Using Deep Generative Models for Embedding Based Speaker Recognition. IEEE/ACM Trans. Audio Speech Lang. Process. 2020, 28, 2598–2609. [Google Scholar] [CrossRef]
- Bykov, M.M.; Kovtun, V.V.; Kobylyanska, I.M.; Wójcik, W.; Smailova, S. Improvement of the Learning Process of the Automated Speaker Recognition System for Critical Use with HMM-DNN Component. In Proceedings of the Photonics Applications in Astronomy, Communications, Industry, and High-Energy Physics Experiments 2019, Wilga, Poland, 25 May–2 June 2019; SPIE: Bellingham, WA, USA, 2019; Volume 11176, pp. 588–597. [Google Scholar] [CrossRef]
- Zhang, C.; Yu, M.; Weng, C.; Yu, D. Towards Robust Speaker Verification with Target Speaker Enhancement. In Proceedings of the ICASSP 2021-2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Toronto, ON, Canada, 6–11 June 2021; pp. 6693–6697. [Google Scholar] [CrossRef]
- Zhang, Y.; Yu, M.; Li, N.; Yu, C.; Cui, J.; Yu, D. Seq2Seq Attentional Siamese Neural Networks for Text-Dependent Speaker Verification. In Proceedings of the ICASSP 2019-2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Brighton, UK, 12–17 May 2019; pp. 6131–6135. [Google Scholar] [CrossRef]
- Madisetti, V.; Williams, D.B. Digital Signal Processing Handbook; CRC Press, LLC: Boca Raton, FL, USA, 1999. [Google Scholar]
- Makowski, R. Automatyczne Rozpoznawanie Mowy-Wybrane Zagadnienia; Oficyna Wydawnicza Politechniki Wrocławskiej: Wrocław, Poland, 2011; ISBN 978-83-7493-615-6. [Google Scholar]
- Kamiński, K. System Automatycznego Rozpoznawania Mówcy Oparty na Analizie Cepstralnej Sygnału Mowy i Modelach Mieszanin Gaussowskich. Ph.D. Thesis, Military University of Technology, Warsaw, Poland, 2018. [Google Scholar]
- Ciota, Z. Metody Przetwarzanie Sygnałów Akustycznych w Komputerowej Analizie Mowy; EXIT: Warsaw, Poland, 2010; ISBN 978-83-7837-531-9. [Google Scholar]
- Pawłowski, Z. Foniatryczna Diagnostyka Wykonawstwa Emisji Głosu Śpiewaczego i Mówionego; Impuls Press: Cracow, Poland, 2005; ISBN 978-83-7850-295-1. [Google Scholar]
- Davis, S.B.; Mermelstein, P. Comparison of parametric representations for monosyllabic word recognition in continuously spoken sentations. IEEE Trans. ASSP 1980, 28, 357–366. [Google Scholar] [CrossRef]
- Harrag, A.; Saigaa, D.; Boukharouba, K.; Drif, M. GA-based feature subset selection Application to Arabic speaker recognition system. In Proceedings of the 2011 11th International Conference on Hybrid Intelligent Systems (HIS), Malacca, Malaysia, 5–8 December 2011; pp. 383–387. [Google Scholar] [CrossRef]
- Kamiński, K.; Dobrowolski, A.; Majda, E. Selekcja cech osobniczych sygnału mowy z wykorzystaniem algorytmów genetycznych. Inżynieria Bezpieczeństwa Obiektów Antropog. 2019, 1–2, 8–16. [Google Scholar] [CrossRef]
- Osowski, S. Metody i Narzedzia Eksploracji Danych; BTC: Warsaw, Poland, 2013; ISBN 978-83-60233-92-4. [Google Scholar]
- Zamalloa, M.; Bordel, G.; Rodriguez, L.J.; Penagarikano, M. Feature Selection Based on Genetic Algorithms for Speaker Recognition. In Proceedings of the 2006 IEEE Odyssey—The Speaker and Language Recognition Workshop, San Juan, PR, USA, 28–30 June 2006; pp. 1–8. [Google Scholar] [CrossRef]
- Tran, D.; Tu, L.; Wagner, M. Fuzzy Gaussian mixture models for speaker recognition. In Proceedings of the International Conference on Spoken Language Processing ICSLP 1998, Sydney, Australia, 30 November–4 December 1998; p. 798. [Google Scholar]
- Janicki, A.; Staroszczyk, T. Klasyfikacja mówców oparta na modelowaniu GMM-UBM dla mowy o różnej jakości. Prz. Telekomun. —Wiadomości Telekomun. 2011, 84, 1469–1474. [Google Scholar]
- Kamiński, K.; Dobrowolski, A.P.; Majda, E. Evaluation of functionality speaker recognition system for downgraded voice signal quality. Prz. Elektrotechniczny 2014, 90, 164–167. [Google Scholar] [CrossRef]
- Kaminski, K.; Majda, E.; Dobrowolski, A.P. Automatic Speaker Recognition Using a Unique Personal Feature Vector and Gaussian Mixture Models. In Proceedings of the 2013 Signal Processing: Algorithms, Architectures, Arrangements, and Applications (SPA), Poznan, Poland, 26–28 September 2013; IEEE: Piscataway, NJ, USA, 2013; pp. 220–225. [Google Scholar]
- Reynolds, D.A.; Quatieri, T.F.; Dunn, R.B. Speaker Verification Using Adapted Gaussian Mixture Models. Digit. Signal Process. 2000, 10, 19–41. [Google Scholar] [CrossRef]
- Kamiński, K.; Dobrowolski, A.P.; Majda, E. Voice identification in the open set of speakers. Prz. Elektrotechniczny 2015, 91, 206–210. [Google Scholar] [CrossRef]
- Büyük, O.; Arslan, M.L. Model selection and score normalization for text-dependent single utterance speaker verification. Turk. J. Electr. Eng. Comput. Sci. 2012, 20, 1277–1295. [Google Scholar] [CrossRef]
- Kamiński, K.A.; Dobrowolski, A.P. Automatic Speaker Recognition System Based on Gaussian Mixture Models, Cepstral Analysis, and Genetic Selection of Distinctive Features. Sensors 2022, 22, 9370. [Google Scholar] [CrossRef] [PubMed]
- Dobrowolski, A.P.; Majda, E. Application of homomorphic methods of speech signal processing in speakers recognition system. Prz. Elektrotechniczny 2012, 88, 12–16. [Google Scholar]
- Kamiński, K.; Dobrowolski, A.P.; Majda, E.; Posiadała, D. Optimization of the automatic speaker recognition system for different acoustic paths. Prz. Elektrotechniczny 2015, 91, 89–92. [Google Scholar] [CrossRef]
- Martin, A.; Przybocki, M. 2002 NIST Speaker Recognition Evaluation LDC2004S04; Linguistic Data Consortium: Philadelphia, PA, USA, 2004. [Google Scholar] [CrossRef]
- Pretrained Speaker Recognition System-MATLAB SpeakerRecognition. Available online: https://www.mathworks.com/help/audio/ref/speakerrecognition.html (accessed on 3 July 2023).
- YAMNet Neural Network-MATLAB Yamnet. Available online: https://www.mathworks.com/help/audio/ref/yamnet.html (accessed on 17 July 2023).
- Panayotov, V.; Chen, G.; Povey, D.; Khudanpur, S. Librispeech: An ASR Corpus Based on Public Domain Audio Books. In Proceedings of the ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing-Proceedings 2015, South Brisbane, QLD, Australia, 19–24 April 2015; pp. 5206–5210. [Google Scholar] [CrossRef]
- Matějka, P.; Glembek, O.; Castaldo, F.; Alam, M.J.; Plchot, O.; Kenny, P.; Burget, L.; Černocky, J. Full-Covariance UBM and Heavy-Tailed PLDA in i-Vector Speaker Verification. In Proceedings of the ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing-Proceedings, Prague, Czech Republic, 22–27 May 2011; pp. 4828–4831. [Google Scholar] [CrossRef]
- Gemmeke, J.F.; Ellis, D.P.W.; Freedman, D.; Jansen, A.; Lawrence, W.; Moore, R.C.; Plakal, M.; Ritter, M. Audio Set: An Ontology and Human-Labeled Dataset for Audio Events. In Proceedings of the ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing-Proceedings 2017, New Orleans, LA, USA, 5–9 March 2017; pp. 776–780. [Google Scholar] [CrossRef]
- Hershey, S.; Chaudhuri, S.; Ellis, D.P.W.; Gemmeke, J.F.; Jansen, A.; Moore, R.C.; Plakal, M.; Platt, D.; Saurous, R.A.; Seybold, B.; et al. CNN Architectures for Large-Scale Audio Classification. In Proceedings of the ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing-Proceedings 2017, New Orleans, LA, USA, 5–9 March 2017; pp. 131–135. [Google Scholar] [CrossRef]
Factor | Universality | Uniqueness | Collectability | Performance | Acceptability | Spoofing |
---|---|---|---|---|---|---|
Password | n/a | L | H | H | H | H |
Token | n/a | M | H | H | H | H |
Voice | M | L | M | L | H | H |
Facial | H | L | M | L | H | M |
Ocular-based | H | H | M | M | L | H |
Fingerprint | M | H | M | H | M | H |
Hand geometry | M | M | M | M | M | M |
Location | n/a | L | M | H | M | H |
Vein | M | M | M | M | M | M |
Thermal image | H | H | L | M | H | H |
Behavior | H | H | L | L | L | L |
Beam-forming | n/a | M | L | L | L | H |
OCS 1 | n/a | L | L | L | L | M |
ECG 2 | L | H | L | M | M | L |
EEG 3 | L | H | L | M | L | L |
DNA | H | H | L | H | L | L |
Name of the Speaker Verification Method | Optimized Custom GMM | I-Vector | YAMNET |
---|---|---|---|
Number of features | 23 | 60 | 64 * |
EER | 9.69% | 10.91% | 11.21% |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Kamiński, K.A.; Dobrowolski, A.P.; Piotrowski, Z.; Ścibiorek, P. Enhancing Web Application Security: Advanced Biometric Voice Verification for Two-Factor Authentication. Electronics 2023, 12, 3791. https://doi.org/10.3390/electronics12183791
Kamiński KA, Dobrowolski AP, Piotrowski Z, Ścibiorek P. Enhancing Web Application Security: Advanced Biometric Voice Verification for Two-Factor Authentication. Electronics. 2023; 12(18):3791. https://doi.org/10.3390/electronics12183791
Chicago/Turabian StyleKamiński, Kamil Adam, Andrzej Piotr Dobrowolski, Zbigniew Piotrowski, and Przemysław Ścibiorek. 2023. "Enhancing Web Application Security: Advanced Biometric Voice Verification for Two-Factor Authentication" Electronics 12, no. 18: 3791. https://doi.org/10.3390/electronics12183791
APA StyleKamiński, K. A., Dobrowolski, A. P., Piotrowski, Z., & Ścibiorek, P. (2023). Enhancing Web Application Security: Advanced Biometric Voice Verification for Two-Factor Authentication. Electronics, 12(18), 3791. https://doi.org/10.3390/electronics12183791