1. Introduction
With the continuous development of science, technology, and computer engineering, many traditional techniques in society are rapidly transitioning to methods utilizing digital technology. The methods for personal identification and authentication are no exception. Biometric recognition technology offers a compelling solution by leveraging an individual’s unique physiological or behavioral traits for identification and verification. Biometric recognition technology, providing a secure, accurate, fast, and convenient means of distinguishing and verifying individuals, has undergone significant advancements across academic research and various industries in response to societal needs and interests.
It is estimated that the global biometrics market will grow to 86.1 billion USD by 2028 from its projected 2023 value of 47.8 billion USD [
1], showcasing the increasing influence of digital technology in everyday life. Biometric recognition technology can be classified into
physiological features, which directly extract information from the body, and
behavioral features, which measure specific individual behavior patterns. Examples of physiological features include fingerprints, faces, irises, retinas, veins, hand geometry, palm prints, and brainwaves, while behavioral features encompass signatures, voiceprints, gait, and keystroke dynamics. Key methods and characteristics of the commonly used biometrics are described in
Table 1 [
2].
As biometric recognition technology continues to evolve, its security and convenience are expected to improve further. However, addressing privacy concerns and ensuring ethical practices are crucial for its widespread adoption. By striking a balance between innovation and responsibility, biometric authentication has the potential to revolutionize personal identification, enhancing security and convenience across various industries and aspects of daily life. Among these biometric recognition and human identification technologies, face is the most common characteristic used by humans for recognition [
3]. Facial recognition technology, due to its simplicity, convenience, and widespread use of high-resolution smart devices, has gained considerable attention in daily life [
4]. With the increasing deployment of face recognition in many applications in the field of the cyber physical system and the industrial control system such as remote-sensing entrance guards, surveillance systems, and intelligent human machine interfaces, its security concern becomes increasingly important [
5].
Concurrently, the development of computing technology, advancements in personal computers and digital camera capabilities, the emergence of high-performance image processing hardware and software, and the revolutionary progress of 3D printing have empowered individuals to easily create, modify, and manipulate various types of information. Therefore, facial information can also be easily and sophisticatedly forged. Forgery, also known as spoofing, counterfeiting, or manipulation, involves intentionally altering or creating false information with deceptive intent. The use of forged information can lead to the irreparable loss of property and trust, necessitating significant capital, time, and effort for recovery [
6]. The relentless march of technological progress has brought about a fascinating paradox. While computing power, PC and digital camera capabilities, and high-performance image processing tools have empowered individuals for legitimate purposes, they have also inadvertently opened a Pandora’s box of manipulation. This very development has made facial information, once considered a reliable identifier, susceptible to sophisticated forgery. Significant facial forgery attack types with their characteristics are described in
Table 2 [
7,
8,
9].
Therefore, the verification of the authenticity of biometric information has become a critical issue. In alignment with the utilization and advancement of facial recognition technology, various facial recognition forgery techniques have emerged, involving manipulated photos, digital reproductions on smart devices, 3D masks, and cosmetic makeup [
4,
10].
Anti-spoofing refers to technologies that detect such forgeries to ascertain the authenticity of facial features. Traditional methods, utilizing techniques like Local Binary Patterns (LBP) or Histogram of Oriented Gradients (HOG), focus on detecting alterations in images. However, these methods are often limited to specific forgery types or specific scenarios, lacking adaptability to newly discovered forgery methods. Additionally, research is actively addressing the detection of human-specific features, such as eye blinking and subtle movements, which are characteristic of genuine individuals [
11]. Nevertheless, these methods may require several seconds of continuous video capture and may be vulnerable to video replay attacks. While significant progress has been made in the field of biometric facial recognition, challenges persist in developing comprehensive anti-spoofing measures that can adapt to evolving forgery techniques and effectively address real-world scenarios.
Over the years, advancements in fields such as mathematics, mechanics, and cognitive science have paved the way for the continuous development of various image processing and computer vision algorithms [
12]. In particular, recent breakthroughs in deep learning technology have significantly benefited the domain of image processing, witnessing diverse applications. Notably, convolutional neural networks (CNNs) have emerged as one of the most successful models in deep learning, particularly excelling in object recognition and decision-making tasks [
13]. This integration of deep learning and computer vision, providing machines with visual perception, offers promising solutions to long-standing problems that humanity has sought to address [
14].
Additionally, research continues on deep learning approaches to identify forged features. One notable example of outstanding research is the development of an algorithm that efficiently distinguishes facial forgery by self-learning genuine faces [
15]. Additionally, an algorithm is proposed that efficiently distinguishes facial forgery by adjustably learning the weights of the 3 × 3 gradient filter used in convolution [
16]. Furthermore, there have been significant advances in face recognition due to deep learning techniques [
3]. However, research on facial forgery prevention is still largely focused on addressing known forgery techniques, while studies on effectively responding to emerging forgery techniques remain limited. Despite the advancement of technology, which enables more diverse and sophisticated facial forgery methods, research has not been able to keep pace with these developments. Proactive research and responses are needed to address this issue.
The accuracy and reliability of deep learning results are significantly influenced by the quality and quantity of the data used for training. A key bottleneck in deep learning research is securing a sufficient amount of high-quality data, as this directly impacts the performance of models. Without adequate and well-prepared data, even the most advanced algorithms can underperform, leading to inaccurate or biased predictions. In classification tasks, achieving a balance in the number of data samples across different categories is crucial. Imbalanced data can skew the learning process, causing the model to favor the majority class and neglect the minority class, resulting in poor generalization and inaccurate predictions. Therefore, both data quality and balance are fundamental to deep learning and essential for producing objective and reasonable results. The importance of these factors cannot be overstated, as they form the foundation for reliable and fair analysis in machine learning systems.
Data imbalance refers to the situation where the number of data samples in certain classes is significantly smaller than in other classes. This issue can negatively impact the training of machine learning models, particularly by reducing the prediction performance for the minority class. Models trained on imbalanced data tend to be biased toward the majority class, as they encounter it more frequently during training. This can result in poor generalization and inaccurate predictions for the minority class. Data imbalance is a common problem across various domains, but it is especially prevalent in the following areas [
17]:
Medical Field: Medical datasets often suffer from imbalance because rare diseases or abnormal findings are, by nature, scarce. For example, images of rare medical conditions or abnormal test results are much harder to collect than those of normal or healthy cases. This disparity can significantly affect the diagnostic accuracy of deep learning models. Models may become highly adept at identifying common conditions but fail to accurately diagnose rare or critical conditions, which can lead to harmful misdiagnoses.
Finance: In fraud detection or credit risk prediction, fraudulent transactions or cases of high credit risk occur far less frequently than normal transactions. This imbalance creates challenges in developing models that are sensitive enough to detect fraudulent behavior without overfitting to the majority (normal transactions), where the vast majority of data resides. This imbalance can severely compromise the model’s ability to detect fraudulent activities, which are often the most critical to identify.
Cybersecurity: In areas such as malware detection, intrusion detection, or DDoS (Distributed Denial of Service) attack prevention, the amount of malicious activity data is typically much smaller compared to normal network traffic data. Since malicious events are relatively rare compared to regular network activity, this data imbalance can hinder a security system’s ability to accurately detect attacks, leaving the system vulnerable to undetected threats.
Law Enforcement and Forensics: Data imbalance also exists in fields such as crime pattern detection and forensic analysis. For example, rare events such as serial killings or specific criminal behaviors (e.g., anonymous threat letters) are difficult to model due to their infrequency. This scarcity of data makes it hard for predictive models to identify or generalize patterns in such cases, which can have significant real-world implications, such as delays in identifying criminal activity.
To address the issue of data imbalance, various methods have been proposed, each with its own set of advantages and disadvantages [
18]. One of the most common approaches is
oversampling, where additional samples are generated for the minority class to balance the dataset. This can be achieved by duplicating existing samples or through techniques like data augmentation, where synthetic samples are created by slightly altering existing data points. Image augmentation methods include random shifts, flips, rotations, adjustments of brightness, and combinations of these to the designated image data. While oversampling is simple to implement, it can increase the risk of
overfitting, especially if duplicate data leads the model to “memorize” specific patterns in the minority class rather than learning to generalize.
Another approach is undersampling, which involves reducing the number of samples in the majority class to create a more balanced dataset. This can be conducted randomly or using algorithms that selectively remove redundant data. Undersampling reduces the likelihood of overfitting but comes with the disadvantage of potentially discarding valuable information from the majority class, which may negatively affect the model’s overall performance.
Addressing data imbalance is critical for ensuring that machine learning models perform well, especially in applications where the accurate detection of rare events is vital. Choosing the right technique for handling data imbalance depends on the specific problem, the available data, and the performance requirements of the model. Both oversampling and undersampling have their place, but careful consideration must be given to avoid overfitting or the loss of important data. As machine learning continues to be applied in high-stakes domains such as healthcare, finance, and cybersecurity, finding effective solutions to the data imbalance problem remains an essential challenge for researchers and practitioners alike.
In this study, a novel facial forgery detection technique is proposed for effective biometric recognition using innovative deep learning networks and algorithms. The proposed algorithm effectively addresses the chronic issue of data imbalance in deep learning. Additionally, the approach allows for the determination of forgery authenticity without being constrained by specific forgery methods. The implementation and experimental results demonstrate an ACER of approximately 0.06, achieving a maximum performance improvement compared to previous studies.
The rest of this paper is described as follows. In
Section 2, dataset acquisition and the proposed algorithm are described. The experimental results are analyzed in
Section 3, and the overall discussion is mentioned in
Section 4. The terms “forgery”, “spoofing”, “counterfeiting”, and “manipulation” can have different definitions depending on the context, both in general and technical manner. In this paper, the term “forgery” is used to encompass all of the meanings of the terms mentioned above, as they all involve the creation of false or counterfeit facial information. Also, the term “genuine” is used to refer to original, unaltered facial information. This clarification of terms will help to avoid any confusion and ensure that the results of this study are interpreted correctly.