Siamese Neural Network for User Authentication in Field-Programmable Gate Arrays (FPGAs) for Wearable Applications

Choi, Hyun-Sik

doi:10.3390/electronics12194030

Open AccessArticle

Siamese Neural Network for User Authentication in Field-Programmable Gate Arrays (FPGAs) for Wearable Applications

by

Hyun-Sik Choi

Department of Electronic Engineering, College of IT Convergence Engineering, Chosun University, Gwangju 61452, Republic of Korea

Electronics 2023, 12(19), 4030; https://doi.org/10.3390/electronics12194030

Submission received: 17 August 2023 / Revised: 20 September 2023 / Accepted: 22 September 2023 / Published: 25 September 2023

Download

Browse Figures

Review Reports Versions Notes

Abstract

:

User authentication has traditionally been performed using methods such as passwords or fingerprints. However, passwords have security vulnerabilities, and fingerprints may hinder user convenience. To address these issues, a novel user authentication method based on biosignals, specifically electromyogram (EMG) signals, is proposed. Using biosignals like EMG offers several advantages, including the ability to acquire data without user awareness, independence from the user’s environment, rapid acquisition, and enhanced security. However, one challenge with using EMG signals for authentication has been their relatively low accuracy. In this paper, a neural network is implemented using a small number of parameters (fewer than 7000) to produce a wearable device using biosignals, and user authentication accuracy is secured using the maximal overlap discrete wavelet transform (MODWT) method and the Siamese network. The MODWT method is highly effective for the time and frequency analysis of time series data, and the Siamese network is a representative method for few-shot learning. The proposed neural network is verified using Chosun University’s user authentication dataset, encompassing data from 100 individuals. Finally, this proposed network is implemented on an edge device such as field-programmable gate arrays (FPGAs) so that it can be applied to a wearable user authentication system. By implementing the Siamese network in FPGA-based edge devices, it was possible to secure user authentication performance at 94% accuracy and an authentication speed within 1.5 ms. In the case of accuracy, it is expected to be further improved by using the multimodal technique of biosignals. Also, the proposed system can be easily fabricated for digital integrated chips (ICs).

Keywords:

user authentication; biosignals; electromyogram (EMG); neural network; maximal overlap discrete wavelet transform (MODWT); Siamese network; edge device; field-programmable gate arrays (FPGAs); wearable user authentication system

1. Introduction

The field of user authentication is a field that has been studied for a long time. Knowledge-based user authentication is the most insecure method as it authenticates with the user’s memory. This includes passwords and PIN numbers. Next, there are possession-based user authentication methods such as keys. Typically, a one-time password (OTP) or a smart card is used. Even in this case, it has a vulnerable aspect to security through wireless communication duplication [1,2,3]. Biometric authentication is divided into a method of using a biometric signal and a behavioral signal [4,5,6]. Biometric signals reflect an individual’s unique characteristics and have lifelong traits, including methods such as fingerprint and iris recognition. The recent facial recognition technology used in smartphones and other devices also belongs to the biometric category. On the other hand, behavioral signals include methods such as signature and voice recognition, and user authentication is performed based on behavioral patterns. Each user authentication method has been used for a long time and has many advantages, but they also have some technical limitations. For example, fingerprints exhibit high security and recognition accuracy, but have the disadvantage that users must directly participate in fingerprint acquisition [7,8,9]. In the case of facial recognition, convenience is provided, but security is vulnerable [10]. Furthermore, in the case of voice signals, signal acquisition is easy, but if short words are used, recognition accuracy may decrease, and there is a vulnerability in terms of security [11].

In user authentication techniques, universality must be possessed by everyone, and characteristics must be clearly defined for each individual. In addition, the personal recognition rate should be excellent, there should be no objection to the user, and the possibility of counterfeiting should be low. That is, user authentication technology requires high security, high accuracy, fast response speed, and non-awareness. Recently, biometric signal-based user authentication that can satisfy these conditions has been extensively researched [12,13,14,15,16]. Since various biosignals are generated in an unconscious situation and each individual includes different information, a high recognition rate can be obtained. A continuous authentication technique using electrocardiogram (ECG) biometrics was proposed [17,18,19]. The proposed method extracts features of ECG signals using one-dimensional multi-resolution local binary patterns (1DMRLBP) and performs user authentication based on them. Experimental results demonstrate that the proposed method provides higher authentication performance compared to other continuous authentication techniques. User authentication using EMG signals was also proposed [15]. The proposed method includes a feature extraction stage and a classification stage. In the feature extraction stage, statistical features and wavelet packet energy features are extracted from the raw EMG signals. Experimental results on a dataset consisting of 21 subjects demonstrate that the proposed method achieves high accuracy of about 99% in personal recognition using EMG signals. However, there is a possibility that the accuracy has increased due to the small number of test subjects. There are also proposals for user authentication using both photoplethysmography (PPG) and electroencephalography (EEG) signals [20,21,22,23,24]. In this paper, a user authentication method based on EMG signals is proposed due to its easy acquisition and suitability for wearable devices. While biosignals offer user convenience, their recognition rate remains a concern, especially in wearable environments with limited hardware resources [25]. Additionally, the small dataset of EMG signals available for user authentication can significantly impact the recognition rate. Efforts have been made to implement a neural network with limited hardware resources [26,27,28]. However, the use of small hardware resources leads to a decrease in accuracy. For example, in the case of a structure using an artificial neural network, only 81.6% user authentication accuracy could be secured [28].

Compared to existing user authentication systems using biosignals, the biggest advantages of the proposed method in this paper are (A) the small neural network with guaranteed accuracy, (B) the validation process involving Chosun University’s extensive dataset, encompassing data from 100 individuals, and (C) the practical implementation in FPGAs for wearable applications. To secure (A), the MODWT method and the Siamese network are used. The MODWT method is advantageous for implementation on low-resource hardware due to its low computation requirements and optimized performance for time series data analysis. The Siamese network, known for its few-shot learning capability, enables easy learning when new data are added and ensures a high recognition rate with a small amount of training data [29,30,31,32,33]. This network consists of twin networks with identical structures and weights. These twin networks receive a pair of inputs and calculate the similarity between the two inputs by calculating the features of each input. This can be applied to classification problems, and the network learns in a way that minimizes the distance between data of the same class and increases the distance between data belonging to different classes. Although it may be challenging to apply to generalized networks, it is well suited for the user authentication environment using EMG signals [34,35,36,37].

Regarding (C), from a hardware deployment perspective, this method is implemented using FPGAs, which can be easily converted to digital logic.

A wearable system is crucial for biosignal-based user authentication, and for this purpose, deployment to edge devices is required. There are many types of edge devices, but in this paper, since digital logic implementation is targeted, it was first implemented in FPGA. Digital logic implemented in FPGAs can be easily manufactured with ICs. As edge devices, a small-scale CPU/GPU can be used, but the CPU/GPU performs sequential operations according to instruction sets, so it takes a long time to execute them [38,39,40]. In contrast, FPGAs can perform parallel operations, making them suitable for high-speed user authentication. However, since FPGA is a hardware implementation, it has limited resources for configuring networks. To address this constraint, the network’s size is reduced while maintaining accuracy. This is achieved by employing a one-dimensional convolution layer and limiting the number of parameters in each layer to 4000 or fewer. As a result, a lightweight network with approximately 7000 parameters achieves 94% recognition accuracy with hardware deployment. Furthermore, the use of MODWT and a Siamese network also enables the creation of lightweight networks with high accuracy. Additionally, a bit quantization method is implemented to reduce resource usage, confirming the feasibility of a low-power, high-speed, and highly secure user authentication system using EMG with edge devices, even in low-cost FPGAs designed for artificial intelligence applications.

Section 2 of this paper discusses the EMG signal acquisition method for the user authentication system, while Section 3 provides an in-depth exploration of the Siamese network’s architecture. Section 4 presents the overall hardware structure and verification process through high-level synthesis (HLS) for hardware deployment. Finally, Section 5 covers the conclusions, discussions, and future research directions.

2. Signal Acquisition

2.1. Chosun University’s Dataset

For the EMG signal input, “Chosun University’s dataset” is used [41]. This dataset provides EMG signals from the palmaris longus and extensor digitorum of the right arm for 100 subjects, with an average age of 24 years (ranging from 19 to 70 years). The data were acquired using a Biopac MP160 instrument in two channels. Although increasing the number of channels in the EMG signal acquisition could improve accuracy, this study opted for data acquisition on two channels to prioritize user convenience in wearable environments [42]. The EMG measurements were performed for 12 motions, but only the dataset for fist clenching was used for user authentication. This selection simplifies the EMG signal acquisition process and aligns with user convenience. The EMG signal was sampled at a rate of 2000 Hz with a 16 bit analog-to-digital converter (ADC) resolution, resulting in a total of 8000 data points over 4 s. Since the primary information related to EMG signals lies within the range of 10 to 250 Hz, a sampling rate of 2000 Hz is sufficient for EMG signal measurement [43,44,45]. Each operation was repeated twice during the 4 s interval. Of the EMG dataset for 100 individuals, 80 were used for the learning phase, and 20 were used for testing purposes. EMG signals from the same person were categorized as class 1, while EMG signals from different individuals were classified as class 0. The Siamese network took two EMG data points (measured from the same person or different persons) as input and output of the corresponding class. The measured EMG signals were normalized to have an average of 0 and a standard deviation of 1. Figure 1 and Figure 2 illustrate the normalized EMG signals measured when two different people perform the same motion and when the same person performs the same motion, respectively. The EMG signal is characterized by converting a small muscle vibration during motion into an electric signal. Consequently, unlike ECG signals with distinct characteristics, EMG signals exhibit highly fluctuating waveforms, making it challenging to distinguish between signals from the same person and different individuals [46]. For this study, no additional data augmentation was conducted, and learning and evaluation were performed using only the measured signal itself.

2.2. Feature Extraction

Since EMG signals are primarily time series data, achieving 60% accuracy using raw time series data proved to be the highest achievable level. This indicates that solely relying on the time series characteristics of EMG signals does not provide sufficient features for accurate user authentication. To improve accuracy, methods utilizing empirical mode decomposition (EMD) and continuous wavelet transform (CWT) have been proposed [47,48,49]. EMD is a technique for decomposing nonlinear and nonstationary time series data into intrinsic mode functions (IMFs), which constitute the characteristic elements of the time series. Unlike the traditional filter bank method, EMD calculates IMFs based on the unique oscillation structure of the signal, without limiting the frequency bands. Implementing the EMD and CWT methods resulted in an accuracy of over 90%.

In this paper, feature extraction was conducted using the MODWT method, similar to the approaches mentioned above. The MODWT method offers advantages in hardware implementation as it requires less computation compared to CWT and maintains a constant dataset shape, making it easy to calculate. Additionally, it provides various frequency characteristics, contributing to increased accuracy. While the CWT method involves substantial data and computation, the MODWT is a special form of DWT that maintains data size and enables straightforward calculation. In the hardware implementation, a finite impulse response (FIR) filter is utilized with digital logic [50]. By applying the MODWT method, time and frequency analysis are employed for classifying EMG signals, leading to high accuracy with fewer hardware resources. The algorithm for MODWT is based on a time series X with an arbitrary sample size N, where the jth level MODWT wavelet (W_j*) and scaling (V_j*) coefficients are computed [50].

W_{j, t}^{*} = \sum_{l = 0}^{L_{j} - 1} h_{j, l}^{*} X_{t - l m o d N}

(1)

V_{j, t}^{*} = \sum_{l = 0}^{L_{j} - 1} g_{j, l}^{*} X_{t - l m o d N}

(2)

where h_j_,l* = h_j_,l/2^j^/2 are the MODWT wavelet filters, and g_j_,l* = g_j_,l/2^j^/2 are the MODWT scaling filters. In addition, the decomposition level was selected as 4 because the performance increased up to level 4. This is related to the frequency domain being decomposed.

The results of MODWT with the decomposition level of 4 are shown in Figure 3. Like Siamese network, the MODWT module will also be implemented inside the FPGA using digital logic. The results of each MODWT are given as inputs to the Siamese network. Therefore, the number of input channels of the Siamese network is 5.

3. Siamese Network

The Siamese network’s strength lies in its ability to learn effectively with a small dataset, which is particularly advantageous for user authentication using EMG data. Moreover, it can achieve high accuracy even when new datasets are introduced through transfer learning [35]. Given the difficulty of achieving accuracy with biosignals such as EMG, the Siamese network is applied to maintain high accuracy despite limited learning data. Using the Siamese network, it is possible to secure user authentication accuracy of over 95% with only 10 individuals’ EMG data in the proposed structure in this paper. However, since the Siamese network uses a twin network structure, it includes twice as many hardware resources when implemented. This poses challenges for applying Siamese networks in edge devices, as there is a trade-off between accuracy and network size. Thus, when configuring the Siamese network, the priority is placed on using minimal hardware resources while maintaining high accuracy levels.

To minimize hardware resource usage, a one-dimensional convolution layer is employed, and the number of convolution layers in the Siamese network is kept to a minimum. The Keras framework is used to train the entire network. For hardware application, quantized Keras (QKeras) using quantized bits for learning can be more efficient. By constructing the network with QKeras, high accuracy can be achieved with a reduced number of bits during the learning phase. This bit precision reduction significantly aids in hardware resource reduction, allowing low hardware resource usage while maintaining high accuracy. In the context of QKeras, “quantized_bits” is the syntax for bit quantization. For example, “quantized_bits(9,3)” means that the total quantized bits are 9 bits, and the number that can be represented is a value between −4 and 3. Unfortunately, however, it has been confirmed that accuracy decreases significantly when hardware deploys in complex networks such as the proposed Siamese network, or when QKeras and Keras are used together. Therefore, in this paper, a method of learning using Keras and reducing the number of bits during hardware deployment was adopted. In this process, accuracy may be slightly reduced, but the use of hardware resources can be greatly reduced, so bit optimization is very important. First in, first out (FIFO) logic for storing data after each operation occupies the most significant part of the total hardware resources, and the bit precision of the operation result has the most substantial impact on the size of this logic. Thus, bit number optimization is highly sensitive to hardware resource usage.

To address overfitting, batch normalization and dropout layers are used for each convolution layer. A dropout rate of about 0.3 is used. The activation function for the one-dimensional convolution layer is the rectified linear unit (ReLU), and padding is not utilized. The kernel for the convolution layer uses L1 regulation with a parameter of 0.001. After each convolution layer, a max pooling layer is added to effectively reduce the number of features. Each Siamese network comprises four convolution layers and three max pooling layers, with no pruning techniques used. Since the proposed structure’s weight and bias count are sufficiently small, applying the pruning method leads to a significant increase in loss. The twin networks are designed to share weights and biases, resulting in a decrease in the number of trainable parameters. The shape of each layer was designed to have the smallest structure while minimizing the decrease in accuracy using KereaTuner.

The resulting feature values after the computations are transformed into distances for each feature using a merge layer, where smaller distances correspond to values closer to class 1. The distances are derived through fundamental Euclidean distance calculations. To combine them, two additional dense layers are added, enabling weighted sum operations. The final output is represented using the sigmoid activation function. The overall structure of the Siamese network is visually depicted in Figure 4. In Siamese networks, considering shared weights, the number of parameters decreases. The total number of parameters is 6835, with approximately 6643 of them being trainable, a notably small number. This demonstrates the efficient utilization of hardware resources. About 41% of the trainable parameters are found in the convolution layers, while about 59% are located in the dense layers. However, the number of layers increases in Siamese networks, leading to an upsurge in hardware resource utilization for storing intermediate results.

The learning process was conducted based on the proposed Siamese networks. The training dataset was composed of EMG data from 80 individuals. In addition, the learning process was performed using the validation data of 25% of the total training data. In this case, binary cross-entropy was used as the loss function for training. The adaptive moment estimation (adam) optimizer was used. In the learning process, optimization of various hyperparameters is required. For example, in the case of a representative hyperparameter such as learning rate (lr), the initial starting value was 0.0001. However, this is too small in the current structure, so there was a problem in finding a global solution. With the help of several optimization tools, the final starting value of lr was set to 0.001. Through this, the starting value of lr was increased to make it easy to find a global solution. In this case, the lr value was changed using the step decay function during the learning process, and the change factor was set to 0.5. The minimum lr value was set to 10⁻⁷. This approach allowed for the fine tuning of lr to an appropriate value. Similar to this method, the batch size that is another hyperparameter for the learning process is determined as 32 after the optimization process. Table 1 provides the initial and final values of parameters related to the structural optimization and learning algorithms used in this paper.

The devised neural network encompasses a total of 6643 trainable parameters. It has very few learning parameters and is essential for the use of few hardware resources. The reason why high accuracy can be secured with such a small number of learning parameters is the accurate feature extraction in time and frequency by MODWT and the use of the Siamese network. Parameters are updated through the learning process, and through this, the accuracy, recall, and precision in user authentication applications can be calculated. These evaluation indicators can be calculated using true positive (TP), which is correctly recognized as true by the model actual true data, true negative (TN), which is correctly recognized as false by the model actual false data, false positive (FP), which is mistakenly recognized as true by the model actual false data, and false negative (FN), which is mistakenly recognized as false by the model actual true data. The following equations show accuracy, recall, and precision, respectively. These metrics offer valuable insights into the model’s performance in user authentication scenarios.

A c c u r a c y = \frac{T P + T N}{T P + T N + F P + F N}

(3)

R e c a l l = \frac{T P}{T P + F N}

(4)

P r e c i s i o n = \frac{T P}{T P + F P}

(5)

Using the above parameters, the F1 score, which is often used for comparison between neural networks, can be defined as follows:

F 1 s c o r e = 2 \times \frac{P r e c i s i o n \times R e c a l l}{P r e c i s i o n + R e c a l l}

(6)

The accuracy of training data including validation data is about 95%, while the recall of training data is approximately 99.6%. Finally, the precision of training data including validation data is about 91.3%. The F1 score, a crucial metric for assessment, is calculated to be approximately 0.953. These results are significant as they are based on measurement data for various age groups, making user authentication for 80 individuals highly meaningful. The user authentication performance on the dataset actually measured from a diverse set of individuals shows the possibility of user authentication using EMG signals. Particularly noteworthy is that this accuracy is achieved with only 6835 weights and biases. This design allows for the creation of hardware suitable for a wearable user authentication system. To achieve higher accuracy, the utilization of multiple biosignals can be considered, and increasing the size of the Siamese network may significantly improve accuracy.

Regarding the Siamese network, an analysis of accuracy, recall, and precision was conducted without the need for retraining on new data. Siamese networks have the advantage of high adaptability to new data as they determine similarity by calculating feature distances. For this analysis, EMG data from 20 individuals who were not included in the initial learning process were employed. The accuracy, recall, and precision for the new data without retraining were approximately 93%, 96.5%, and 89.4%, respectively. The F1 score for this scenario is about 0.93. This result is due to the relatively high adaptability of the Siamese network to new data. Even with retraining using transfer learning, an accuracy improvement of approximately 1.5% was achieved. Consequently, retraining is not required even when adding a new user to the previously trained Siamese network.

Finally, Figure 5 shows the receiver operating characteristic (ROC) curve and area under curve (AUC). A ROC curve is often used to evaluate the performance of a model that distinguishes classes, and there is a false positive rate (FPR) on the x-axis and a true positive rate (TPR) on the y-axis. Currently, the AUC is over 99% for the EMG data of 80 individuals used in the training dataset. In this study, higher accuracy was achieved with fewer parameters by using the MODWT method and the Siamese network.

4. Hardware Deployment

The proposed neural networks must be implemented as edge devices to be applied to wearable devices. For this purpose, a small CPU/GPU can be used, but since this is an instruction-based operation, it takes a long time to operate. Especially in small-scale cases suitable for wearable devices, computation time also takes a long time. In this paper, an edge device is implemented using FPGAs to ensure rapid operation speed. Circuits implemented with FPGAs can be seamlessly transitioned to IC including digital logic. Figure 6 shows a conceptual diagram of the entire wearable system for user authentication, including an EMG sensor and an edge device for a neural network.

The Siamese network, initially implemented with Keras, can be translated into digital logic through the use of the Verilog hardware description language (HDL) for hardware deployment. Verilog HDL is a language specifically designed for digital logic and is relatively easy to configure, as its syntax is similar to that of C/C++. In Verilog HDL, the network is transformed into NAND gates, NOR gates, etc., through synthesis, creating the digital logic. Siamese networks implemented in Verilog HDL can be easily programmed into FPGAs. However, for Siamese networks with repetitive convolution layers and dense layers, it is more efficient to define and utilize each layer using high-level synthesis (HLS).

HLS is a technique that automatically converts software code written in C/C++ into Verilog HDL, a language familiar to users. Writing large logic directly in Verilog HDL can be time-consuming, but HLS significantly reduces hardware development time and increases user convenience [50]. Additionally, HLS provides various #pragma-based methods for efficient hardware deployment, enabling hardware optimization. For example, #pragma provides hardware features such as pipelining and loop unrolling. The HLS code written in this manner is then easily converted into Verilog HDL and synthesized into digital logic. This streamlined process allows the creation of digital logic suitable for application-specific integrated circuits (ASICs), enabling the manufacturing of user authentication systems using EMG signals in wearable devices.

Figure 7 represents the HLS code for the dense layer, written in C/C++, as an example. The dense layer, proposed by Keras, is configured using C/C++ and then converted to Verilog HDL using HLS. When writing in Verilog HDL, memory elements for weights need to be created, and digital logic for operations must consider factors like the number of bits. However, when using HLS, similar to C/C++, only matrix multiplication is defined for inputs and weights. The user only needs to define the addition of these products. HLS automatically handles hardware-related parts, maximizing user convenience. Each layer, such as the one-dimensional convolution layer, batch normalization layer, max pooling layer, and activation layer, was implemented using HLS to match the proposed Siamese network developed in Keras. To simplify this process, high-level synthesis for machine learning (hls4ml) eases hardware deployment by configuring and combining basic utilities corresponding to each layer [51]. Other hardware implementation methods include Vivado SDAccel and NVIDIA deep learning accelerator (NVDLA), where the core part directly converts a neural network written in Caffe or similar frameworks into register transfer level (RTL) [52]. In this paper, hls4ml served as the primary hardware deployment method.

When hls4ml is used for hardware deployment, an error may occur because the reuse factor is not defined in the structure for the merge layer. Thus, reuse_factor = 1 must be defined in the merge_config structure in the “nnet_merge.h” file in nnet_utils before use. The resource-based method was used for optimization, with a reuse factor of about 50. The advantage of hls4ml is that it can create a structure with reduced hardware resources by employing a reuse factor. The reuse factor implies the use of the same hardware for multiple calculations. A high reuse factor is bad for latency, but it is an effective way to reduce the usage of hardware resources. However, even in this case, additional hardware resources are required due to the use of FIFOs for data storage, so an appropriate reuse factor needs to be set. In terms of bit precision, the data type of HLS used with the similar meaning as “quantized_bits(9,3)” used in QKeras is “ap_fixed<9,4>” with sign bit. Bit precision directly affects accuracy and hardware resource usage, and an optimal solution was constructed after experimenting with various configurations. In the case of Keras, a floating-point number of 32 bits is used as a default. Through bit optimization, a fixed-point number is set so that the use of hardware resources is reduced while the reduction in accuracy is minimized. Based on this, “ap_fixed<16,4>” was used for the convolution layer, and “ap_fixed<20,4>” was used for the dense layers. This bit optimization can reduce the use of hardware resources while minimizing the reduction in accuracy. The accuracy, recall, and precision of the implemented hardware for EMG data from 80 individuals are approximately 94%, 99%, and 90%, respectively. The F1 score reaches about 0.943. The AUC is around 99% using hardware deployment. Figure 8 depicts the ROC curve in the digital logic implemented on the FPGA.

When configured in an actual FPGA, communication is established using the advanced extensible interface (AXI) structure, and the generated intellectual property (IP) for the artificial neural network is utilized. The FPGA chipset “xcku035-fbva676-3-e” was used, and the device utilization is shown in Figure 9. Using a Siamese network increases the size of the network due to the use of two identical networks, leading to an increase in hardware resources due to numerous calculations between layers, resulting in a higher utilization of block random access memory (BRAM) resources for FIFOs. Switching to address-based operations like ultra RAM (URAM) can reduce resource usage in the FPGA. Through this hardware deployment on an FPGA or ASIC, after acquiring an EMG signal from the wrist, user authentication can be performed using the Siamese network, and the result can be transmitted immediately. This enables wearable user authentication and the production of an accurate and security-enhanced user authentication system.

Finally, a performance analysis focusing on operation speed was conducted. In the wearable user authentication system, unlike software applications, there is no need to transmit EMG measurement data to a server or receive results, making real-time operation possible. Additionally, since it is deployed on edge devices capable of parallel computation, the computation speed is faster than that of software applications. Fast response time is a crucial factor to consider in the user authentication system. Table 2 compares the accuracy and operation time of the CPU and FPGA in the proposed Siamese network. In this case, the CPU used is the Intel Core i7-7700HQ, and the inference time for 3000 samples was considered. In the case of the CPU, after measuring the EMG, it is transmitted to the server, requiring additional time for reception. In contrast, the FPGA can perform inference in about half that time due to parallel computation, resulting in faster computation time and eliminating the need for data transmission to an external server, facilitating real-time operation. In the case of FPGAs, the increase in latency is due to the high use of reuse factor to reduce the use of hardware resources. However, the difference in inference time is small when compared to a high-performance CPU, but it still shows fast latency compared to a low-performance CPU. The proposed system is capable of completing the operation within 1.5 ms, accurately determining whether the user is authorized.

5. Conclusions and Discussions

In conclusion, this paper introduces a wearable user authentication system designed for user convenience, non-awareness, and high security. In pursuit of these objectives, EMG signals, which are easily acquired, were chosen for unaware user authentication. However, achieving accuracy with EMG signals can be challenging. To address this, a small-sized neural network with high accuracy suitable for wearable devices using MODWT and a Siamese network is proposed. For accurate feature extraction, the MODWT method, which is capable of time and frequency analysis, is selected for this system. Furthermore, since most of the training data in the user authentication environment is small, the Siamese network, which is suitable for one-shot learning and high accuracy, is chosen. This proposed network structure is optimized based on Keras. The optimized neural network is implemented in FPGAs for edge devices, and can be easily converted to digital logic. In this case, it is possible to implement it as a wearable system thanks to small parameters and a small structure. Also, the small bit precision is very helpful to use limited hardware resources. To analyze the accuracy and response time of the proposed system, Chosun university’s EMG dataset was used. In this case, EMG data from two channels, focusing on a single motion, were used. Using the data of about 100 individuals, the proposed user authentication system’s reliability can be ensured. In addition, 20 datasets from this pool were selected as new data, and it was confirmed that user authentication is possible without retraining due to the characteristics of the Siamese network which is able to work with the existing training results. Through this, the accuracy, recall, and precision for 80 individuals of the proposed hardware system on FPGA is about 94%, 99%, and 90%, respectively. In addition, the computation time for one inference is about 1.5 ms, and data transmission to an external server is not required, so it is possible to manufacture a wearable system capable of real-time operation. As a result of showing high accuracy and fast response time through a small network, this method showed the possibility of a wearable user authentication system using biosignals such as EMG signals. However, the accuracy must be increased for real applications using the multimodal technique of biosignals. In addition, efficient computational algorithms must be developed to use fewer hardware resources. In the future, the implementation of the entire circuit including the EMG sensor and the analysis of power consumption are planned.

Funding

This study was supported by a research fund from Chosun University (2022).

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

The data supporting the findings of the article are available in reference number [41].

Conflicts of Interest

The author declares no conflict of interest.

References

Ahmed, A.A.; Wendy, K.; Kabir, M.N.; Sadiq, A.S. Dynamic reciprocal authentication protocol for mobile cloud computing. IEEE Syst. J. 2020, 15, 727–737. [Google Scholar] [CrossRef]
Ahmad, S.; Abdeljaber, H.A.; Nazeer, J.; Uddin, M.Y.; Lingamuthu, V.; Kaur, A. Issues of clinical identity verification for healthcare applications over mobile terminal platform. Wirel. Commun. Mob. Comput. 2022, 1, 6245397. [Google Scholar] [CrossRef]
Govindraj, V.J.; Yashwanth, P.V.; Bhat, S.V.; Ramesh, T.K. Smart door using biometric NFC band and OTP based methods. In Proceedings of the International Conference for Emerging Technology (INCET), Belgaum, India, 5–7 June 2020; pp. 1–4. [Google Scholar]
Baig, A.F.; Eskeland, S. Security, privacy, and usability in continuous authentication: A survey. Sensors 2021, 21, 5967. [Google Scholar] [CrossRef]
El_Tokhy, M.S. Robust multimodal biometric authentication algorithms using fingerprint, iris and voice features fusion. J. Intell. Fuzzy Syst. 2021, 40, 647–672. [Google Scholar] [CrossRef]
Maiorana, E. A survey on biometric recognition using wearable devices. Pattern Recognit. Lett. 2022, 156, 29–37. [Google Scholar] [CrossRef]
Abdullahi, S.M.; Wang, H.; Li, T. Fractal coding-based robust and alignment-free fingerprint image hashing. IEEE Trans. Inf. Forensics Secur. 2020, 15, 2587–2601. [Google Scholar] [CrossRef]
Tan, T.N.; Lee, H. High-secure fingerprint authentication system using ring-LWE cryptography. IEEE Access 2019, 7, 23379–23387. [Google Scholar] [CrossRef]
Bian, W.; Gope, P.; Cheng, Y.; Li, Q. Bio-AKA: An efficient fingerprint based two factor user authentication and key agreement scheme. Future Gener. Comput. Syst. 2020, 109, 45–55. [Google Scholar] [CrossRef]
Anwarul, S.; Dahiya, S. A comprehensive review on face recognition methods and factors affecting facial recognition accuracy. Proc. ICRIC Recent Innov. Comput. 2020, 1, 495–514. [Google Scholar]
Bunrit, S.; Inkian, T.; Kerdprasop, N.; Kerdprasop, K. Text-independent speaker identification using deep learning model of convolution neural network. Int. J. Mach. Learn. Comput. 2019, 9, 143–148. [Google Scholar] [CrossRef]
Prakash, A.J.; Patro, K.K.; Hammad, M.; Tadeusiewicz, R.; Pławiak, P. BAED: A secured biometric authentication system using ECG signal based on deep learning techniques. Biocybern. Biomed. Eng. 2022, 42, 1081–1093. [Google Scholar] [CrossRef]
Ahamed, F.; Farid, F.; Suleiman, B.; Jan, Z.; Wahsheh, L.A.; Shahrestani, S. An intelligent multimodal biometric authentication model for personalized healthcare services. Future Internet 2022, 14, 222. [Google Scholar] [CrossRef]
Tatar, A.B. Biometric identification system using EEG signals. Neural Comput. Appl. 2023, 35, 1009–1023. [Google Scholar] [CrossRef]
Lu, L.; Mao, J.; Wang, W.; Ding, G.; Zhang, Z. A study of personal recognition method based on EMG signal. IEEE Trans. Biomed. Circuits Syst. 2020, 14, 681–691. [Google Scholar] [CrossRef]
Bidgoly, A.J.; Bidgoly, H.J.; Arezoumand, Z. A survey on methods and challenges in EEG based authentication. Comput. Secur. 2020, 93, 101788. [Google Scholar] [CrossRef]
Louis, W.; Komeili, M.; Hatzinakos, D. Continuous authentication using one-dimensional multi-resolution local binary patterns (1DMRLBP) in ECG biometrics. IEEE Trans. Inf. Forensics Secur. 2016, 11, 2818–2832. [Google Scholar] [CrossRef]
Arteaga-Falconi, J.S.; Al Osman, H.; El Saddik, A. ECG authentication for mobile devices. IEEE Trans. Instrum. Meas. 2015, 65, 591–600. [Google Scholar] [CrossRef]
Hosseinzadeh, M.; Vo, B.; Ghafour, M.Y.; Naghipour, S. Electrocardiogram signals-based user authentication systems using soft computing techniques. Artif. Intell. Rev. 2021, 54, 667–709. [Google Scholar] [CrossRef]
Hinatsu, S.; Suzuki, D.; Ishizuka, H.; Ikeda, S.; Oshiro, O. Evaluation of PPG feature values toward biometric authentication against presentation attacks. IEEE Access 2022, 10, 41352–41361. [Google Scholar] [CrossRef]
Siam, A.I.; Elazm, A.A.; El-Bahnasawy, N.A.; El Banby, G.M.; Abd El-Samie, F.E. PPG-based human identification using Mel-frequency cepstral coefficients and neural networks. Multimed. Tools Appl. 2021, 80, 26001–26019. [Google Scholar] [CrossRef]
Labati, R.D.; Piuri, V.; Rundo, F.; Scotti, F. Photoplethysmographic biometrics: A comprehensive survey. Pattern Recognit. Lett. 2022, 156, 119–125. [Google Scholar] [CrossRef]
Stergiadis, C.; Kostaridou, V.D.; Veloudis, S.; Kazis, D.; Klados, M.A. A Personalized User Authentication System Based on EEG Signals. Sensors 2022, 22, 6929. [Google Scholar] [CrossRef] [PubMed]
Zeynali, M.; Seyedarabi, H. EEG-based single-channel authentication systems with optimum electrode placement for different mental activities. Biomed. J. 2019, 42, 261–267. [Google Scholar] [CrossRef] [PubMed]
He, J.; Jiang, N. Biometric from surface electromyogram (sEMG): Feasibility of user verification and identification based on gesture recognition. Front. Bioeng. Biotechnol. 2020, 8, 58. [Google Scholar] [CrossRef] [PubMed]
Xu, H.; Guo, M.; Nedjah, N.; Zhang, J.; Li, P. Vehicle and pedestrian detection algorithm based on lightweight YOLOv3-promote and semi-precision acceleration. IEEE Trans. Intell. Transp. Syst. 2022, 23, 19760–19771. [Google Scholar] [CrossRef]
Wang, X.A.; Weng, J.; Ma, J.; Yang, X. Cryptanalysis of a public authentication protocol for outsourced databases with multi-user modification. Inf. Sci. 2019, 488, 13–18. [Google Scholar] [CrossRef]
Shin, S.; Jung, J.; Kim, Y.T. A study of an EMG-based authentication algorithm using an artificial neural network. In Proceedings of the IEEE Sensors, Glasgow, UK, 29 October–1 November 2017; pp. 1–3. [Google Scholar]
Zhang, Z.; Peng, H. Deeper and wider siamese networks for real-time visual tracking. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Long Beach, CA, USA, 15–20 June 2019; pp. 4591–4600. [Google Scholar]
Shen, J.; Tang, X.; Dong, X.; Shao, L. Visual object tracking by hierarchical attention siamese network. IEEE Trans. Cybern. 2019, 50, 3068–3080. [Google Scholar] [CrossRef]
Fu, K.; Fan, D.P.; Ji, G.P.; Zhao, Q.; Shen, J.; Zhu, C. Siamese network for RGB-D salient object detection and beyond. IEEE Trans. Pattern Anal. Mach. Intell. 2021, 44, 5541–5559. [Google Scholar] [CrossRef]
Liang, Z.; Shen, J. Local semantic siamese networks for fast tracking. IEEE Trans. Image Process. 2019, 29, 3351–3364. [Google Scholar] [CrossRef]
Roy, S.K.; Harandi, M.; Nock, R.; Hartley, R. Siamese networks: The tale of two manifolds. In Proceedings of the IEEE/CVF International Conference on Computer Vision, Seoul, Republic of Korea, 27 October–2 November 2019; pp. 3046–3055. [Google Scholar]
Yousif, A.S.; Omar, Z.; Sheikh, U.U. An improved approach for medical image fusion using sparse representation and Siamese convolutional neural network. Biomed. Signal Process. Control. 2022, 72, 103357. [Google Scholar]
Hazratifard, M.; Agrawal, V.; Gebali, F.; Elmiligi, H.; Mamun, M. Ensemble Siamese Network (ESN) Using ECG Signals for Human Authentication in Smart Healthcare System. Sensors 2023, 23, 4727. [Google Scholar] [CrossRef] [PubMed]
Li, Z.; Wang, H.; Liu, X. A one-dimensional Siamese few-shot learning approach for ECG classification under limited data. In Proceedings of the 43rd Annual International Conference of the IEEE Engineering in Medicine & Biology Society (EMBC), virtual, Mexico, 1–5 November 2021; pp. 455–458. [Google Scholar]
Fan, B.; Liu, X.; Su, X.; Hui, P.; Niu, J. Emgauth: An emg-based smartphone unlocking system using siamese network. In Proceedings of the IEEE International Conference on Pervasive Computing and Communications (PerCom), Austin, TX, USA, 23–27 March 2020; pp. 1–10. [Google Scholar]
Xu, X.; Li, H.; Xu, W.; Liu, Z.; Yao, L.; Dai, F. Artificial intelligence for edge service optimization in internet of vehicles: A survey. Tsinghua Sci. Technol. 2021, 27, 270–287. [Google Scholar] [CrossRef]
Deng, S.; Zhao, H.; Fang, W.; Yin, J.; Dustdar, S.; Zomaya, A.Y. Edge intelligence: The confluence of edge computing and artificial intelligence. IEEE Internet Things J. 2020, 7, 7457–7469. [Google Scholar] [CrossRef]
Amin, S.U.; Hossain, M.S. Edge intelligence and Internet of Things in healthcare: A survey. IEEE Access 2020, 9, 45–59. [Google Scholar] [CrossRef]
Kim, J.S.; Song, C.H.; Bak, E.; Pan, S.B. Multi-Session Surface Electromyogram Signal Database for Personal Identification. Sustainability 2022, 14, 5739. [Google Scholar] [CrossRef]
Zwarts, M.J.; Stegeman, D.F. Multichannel surface EMG: Basic aspects and clinical utility. Muscle Nerve Off. J. Am. Assoc. Electrodiagn. Med. 2003, 28, 1–17. [Google Scholar] [CrossRef]
Ives, J.C.; Wigglesworth, J.K. Sampling rate effects on surface EMG timing and amplitude measures. Clin. Biomech. 2003, 18, 543–552. [Google Scholar] [CrossRef]
Doheny, E.P.; Goulding, C.; Flood, M.W.; Mcmanus, L.; Lowery, M.M. Feature-based evaluation of a wearable surface EMG sensor against laboratory standard EMG during force-varying and fatiguing contractions. IEEE Sens. J. 2019, 20, 2757–2765. [Google Scholar] [CrossRef]
Asghar, A.; Jawaid Khan, S.; Azim, F.; Shakeel, C.S.; Hussain, A.; Niazi, I.K. Review on electromyography based intention for upper limb control using pattern recognition for human-machine interaction. Proc. Inst. Mech. Eng. Part H J. Eng. Med. 2022, 236, 628–645. [Google Scholar] [CrossRef]
Zhao, K.; Guo, J.; Guo, S.; Fu, Q. Design of fatigue grade classification system based on human lower limb surface emg signal. In Proceedings of the IEEE International Conference on Mechatronics and Automation (ICMA), Guilin, China, 7–10 August 2022; pp. 1015–1020. [Google Scholar]
Jadhav, P.; Rajguru, G.; Datta, D.; Mukhopadhyay, S. Automatic sleep stage classification using time–frequency images of CWT and transfer learning using convolution neural network. Biocybern. Biomed. Eng. 2020, 40, 494–504. [Google Scholar] [CrossRef]
Belkhou, A.; Achmamad, A.; Jbari, A. Classification and diagnosis of myopathy EMG signals using the continuous wavelet transform. In Proceedings of the Scientific Meeting on Electrical-Electronics & Biomedical Engineering and Computer Science (EBBT), Istanbul, Turkey, 24–26 April 2019; pp. 1–4. [Google Scholar]
Ozdemir, M.A.; Kisa, D.H.; Guren, O.; Akan, A. Hand gesture classification using time–frequency images and transfer learning based on CNN. Biomed. Signal Process. Control 2022, 77, 103787. [Google Scholar] [CrossRef]
Choi, H.-S. Electromyogram (EMG) Signal Classification Based on Light-Weight Neural Network with FPGAs for Wearable Application. Electronics 2023, 12, 1398. [Google Scholar] [CrossRef]
Fast Machine Learning Lab. Available online: https://github.com/fastmachinelearning/ (accessed on 5 August 2022).
Farshchi, F.; Huang, Q.; Yun, H. Integrating NVIDIA deep learning accelerator (NVDLA) with RISC-V SoC on FireSim. In Proceedings of the 2nd Workshop on Energy Efficient Machine Learning and Cognitive Computing for Embedded Applications (EMC2), Washington, DC, USA, 17 February 2019; pp. 21–25. [Google Scholar]

Figure 1. EMG signals obtained from two different individuals, (a) a 27-year-old person and (b) a 43-year-old person.

Figure 2. EMG signals obtained from the same person, 45 years of age. (a) Leading time and (b) lagging time.

Figure 3. MODWT execution results. (a) Original signal, (b) level 1, (c) level 2, (d) level 3, (e) level 4, (f) residual.

Figure 4. Structure of Siamese network.

Figure 5. ROC curve for training data.

Figure 6. Conceptual diagram of the entire wearable system for user authentication.

Figure 7. HLS code for dense layer.

Figure 8. ROC curve for hardware deployment.

Figure 9. Device utilization for FPGA chipset “xcku035-fbva676-3-e”.

Table 1. Initial and final values of parameters related to structural optimization and learning algorithms.

Category	Final Value	Initial Value
Number of convolution layers	4	6
Filter number	[8, 8, 16, 16]	32~64
Learning rate (starting value, change factor)	0.001, 0.5	0.0001, 0.5
Dropout rate	0.3	0.5
Batch size	32	256

Table 2. Comparison of accuracy and response speed between CPU and FPGA.

Category	Accuracy	Recall	Precision	Inference Time (3000 Samples)
CPU	95%	99.6%	91.3%	8.64 s
FPGA	94%	99%	90%	4.32 s

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2023 by the author. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Choi, H.-S. Siamese Neural Network for User Authentication in Field-Programmable Gate Arrays (FPGAs) for Wearable Applications. Electronics 2023, 12, 4030. https://doi.org/10.3390/electronics12194030

AMA Style

Choi H-S. Siamese Neural Network for User Authentication in Field-Programmable Gate Arrays (FPGAs) for Wearable Applications. Electronics. 2023; 12(19):4030. https://doi.org/10.3390/electronics12194030

Chicago/Turabian Style

Choi, Hyun-Sik. 2023. "Siamese Neural Network for User Authentication in Field-Programmable Gate Arrays (FPGAs) for Wearable Applications" Electronics 12, no. 19: 4030. https://doi.org/10.3390/electronics12194030

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Siamese Neural Network for User Authentication in Field-Programmable Gate Arrays (FPGAs) for Wearable Applications

Abstract

1. Introduction

2. Signal Acquisition

2.1. Chosun University’s Dataset

2.2. Feature Extraction

3. Siamese Network

4. Hardware Deployment

5. Conclusions and Discussions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI