Next Article in Journal
Optimizing Data Flow in Binary Neural Networks
Next Article in Special Issue
Comparison of the Accuracy of Ground Reaction Force Component Estimation between Supervised Machine Learning and Deep Learning Methods Using Pressure Insoles
Previous Article in Journal
Design and Feasibility Verification of Novel AC/DC Hybrid Microgrid Structures
Previous Article in Special Issue
A Comprehensive Study of Object Tracking in Low-Light Environments
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

DDP-FedFV: A Dual-Decoupling Personalized Federated Learning Framework for Finger Vein Recognition

1
School of Computer Science, Nanjing University of Posts and Telecommunications, Nanjing 210023, China
2
Jiangsu High Technology Research Key Laboratory for Wireless Sensor Networks, Nanjing University of Posts and Telecommunications, Nanjing 210023, China
3
College of Automation & College of Artificial Intelligence, Nanjing University of Posts and Telecommunications, Nanjing 210023, China
4
College of Information Science and Technology and College of Artificial Intelligence, Nanjing Forestry University, Nanjing 210037, China
*
Author to whom correspondence should be addressed.
Sensors 2024, 24(15), 4779; https://doi.org/10.3390/s24154779
Submission received: 20 June 2024 / Revised: 21 July 2024 / Accepted: 22 July 2024 / Published: 23 July 2024
(This article belongs to the Special Issue Biometrics Recognition Systems)

Abstract

:
Finger vein recognition methods, as emerging biometric technologies, have attracted increasing attention in identity verification due to their high accuracy and live detection capabilities. However, as privacy protection awareness increases, traditional centralized finger vein recognition algorithms face privacy and security issues. Federated learning, a distributed training method that protects data privacy without sharing data across endpoints, is gradually being promoted and applied. Nevertheless, its performance is severely limited by heterogeneity among datasets. To address these issues, this paper proposes a dual-decoupling personalized federated learning framework for finger vein recognition (DDP-FedFV). The DDP-FedFV method combines generalization and personalization. In the first stage, the DDP-FedFV method implements a dual-decoupling mechanism involving model and feature decoupling to optimize feature representations and enhance the generalizability of the global model. In the second stage, the DDP-FedFV method implements a personalized weight aggregation method, federated personalization weight ratio reduction (FedPWRR), to optimize the parameter aggregation process based on data distribution information, thereby enhancing the personalization of the client models. To evaluate the performance of the DDP-FedFV method, theoretical analyses and experiments were conducted based on six public finger vein datasets. The experimental results indicate that the proposed algorithm outperforms centralized training models without increasing communication costs or privacy leakage risks.

1. Introduction

Finger vein recognition technology is an emerging biometric technology that uses the distribution pattern of veins within the finger for identity verification. This technology is highly accurate, has live detection capabilities, and is noninvasive, making it suitable for various identity verification applications.
To extract more representative finger vein features, deep learning algorithms have become the mainstream feature extraction methods in finger vein recognition, considerably improving recognition accuracy [1,2,3,4]. Training a robust deep learning model typically requires a large amount of finger vein image data [4,5]. However, due to privacy protection laws [6,7,8,9], research institutions and companies cannot collect or share extensive finger vein datasets. As a result, these organizations can train models using only limited local data, leading to suboptimal accuracy and generalizability. Federated learning (FL) addresses this issue by enabling multiple clients to collaboratively train a model while ensuring data privacy and security. FL methods facilitate the training of high-performance models without sharing datasets, aligning with the privacy and high-accuracy requirements in biometric recognition, and thus shows great application potential. Currently, FL methods have been applied in areas such as facial recognition [10,11,12,13,14] and iris recognition [15,16]. However, research on finger vein recognition remains limited. Only one published study has proposed FedFV [17]. However, in this study, only the size of the dataset for each client was used to calculate the aggregation weight matrix, while the heterogeneity among clients’ finger vein datasets was not considered, resulting in significantly different performance across clients.
The heterogeneity among client datasets must be addressed when applying FL to finger vein recognition. In reality, researchers at institutions and companies collect finger vein images under different conditions with various equipment, resolutions, external environments, and capture areas, resulting in highly heterogeneous image sets. Existing methods have not effectively addressed this heterogeneity, leading to suboptimal performance. Figure 1 shows images from two different client datasets. Clearly, the main parts of these images are similar, consisting of vein patterns, but there are substantial differences in lighting, angle, background, and other aspects. In finger vein model training, the primary task is to accurately capture and analyze vein pattern features to construct a highly robust recognition model. However, during actual training, the model learns parameter information from the input images, which include both valuable foreground vein information and considerable background information. When model parameters, which couple various information, for different clients are uploaded to the server for aggregation, the global model parameters are influenced by the specific background information in the images of each client. This can reduce the generalizability of the aggregated model. A reasonable approach would be to distinguish foreground and background information. Foreground information, as common knowledge in finger vein recognition tasks, should be aggregated at the server. In contrast, background information, reflecting the personalized characteristics of each client, should remain local to mitigate the negative impact of dataset heterogeneity on model performance.
Based on the above analysis, this paper introduces the concept of decoupling [14] to better utilize both types of information. Decoupling is performed at two levels: the model level and the feature level. Model decoupling involves separating the feature extractor from the classifier, with the former focusing on improving model generalizability and the latter focusing on capturing personalized information. Additionally, feature decoupling is introduced to distinguish between foreground and background information. The decoupled common foreground information is used for collaborative training to obtain a global domain-invariant model with strong generalizability. The personalized background information is used in the second stage for client similarity measurements to train personalized models for each client. Based on the above ideas, this paper proposes the two-stage dual-decoupling personalized federated learning framework for finger vein recognition (DDP-FedFV), which protects data privacy while training high-performance personalized models for each client. The main contributions of this paper can be summarized as follows.
(1) In this paper, the concepts of generalizability and personalization are combined by designing a two-stage FL algorithm. In the first stage, the generalizability of the global server model is improved, while in the second stage, personalized information is emphasized during fine-tuning to derive model parameters better suited for each client (N-Model Personalized FL).
(2) To enhance model generalizability, this paper introduces a dual-decoupling strategy in the first stage of training. This is the first work in the field of finger vein recognition to incorporate this approach. The server aggregates the common information from various heterogeneous datasets to train a more robust model.
(3) To obtain personalized models for each client, an innovative personalized federated aggregation scheme, FedPWRR, is proposed in this framework. FedPWRR fully leverages the similarity between clients and the size of the datasets to calculate personalized aggregation weights. This weighted aggregation strategy enables the development of high-performance personalized models for each client.
(4) This paper presents a theoretical analysis of the convergence of the DDP-FedFV method. Experiments were conducted based on six public heterogeneous finger vein datasets, with the results demonstrating that the DDP-FedFV algorithm performs stably and reliably for clients with significantly different data distributions.

2. Related Work

2.1. Finger Vein Recognition

There has been significant progress in finger vein recognition technology in recent years. The main processes include preprocessing, finger vein feature extraction, and finger vein feature matching. Feature extraction is the core step, and two main types of methods have been developed: traditional methods and deep learning-based methods.
Traditional finger vein feature extraction methods can be divided into four categories: minutia-based, dimensionality reduction-based, local texture-based, and vein pattern-based methods [18,19,20,21]. These methods can achieve a certain level of accuracy. However, because they rely on manually designed features, they are highly subjective and may require redesigning features or adjusting parameters when handling new datasets, resulting in poor scalability and robustness. In contrast, deep learning-based methods learn more comprehensive features without manual intervention. Thus, deep learning-based finger vein recognition methods have gradually replaced traditional methods and become popular research topics. Liu et al. [22] employed a seven-layer convolutional neural network (CNN) with five convolutional layers and two fully connected layers to process finger vein images. Chen et al. [23] proposed a finger vein recognition algorithm based on feature block fusion and a CNN, improving the deep network input by using a set of feature points in the finger vein images. Abbas [24] introduced a new hybrid architecture combining a CNN and long short-term memory (LSTM) network to enhance vein detection performance. Ma et al. [25] used an ant colony algorithm for region of interest (ROI)-extraction and integrated a dual attention network (DANet) with EfficientNetV2, providing new ideas for deep learning-based finger vein recognition methods.
Overall, deep learning methods outperform traditional methods that require manual feature extraction. However, deep learning methods typically require large amounts of data for model training. Due to privacy protection, data security, and cost constraints, it is challenging for individual research institutions to collect large amounts of finger vein data, and they cannot share their datasets. Consequently, the lack of sufficient data is increasingly becoming a bottleneck for enhancing network performance. In this context, enabling multiple institutions to share data to collaboratively train more effective models while meeting privacy requirements has become increasingly important.

2.2. Finger Vein Recognition Based on Federated Learning

As a distributed machine learning framework, FL [26] enables model training through collaboration among clients without sharing raw data. FL methods simultaneously protect the privacy of client data and enhance model performance for each client, making FL highly suitable for development requirements in the biometric recognition field.
Currently, FL has demonstrated its unique value and broad application potential in biometric recognition, with substantial progress in face recognition [10,11,12,13,14] and iris recognition [15,16]. However, the application of FL with finger vein recognition technology is still in the exploratory stage. Among the limited research, Lian et al. [17] were the first to propose a FL-based finger vein recognition algorithm, FedFV. They designed a personalized parameter aggregation method to address the issues of existing personalized aggregation algorithms, including long processing times and high computational costs. This method also mitigates the limitations of traditional FL algorithms, with clients with less data contributing less to the model. However, due to the strong heterogeneity of finger vein datasets, FedFV does not provide an optimal solution, resulting in limited model performance and poorer performance for some clients.
To address the aforementioned limitations, in this paper, generalizability and personalization are both considered to design a two-stage FL model. In the first stage, a dual-decoupling mechanism is utilized to enhance the generalizability of the global server model, mitigating the adverse effects of heterogeneity among different datasets. In the second stage, the FedPWRR algorithm is designed to use the similarity information of the decoupled background features from the first stage to customize the model parameters for each client.

3. Methodology

This section provides a detailed introduction to the problem definition of the FL-based finger vein recognition algorithm. Building on this, the overall framework of the DDP-FedFV method and the detailed design of each module are introduced.

3.1. Problem Description

This section defines the problem of the FL-based finger vein recognition algorithm. Assume there are N different clients in the client set C, denoted as Ck(k = 1, ……, N). For each k-th client, there is a local private dataset DkXk × Yk, where Xk and Yk represent the feature space and label space of the k-th client’s dataset, respectively. The size of each private dataset is Mk, and the total size of all the clients’ datasets is M. In addition to the N clients, a server is defined to perform operations such as model parameter initialization and aggregation during each round of communication. After the global (t − 1)-th round of communication, the server aggregates the model parameters θt and distributes them to each client. The objective definition of the FL-based finger vein recognition algorithm is shown in Equation (1).
min θ R e m p θ = 1 N Σ N k = 1 f k θ
Here, R e m p θ represents the empirical risk of the model parameters θ across all participating clients. f k θ = E x , y ~ D k L θ ; x , y denotes the empirical risk of the model parameters θ for the k-th client. L θ ; x , y is the loss function defined by the algorithm, where x and y represent the features and labels of the samples in dataset Dk.

3.2. Framework of the DDP-FedFV Method

This section introduces the framework of the DDP-FedFV method. A schematic diagram is shown in Figure 2. The overall training process is divided into two stages. In the first stage, the heterogeneity among finger vein datasets is addressed by applying a dual-decoupling strategy to the local models. In the second stage, a similarity-based aggregation scheme is used to personalize and fine-tune the performance of the global model obtained in the first stage.
The pseudocode for the DDP-FedFV method is shown in Algorithm 1. For more detailed descriptions of these two stages, see Section 3.3 and Section 3.4. We define the total number of global communication rounds for the entire framework as T. The proportion of global communication rounds allocated to the first stage is ρ, meaning that the number of global communication rounds in the first stage Tg is ρT. Consequently, the number of global communication rounds in the second stage is T-Tg.
Algorithm 1. DDP-FedFV
Parameters :   Number   of   global   epochs   T ,   number   of   local   epochs   E ,   weight   aggregation   matrix   W S N × N , generalization ratio ρ
Initialize   ω D I ,   ω D S ,   ω D e c , φ
Compute   T g ρ T ; T P T T g
Generalization Phase
Server executes:
  for t = 0, 1, ……, (Tg − 1) do
    for each client k parallel do
       ω t , k D I , ω t , k D S , ω t , k D e c , φ t , k ClientUpdate k , ω t , k D I , ω t , k D S , ω t , k D e c , φ t , k
       ω t , k D S , ω t , k D e c , φ t , k always stored locally without communication.
     Obtain   the   ω t D I for this round t through Equation (6) and send it to all clients.
Personalization Phase
Server executes:
  Use FedPWRR to compute the weight matrix W
  for t = Tg, ……, T do
    for each client k parallel do
       ω t , k D I , ω t , k D S , ω t , k D e c , φ t , k ClientUpdate k , ω t , k D I , ω t , k D S , ω t , k D e c , φ t , k
       ω t , k D S , ω t , k D e c , φ t , k always stored locally without communication.
    for each client k parallel do
       Obtain   the   ω t + 1 , k D I through Equation (8)
       Send   ω t + 1 , k D I to client k and simultaneously update the parameters.

ClientUpdate ( k ,   ω t D I , ω t D S , φ t , ω t D e c ):    //run for Client k
   Set   ω t , k D I ω t D I ,   ω t , k D S ω t D S ,   φ t , k φ t , ω t , k D e c ω t D e c
   L ω k D I , ω k D S , ω k D e c , φ k = Equation 2
  for each local epoch do
     ω t + 1 , k D I , ω t + 1 , k D S , ω t ? , k D e c , φ t ? , k = ω t , k D I , ω t , k D S , ω t , k D e c , φ t , k η L ω t , k D I , ω t , k D S , ω t , k D e c , φ t , k
   return   ω t + 1 , k D I , ω t + 1 , k D S , ω t + 1 , k D e c , φ t + 1 , k

3.3. The First Phase of the DDP-FedFV Method

The primary function of this phase is to train a global model with strong generalizability. The core idea is to use a dual-decoupling mechanism, namely, model decoupling and feature decoupling. This section provides a detailed explanation of these mechanisms. Additionally, specific details of the FL process, such as the configuration of the loss function and the parameter aggregation method executed on the server, are thoroughly discussed.
The first decoupling performed for each client is model decoupling, which involves separating the neural network into two parts: the feature extractor and the classifier. In traditional FL frameworks, the model parameters for all clients are uploaded to the server for aggregation. However, this method is not suitable for practical finger vein recognition applications because the number of finger vein categories varies across clients, leading to different dimensions in the classifiers of each client. This discrepancy directly hinders parameter aggregation. Therefore, in the DDP-FedFV method, the model parameters are decoupled into two parts, θ k = ω k φ k , where ω k is the feature extractor of the k-th client and φ k is the classifier of the k-th client. During the FL process, the feature extractor parameters for each client, which have the same dimensions, are uploaded for aggregation. This helps to improve model generalizability. However, the classifier parameters of each client remain local because they effectively capture the distribution patterns of the local data [27,28], thereby benefiting the personalization of the client models.
Next, the second decoupling of the model—feature decoupling—is performed. This involves extracting domain-invariant and domain-specific features from finger vein images using two corresponding models. After the first decoupling, the trained feature extractor parameters still contain considerable domain-specific information related to the background information and image noise in the finger vein images. These features vary significantly among the finger vein datasets of different clients. Directly aggregating these coupled feature extractor parameters is not conducive to improving model generalizability. Inspired by [14], this paper introduces domain disentanglement learning. The feature extractor of the k-th client (k = 1,2,…,N) is divided into two parts: domain-invariant ω k D I and domain-specific ω k D S . This approach decouples the feature representations in the client’s dataset into domain-invariant and domain-specific features. The domain-invariant model parameters are uploaded to the server to train a global model with strong generalizability. Moreover, both the classifier φ and domain-specific model parameters are kept local to capture the distribution characteristics of the local dataset. These can be used in the second stage for client similarity measurements.
Next, we provide a detailed introduction to the training process, including the choice of the backbone network, the loss function configuration, and the server parameter aggregation method used in this phase. We adopted MobileNetV3 [29] as the backbone network for the feature extractor. The loss function for the k-th finger vein client is designed as shown in Equation (2).
L ω k D I , ω k D S , ω k D e c , φ k = L J ω k D I , φ k + L D ω k D I , ω k D S + L R ω k D I , ω k D S , ω k D e c
The combined loss function L J is used to obtain the classification loss, as calculated in Equation (3). In this equation, L S represents SoftmaxLoss [30], L C represents CenterLoss [31], and λ is a hyperparameter used to balance the weights of SoftmaxLoss and CenterLoss. During the training of the local model of the k-th client, the domain-invariant feature χ k I is fed into the classifier φ k to obtain the classification loss.
L J ω k D I , φ k = L S + λ L C
L D is the loss, which is calculated as shown in Equation (4). To ensure that features ω k D I and ω k D S adequately represent the domain-invariant and domain-specific information of the finger vein data, the soft subspace orthogonality constraint [14] is applied in Equation (4).
L D ω k D I , ω k D S = Σ x ~ D i | | χ k I T χ k S | | F 2
L R is the reconstruction loss, which is calculated as shown in Equation (5). The domain-invariant feature χ k I and domain-specific feature χ k S extracted for the k-th client are input into a decoder to reconstruct an image consistent with the original finger vein image. The difference between the generated image and the actual input image during this process results in reconstruction errors. This loss function is optimized to enhance the model’s ability to capture important information from the original data, allowing the model to learn high-quality parameters. Here, ω t , k D e c is the local image decoder of the k-th client in the global t-th communication round. This decoder uses domain-invariant and domain-specific features to reconstruct the finger vein image, which is used to calculate the reconstruction loss.
L R ω k D I , ω k D S , ω k D e c = Σ x ~ D i | | D e c k χ k I + χ k S x | | 2 2
In the first stage, the domain-invariant models of each client are aggregated on the server using the method described in Equation (6).
w t + 1 D I ¯ = E k [ ω t + 1 , k DI ]
Here, w t + 1 D I ¯ refers to the expected value obtained by the server after aggregating the domain-invariant model parameters from all clients in the t-th global communication round. The server then distributes this aggregated value to each client.

3.4. The Second Phase of the DDP-FedFV Method

This section primarily introduces the core algorithm used in the second stage, FedPWRR, and the FL process applied during this stage. The main objectives in this phase are to measure the similarity of the data distributions across clients and to obtain personalized aggregated model parameters for each client, which further improves model performance.
The main objective of FedPWRR is to utilize the similarity of the data distributions across clients to generate a weight matrix. In the server, this matrix is used to customize the model parameters for each client. The similarity of the data distributions is measured based on domain-specific model information. After the first stage of training, the domain-specific model captures information that reflects the distribution of the finger vein datasets. The similarity between domain-specific models is used to obtain the similarity matrix between clients. FedPWRR calculates the weight aggregation matrix based on this similarity matrix and the size of each client’s dataset to personalize the domain-invariant model parameters aggregated for each client.
To describe the FedPWRR algorithm, the trans function is defined, as shown in Equation (7).
t r a n s q 1 , q 2 , , q N = q 1 Σ i = 1 N q i , q 2 Σ i = 2 N q i , , q N Σ i = N N q N
The input to the trans function is a one-dimensional vector q 1 , q 2 , , q N , and the output is also a one-dimensional vector.
The pseudocode for the second stage of the algorithm, FedPWRR, is shown in Algorithm 2.
Algorithm 2. FedPWRR
Parameters :   Size   of   each   client   dataset   M k ( k = 1 , 2 , , N ) ,   domain - specific   feature   extractor   for   each   finger   vein   client   ω T g , k D S , weights assigned to clients with negative similarity r, base weight scaling factor for each client rr
Initialize Two empty matrices s and W0 to store the results of intermediate calculations
Server executes:
  Step 1: Obtain (e1, e2, ……, eN) via the trans operation based on (M1, M2, ……, MN).
   Step   2 :   The   similarity   assessment   is   performed   based   on   the   parameters   of   the   last   layer   of   ω T g , k D S uploaded for each finger vein client, and a symmetric similarity matrix Φ is obtained using the cosine similarity assessment algorithm.
  Step 3:
  for each client k parallel do
    The total number of negative clients is counted as cnt.
    for i-th client with negative similarity with client k do
       s [ k ] [ i ] = r max ( c n t ,   1 )
    for j-th client with positive similarity with client k do
       s k j = Φ k j Σ Φ k u × 1 r , w h e r e   Φ k u > 0
    set s[k][k] = 1 and then let s[k] = trans(s[k])
    Then, W0[k] = trans((e1, e2, ……, eN)○s[k])
  Step 4: The individual weights of the final personalized aggregation matrix can be calculated using the following equation:
W k j = 1 r r + r r × W 0 k u , w h e n   k = u r r × W 0 k u , o t h e r
  Return the weight matrix W
Through the above steps, the server can compute the aggregation weight matrix W at the beginning of the second stage, which is then used in the subsequent personalized aggregation process. In the second stage, FedPWRR needs to calculate the aggregation weight matrix only once, resulting in low computational overhead. Moreover, this approach comprehensively considers the similarity of data distributions across clients, making more efficient use of the available information.
The following section describes how the server aggregates personalized model parameters for each client during the second stage of training. The server uses the aggregation weight matrix W calculated by FedPWRR to update the domain-invariant model parameters for the k-th client according to Equation (8).
ω t + 1 , k D I = Σ j W k z × ω t , z D I
where ω t + 1 , k D I represents the domain-invariant feature extractor parameters of the k-th client in the (t + 1)-th global communication round; ω t , z D I represents the domain-invariant feature extractor parameters of the z-th client in the t-th global communication round; W is the aggregation weight matrix calculated by the FedPWRR algorithm; and represents the set of all finger vein clients.

4. Convergence Analysis

This section provides a proof of the convergence of the proposed algorithm. The algorithm consists of two stages, with the first generalization stage being the key component of the proposed algorithm. Therefore, the convergence proof focuses primarily on the first stage, and demonstrating the convergence of the first stage implies the overall convergence of the algorithm. Additionally, this section proves that the DDP-FedFV algorithm maintains good convergence properties even when some clients do not participate. The relevant assumptions and definitions for the proof are as follows.
  • μ strongly convex of Lk
L k w 1 L k w 2 μ w 1 w 2
  • Jensen inequality
If f is a convex function defined on a real interval I and X is a random variable taking values in I, then the Jensen inequality can be stated as:
f E X E f X
where E[X] represents the expected value of the random variable X. This inequality is frequently used in the proof.
  • γ i n e x a c t   s o l u t i o n [32]
For the local client function h k w k D I , w k D S , w k D e c , φ k ; w k D I = L k w k D I , w k D S , w k D e c , φ k with γ 0 , 1 , w k , D I is defined as the γ-inexact solution of
min w k D I h k w k D I , w t + 1 , k D S , w t + 1 , k D e c , φ t + 1 , k
if:
h k w k , D I , w t + 1 , k D S , w t + 1 , k D e c , φ t + 1 , k γ h k w t , k D I , w t + 1 , k D S , w t + 1 , k D e c , φ t + 1 , k
  • B-local dissimilarity [32]
The local function Lk is B-locally dissimilar at w t D I if:
E k L k w t , k D I , w t , k D S , w t , k D e c , φ t , k f w t D I 2 B 2
and B is defined as:
B w t D I = E L k w t , k D I , w t , k D S , w t , k D e c , φ t , k f w t D I 2
Based on the above assumptions and definitions, we can derive the convergence result of the algorithm.
Theorem 1. 
If the chosen learning rate η, the total number of participating clients K, γ, and μ satisfy
α = η 2 B 2 + 1 B L ( 1 + γ ) 2 μ ( 2 2 + 2 ) B 2 L ( 1 + γ ) 2 μ K 2 B ( 1 + γ ) μ K > 0
Then, according to the global objective at round t, the expected decrease is expressed as shown in Equation (15).
E S t f w t + 1 D I f w t D I α f w t D I 2
where St is the set of clients selected in round t. The corresponding convergence form is then derived as follows:
1 T Σ t = 0 T 1 f w t D I 2 f w 0 D I f * α T
where f * is the minimum value of the global model and w 0 D I is the initial model parameters.
The detailed proof process is provided in Appendix A. The proof results demonstrate that the convergence of the DDP-FedFV framework is theoretically guaranteed. Additionally, in our proof, we allow some clients to perform only a few local updates or incomplete local training while still participating in global model aggregation. This setting is closer to real-world applications.

5. Experiments

This section describes the specific dataset information, testing methods, and evaluation metrics. In addition, the results of each experiment are analyzed.

5.1. Datasets and Verification Method

In this paper, to simulate real-world finger vein recognition applications, six public datasets are used for FL: SDUMLA-HMT [33], MMCBNU-6000 [31], HKPU-FV [21], NUPT-FV [34], VERA [35], and UTFVP [36]. The SDUMLA-HMT dataset contains 636 classes, with 6 photos per class; the MMCBNU-6000 dataset contains 600 classes, with 10 photos per class; the HKPU-FV dataset contains 600 classes, with 6 photos per class; the NUPT-FV dataset contains 1680 classes, with 10 photos per class; the VERA dataset contains 220 classes, with 2 photos per class; and the UTFVP dataset contains 360 classes, with 4 photos per class.
These six datasets are treated as different clients. Table 1 provides a brief introduction to these six datasets, including specific experimental settings. These finger vein datasets have different data distributions due to factors such as various acquisition devices and background noise, which exemplify the classic non-iid problem in FL. Mu et al. [37] explored the heterogeneity characteristics of finger vein datasets. NUPT, as the largest finger vein dataset currently available, shows poor performance in federated learning with other datasets due to different finger placement orientations. In contrast, HKPU, SDUMLA, and UTFVP exhibit similar finger placement positions, image background lighting, and other characteristics, resulting in good compatibility. VERA, being the smallest dataset in terms of data size, negatively impacts the performance of models trained in cooperation with other datasets.
In the experiments, the open-set testing method, which is commonly employed in biometrics, was used. This method accurately reflects the model’s robustness and accuracy in real finger vein recognition scenarios and demonstrates the generalizability of the finger vein models better than other approaches. The specific datasets are divided as follows: each client’s finger vein dataset is randomly split into training and testing sets at an 8:2 ratio.
In the training phase, only the training set is used for model training. Once the model is trained, during the testing phase, verification matching pairs for each client are generated from the dataset images. Similarity measurements are performed based on the feature vectors extracted by the model. The image matching rule is as follows: if the similarity is greater than or equal to the preset threshold and the two images come from the same class, the match is considered correct; if the similarity is less than the preset threshold and the two images come from different classes, the match is also considered correct. All other cases are considered incorrect.

5.2. Evaluation Metrics

This subsection introduces the evaluation metrics used in the experiments, namely, the equal error rate (EER) and TAR@FAR = 0.01 [17,38,39,40], to comprehensively assess the performance of the proposed framework. Additionally, to provide a more intuitive comparison of algorithm performance, three statistical metrics were used.
In finger vein recognition, the EER is a widely used evaluation metric. The EER is the value at which the False Accept Rate (FAR) and False Reject Rate (FRR) are equal; the lower the value is, the better the model’s performance. The related FAR and FRR metrics are defined as shown in Equations (17) and (18).
F A R = N F A N I R A * 100 %
F R R = N F R N G R A * 100 %
where NFA is the number of false acceptances, NIRA is the number of interclass matching pairs, NFR is the number of false rejections, and NGRA is the number of intraclass matching pairs.
In addition, the TAR value when the FAR value is 0.01, abbreviated as TAR@FAR = 0.01, is a key metric in this experiment. Here, TAR is defined as TAR = 1 − FRR. The higher the value of TAR@FAR = 0.01 is, the better the model’s performance.
To more intuitively evaluate the algorithm’s performance, this paper also introduces three statistical metrics: best performance, worst performance, and average performance.
The best performance is defined as the best result achieved by the algorithm among all client metrics. This metric reflects the optimal performance the algorithm can obtain with different datasets and settings.
The worst performance is defined as the client with the poorest performance, which aids in evaluating the robustness and generalizability of the algorithm. In practical scenarios, finger vein data are distributed across different clients, each with distinct data distributions and characteristics. This metric allows us to better understand how the algorithm performs in complex environments with heterogeneous finger vein data, reflecting the model’s adaptability to the most challenging client data.
The average performance is a straightforward average metric that is well suited for measuring overall performance across all clients in FL. The use of this metric ensures that the performance of the model for each client is considered equally, regardless of the size of the corresponding client data, providing a fair and consistent evaluation standard that reflects the comprehensive effectiveness of the FL algorithm across different clients.

5.3. Experimental Results and Analysis

To evaluate the performance of the DDP-FedFV framework, four sets of experiments were conducted. In the first set of experiments, the DDP-FedFV method was compared with client-independent training and client-centralized training methods to validate the effectiveness of our framework. In the second set of experiments, the baseline results were compared with those of the first stage of the training method to verify the effectiveness of the first stage. In the third set of experiments, the performance using only the first stage was compared with that using the complete DDP-FedFV algorithm to test the effectiveness and importance of the second stage. In the fourth set of experiments, a comparative analysis with existing FL algorithms was conducted to validate the superiority of our proposed algorithm.
(1) Comparison with client-independent training and client-centralized training methods
In this experiment, “Local” represents client-independent training, where clients do not communicate with each other. “Centralized” indicates centralized training involving all clients. To better compare the performance of our algorithm with that of the other two algorithms, we set up two validation experiments: “Centralized_1” and “Centralized_2”. In both experiments, the model parameters from centralized training were used. The difference is that in the “Centralized_1” experiment, the metrics for each of the six clients were measured separately, while in the “Centralized_2” experiment, the six clients were combined into one large dataset and the overall metrics were measured. “DDP-FedFV” is the method proposed in this paper. In the EER column, smaller values indicate better performance. In the TAR@FAR = 0.01 column, higher values indicate superior performance.
The experimental results in Table 2 demonstrate that the “Centralized” training method performs considerably better than the “Local” training method. However, due to the heterogeneity of different finger vein data distributions, the performance of this model still has certain limitations. By using feature decoupling to obtain domain-invariant information for communication and employing a personalized aggregation mechanism in the second stage, the proposed method effectively overcomes the strong heterogeneity among different finger vein datasets. This further enhances the performance of the trained model. Therefore, the performance of the proposed DDP-FedFV framework is comparable or superior to that of the centralized training method.
To better and more intuitively analyze the experimental results, we plotted boxplots of the above experimental results, as shown in Figure 3.
This paper will also statistically compare the performance of each client of the algorithm to identify outliers and mark them in a box plot. According to the EER results, the “Local” method obtains a wide distribution of values, with a median of approximately 3%, indicating relatively high variability and poor overall performance. The “Centralized_1” method has a lower median EER than the “Local” method but still shows high variability. The “DDP-FedFV” method obtains the lowest median EER, indicating the best performance among these methods, and the data are relatively concentrated, suggesting that the proposed method performs consistently across different clients. Next, the TAR@FAR = 0.01 results were analyzed. The “Local” method also performs the worst, with a significant outlier indicating much poorer performance with one client compared to with the other clients. The “Centralized_1” method performs better than the “Local” method, with a median exceeding 90% and relatively low variability, suggesting that it maintains high performance across multiple clients. The “DDP-FedFV” method shows outstanding performance in terms of the TAR@FAR = 0.01 metric, with a median close to 99%, and more tightly clustered data, further confirming the stability and superiority of the proposed algorithm across different clients. These results also demonstrate the necessity and effectiveness of introducing FL in finger vein recognition.
(2) Experiments to prove the validity of the first phase
The ablation experiments were primarily performed to validate the effectiveness of the designed two-stage approach. In this subsection, the performance of the first-stage method was compared with that of the baseline training method to verify the effectiveness of the first stage. FedPer [27], which uses a server aggregation method similar to FedAvg, was selected as the baseline. Additionally, to better demonstrate the feature decoupling in the first stage of our model, the Grad-CAM [41] method was employed to visualize the ROIs in the decoupled model. This comparison between our first-stage model and the baseline aims to demonstrate the effectiveness of the refined feature extraction approach applied in the first stage.
The results of the ablation experiments are shown in Table 3 and visualized as boxplots in Figure 4. Clearly, the two metrics of FedPer exhibit large interquartile ranges, indicating marked performance variability among clients, with performance based on the VERA dataset particularly lower than the average level. This is due to the substantial distribution differences among the six finger vein datasets, leading to serious “client drift” issues. In contrast, the metrics obtained from the first stage of our proposed algorithm have very small interquartile ranges, indicating that in the first stage, the algorithm shows highly consistent and reliable performance among different clients. Even in the worst cases, it maintains high recognition accuracy. The dual-decoupling design ensures fine-grained feature utilization, encourages the exchange of common information, and ensures good performance for all clients. Although the EER value of 4.71% for the VERA client is considered an outlier on the boxplot, this value is 64.076% less than the 13.111% EER of FedPer, and this value is optimized further in the second stage. Overall, the experiments confirm that the first stages use the domain-invariant model for global communication, resulting in a generalizable global model, which provides a solid foundation for ensuring good overall model performance. Next, we use neural network visualization methods to further validate the effectiveness of the feature-decoupling method applied in the first stage of the algorithm.
The model feature map visualizations are shown in Figure 5. As shown by the sample images for each finger vein dataset in the first row, there are appreciable differences in background information such as clarity, background color, and exposure, but the core vein region features are largely consistent among the different images. In the second row, the heatmaps obtained with the baseline training method show that the trained single model does not always capture the core region well. This is because the single model cannot adapt to the data distribution of each client, leading to deviations in the model’s focus areas from the core vein distribution regions for some clients.
In contrast, our method can finely separate the two types of image information. According to the domain-invariant (DI) model visualization results in the third row, the core finger vein region shows high brightness, indicating that the decoupled DI model effectively focuses on more general, generalized information. This type of information helps enhance the performance of the global model and is crucial for communication during FL. According to the domain-specific (DS) model visualization results in the fourth row, the model focuses more on areas outside the finger vein, such as the edges of the finger. These areas are unique to each client and should remain local without participating in global communication. The visualization results strongly validate the effectiveness of the dual-decoupling method applied in the first stage. This method is important for mitigating the heterogeneity among datasets.
(3) Experiments to prove the validity of the second phase
In this section, we compare the experimental results of models with and without the second stage, that is, with and without the personalized aggregation method FedPWRR, to verify the effectiveness and importance of the second stage.
The specific values of the aggregated weight matrices obtained in the second stage are detailed in Appendix B. Table 4 shows a comparison of the experimental results of the DDP-FedFV method without the second stage and those of the complete DDP-FedFV framework. Clearly, incorporating the second stage results in improvements across different clients. The average EER of the DDP-FedFV method was 2.563%, which is 25.902% less than that of the algorithm without the second stage. Moreover, the average TAR@FAR = 0.01 of the DDP-FedFV method is 95.15%, which is 1.163% higher than that of the model without the second stage, indicating that the second stage plays a crucial role in enhancing the overall performance. The worst EER of the DDP-FedFV method was 2.563%, which is 45.584% less than that of the algorithm without the second stage, and the worst TAR@FAR = 0.01 of the DDP-FedFV method was 95.15%, which is 4.285% higher than that of the model without the second stage. The models with the worst performance are typically associated with clients with data heterogeneity, and the second-stage design further mitigates the impact of this heterogeneity.
Furthermore, the boxplot in Figure 6 reveals that the model trained without the FedPWRR method exhibits high variability, with the EER metric for the VERA client classified as an outlier. This shows that the method without the FedPWRR method has greater variability, with the models for some clients showing considerably poor performance. In contrast, the DDP-FedFV metric data are more concentrated and denser, and the median metrics are clearly improved compared to those of the algorithm without the second stage. This visually demonstrates the effectiveness of the proposed FedPWRR method and the rationality of the two-stage training paradigm.
(4) Experiments comparing the performance with those of existing algorithms
To our knowledge, the only FL-based finger vein recognition method is FedFV [17]. To comprehensively evaluate the superiority of the DDP-FedFV method, in addition to reproducing FedFV, we selected several mainstream FL algorithms for comparison, such as Moon [42], which combines FL with contrastive learning, and the personalized FL method pFedSim [43]. The analysis results in Table 5 and Figure 7 show that the DDP-FedFV method consistently obtains the optimal results for the best, worst, and average performance metrics. Notably, the model trained by our method ensures that even the worst-performing client model shows good performance. The average EER of the DDP-FedFV method was 1.622%, which was 7.473% and 23.237% lower than those of Moon and pFedSim, respectively, but 8.64% higher than that of FedFV (1.493%).
The average TAR@FAR = 0.01 of the DDP-FedFV method was 97.46%, which was 1.394%, 1.817%, and 2.838% greater than those of Moon, pFedSim, and FedFV, respectively, indicating an improvement in overall performance. The worst EER of the DDP-FedFV method was 2.563%, which was 2.769%, 32.178%, and 6.152% lower than that of Moon, pFedSim, and FedFV, respectively. The worst TAR@FAR = 0.01 of the DDP-FedFV method was 95.15%, which was 2.356%, 5.946%, and 22.996% greater than those of Moon, pFedSim, and FedFV, respectively. This demonstrates that our algorithm can maintain good performance even for clients with considerable heterogeneity, showing robust overall performance. Moreover, the boxplot in Figure 7 reveals that the models of each client obtained using our algorithm show good performance, with low overall variability and more stable performance. These results collectively prove the superiority of the DDP-FedFV method over existing algorithms.

6. Conclusions

This paper proposes a novel dual-stage FL framework, DDP-FedFV, which combines generalization and personalization. In the first stage, we focus on enhancing the generalizability of the global model, while in the second stage, we fine-tune the model parameters for each client. To improve the model’s generalizability, a dual-decoupling mechanism is applied in the first stage, and the parameters of the domain-invariant feature extractors for each client are uploaded for aggregation. To obtain customized high-performance models for each client, the FedPWRR algorithm is designed based on the first-stage decoupling algorithm. This algorithm uses client similarity information and dataset size to personalize the parameters aggregated for each client’s model. A theoretical analysis of the DDP-FedFV method shows the convergence of the algorithm. The experimental results demonstrate that the proposed DDP-FedFV framework outperforms existing methods, with superior results across multiple metrics.

Author Contributions

Conceptualization, Z.G. and J.G.; methodology, Z.G.; software, Z.G. and Y.H.; validation, Z.G., Y.H. and Y.Z.; formal analysis, Z.G. and Y.Z.; investigation, Z.G.; resources, J.G.; data curation, Y.H. and Y.Z.; writing—original draft preparation, Z.G. and J.G.; writing—review and editing, J.G. and Y.H.; visualization, Z.G.; supervision, J.G. and H.R.; project administration, J.G. and H.R.; funding acquisition, J.G. All authors have read and agreed to the published version of the manuscript.

Funding

This research was supported by the National Natural Science Foundation of China (grant no. 62272242) and the National Innovation and Entrepreneurship Training Program for College Students (grant no. 202310293018Z).

Institutional Review Board Statement

Ethical review and approval were waived for this study because the study was conducted with publicly available images in finger vein datasets and did not involve experimental studies with humans.

Informed Consent Statement

Informed consent was obtained from all the subjects involved in the study.

Data Availability Statement

The data presented in this study are publicly available in the SDUMLA-HMT, MMCBNU-6000, HKPU-FV, NUPT-FV, VERA, and UTFVP datasets [21,31,33,34,35,36].

Acknowledgments

The authors thank everyone who participated in the testing process.

Conflicts of Interest

The authors declare that there are no conflicts of interest.

Appendix A. Proof of Convergence

Based on the L-Lipschitz smoothness of the f Taylor expansion, we can obtain the following inequality, as shown in Equation (A1).
f w ¯ t + 1 D I f w t D I + < f w t D I , w ¯ t + 1 D I w t D I > + L 2 w ¯ t + 1 D I w t D I
E S t < f w t D I , w ¯ t + 1 D I w t D I > = E S t < f w t D I , E k w t + 1 , i D I w t D I > = η E S t < f w t D I , E k L k w k D I , w k D S , w k D e c , φ k >
Additionally, for vectors a and b, we have Equation (A3).
< a , b > = 1 2 a 2 + b 2 a b 2
Below, we obtain the bounds for E S t < f w t D I , w ¯ t + 1 D I w t D I > and w ¯ t + 1 D I w t D I via mathematical deflation. Equations (A3), (A2) and (10) can be combined, yielding Equation (A4).
E S t < f w t D I , w ¯ t + 1 D I w t D I > η 2 E S t f w t D I 2 + L k w t , k D I , w t , k D S , w t , k D e c , φ t , k 2 f w t D I L k w t , k D I , w t , k D S , w t , k D e c , φ t , k 2 η 2 E S t f w t D I 2 + L k w t , k D I , w t , k D S , w t , k D e c , φ t , k 2
Equation (A5) can be obtained by combining Equations (12) and (A4).
E S t < f w t D I , w ¯ t + 1 D I w t D I > η 2 B 2 + 1 f w t D I 2
The upper bound for w ¯ t + 1 D I w t D I is solved as follows: Let w ^ t + 1 , k D I = a r g min w D I L k w t , k D I , w t , k D S , w t , k D e c , φ t , k . Combining Equations (9) and (11), we obtain Equations (A6) and (A7).
w ^ t + 1 , k D I w t + 1 , k D I γ μ L k w t , k D I , w t + 1 , k D S , w t + 1 , k D e c , φ t + 1 , k
w ^ t + 1 , k D I w t D I 1 μ L k w t , k D I , w t + 1 , k D S , w t + 1 , k D e c , φ t + 1 , k
Additionally, for vectors a and b, we have the triangular inequality shown in Equation (A8).
a b a c + b c
Equation (A9) can be obtained by combining Equations (A6) and (A7).
w t + 1 , k D I w t D I w ^ t + 1 , k D I w t + 1 , k D I + w ^ t + 1 , k D I w t D I = γ + 1 μ L k w t + 1 , k D I , w t + 1 , k D S , w t + 1 , k D e c , φ t + 1 , k
Therefore, we can deduce Equation (A10).
w ¯ t + 1 D I w t D I E k w t + 1 , k D I w t D I = γ + 1 μ E k L k w t , k D I , w t , k D S , w t , k D e c , φ t , k 2 B 1 + γ μ f w t D I 2
Combining Equations (A1), (A5) and (A10) leads to Equation (A11).
f w ¯ t + 1 D I f w t D I + < f w t D I , w ¯ t + 1 D I w t D I > + L 2 w ¯ t + 1 D I w t D I f w t D I + L B ( 1 + γ ) 2 μ η 2 B 2 + 1 f w t D I 2
Next, the local Lipschitz smoothness of f is utilized to find the relationship between E f w t + 1 D I and f w t D I to obtain Equation (A12).
f w t + 1 D I f w ¯ t + 1 D I + L 0 w ¯ t + 1 D I w t + 1 D I
where L0 is the local Lipschitz continuity constant for function f and is formulated as shown in (A13).
L 0 f w t D I + L max w ¯ t + 1 D I w t D I , w t + 1 D I w t D I f w t D I + L w ¯ t + 1 D I w t D I + w t + 1 D I w t D I
Therefore, to obtain the calculated expectation value of the selected device in round t, we need to define Equation (A14).
E S t f w t + 1 D I f w ¯ t + 1 D I + O t
where O t satisfies the inequality shown in (A15).
O t E S t f w t D I + L w ¯ t + 1 D I w t D I + w t + 1 D I w t D I w t + 1 D I w ¯ t + 1 D I f w t D I + 2 L w ¯ t + 1 D I w t D I E S t w t + 1 D I w ¯ t + 1 D I + L E S t w t + 1 D I w ¯ t + 1 D I 2
The following process for E S t w t + 1 D I w ¯ t + 1 D I 2 and E S t w t + 1 D I w ¯ t + 1 D I is performed based on the inequality in (A16).
E S t w t + 1 D I w ¯ t + 1 D I E S t w t + 1 D I w ¯ t + 1 D I 2
E S t w t + 1 D I w t + 1 D I 2 E S t 1 K Σ k S t w t + 1 , k D I w t + 1 D I 2 1 K 2 E S t Σ k S t w t + 1 , k D I K w t + 1 D I 2 1 K 2 E S t Σ k S t w t + 1 , k D I w t + 1 D I 2 = 1 K E k w t + 1 D I w t + 1 D I 2 = 1 K E k w t + 1 D I E k w t + 1 , k D I 2 1 K E k w t + 1 , k D I w t D I 2 + E k E k w t + 1 , k D I w t D I 2 2 K E k w t + 1 , k D I w t D I 2
Considering Equation (A9), we obtain the inequality shown in (A17).
E S t w t + 1 D I w t + 1 D I 2 B 2 ( 1 + γ ) 2 K μ 2 f w t D I 2
The inequality in (A18) is obtained by substituting Equations (A10), (A16) and (A17) into Equation (A15).
O t 2 B ( 1 + γ ) μ K + ( 2 2 + 2 ) B 2 L ( 1 + γ ) 2 μ 2 K f w t D I 2
Combining Equations (A11), (A14) and (A18) leads to Equation (A19).
E S t f w t + 1 D I f w t D I η 2 B 2 + 1 B L 1 + γ 2 μ 2 2 + 2 B 2 L 1 + γ 2 μ 2 K 2 B 1 + γ μ K f w t D I 2
Then, for the chosen learning rate η, the total number of participating clients K, γ, and μ are selected to satisfy Equation (A20).
α = η 2 ( B 2 + 1 ) B L ( 1 + γ ) 2 μ ( 2 2 + 2 ) B 2 L ( 1 + γ ) 2 μ K 2 B ( 1 + γ ) μ K > 0
Then, the expected decline in the global target in round t is shown as the inequality in (A21).
E S t f w t + 1 D I f w t D I α f w t D I 2
We obtain the inequality in (A22) by summing the inequalities for the T round iterations of the operation shown in (A21).
f * f w 0 D I α Σ t 1 t = 0 f w t D I 2
Therefore, the converged form is finally derived as shown in Equation (A23).
1 T Σ t = 0 t 1 f w t D I 2 f w 0 D I f * α T
This equation shows that as the number of iterations T increases, the upper bound on this average gradient paradigm decreases, eventually converging to a stable value.

Appendix B. The Aggregation Weight Matrix in the Experiments

As shown in Figure A1, this weight matrix corresponds to the six clients, SDU, MMC, HKPU, NUPT, VERA, and UTFVP, in order from top to bottom. w[k][z] denotes the proportion of the model parameters uploaded by the z-th client when calculating the parameters of the k-th client. The degree of dependence of each client on the information of other clients is not the same, taking the SDU in the first row as an example. To aggregate a model that is more suitable for the distribution of the SDU dataset, the locally trained model parameters account for a large portion of a small amount of access to the knowledge of the other clients, and for the VERA client, the other clients must provide more information to aggregate a model that performs better based on the VERA data.
Figure A1. Visual example of the personalized parameter aggregation process.
Figure A1. Visual example of the personalized parameter aggregation process.
Sensors 24 04779 g0a1

References

  1. Zhang, Y.; Liu, Z. Research on Finger Vein Recognition Based on Sub-Convolutional Neural Network. In Proceedings of the International Conference on Computer Network, Electronic and Automation (ICCNEA), Xi’an, China, 25–27 September 2020. [Google Scholar]
  2. Wan, Z.C.; Chen, L.; Wang, T.; Wan, G.C. An Optimization Algorithm to Improve the Accuracy of Finger Vein Recognition. IEEE Access 2022, 10, 127440–127449. [Google Scholar] [CrossRef]
  3. Al-Obaidy, N.A.I.; Mahmood, B.S.; Alkababji, A.M.F. Finger Veins Verification by Exploiting the Deep Learning Technique. Ing. Syst. Inf. 2022, 27, 923–931. [Google Scholar] [CrossRef]
  4. Yang, H.; Fang, P.; Hao, Z. A GAN-Based Method for Generating Finger Vein Dataset. In Proceedings of the 2020 3rd International Conference on Algorithms, Computing and Artificial Intelligence (ACAI), Sanya, China, 24–26 December 2020. [Google Scholar]
  5. Zhang, J.; Lu, Z.; Li, M.; Wu, H. GAN-Based Image Augmentation for Finger-Vein Biometric Recognition. IEEE Access 2019, 7, 183118–183132. [Google Scholar] [CrossRef]
  6. Stallings, W. Handling of Personal Information and Deidentified, Aggregated, and Pseudonymized Information Under the California Consumer Privacy Act. IEEE Secur. Priv. 2020, 18, 61–64. [Google Scholar] [CrossRef]
  7. Goldman, E. An Introduction to the California Consumer Privacy Act (CCPA). SSRN Electron. J. 2018. Available online: https://ssrn.com/abstract=3211013 (accessed on 1 July 2024). [CrossRef]
  8. Bygrave, L.A. Article 25 Data Protection by Design and by Default. In The EU General Data Protection Regulation (GDPR); Oxford University Press: Oxford, UK, 2020. [Google Scholar]
  9. Drev, M.; Delak, B. Conceptual Model of Privacy by Design. J. Comput. Inf. Syst. 2022, 62, 888–895. [Google Scholar] [CrossRef]
  10. Liu, C.-T.; Wang, C.-Y.; Chien, S.-Y.; Lai, S.-H. FedFR: Joint Optimization Federated Framework for Generic and Personalized Face Recognition. In Proceedings of the AAAI Conference on Artificial Intelligence, Vancouver, BC, Canada, 22 February–1 March 2022. [Google Scholar]
  11. Aggarwal, D.; Zhou, J.; Jain, A.K. FedFace: Collaborative Learning of Face Recognition Model. In Proceedings of the 2021 IEEE International Joint Conference on Biometrics (IJCB), Shenzhen, China, 4–7 August 2021. [Google Scholar]
  12. Niu, Y.; Deng, W. Federated Learning for Face Recognition with Gradient Correction. In Proceedings of the AAAI Conference on Artificial Intelligence, Vancouver, BC, Canada, 22 February–1 March 2022. [Google Scholar]
  13. Meng, Q.; Zhou, F.; Ren, H.; Feng, T.; Liu, G.; Lin, Y. Improving Federated Learning Face Recognition via Privacy-Agnostic Clusters. arXiv 2022, arXiv:2201.12467. [Google Scholar]
  14. Shao, R.; Perera, P.; Yuen, P.C.; Patel, V.M. Federated Generalized Face Presentation Attack Detection. IEEE Trans. Neural Netw. Learn. Syst. 2022, 35, 103–116. [Google Scholar] [CrossRef] [PubMed]
  15. Gupta, H.; Rajput, T.K.; Vyas, R.; Vyas, O.P.; Puliafito, A. Biometric Iris Identifier Recognition with Privacy Preserving Phenomenon: A Federated Learning Approach. In Proceedings of the Communications in Computer and Information Science, Neural Information Processing, New Delhi, India, 22–26 November 2023. [Google Scholar]
  16. Luo, Z.; Wang, Y.; Wang, Z.; Sun, Z.; Tan, T. FedIris: Towards More Accurate and Privacy-Preserving Iris Recognition via Federated Template Communication. In Proceedings of the Computer Vision and Pattern Recognition Conference (CVPR), New Orleans, LA, USA, 18–24 June 2022. [Google Scholar]
  17. Lian, F.-Z.; Huang, J.-D.; Liu, J.-X.; Chen, G.; Zhao, J.-H.; Kang, W.-X. FedFV: A Personalized Federated Learning Framework for Finger Vein Authentication. Mach. Intell. Res. 2023, 20, 683–696. [Google Scholar] [CrossRef]
  18. Wu, J.-D.; Ye, S.-H. Driver Identification Using Finger-Vein Patterns with Radon Transform and Neural Network. Expert. Syst. Appl. 2009, 36, 5793–5799. [Google Scholar] [CrossRef]
  19. Miura, N.; Nagasaka, A.; Miyatake, T. Feature Extraction of Finger-Vein Patterns Based on Repeated Line Tracking and Its Application to Personal Identification. Mach. Vis. Appl. 2004, 15, 194–203. [Google Scholar] [CrossRef]
  20. Miura, N.; Nagasaka, A.; Miyatake, T. Extraction of Finger-Vein Patterns Using Maximum Curvature Points in Image Profiles. Mach. Vis. Appl. 2005, E90-D, 1185–1194. [Google Scholar] [CrossRef]
  21. Kumar, A.; Zhou, Y. Human Identification Using Finger Images. IEEE Trans. Image Process 2012, 21, 2228–2244. [Google Scholar] [CrossRef] [PubMed]
  22. Liu, W.; Li, W.; Sun, L.; Zhang, L.; Chen, P. Finger Vein Recognition Based on Deep Learning. In Proceedings of the 2017 12th IEEE Conference on Industrial Electronics and Applications (ICIEA), Siem Reap, Cambodia, 18–20 June 2017. [Google Scholar]
  23. Azmat, F.; Li, P.; Wu, Z.; Zhang, J.; Chen, C. A Finger Vein Recognition Algorithm Based on Deep Learning. Int. J. Embed. Syst. 2017, 9, 220. [Google Scholar] [CrossRef]
  24. Abbas, T. Finger Vein Recognition with Hybrid Deep Learning Approach. J. La Multiapp 2023, 4, 23–33. [Google Scholar] [CrossRef]
  25. Ma, X.; Luo, X. Finger Vein Recognition Method Based on Ant Colony Optimization and Improved EfficientNetV2. Math. Biosci. Eng. 2023, 20, 11081–11100. [Google Scholar] [CrossRef]
  26. McMahan, H.B.; Moore, E.; Ramage, D.; Hampson, S.; Arcas, B. Communication-Efficient Learning of Deep Networks from Decentralized Data. arXiv 2016, arXiv:1602.05629. [Google Scholar]
  27. Arivazhagan, M.; Aggarwal, V.; Singh, A.; Choudhary, S. Federated Learning with Personalization Layers. arXiv 2019, arXiv:1912.00818. [Google Scholar]
  28. Collins, L.; Hassani, H.; Mokhtari, A.; Shakkottai, S. Exploiting Shared Representations for Personalized Federated Learning. arXiv 2021, arXiv:2102.07078. [Google Scholar]
  29. Howard, A.; Sandler, M.; Chen, B.; Wang, W.; Chen, L.-C.; Tan, M.; Chu, G.; Vasudevan, V.; Zhu, Y.; Pang, R.; et al. Searching for MobileNetV3. In Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision (ICCV), Seoul, Republic of Korea, 27 October–2 November 2019. [Google Scholar]
  30. Liu, W.; Wen, Y.; Yu, Z.; Yang, M. Large-Margin Softmax Loss for Convolutional Neural Networks. arXiv 2017, arXiv:1612.02295. [Google Scholar]
  31. Lu, Y.; Xie, S.J.; Yoon, S.; Wang, Z.; Park, D.S. An Available Database for the Research of Finger Vein Recognition. In Proceedings of the 2013 6th International Congress on Image and Signal Processing (CISP), Hangzhou, China, 16–18 December 2013. [Google Scholar]
  32. Li, T.; Sahu, A.; Zaheer, M.; Sanjabi, M.; Talwalkar, A.; Smith, V. Federated Optimization in Heterogeneous Networks. arXiv 2018, arXiv:1812.06127. [Google Scholar]
  33. Yin, Y.; Liu, L.; Sun, X. SDUMLA-HMT: A Multimodal Biometric Database. In Proceedings of the Biometric Recognition, Lecture Notes in Computer Science (CCBR), Beijing, China, 3–4 December 2011. [Google Scholar]
  34. Ren, H.; Sun, L.; Guo, J.; Han, C. A Dataset and Benchmark for Multimodal Biometric Recognition Based on Fingerprint and Finger Vein. IEEE Trans. Inf. Forensics Secur. 2022, 17, 2030–2043. [Google Scholar] [CrossRef]
  35. Tome, P.; Vanoni, M.; Marcel, S. On the Vulnerability of Finger Vein Recognition to Spoofing. In Proceedings of the 2014 International Conference of the Biometrics Special Interest Group (BIOSIG), Darmstadt, Germany, 10–12 September 2014. [Google Scholar]
  36. Ton, B.T.; Veldhuis, R.N.J. A High Quality Finger Vascular Pattern Dataset Collected Using a Custom Designed Capturing Device. In Proceedings of the 2013 International Conference on Biometrics (ICB), Madrid, Spain, 4–7 June 2013. [Google Scholar]
  37. Mu, H.; Guo, J.; Han, C.; Sun, L. PAFedFV: Personalized and Asynchronous Federated Learning for Finger Vein Recognition. arXiv 2024, arXiv:2404.13237. [Google Scholar]
  38. Huang, Y.; Wang, J.; Li, P.; Xiang, L.; Li, P.; He, Z. Generative Iris Prior Embedded Transformer for Iris Restoration. In Proceedings of the 2023 IEEE International Conference on Multimedia and Expo (ICME), Brisbane, Australia, 10–14 July 2023. [Google Scholar]
  39. Alonso-Fernandez, F.; Farrugia, R.A.; Bigun, J.; Fierrez, J.; Gonzalez-Sosa, E. A survey of super-resolution in iris biometrics with evaluation of dictionary-learning. IEEE Access 2018, 7, 6519–6544. [Google Scholar] [CrossRef]
  40. Liu, J.; Qin, H.; Wu, Y.; Liang, D. Anchorface: Boosting tar@far for practical face recognition. In Proceedings of the AAAI Conference on Artificial Intelligence, Vancouver, BC, Canada, 22 February–1 March 2022. [Google Scholar]
  41. Selvaraju, R.R.; Cogswell, M.; Das, A.; Vedantam, R.; Parikh, D.; Batra, D. Grad-CAM: Visual Explanations from Deep Networks via Gradient-Based Localization. In Proceedings of the 2017 IEEE International Conference on Computer Vision (ICCV), Venice, Italy, 22–29 October 2017. [Google Scholar]
  42. Li, Q.; He, B.; Song, D. Model-Contrastive Federated Learning. In Proceedings of the 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Nashville, TN, USA, 20–25 June 2021. [Google Scholar]
  43. Tan, J.; Zhou, Y.; Liu, G.; Wang, J.; Yu, S. pFedSim: Similarity-Aware Model Aggregation Towards Personalized Federated Learning. arXiv 2023, arXiv:2305.15706. [Google Scholar]
Figure 1. Example diagram illustrating the necessity of decoupling. Finger vein images can be divided into two parts: background and foreground information. In this paper, we refer to them as domain-specific and domain-invariant information, respectively. The former reflects the personalized information of specific datasets, such as exposure and imaging conditions for different clients. The latter refers to the core vein texture, which is common and is the key feature for finger vein recognition technology.
Figure 1. Example diagram illustrating the necessity of decoupling. Finger vein images can be divided into two parts: background and foreground information. In this paper, we refer to them as domain-specific and domain-invariant information, respectively. The former reflects the personalized information of specific datasets, such as exposure and imaging conditions for different clients. The latter refers to the core vein texture, which is common and is the key feature for finger vein recognition technology.
Sensors 24 04779 g001
Figure 2. DDP-FedFV framework for finger vein recognition. The training process of this framework is divided into two main stages. In the first stage (generalization phase), the model is decoupled into two parts: a feature extractor and a classifier. To further decouple the dataset into foreground and background information, two feature extractors are devised for each client: domain-invariant and domain-specific extractors. This decouples the local dataset’s feature representations into domain-invariant and domain-specific parts. During this phase, the parameters of the domain-invariant model for each client are uploaded to the server, allowing the training of a model with strong generalizability. In the second stage (personalization phase), the server uses the size and distribution similarity information of each client’s dataset to determine the aggregation weight matrix W, customizing the model for each client to enhance personalization.
Figure 2. DDP-FedFV framework for finger vein recognition. The training process of this framework is divided into two main stages. In the first stage (generalization phase), the model is decoupled into two parts: a feature extractor and a classifier. To further decouple the dataset into foreground and background information, two feature extractors are devised for each client: domain-invariant and domain-specific extractors. This decouples the local dataset’s feature representations into domain-invariant and domain-specific parts. During this phase, the parameters of the domain-invariant model for each client are uploaded to the server, allowing the training of a model with strong generalizability. In the second stage (personalization phase), the server uses the size and distribution similarity information of each client’s dataset to determine the aggregation weight matrix W, customizing the model for each client to enhance personalization.
Sensors 24 04779 g002
Figure 3. Boxplot of this experiment.
Figure 3. Boxplot of this experiment.
Sensors 24 04779 g003
Figure 4. Boxplot depicting the experimental comparison with the baseline.
Figure 4. Boxplot depicting the experimental comparison with the baseline.
Sensors 24 04779 g004
Figure 5. Grad-CAM visualization of the regions focused on by the domain-invariant (DI) and domain-specific (DS) models. In this subsection, the Grad-CAM [41] method is used to visualize the feature maps of the deep learning model, highlighting the areas of the image that the model focuses on most. The first row shows sample images for each client. The second row displays the heatmaps obtained with the baseline method selected in this paper. The third row presents heatmaps of the areas on which the domain-invariant (DI) model focuses. The fourth row shows heatmaps of the areas on which the domain-specific (DS) model focuses.
Figure 5. Grad-CAM visualization of the regions focused on by the domain-invariant (DI) and domain-specific (DS) models. In this subsection, the Grad-CAM [41] method is used to visualize the feature maps of the deep learning model, highlighting the areas of the image that the model focuses on most. The first row shows sample images for each client. The second row displays the heatmaps obtained with the baseline method selected in this paper. The third row presents heatmaps of the areas on which the domain-invariant (DI) model focuses. The fourth row shows heatmaps of the areas on which the domain-specific (DS) model focuses.
Sensors 24 04779 g005
Figure 6. Boxplot of the ablation experiment.
Figure 6. Boxplot of the ablation experiment.
Sensors 24 04779 g006
Figure 7. Boxplot of the experiment.
Figure 7. Boxplot of the experiment.
Sensors 24 04779 g007
Table 1. Introduction to the datasets used in the experiment.
Table 1. Introduction to the datasets used in the experiment.
FV DatasetsNumber of FingersTotal ImagesTraining SetTest SetNumber of Authentication Pairs
SDUMLA-HMT63638163054976289,941
MMCBNU-6000600600048001200719,400
HKPU-FV3121872150037269,006
NUPT-FV168016,80013,44033605,643,120
VERA220440352883828
UTFVP3601440115228841,328
Table 2. Comparison of local training, centralized training, and FL training.
Table 2. Comparison of local training, centralized training, and FL training.
DatasetsLocalCentralized_1Centralized_2DDP-FedFV
EERTAR@FAR = 0.01EERTAR@FAR = 0.01EERTAR@FAR = 0.01EERTAR@FAR = 0.01
SDU3.687%92.05%3.485%91.27%4.280%92.47%2.163%96.92%
MMC1.365%98.13%1.272%98.43%0.647%99.78%
HKPU2.265%95.47%2.072%96.03%1.225%98.59%
NUPT1.025%98.93%1.048%98.86%0.995%99.02%
VERA8.749%65.43%6.685%82.07%2.563%95.29%
UTFVP4.802%82.45%5.236%87.90%2.141%95.15%
Best1.025%98.93%1.048%98.86%4.280%92.47%0.647%99.78%
Worst8.749%63.43%6.685%82.07%2.563%95.15%
Average3.649%88.74%3.300%92.43%1.622%97.46%
Table 3. Performance comparison between the baseline algorithm and the first-stage algorithm.
Table 3. Performance comparison between the baseline algorithm and the first-stage algorithm.
DatasetsFedPerThe First Phase Method
EERTAR@FAR = 0.01EERTAR@FAR = 0.01
SDU7.875%79.41%2.239%96.58%
MMC1.867%96.49%0.930%99.14%
HKPU4.348%87.19%1.654%97.59%
NUPT2.026%97.06%1.063%98.90%
VERA13.111%58.05%4.710%91.24%
UTFVP6.362%74.17%2.540%94.61%
Best1.867%97.06%0.930%99.14%
Worst13.111%58.05%4.710%91.24%
Average5.931%82.06%2.189%96.34%
Table 4. Comparison between the performance of the first-stage algorithm and the complete algorithm.
Table 4. Comparison between the performance of the first-stage algorithm and the complete algorithm.
DatasetsDDP-FedFV W/O FedPWRRDDP-FedFV
EERTAR@FAR = 0.01EERTAR@FAR = 0.01
SDU2.239%96.58%2.163%96.92%
MMC0.930%99.14%0.647%99.78%
HKPU1.654%97.59%1.225%98.59%
NUPT1.063%98.90%0.995%99.02%
VERA4.710%91.24%2.563%95.29%
UTFVP2.540%94.61%2.141%95.15%
Best0.930%99.14%0.647%99.78%
Worst4.710%91.24%2.563%95.15%
Average2.189%96.34%1.622%97.46%
Table 5. Comparison of the performance of the proposed method with those of existing finger vein recognition algorithms based on FL.
Table 5. Comparison of the performance of the proposed method with those of existing finger vein recognition algorithms based on FL.
DatasetsMoon [42]pFedSim [43]FedFV [17]DDP-FedFV
EERTAR@FAR = 0.01EERTAR@FAR = 0.01EERTAR@FAR = 0.01EERTAR@FAR = 0.01
SDU2.383%94.43%2.211%95.61%1.938%97.10%2.163%96.92%
MMC1.013%98.90%0.596%99.68%0.719%99.41%0.647%99.78%
HKPU1.416%97.80%1.557%97.40%0.736%99.26%1.225%98.59%
NUPT0.740%99.39%1.098%98.80%0.995%99.03%0.995%99.02%
VERA2.330%93.24%3.779%89.81%2.731%77.36%2.563%95.29%
UTFVP2.636%92.96%3.438%93.02%1.837%96.45%2.141%95.15%
Best0.740%99.39%0.596%99.68%0.719%99.41%0.647%99.78%
Worst2.636%92.96%3.779%89.81%2.731%77.36%2.563%95.15%
Average1.753%96.12%2.113%95.72%1.493%94.77%1.622%97.46%
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Guo, Z.; Guo, J.; Huang, Y.; Zhang, Y.; Ren, H. DDP-FedFV: A Dual-Decoupling Personalized Federated Learning Framework for Finger Vein Recognition. Sensors 2024, 24, 4779. https://doi.org/10.3390/s24154779

AMA Style

Guo Z, Guo J, Huang Y, Zhang Y, Ren H. DDP-FedFV: A Dual-Decoupling Personalized Federated Learning Framework for Finger Vein Recognition. Sensors. 2024; 24(15):4779. https://doi.org/10.3390/s24154779

Chicago/Turabian Style

Guo, Zijie, Jian Guo, Yanan Huang, Yibo Zhang, and Hengyi Ren. 2024. "DDP-FedFV: A Dual-Decoupling Personalized Federated Learning Framework for Finger Vein Recognition" Sensors 24, no. 15: 4779. https://doi.org/10.3390/s24154779

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop