1. Introduction
With the growing concern for information security, biometric technology has been recognized as one of the most reliable security technologies in the field of information security [
1]. Compared to other biometric technologies, finger vein recognition has unique advantages, such as low cost, non-contact acquisition, and uniqueness of finger vein information, especially in the current world epidemic situation, its contactless collection method has been widely accepted [
2]. In practice, however, the finger vein images are often blurred, noisy, or have missing features [
3]. For example, unstable light sources and image sensors will increase blurring and noise of the finger vein images (
Figure 1a,b), the temperature of the acquisition environment can cause expansion or contraction of the vein, and oil and dust on the filter of the acquisition device can cause the vein features to be missing (
Figure 1c,d). These images are known as low-quality finger vein images. These images will increase the difficulty of finger vein feature extraction and reduce the recognition performance. Hence, the selection of appropriate enhancement and effective restoration methods to enhance and improve the resolution of low-quality images is clearly essential and meaningful. In this paper, low-quality finger vein image restoration tasks are classified into three types: 1. restoration of finger vein images with motions blur and Gaussian blur; 2. restoration of damaged finger vein images with oil and water stains; and 3. restoration of enhanced finger vein images with noise and texture blur.
In the field of image deblurring, the current methods for finger vein image deblurring restoration are divided into traditional deblurring algorithms and deblurring algorithms based on deep learning. The typical traditional method [
4,
5] requires the use of a blur kernel to perform deconvolution to restore a clear finger vein image, provided that the blur kernel is known. However, the current deblurring approaches are mostly impractical because of several reasons. First, most blur kernels are unknown in the real environment as estimating blur kernels is very time consuming. Second, even if the convolution kernel is known, different acquisition devices have variations in their blur kernels due to environmental, physical, and other factors, and the processed images still result in degraded recognition performance. Several improved algorithms have been proposed to deblur the finger veins based on skin scattering or optical blurring. Li et al. [
6] proposed a blurred finger vein image restoration method to restore blurred finger vein images and improve recognition performance by considering optical blurring components and scattering blurring components. Yang et al. [
6,
7] devised a deblurring optical model by calculating the light scattering component of an organism. However, the traditional approach requires the measurement of multiple parameters to improve the performance of the model, resulting in a large amount of undesired processing time. The recent deblurring algorithms based on deep learning have been widely used to restore clear images by predicting the blur kernel, ref. [
8] used generative adversarial networks (GAN) [
9] to achieve image deblurring. GAN provides an adversarial game idea of generators and discriminators, and a better deblurring effect can be obtained by inputting the blurred image into the trained generative network. Cui et al. [
10] proposed a modified correction-based GAN for restoring the optical blur conditions contained in the original finger vein images; ref. [
3] improved the DeblurGAN network [
11] and proposed a method that can restore finger vein images with motion blur. GAN provides an adversarial game idea of generators and discriminators and can deliver a better deblurring effect by inputting the blurred image. However, the frequent use of down-sampling techniques in the feature extraction process will essentially result in the loss of crucial vein texture information, especially for finger vein images with blurred texture features, making it difficult to improve the deblurring image effect.
In the field of image restoration for damaged regions, there are currently two main types of methods for restoring damaged areas of finger vein images: traditional methods and deep learning-based methods. The representative algorithms among traditional approaches are the total variation (TV) algorithm [
12] and the curvature driven diffusions (CDD) algorithm [
13]. However, TV and CDD algorithms do not take the texture constraints into account, leading to inaccurate restored finger vein images that lack texture information and are easily distorted. The work [
14] proposed a fast-marching method (FMM) that prioritizes the restored by estimating parameters, such as light direction, and then fills the damaged area using a weighted average of neighboring pixels. FMM improves time efficiency but blindly uses all known information in the patch for the weighting calculation, resulting in vein textures that do not consider texture constraints and do not accurately characterize the texture. Yang H et al. [
1] proposed a restoration algorithm based on Gabor texture feature constraints. During the restoration process, the information of the neighboring blocks is selectively weighted according to the features of similar textures in the local region, making the vein texture of the restored image more coherent. However, although these methods are suitable for cases where the vein structure information is unclear or missing due to image blurring, they are less effective for cases where the vein image itself is broken or missing due to external factors, and do not achieve better generalization capabilities. Recently, deep learning approaches have emerged as a promising technology for image inpainting [
15]. Deep learning-based methods are mainly based on convolutional modules. This is because convolutional neural networks (CNN) can learn the depth features of an image and have better generalization capabilities. The deep learning-based methods have demonstrated promising solutions to restore the missing parts of an image. Gao et al. [
16] proposed a deep learning-based algorithm for restoring damaged areas of finger veins, but the algorithm did not make full use of texture information in the restoration process, resulting in texture blurring in the restored images, which presented low performance in matching authentication. With this problem in mind, Jiang et al. [
17] proposed a finger vein restoration algorithm based on neighbor binary Wasserstein generative adversarial networks (NB-WGAN). The method uses texture loss as part of the loss function of the generator to recover more vein texture details. However, the method only constrains the network on the loss and still does not allow the network to adequately learn the finger vein texture features and guide the repair of the defective image based on the vein features. The method still loses detail in the repaired finger vein images.
In this study, we propose a novel design of an effective and reliable restored method for damaged finger vein images and the design of finger vein image deblurring methods to address the critical challenges in the field of finger vein image restoration and enhancement. We have designed a DeblurGAN-v2 network based on the Inception-Resnet-v2 [
18] backbone to address the problems of existing finger vein deblurring methods. DeblurGAN-v2 [
19] is an improvement on DeblurGAN, and incorporates the feature pyramid network (FPN) [
20], to achieve faster and better performance in learning the feature information of finger veins and reducing the loss of vein texture information. Furthermore, in the process of feature extraction, we have adopted inception-Resnet-v2 as the skeleton to collaborate with FPN to form a generative network. To solve the various blurs and distortions that exist in the captured images in real situations, we used generative adversarial networks for training. The GAN can generate images by finding the best filter using weights trained from the training data, and the trained network remains robust and generalizable. Moreover, to address the current restoration methods for damaged finger vein images (which suffer from loss of details after restoring the images), we propose a finger vein image restoration method based on vein feature texture guidance. This consists of two stages: (1) a finger vein texture feature restoration network, and (2) an original image restoration network. Specifically, the original image of the finger vein with damage is first subjected to feature extraction to obtain the texture feature image with damage. Then, the texture feature restoration network in the first stage is used to restore the damaged texture feature image, and the texture information of the damaged area can be predicted by the trained weights. Finally, the second stage of the original image restoration network is used to restore the defective original image using the restored texture image feature information as a guide. As a result, a significantly lower loss of vein feature detail in the restored original image would be achieved through feature guidance.
In the process of our research, we have found that existing methods are only able to deal with a single finger vein image problem, but when multiple vision problems with the finger vein image (i.e., defects, motion blur, noise, etc.), it is challenging to select the appropriate restoration method and proper order of image preprocessing steps for different image problem. For example, if the finger vein image is missing structural information and is blurred, as a rule of thumb, the ideal combination of restoration links is to restore the damage first and then to deblur the restored image, whereas if only the damage restoration method is used, the restored finger vein image will still be blurred; or if only the deblurring restoration method is used, the image quality of the finger vein will still be poor, due to the lack of detail. For finger vein images with both blur and noise, using restoration methods that enhance feature details can increase the degree of blur and noise in the image. This makes it difficult to subsequently extract clear vein feature information. If multiple restoration methods are used to restore an image simultaneously, it takes a significant amount of time to find the best combination of links by trial and error. Assuming that there are n restoration methods, there are n! possible combinations of processing steps to generate a restored image, while there is only one optimal method that would effectively generate the high-quality restored image. Therefore, the fast and efficient selection of repair tasks and combinations to process low-quality finger vein images is an essential topic of research significance in this field.
In this work, we hereby propose an adaptive selection and restoration method for finger vein images based on Deep Reinforcement Learning, referred to as DRL-FVRestore. We have divided the restoration tasks into three categories: image denoising and enhancement restoration, image deblurring restoration, and image inpainting restoration. We firstly propose to treat the finger vein restoration task as a sequential decision-making process and use the idea of deep reinforcement learning (DRL) to train an agent that can select the image restoration task adaptively according to the state of the finger vein image, to gradually restore and improve the quality of the finger vein image. Very recently, DRL has been used for some image processing applications. Yu et al. [
21] were the first to attempt to apply DRL to learn a strategy to gradually restore damaged images by selecting appropriate operations from a predefined toolbox. Their improved version [
22] can dynamically select suitable paths for different image regions in a multi-path CNN for spatially varying image denoising. Furuta et al. [
23] proposed PixelRL, the first framework to perform pixelated restoration. Based on these studies, we proposed the first application of DRL to finger vein image restoration. Furthermore, to enable the trained agent to better learn the finger vein feature information and select the best method for restoration based on the feature information, we design a reward function based on the vein feature constraints to guide the agent to fully learn the vein feature information.
We summarize our contributions as follows:
- (1)
An Inception-Resnet-v2 backbone based DeblurGAN-v2 network for finger vein image deblurring generation adversarial network has been designed.
- (2)
A vein feature-guided finger vein image inpainting network for the restoration of damaged finger vein images has been proposed, where the network comprised two stages: a feature image restoration stage and an original finger vein restoration stage.
- (3)
A deep reinforcement learning-based method for adaptive selection of finger vein image restoration tasks has been proposed for the first time, in which a reward function with vein feature constraints was used to guide the learned vein feature information for optimal selection of restoration tasks.
The rest of this paper is organized as follows.
Section 2 describes the proposed methodology.
Section 3 describes how to design the training and testing datasets.
Section 4 conducts many experiments on the proposed method and analyses the experimental results in detail. Finally, in
Section 5, summarizes the full work and provides an outlook for future work.
2. Materials and Methods
Our goal is to restore low-quality finger vein images
x to high-quality finger vein images
y. Specifically, we propose the DRL-FVRestore that can self-select the optimal combination of restoration tasks and links according to the quality status of the finger vein image itself, enabling an adaptive restoration of low-quality images of finger veins. When applying DRL to finger vein image restoration task selection, it is initially necessary to consider how the Markov decision process (MDP) can be built in this area to achieve reliable and effective results. The overall flow chart of the proposed method for DRL-FVRestore is shown in
Figure 2. During the processing of DRL-FVRestore, the set
of low-quality finger vein images is obtained from the environment. Each finger vein image is considered as a state, and the different image restoration tasks are considered as actions to be executed, called action spaces, denoted as
. If the current input finger vein image to the trained model is
, the agent finds the best image restoration method
from the action space to process the low-quality finger vein image based on the weights trained from the training data. It can be found that the design of the DRL-FVRestore method is divided into three sections: (a) building of the reinforcement learning interactive environment; (b) method realization of the image deblurring restoration, image damage inpainting restoration, and image enhancement and denoising restoration; and (c) training of an agent for adaptive selection of image restoration tasks based on DRL. The focus of the environment is the design of the dataset for training the agent, which will be described in detail in
Section 3. In
Section 2.1, we introduce the structure of the DeblurGAN-v2 network based on the Inception-Resnet-v2 backbone and its loss function design applied to the finger vein image deblurring restoration task. In
Section 2.2, we then introduce the structure of a vein feature-guided finger vein image damage inpainting network and the design of its loss function applied to the image damage inpainting restoration task. In
Section 2.3, we describe representative algorithms used for the restoration task applied to finger vein image enhancement and denoising. Finally, in
Section 2.4, we will discuss in detail the network structure of the DRL-FVRestore method proposed in this paper, the design ideas, and the design of the reward function for the method.
2.1. The Design of a Method for Finger Vein Image Deblurring Restoration Tasks
The blurred finger vein image can be expressed as follows [
19]:
where
refers to a blurred finger vein image,
refers to a clear finger vein image,
K denotes the blur kernel, and ⊗ denotes the convolution operation. For the blurring problems encountered with finger vein images, we have divided them into two categories: Gaussian blurring and motion blurring. Gaussian blur is an imaging blur caused by the focal length of the image sensor of the finger vein capture device; motion blur is caused by the relative motion between the image sensor of the finger vein capture device and the finger of the person being captured. The blurring of finger vein images is variable due to different external factors and physical environments, so different blurring kernels need to be selected to deal with different finger vein image blurring problems. However, in practical applications, the method is not robust and generalization because of the unknown blurring kernels of the image and significant computing resources to identify the appropriate convolution kernel.
In this study, assuming that the blurring kernels are unknown, we propose a method for blurring restoration of finger vein images that is appropriate for various environments. Accordingly, we have designed a finger vein deblurring generative adversarial network based on the DeblurGAN-v2 network with the Inception-Resnet-v2 backbone and evaluated the framework on diverse dataset. While the original DeblurGAN showed good performance in the deblurring task, the improved DeblurGAN-v2 has shown exceptional performance in both the deblurring effects and the extraction of image features. In the following discussion, the design ideas of the deblurring method in this paper will be described in detail in terms of three aspects, namely the generator, discriminator, and loss function of the network.
2.1.1. The DeblurGAN-v2 Generator Architecture and Loss Function
The overview of the DeblurGAN-v2 generator architecture is shown in
Figure 3. It restores a sharp finger vein image from a single finger vein blurred image, via the trained generator.
Most of the existing CNN [
24] frameworks for image deblurring follow the design of ResNet architecture, and most of the state-of-the-art methods for handling different levels of blurring make use of multi-stream CNN and input image pyramids of different scales [
19]. However, processing multi-scale images is both time consuming and requires a large amount of memory space. Many existing restoration network backbones tend to adopt FPN to combine multi-scale finger vein features to achieve the finger vein image deblurring restoration task. By applying FPN to finger vein images, multiple feature layers of finger veins can be generated, and these layers encode different finger vein image feature information to obtain more image feature information. Specifically, the FPN paths mainly consist of bottom–up and top–down. The bottom–up path is mainly a convolutional network structure for feature extraction of blurred finger vein images to extract and compress major information on the semantic features of finger veins; the top–down path implements the FPN to reconstruct clear finger vein images from semantic feature-rich layers. In this study, we designed DeblurGAN-v2 with FPN as backbone architecture and obtained final feature maps at five different scales as output. These feature maps are further unsampled to
of the input size and concatenated into a tensor containing different levels of finger vein feature information. Moreover, in pursuit of the strong deblurring performance and the reduction in the loss of vein texture information during feature extraction by the network, we used Inception-Resnet-v2 as the backbone and collaborated with FPN to form the generative network.
The role of the generator loss is to provide a metric that compares the original finger vein image with the restored image from the training phase. In this paper, we chose to use
loss, perceptual loss
, and adversarial loss
as joint losses to jointly constrain the training of the deblurring network. Where the
loss is calculated as the Euclidean distance on the
feature map in VGG19 [
25] after pre-training, and the adversarial loss
is obtained from the global discriminator and local discriminator. The specific loss function is formulated as follows:
2.1.2. The DeblurGAN-v2 Architecture of Discriminator and Loss Function
To solve the problem of varying degrees of blurring in images being captured in real situations, we use GAN that can be trained to generate images by finding the best filter using weights trained from the training data, and the trained network is still robust and generalizable. The traditional GAN consists of two models: the discriminator
D and the generator
G. The objective function of the traditional GAN network is shown as follows:
It must be noted that the objective function described above is difficult to optimize and can also produce pattern collapse or gradient disappearance during training. To address this shortcoming, in order to improve gradient disappearance and stabilize the training model, Least Squares GAN (LSGAN) [
26] attempts to introduce a loss function that provides smoother and unsaturated gradients. Better training stability is obtained by introducing the
loss into LSGAN, where the
loss provides a gradient proportional to that distance, so that false samples further away from the boundary are penalised more. Specifically, the LSGAN function is shown in Equations (
4) and (
5) as follows:
Meanwhile, the relativistic discriminator [
27] was used on LSGAN to estimate the probability that the provided real data are more real than the randomly sampled fake data. The relativistic discriminator showed more stable and computationally efficient training results. Accordingly, this discriminator network design idea was adopted in this paper. The loss function of the discriminator is expressed as shown in Equation (
6):
Discriminator networks are used to determine the degree of counterfeiting of the generated images. The processing of local images produces clearer images than the standard discriminator that processes the global image. Most restoration networks currently use the PatchGAN discriminator [
28], which operates on image blocks of size
. However, research has demonstrated that a standard discriminator for processing global images is still essential for images where serious blurring exists [
11]. Therefore, in order to exploit the global and local features of finger vein images, we designed a double-scale discriminator that comprises a local discriminator and a global discriminator. The local discriminator uses the PatchGAN structure, which divides the finger vein image into
image blocks as the input to the discriminator; the global discriminator is a direct input of the global finger vein image, as shown in
Figure 4.
2.2. The Design of a Task Method for the Restoration of Damaged Finger Vein Images
To address the current problem of vein feature information loss in the restored finger vein images, we propose a feature-guided finger vein image restoration method, which consists of two stages: (1) finger vein texture feature restoration network; and (2) original image restoration network. As shown in
Figure 5, both stages are based on a generative adversarial model consisting of a generator and a discriminator. Let
G1 and
D1 be the generator and discriminator for the finger vein texture feature generator, and
G2 and
D2 be the generator and discriminator for the original image restoration network, respectively. Specifically, we designed the generator for both stages to consist of an encoder that down-samples twice, 8 dilated convolution residual blocks that are replaced by the dilated convolution with an expansion factor of 2, and an up-sampled decoder. We use
PatchGAN architecture, which determines whether or not the overlapping image patches of size
are real. The damaged texture feature images input to the first stage of the texture feature inpainting network for restoration, which is trained to predict the texture information of the damaged area according to the weights. Then, the original image inpainting network further restores the damaged original image based on the restored texture image feature information as a guide. The method of guiding the restored original image through the features does not suffer from the loss of vein feature detail information. In the following discussion, the specific design ideas of the two stages and the design of the loss function are described separately.
2.2.1. The First Stage of the Finger Vein Image Damage Restoration Task: Finger Vein Feature Image Restoration
Assuming that
is a region of interest (ROI) image of the finger vein without missing, we extracted the feature texture structure of
using the feature extraction method of maximum curvature [
29] and obtained the feature texture image
. We use a randomly generated mask image for
. Assuming that the generated mask image is
, then our input finger vein image with damage in this network is
and the corresponding damaged finger vein feature texture image is
, where ⊙ is the matrix Hadamard product. Our generator network is designed to predict the finger vein texture information in the area covered by the mask image and obtain the repaired finger vein texture image
, which is defined as follows:
We use the generated image
from the generator and the undamaged finger vein feature image
as input to the discriminator to predict whether the finger vein texture image is realistic or not. In summary, the network is trained and optimized based on the following loss functions, which include the adversarial loss
and feature-matching loss
[
30]:
where the feature matching loss
is compared to the activation maps that go through the discriminator feature extraction process. This is similar to perceptual loss, where the activation maps are compared with the activation maps from the pre-trained VGG network. In Equation (
9),
n is the final convolution layer of the discriminator,
is the
ith activation layer, and
is the activation in the
ith layer of the discriminator. For our experiments, the final loss in finger vein feature image inpainting is defined as:
2.2.2. The Second Stage of the Finger Vein Image Damage Restoration Task: Original Image Restoration
The vein damage image inpainting stage is based on the image of the pattern features after feature inpainting to guide the restoration of the original image. The finger vein images generated in this way are guided and constrained by their own vein features, substantially preserving the vein feature information of the image. This stage of the network has an input image consisting of the original ROI image
, the image to be restored
after masking, and the feature image
from the first stage of the restoration, combining
with the restored
to form
. We designed a generator network which predicts the vein image of the damaged region to obtain the restored finger vein image
.
is defined as:
We used a joint loss consisting of
loss, perceptual loss
, style transfer loss
, and generative adversarial loss
for the training of the second stage restoration network. Where the generative adversarial loss
is similar to
and is defined as:
The perceptual loss
that we use for training in this task is similar to the perceptual loss in the image deblurring task, where
is defined as:
is the activation maps of
of the VGG-19 network that has been pre-trained on the ImageNet dataset.
determines the degree of difference between the network-generated finger vein images and the original finger vein images at the perceptual level by defining the distance between the activation maps of the pre-trained network. In this study, we used the activation map of the pre-trained network to calculate the style loss
. Given feature maps of sizes
,
is computed by:
where
is a
Gram matrix constructed from activation maps
. For our experiments, the final loss of the original image restoration of the finger vein images is defined as:
2.3. The Design of a Restoration Task Method for Finger Vein Image Enhancement and Denoising
In order to address other problems experienced in finger vein recognition, such as low contrast of finger vein feature information and various noises present in the image due to physical and electromagnetic interference, several traditional finger vein image enhancements and denoising algorithms were selected as restoration tasks. The algorithms mainly include:
- (a)
Mean Filter. The Mean Filter is used to smooth out the noise that exists in an image. It is based on the principle that the pixel at the centre is the average of all surrounding pixels. The calculation formula is shown in Equation (
16):
among them,
is the current pixel, and
M is the total number of pixels in the template including the current pixel.
- (b)
Laplacian Filter. The Laplacian Filter is used to enhance areas of the image where there is a sudden change in grey level and to attenuate areas where the grey level changes slowly. The Laplacian Filter for image enhancement is calculated as in Equation (
17) as:
Among them,
is the input image,
is the enhanced image; The value of
c depends on the centre factor of the mask.
is the Laplacian operator, which is defined as:
- (c)
Contrast Limited Adaptive Histogram Equalization (CLAHE). CLAHE [
31] method is an improvement on the traditional adaptive histogram equalization (AHE) method. AHE can highlight detailed information about an image when there are areas in the image that are significantly brighter or darker than other areas. However, AHE makes the image noisier, and CLAHE can solve this problem by limiting the contrast in each pixel region.
- (d)
Multi-Scale Retinex (MSR). MSR [
32] can effectively solve the problem of low image contrast due to uneven IR illumination. the formula for MSR is shown in Equation (
19):
Among them, is the original image of the input, is the enhanced image of each scale, is the weight corresponding to each scale, and the sum of the weights of each scale must be 1.
2.4. The Design of an Adaptive Selection Method for Finger Vein Image Restoration Tasks Based on Deep Reinforcement Learning
In practical application situations, captured low-quality finger vein images may have multiple image problems, and it is not desirable to consider only one image problem and perform restoration. For this reason, we introduce DRL to the finger vein restoration task for the first time, viewing the restoration task as a sequential action of restoring images and the low-quality finger vein images as states. DRL is used to train the agent to select restoration behaviors based on the state of the image. The method can gradually restore the image, combining different restoration methods into the best means of restoration. If there are multiple image problems, the agent selects the restoration behaviors consecutively. In this section, we describe in detail the design ideas and the network structure of the DRL-FVRestore method proposed in this paper.
2.4.1. The Design Idea of the DRL-FVRestore Method
The architecture figure of the DRL-FVRestore at the step
t is shown in
Figure 6. As can be seen from the figure, state is the input to the agent and when
the agent
receives the current input of the finger vein image
and its input value vector
, which is obtained from the network output of the previous agent at
. Based on the maximum value of the output
, the behaviors
is selected and the corresponding method is chosen to restore the current finger vein image in the image restoration task based on
. After the restoration process, the restored image
and the value vector
are obtained. Based on the maximum value of the output
, the action
is selected and the restoration process continues iteratively for the image
until the desired value is reached. At this point the agent chooses to stop the operation and ends the restoration process. Therefore, to design an agent that can select a restoration behavior based on the state of the image, it is first necessary to determine the ‘state’, ‘action’, and ‘reward’ of the agent.
The state of agent. The Agent needs an ‘Observation’ to observe the input finger vein image information, and in reinforcement learning, ‘state’ is this ‘Observation’. As shown in
Figure 6, the input to the ‘state’ of the intelligence is
. ‘State’ is provided to the agent for the current finger vein image
to be processed and the vector of values
obtained by the previous agent. Knowledge of the previous decision can help in the selection of the restoration action for the current step.
The action of agent. In this work, we define the behaviors space as an 8-dimensional vector
, as shown in
Figure 6. Among them, the behaviors vectors
each correspond to a finger vein image restoration method.
is the restoration task stopping behaviors. The restoration task for the current image will end when the agent selects the stop behavior.
The reward of agent. The reward drives the training of the agent when it maximizes the cumulative reward. ‘Reward’ can determine the result of the image quality of the input finger vein image after restoration. To achieve a final restored finger vein image with clear details and satisfactory quality, we introduce the
loss of the vein feature as a texture constraint in the reward function, and the perceptual loss
of the original image to jointly constrain the training of the agent, where
is the same as the perceptual loss for the motion blur removal task. Assuming that the current agent is at the step t stage, the specific equation is shown in Equations (
20) and (
21):
where
is the image restored from the previous step of the agent and
is the finger vein image obtained by the agent at the end of the current step. The
is feature extraction based on direction valley method [
33]. To ensure that the image quality is improved at each step, our stepwise reward design is shown in Equation (
22):
In this study, we designed reasonable reward and punishment mechanisms in Equations (
20) and (
21) to constrain the agent’s behavior. In order to improve the quality of the finger vein image after each restoration, we designed a progressive reward
rt, as shown in Equation (
22). Assuming that the current input image is calculated by Equations (
20) and (
21) to obtain the value function as
, and the value function calculated by the output image after restoration is
then
. By determining the value of
, the agent’s behavior can be controlled, if
, then the control agent stops the restoration behavior; if
, then the control agent continues with the restoration behaviors.
2.4.2. The Network Architecture of the DRL-FVRestore Method
The agent of DRL-FVRestore consists of three modules, as shown in
Figure 6. The first module is the feature extraction module, where we use SE-ResNeXt [
34] to perform feature extraction on the input finger vein image. To retain the information of the previous selected action, it is formed as a mapping of finger vein repair action behavior. The second module is implemented using a one-hot encoder, which has an 8-dimensional input and a 7-dimensional output. The third module is a long short-term memory (LSTM) [
35]. The LSTM not only observes the state of the current input image, but also saves the state of the historically restored images, providing contextual information about the historically restored images and actions. Finally, an FC layer is added behind the LSTM to output a value vector for the selection of finger vein restoration tasks.
3. Experimental Data
All models are trained in parallel on two RTX 3090 GPUs, Intel(R) Xeon(R)Silver CPU E5-2678 v3 @2.50GHz, 128GB RAM and Ubuntu operating system using PyTorch framework. For image deblurring networks, the models were trained with Adam [
36] optimizer and the learning rate of
for 200 epochs, followed by another 200 epochs with a linear decay to
. For the image damage restoration network, we optimized the model using the Adam optimizer with
. In this procedure, the generators
G1 and
G2 are trained using a learning rate of
, respectively. When the loss has smoothed out, we reduce the learning rate to
to continue training
G1 and
G2 until convergence. Assuming that the amount of data in the training dataset is
m, the number of network iterations is
n, and the model is trained on
images with different damages. In our training, the loss of the image defect repair model tends to stabilize when N reaches 1000. In DRL-FVRestore, the DQN network [
37] was used to train the agent. The learning rate of the model is
. For each training, the discount factor
, the target network calculates the true value
, and Minimize Target Loss
to update the agent.
3.1. The Introduction of Public Datasets
In this study, four public datasets were used for experimental validation in this study, including MMCBNU_6000 [
38], FV-USM [
39], UTFVP [
40], and SDUMLA-FV [
41], the details of which are shown in
Table 1. We divide the dataset into training and testing datasets in a ratio of 8:2 to train and test the proposed network model. In particular, the 20% of the training dataset is divided into a validation dataset. We use method [
42] to extract ROI from the images in the dataset. To prevent overfitting, we enhance the data by flipping, adding noise, mirroring, and offsetting the images of the input network. Before training, we standardize the size of the input images to 256 × 256.
In this study, we classify the blurred images captured from finger veins into two general types of blurs: motion blur and Gaussian blur. Among them, Gaussian blur is mainly determined by the standard deviation
and the convolution kernel, where the larger the standard deviation the greater the blur, and the larger the convolution kernel the greater the blur. The motion blur is generated by a random matrix of motion blur kernels. In this study we use the albumentations library [
43] to add motion blur to the images in this public dataset. The albumentations library adds the motion blur operation with the note that the blur_limit is the range of the convolution kernel size, the larger the convolution kernel the more pronounced the blur effect.
Training dataset. To ensure that the network model can handle different blur kernels, the motion blur kernel, the standard deviation of the Gaussian blur and the convolution kernel of the Gaussian blur are all randomly generated for the public dataset, which is closer to the real case and improves the training effect and generalization ability of the model. When training, the data input to the network usually consists of several groups of images. Each set of images is a sharp image as the label and a blurred image corresponding to it as the input to the network. Finally, we input the processed data into the network for training to obtain the weights of the finger vein image restoration network with motion deblurring, and the weights of the finger vein image restoration network with Gaussian deblurring.
Testing dataset. To test the ability of the trained network to repair finger vein images, in this study, for Gaussian blur, the set of values of standard deviation is taken as and the set of values of the size of the convolution kernel is taken as . The images of the test set are processed according to the values of the set as parameters for Gaussian blurring, for example, when , the convolution kernel takes to blur the images, respectively. For motion blur, the set of values of the motion blur kernel is taken as . By processing the data after randomly obtaining the blurring kernel parameters in the set, a test dataset of finger vein images with different degrees of motion blurring can be obtained. It is closer to the real situation and the reliability of using it to test and validate the deblurring repair task is more definitive and convincing.
3.2. The Design of Image Damaged Datasets
In this study, none of the four public datasets used were constructed with image datasets with damaged finger vein information to simulate the problems encountered in practice. The performance of the network in the finger vein image damage inpainting restoration task depends heavily on the design of the training dataset. The designed dataset should be maintained to be as close as possible to the actual application; otherwise, the generalization ability of the network will be reduced. Therefore, in this study, we build the training dataset by using a random mask to simulate the damaged images, which is a more common method of building damaged images [
44].
Training dataset. To continue this study, we designed the appropriate method to build a dataset of damaged finger vein information to simulate the actual problem. In real-world application situations, the size of the damaged region of the finger vein often does not cover more than half of the image size; therefore, in this study, the limit of the size of the constructed damaged region is set to half of the image size. Therefore, to make the network model able to cope with different locations and sizes of damage, and to improve the generalization ability of the network to match the real situation more closely, damaged blocks were designed to be used in the finger vein images, where the location of the damaged area and the size of the damaged area were both randomized. Simultaneously, we used the feature extraction method of maximum curvature to produce feature image datasets as vein texture structure features for the first phase of network training. Following the above approach, we built the training data for the simulated defective images on the public four finger vein dataset.
Testing dataset. The test dataset is independent of the training dataset, and, therefore, the test dataset must contain a variety of damage cases to ensure the reliability of the results. To test the ability of the trained network to repair finger vein images, in this study we created the test dataset according to the location of the damaged area of the finger image and the area of the damaged area. The damaged areas were divided into the left, middle and right sides of the finger image, and the damaged areas were divided into 20 × 20, 40 × 40, 60 × 60, 80 × 80, 100 × 100, and 120 × 120.
3.3. The Design of the DRL-FVRestore Dataset
In this paper, we propose an adaptive selection method for finger vein image repair tasks based on DRL. This method selects an appropriate restoration method for the restoration process based on the state of the low-quality finger vein images with image problems (blurred, damaged, noisy, etc.). Therefore, the DRL-FVRestore needs to be trained with finger vein images that have image problems. To build this dataset, we divide the processing of the dataset into three steps. The first step is to determine the number of image problems , i.e., the image needs to have at least one image problem. The second step is to identify the specific image problem and to process the clear finger vein image to obtain the finger vein image to be restored. As for the third step, it is to randomly generate the size of the parameters of the image problem. In this study, the image problem is divided into: the blur problem, which contains the standard deviation and the blur kernel, and the defect problem, which contains the parameters of the defect area and the defect area.
4. Experiments and Analysis of Results
In this section, we design the following experiments to verify the effectiveness of our proposed method. We divide the experimental part into the reliability validation for the motion blur restoration task, Gaussian blur restoration task, image damage restoration task, and DRL-FVRestore. We test our proposed method on four public datasets. In this paper, peak signal-to-Noise ratio (PSNR) and structural similarity (SSIM) [
45] are adopted as quality evaluation metrics to compare different algorithms for restoration tasks. PSNR and SSIM have been widely used in image quality evaluation. The larger the PSNR and SSIM values, the better the image quality.
The existing finger vein image matching algorithms can be broadly divided into texture feature based matching algorithms and feature point based matching algorithms. Image matching can be used to evaluate the performance improvement of the restored finger vein feature texture image. In order to verify the reliability of the methods, in this experiment we use both methods to simultaneously evaluate the performance improvement of the repaired images by the proposed method. The matching algorithm we adopt is described in detail below.
- (a)
Matching algorithm based on texture feature extraction. In this study, a single finger in each dataset was considered as a single individual. The template used for matching was the image fusion of all ROI using a single finger. We used the template matching method for recognition matching. As an example, for the template matching method, we use the MMCBNU_6000 (6000/10) dataset. The template images were first obtained by fusing 10 images of the same category using ROI images and then the maximum curvature method [
28] was used to obtain vein feature images. Matching each original feature image with the same class of feature images of the template, i.e., 6000 times for intra-class matching, and
times for inter-class matching.
False acceptance rate (FAR) and false rejection rate (FRR) were used to assess the performance of the matches, which are two important metrics in the field of finger vein recognition. FRR is the probability that an authorized object is falsely rejected, and FAR is the probability that a non-authorized object is accepted as an authorized one [
3]. Different matching thresholds give different FAR and FRR, while the equal error rate (EER) is the value when the FAR and FRR are equivalent. Draw the Receiver Operating Characteristic (ROC) curve according to FAR and FRR. n the ROC curve, that illustrates FAR against FRR at different thresholds on the matching score, the lower the ROC curve is located, the better the image quality is. Each part of the experiment is described in detail below.
- (b)
Matching algorithms based on feature points. We use the SURF algorithm [
46] for feature point extraction of finger vein images, SURF is a local feature point description algorithm with accelerated robustness, the main process of its algorithm is: (1) construct Hessian matrix; (2) construct Gaussian pyramid scale space; (3) use non-extreme value suppression of initially determined feature points to precisely locate the extreme value points; (4) calculate the feature points in a certain domain of horizontal Haar wavelet features and vertical Haar wavelet features to determine the 64-dimensional feature vector; and (5) finally, Euclidean distance is used for feature point matching to achieve feature point matching of finger vein images. A higher number of feature points for intra-class matching and a lower number of feature points for inter-class matching indicates a better-quality image with richer vein texture information. In this experiment, we use a comparison of the intra-class and inter-class probability density distribution plots and feature point histogram statistical plots of different algorithms to analyze the reliability of the algorithms in this paper.
4.1. Reliability Verification of Motion Blur Restoration Tasks
To verify the reliability of the proposed de-motion blurring restoration method for finger vein image restoration, several experiments were designed. In a practical application, the subject is captured with the finger vein image by the capture device. The system creates a template of the finger vein image and stores it in the database. When the subject needs to verify the identity of a template, the capture device recaptures the current finger vein image and matches it to the template in the database to obtain a match score. However, when creating the templates, the captured finger vein images may have motion blur due to external factors and need to be restored and then created into templates. Therefore, due to the different templates, we considered the following cases, which are close to the actual application.
Case 1: The captured finger vein image is image restored by the method of this paper and then made into a template, and the template is matched with the finger vein image with motion blur during validation.
Case 2: The captured finger vein image is restored by the method in this paper and then made into a template, and the template is matched with the restored finger vein image by the method in this paper.
Case 3: The captured finger vein image from the acquisition device is made into a template directly without any processing, and the template is matched with the finger vein image with motion blur during verification.
Case 4: The finger vein image captured by the capture device is made into a template directly without any processing, and the template is matched to the restored finger vein images during validation.
Table 2 and
Table 3 show the values of EER obtained by doing matching authentication calculations on the public dataset for these four cases on different motion blur convolutions. Through the display of the experimental results in these tables, it can be found that as the convolution kernel becomes larger, the textured areas of the finger veins become more affected by the blurring, resulting in a serious lack of vein texture information and finally the performance of the recognition degrades. In Case 3 and Case 4, it is easy to find that the images captured by the acquisition device have motion blur, and after making the templates the feature information of the templates is not obvious due to the blurring, resulting in larger EER and lower recognition performance. Notably, the results of Case 4 were better than those of Case 3 due to the deblurring process of the matched images by the method in this paper. In Case 1 and Case 2, the captured images are deblurred by the method in this paper and then made into templates. The feature information of the templates was restored and the EER values calculated were lower and the recognition performance was better compared to Case 3 and Case 4. In Case 2, the matched data are deblurred compared to Case 1, which results in a better performance of Case 2. It is not difficult to see that, when comparing all the cases, only the template image and the matched image, which are both deblurred by the method in this paper, have smaller EER, higher image quality and better recognition performance in Case 2. Moreover, the generalization capability of our model can be seen by processing different motion blur convolution kernels.
To verify the effectiveness and reliability of the method proposed in this paper in the field of finger vein image deblurring, we processed the finger vein images from the test datasets by adding random motion blur kernels and restored them with the trained model. The PSNR and SSIM were used to quantitatively evaluate the image quality and similarity to the original image after restoration with the state-of-the-art method and the method used in this paper, as shown in
Table 4. It can be found that the method used in this paper has higher values of PSNR and SSIM, indicating that the motion blurred image is closer to the original image and has better image quality. In the process, we plotted the ROC curves of the images restored by the state-of-the-art method and the method used in this paper on the four publicly available datasets compared to the original images, as shown in
Figure 7. It can be found that the method adopted in this paper exhibits lower ROC curves, lower EER and better recognition performance compared to the other state-of-the-art methods. The deblurred images from this paper showed an average reduction of 4.31% in EER compared to the restored images from the state-of-the-art method on the four datasets.
The above experiments validate the effectiveness and reliability of our method in terms of both the quality of the restored images and the texture information. In order to verify that the restored images are rich in feature information, we have completed the intra-class and inter-class probability density distribution plots and feature point histogram statistical plots of different algorithms to analyze the reliability of the algorithms in this paper, as shown in
Figure 8 and
Figure 9. It can be seen that the image restored by the method in this paper has a more rightward feature point probability density curve and a larger number of feature point histogram statistical plots; outside the class, the feature point probability density curve is more to the zero point and the number of feature point histogram statistical plots is less and more to the left. The accuracy of matching between the same kind of features is higher.
4.2. Reliability Verification of Gaussian Blur Restoration Tasks
To verify the reliability of the proposed deblurring method for finger vein image restoration, we designed the following experiments. We used the original image as a template and matched the template with the blurred finger vein image and the finger vein image restored after the deblurring task, respectively, as shown in
Table 5 and
Table 6. We chose different convolutional kernel parameters from the trained ones as the test dataset to verify the generalization ability of the trained model. As the convolution kernel and variance become larger, the values of the matched EER become larger and the performance of recognition decreases. In contrast, the recognition performance is improved by matching the restored finger vein images. It illustrates the positive impact of the task that we used for image deblurring of Gaussians on subsequent recognition.
4.3. Reliability Validation of Finger Vein Image Damage Restoration Tasks
To verify the reliability of the proposed damaged finger vein image inpainting method for finger vein image restoration in this paper, we designed the following experiment. To simulate the actual application situation, we divided the damaged areas of the images into
,
,
,
,
, and
for testing, and divided the damaged areas into middle, left, and right. We calculated the EER values of the simulated damaged images in the public dataset by template matching with the images restored by the method proposed in this paper, as shown in
Table 7,
Table 8 and
Table 9, respectively. The experimental results show that the recognition accuracy is low as the damaged area increases, leading to a lack of information on the texture details of the finger veins. Especially after the damaged area reaches
, its EER value is significantly higher. Finger vein images restored by the method proposed in this paper showed smaller values of EER and a higher degree of matching performance compared to the unrepaired damaged images for template matching.
The values of the EER for template matching of finger vein images after restoration by the method proposed in this paper were reduced by an average of 0.41% over the four datasets when the damaged area was between 20 and 80, compared to the images before restoration. When the damaged area is between 80 and 120, there is an average reduction of 1.71% compared to the unrestored images on the four datasets. It can be seen that the proposed method has better restoration performance and generalization ability for different damaged areas and damage sizes of the test dataset, especially for large damaged areas.
To verify the effectiveness and reliability of the proposed method in the area of finger vein image damaged restoration, the finger vein images from the test datasets were processed by adding a random simulation of the damaged area and size, and the trained model was used for restoration. We used PSNR to quantitatively assess the image quality and image similarity of the images restored with the state-of-the-art method, as well as the method used in this paper to the original images, as shown in
Table 10. It can be readily seen that the method used in this paper has a higher value of PSNR, indicating that the images after damaged restoration by the method used in this paper are of higher quality and have more vein detail information. At the same time, we plotted the ROC curves for the images recovered using the state-of-the-art method and the method used in this paper on each of the four public datasets, as well as for the original images, as shown in
Figure 10. It can be found that the damaged finger vein images have low matching performance due to the lack of finger vein detail information. The restored finger vein images using the proposed inpainting method showed an improved performance in matching authentication compared to the unrestored images, with an average reduction in EER of 3.77% across the four datasets. This result is mainly due to the two-stage restoration process used in this paper, where the second stage is guided by the vein structure. The restored images have more visible vein details which show a lower ROC curve and lower values of EER compared to the state-of-the-art method, with an average reduction of 1.12% over the four datasets. To verify validity still further, we compared the network with only one-stage restoration with the two-stage restoration proposed in this paper. On the four datasets, the EER of the two-stage restoration proposed in this paper is on average 1.33% lower than that of the one-stage restoration. This is a sound and reliable indication of the effectiveness of the two-stage restoration as proposed in this paper.
In order to verify that the images restored by this paper are rich in feature information, we did experiments based on feature point matching on four publicly available datasets, and plotted the intra-class and inter-class probability density distribution plots and feature point histogram statistical plots of different algorithms to analyze the reliability of the algorithms in this paper, as shown in
Figure 11 and
Figure 12. It can be seen that after the defective images are repaired, the probability density curve of feature points in the in-class images is more to the right, and the histogram statistics of feature points are more to the right and more in number; the probability density curve of feature points in the out-class repaired images is more to the zero point, and the histogram statistics of feature points are more to the left and less in number, which indicates that the defective images can be repaired by this paper with more details than other defective repair methods. The results show that more detailed information can be recovered by this method than by other defect restoration methods.
4.4. Reliability Verification of the DRL-FVRestore
In this study, we propose the DRL-FVRestore method to select the appropriate restoration task for image restoration processing based on the current state of the input image. By analyzing the results from the previous experiments, the effectiveness and reliability of the proposed DRL-FVRestore method for the restoration tasks of the motion deblurring restoration task, the Gaussian deblurring restoration task, and the damaged image restoration task were verified. The above restoration methods can effectively improve the quality of images and increase the accuracy of recognition when there is an image problem in the matched image. However, when the matched image is faced with complex, multi-image problems, merely using a single repair method is not effective in improving the performance of the system. In the case of image problems, if the system only relies on a trial-and-error approach to processing, without implementing a restoration task selected according to the image state, it requires a great deal of time and the restoration results are less than optimal.
Figure 13 shows the output images of each stage of the DRL-FVRestore restoration. The first and second groups are images of defective finger veins, which were basically repaired by the first stage of restoration, and the repaired images have clear feature information. The third and fourth groups are images with motion blur, noise and other problems, and after each stage of denoising and deblurring, the image quality is improved and the vein detail is enhanced. Below, the images are the PSNR values for each image, and it can be seen that the quality of the output image has improved at each stage after the DRL-FVRestore restoration.
The method proposed in this paper can effectively solve this problem. Based on the idea of reinforcement learning, the restoration task is selected by the trained agent. The agent is restored based on the state of the image. To verify the effectiveness and reliability of the DRL-FVRestore method in solving the multiple complex problems present in low-quality finger vein images, two image problems were randomly added to the test datasets. The ROC curves are plotted using the trained network for processing, as shown in
Figure 14. When finger vein images have complex image problems, the restored finger vein images obtained by processing with the DRL-FVRestore method proposed in this paper are compared with those restored by just a single restoration method (Specifically motion deblurring restoration, Gaussian deblurring restoration, and damaged restoration). This reveals that the method in this paper has a lower EER value, a ROC curve closer to the coordinate axis and better recognition performance. The average EER value of the method in this paper was reduced by 3.98% on the four datasets compared to the single method.
This is because trained agent perform restoration processing is based on several steps, namely: 1. the state of the image, 2. selecting a damaged restoration task to restore the image if there is a damaged image, 3. performing a deblurred restoration task to restore the image if there is a blur, and, finally, 4. selecting an enhanced method for image texture enhancement in appropriate amounts based on the quality and texture detail of the image.
In contrast, the EER of just a single de-motion blur restoration method with images processed by de-Gaussed blur restoration for matching recognition was not as effective as the DRL-FVRestore method, although its EER was also reduced. The reason for this is that the randomly added image problems in the test set contain missing images, which significantly adversely affect the detail of the image features, thereby causing a reduction in performance.
In addition, we added experiments based on feature point matching and plotted the intra-class and inter-class probability density distribution plots and feature point histogram statistical plots of different algorithms to analyze the reliability of the algorithms in this paper, as shown in
Figure 15 and
Figure 16. It can be seen that the images restored by DRL-FVRestore are more rightward in class probability density distribution plots and feature point histogram statistical plots than single restoration method. The restored images are richer in detail than the single restoration method.The DRL-FVRestore method proposed in this paper selects the appropriate restored method according to the state of the image and can be restored continuously, which is beneficial for solving finger vein images with multiple problems.