• Prewitt Operator

The Prewitt filter detects the vertical and horizontal directions of the edges of an image by locating those pixel values defined by steep gray values [26]. The Prewitt operator consists of two 3 × 3 convolution masks [25–27]:

$$G\_{\mathcal{Y}} = \begin{bmatrix} +1 & 0 & -1 \\ +1 & 0 & -1 \\ +1 & 0 & -1 \end{bmatrix} \* A(x, y) \tag{2}$$

$$\mathbf{G}\_{\mathbf{x}} = \begin{bmatrix} +1 & +1 & +1 \\ 0 & 0 & 0 \\ -1 & -1 & -1 \end{bmatrix} \ast A(\mathbf{x}, \mathbf{y}) \tag{3}$$

where *A* is the image source and \* is the 2D convolution operation.

• The Laplacian operator

The Laplace operator is computed using the second-order derivative approximations of the image function *A*(*x*, *y*). It is noise sensitive, so it is often combined with a Gaussian filter to decrease the sensitivity to noise [28]. The Laplacian filter searches the zero crossing points of the second-order derivatives of the image function and establishes the rapid changes in adjacent pixel values that belong to an edge [28,29].

$$
\nabla^2 A(\mathbf{x}, \mathbf{y}) = \frac{\partial^2 A(\mathbf{x}, \mathbf{y})}{\partial \mathbf{x}^2} + \frac{\partial^2 A(\mathbf{x}, \mathbf{y})}{\partial \mathbf{y}^2} \tag{4}
$$

A zero value indicates the areas of constant intensity, while values < 0 or > 0 are placed in the vicinity of an edge.

• The Laplacian of Gaussian (*LoG*) operator

For an image *A*(*x*, *y*) with pixel intensity values (*x*, *y*), a combination of the Laplacian and Gaussian functions generates a new operator *LoG* [20], centered on zero and with a Gaussian standard deviation σ:

$$LoG(x,y) = -\frac{1}{\pi \sigma^4} \left[ 1 - \frac{x^2 + y^2}{2\sigma^2} \right] e^{-\frac{x^2 + y^2}{2\sigma^2}} \tag{5}$$

The Gaussian operator suppresses the noise before using the Laplace operator for edge detection. The *LoG* operator detects areas where the intensity changes rapidly, namely the function's values are positive on the darker side (pixel values close to 0) and negative on the brighter side (pixel values close to 255) [30].

#### 2.2.2. Dataset

The analyzed fingerprint images belong to the FVC2004 database, which is the property of the University of Bologna, Italy [31]. The image data are described, in detail, in Table 1.

**Table 1.** Dataset characteristics.


In order to enhance the limitations of low-quality fingerprint images, both the Prewitt and *LoG* filters were used to enhance the edges that separate the ridges and valleys in the fingerprint images [9,10]. Figure 1 displays examples of image enhancement from each used database and filter.

#### 2.2.3. Data Augmentation

The optimization of a CNN which uses small datasets means avoiding the network convergence to a local minimum. This issue is overcome using augmentation to extend the training dataset and to prevent the issue of overfitting during training. The number of images has been increased nine times. Data augmentation has been performed by executing ±30◦ rotations, from 0◦ to 360◦. This provided a total of 3528 images, of which 2469 were used as training samples. Additionally, each image was resized from 256 × 256 pixels to 80 × 80 pixels to be more suitable to be fed into the network and to reduce the training time. Figure 2 shows an example of data augmentation.

#### 2.2.4. Convolutional Neural Network

A CNN architecture aggregates convolutional modules that perform feature extraction, pooling, and fully connected layers [32]. CNNs perform well in recognition of images tasks. To optimize the performance of a CNN model, it has to be trained to extract the most important deep discriminatory features. As a general description of a CNN architecture and its learning strategy, we could mention that a gradual training process designed to solve complex concepts is employed. Thus, the early layers detect the general features of a given image. Furthermore, from layer to layer the convolution filters are trained to detect more and more complex patterns, such as object features (Figure 3). The model architecture

is given in Table 2. The parameter settings and some hyperparameters selected for the proposed CNN model are also presented (Table 3). They were established before training. An epoch means the whole dataset passes once forward and backward through the neural network. Usually, the number of epochs is determined when the validation accuracy starts decreasing, even if the training accuracy is still increasing. In addition, one epoch is too large to be run through the model as a whole, so it is divided into several smaller subsamples called batches. A higher number of epochs increases the cost of computational complexity and the risk of overfitting, respectively. The variation in the number of epochs stops when the validation loss no longer improves.

**Figure 1.** Samples of fingerprints from our evaluation datasets and examples of enhanced fingerprint images. Columns: **left**—raw grayscale images; **middle**—edge enhancement using the Prewitt filter; **right**—edge enhancement using the *LoG* filter.

**Figure 2.** Data augmentation. Rotated fingerprint on ±30◦.

**Figure 3.** CNN model architecture.

**Table 2.** Model architecture and parameter settings.


Total parameters: 229,843; Trainable parameters: 229,843; 'None' indicates that any positive integer may be expected so that the model is able to process batches of any size.


**Table 3.** The hyper-parameters of the CNN model.

The convolutional layers extract features from the input images. The size of the input image is reduced by the pooling layers. In addition, these pooling layers collect features, and their basic task is to reduce the feature dimensions or to reduce the feature map size by using the ReLU (rectified linear unit) activation function. The ReLU function does not activate all the neurons at the same time. If the output of the linear transformation is negative, the neurons are deactivated. The fully connected or dense layers exploit the learned high-level features and act as classifiers. Additionally, to reduce overfitting and enhance the CNN performance, a large amount of data is required. This issue is overcome by augmentation. To evaluate the performance of the proposed method, the accuracy of fingerprint identification is calculated [33].

### **3. Results and Discussion**

The experiment was carried out in MATLAB R2018a (The MathWorks, Natick, MA, USA), using the proposed approach and the image processing toolbox. The CNN was implemented using Python (Jupyter Notebook) and the open-source platform Keras for the TensorFlow machine learning toolbox. It is run in Google Colaboratory (Colab).

The image datasets were stored in Google Drive and the workspaces were connected. Of the total dataset, 60%, 20%, and 20% were used as the training set, validation set, and test set, respectively. The performance of the CNN model training and validation classification accuracy rate over the 10, 20, 30, and 50 epochs are shown in Figures 4–7. The classification accuracy is defined as the ratio between the correct predictions and the total number of predictions in the training or validation data.

As shown in Figures 4–7, during the training of the CNN, the accuracy of the training set (blue line) continued to increase and the network was learning constantly. The validation set (orange line) first increased, then overfitting occurred and the accuracy showed an unstable variation. The same behavior was observed for loss curves. Consequently, we investigated the number of epochs and selected the best number to solve the overfitting problem.

The number of epochs was set to 10, 20, 30, and 50 in order to keep the training time at an acceptable value of 1.8 s/epoch, on average. Prior to each training epoch, the training data was randomly shuffled. The performance of the proposed model is summarized in Table 4.

**Figure 4.** Illustration of model accuracy rate for 50, 30, 20, and 10 epochs for DB1 dataset acquired using an optical sensor "V300" by CrossMatch.

**Figure 5.** Illustration of model accuracy rate for 50, 30, 20, and 10 epochs for DB2 dataset acquired using an optical sensor "U.are.U 4000".

**Figure 6.** Illustration of model accuracy rate for 50, 30, 20, and 10 epochs for DB3 dataset acquired using a thermal sweeping sensor "FingerChip FCD4B14CB" by Atmel.

**Figure 7.** Illustration of model accuracy rate for 50, 30, 20, and 10 epochs for DB4 dataset generated as synthetic fingerprints.


**Table 4.** The performance of the proposed model.

The results in Figures 4–7 indicate that, while the accuracy and loss of the training data have very good values, the accuracy and loss of the validation dataset is influenced by the epoch number, indicating the existence of overfitting or underfitting. Our data indicates the overfitting of the model starting with the seventh epoch. In this case, it is necessary to stop the model early by tuning the hyperparameter. The CNN performance is strongly influenced by the quality of a fingerprint image and by its local and global structures. The accuracy of the proposed CNN model depends on the amount and quality of the training images, which in our case show an important variability from dataset to dataset. The performance on the test data (20% of each dataset) is lower than the accuracy provided by the training data, with the amendment that the number of samples is reduced for the test set. However, the LoG filter increased the accuracy compared to the Prewitt filter, and it is a better solution to enhance the edges in the fingerprint images. Additionally, the raw images in the DB2 dataset that were acquired using the optical sensor "U.are.U 4000" had a low quality that affected the performance of the classification.

According to the data in Table 4, the accuracy values determined in the training and validation sets are in line with the reported results in the literature. In [3], an accuracy of 85% was reported for images belonging to the FVC2004 database which were processed using a CNN-based automatic latent fingerprint matching system which uses the local minutiae features. Mohamed et al. [14] reported a 99.2% classification accuracy for the training set in an experiment which used the NIST DB4 dataset and 4000 fingerprint images. Militello et al. [15] reported an accuracy value of 91.67% for a pre-trained CNN, used together with the PolyU and NIST fingerprint databases.

The accuracy values determined for the test set were slightly worse, thus indicating a smaller drop in performance. However, we have mentioned that our proposed method used the whole fingerprint images and the computation time is small compared to the other reported method. As an example, in [34], an accuracy of 94.4% and a testing time of 39 ms/image were reported for a pre-trained CNN architecture of the VGG-F network type, and an accuracy of 95.05% and a testing time of 77 ms/image for the VGG-S network.

In addition, CNN architectures have some drawbacks, such as a poor generalization capacity, a requirement for a huge training dataset, and a low stability to geometrical deformation and rotation. In the proposed study, the low generalization capacity was overcome by increasing the training data size to allow the network to train on as many samples as possible. The obtained results indicate that the CNN performance is greatly influenced by the quality of the fingerprint images. The low stability of the network is due to the diversity of finger scanners which were used to acquire the fingerprints, such as optical and thermal sweeping sensors, and synthetic fingerprints as well. Finding a solution to provide a good performance of classification for this variety of data was a big challenge for our method. Our approach integrates the monitoring of training and validation by setting the number of epochs as a form of regularization, and learning curve graphs to decide on the model convergence.

#### **4. Conclusions**

The work conducted in this paper is mainly devoted to fingerprint identification using a CNN network that can perform fingerprint classification by considering whole fingerprint images. The proposed algorithm uses poor-quality original raw fingerprint images. These were processed using Prewitt and Laplace filters to enhance the edges and, in order to reduce the expensive training cost, data resizing was applied. Hyper-parameter tuning, using various epoch numbers, was considered to improve the performance of classification. Our results indicate the overfitting of the model starting with the seventh epoch. The classification accuracy varied from 67.6% to 98.7% for the validation set, and from 70.2% to 75.6% for the test set. Following these considerations, we would argue that the proposed method can achieve a very good performance compared to the traditional hand-crafted features method, despite the fact that it uses raw data and does not perform any handcrafted feature extraction operations.

For future developments, we are interested in improving the performance of classification by using other pre-processing techniques correlated to extensive hyper-parameter tuning. Additionally, other fingerprint databases will be used to assess the generalization capabilities of CNN architectures.

**Author Contributions:** Conceptualization, S.M. and L.M.; methodology, S.M. and L.M.; software, S.M., A.-M.D.L. and L.M.; validation, S.M. and A.-M.D.L.; formal analysis, A.-M.D.L.; investigation, S.M. and A.-M.D.L.; writing—original draft preparation, S.M., A.-M.D.L. and L.M.; writing—review and editing, S.M. and L.M. All authors have read and agreed to the published version of the manuscript.

**Funding:** This research received no external funding.

**Institutional Review Board Statement:** Not applicable.

**Informed Consent Statement:** Not applicable.

**Data Availability Statement:** Not applicable.

**Acknowledgments:** The authors thank the anonymous referees whose comments helped to improve the paper.

**Conflicts of Interest:** The authors declare no conflict of interest.

#### **References**

