*3.1. Local Appearance-Based Techniques*

It is a geometrical technique, also called feature or analytic technique. In this case, the face image is represented by a set of distinctive vectors with low dimensions or small regions (patches). Local appearance-based techniques focus on critical points of the face such as the nose, mouth, and eyes to generate more details. Also, it takes into account the particularity of the face as a natural form to identify and use a reduced number of parameters. In addition, these techniques describe the local features through pixel orientations, histograms [13,26], geometric properties, and correlation planes [3,33,41].

• Local binary pattern (LBP) and it's variant: LBP is a great general texture technique used to extract features from any object [16]. It has widely performed in many applications such as face recognition [3], facial expression recognition, texture segmentation, and texture classification. The LBP technique first divides the facial image into spatial arrays. Next, within each array square, a 3 × 3 pixel matrix (p1 ...... p8) is mapped across the square. The pixel of this matrix is a threshold with the value of the center pixel (p0) (i.e., use the intensity value of the center pixel i p0 as a reference for thresholding) to produce the binary code. If a neighbor pixel's value is lower than the center pixel value, it is given a zero; otherwise, it is given one. The binary code contains information about the local texture. Finally, for each array square, a histogram of these codes is built, and the histograms are concatenated to form the feature vector. The LBP is defined in a matrix of size 3 × 3, as shown in Equation (1).

$$\text{LBP} = \sum\_{p=1}^{8} 2^p s (i\_0 - i\_p), \quad \text{with } s(\mathbf{x}) = \begin{cases} 1 & \mathbf{x} \ge \mathbf{0} \\ 0 & \mathbf{x} < \mathbf{0}' \end{cases} \tag{1}$$

where *i*<sup>0</sup> and *ip* are the intensity value of the center pixel and neighborhood pixels, respectively. Figure 3 illustrates the procedure of the LBP technique.

Khoi et al. [20] propose a fast face recognition system based on LBP, pyramid of local binary pattern (PLBP), and rotation invariant local binary pattern (RI-LBP). Xi et al. [15] have introduced a new unsupervised deep learning-based technique, called local binary pattern network (LBPNet), to extract hierarchical representations of data. The LBPNet maintains the same topology as the convolutional neural network (CNN). The experimental results obtained using the public benchmarks (i.e., LFW and FERET) have shown that LBPNet is comparable to other unsupervised techniques. Laure et al. [40] have implemented a method that helps to solve face recognition issues with large variations of parameters such as expression, illumination, and different poses. This method is based on two techniques: LBP and K-NN techniques. Owing to its invariance to the rotation of the target image, LBP become one of the important techniques used for face recognition. Bonnen et al. [42] proposed a variant of the LBP technique named "multiscale local binary pattern (MLBP)" for features' extraction. Another LBP extension is the local ternary pattern (LTP) technique [43], which is less sensitive to the noise than the original LBP technique. This technique uses three steps to compute the differences between the neighboring ones and the central pixel. Hussain et al. [36] develop a local quantized pattern (LQP) technique for face representation. LQP is a generalization of local pattern features and is intrinsically robust to illumination conditions. The LQP features use the disk layout to sample pixels from the local neighborhood and obtain a pair of binary codes using ternary split coding. These codes are quantized, with each one using a separately learned codebook.

• Histogram of oriented gradients (HOG) [44]: The HOG is one of the best descriptors used for shape and edge description. The HOG technique can describe the face shape using the distribution of edge direction or light intensity gradient. The process of this technique done by sharing the whole face image into cells (small region or area); a histogram of pixel edge direction or direction gradients is generated of each cell; and, finally, the histograms of the whole cells are combined to extract the feature of the face image. The feature vector computation by the HOG descriptor proceeds as follows [10,13,26,45]: firstly, divide the local image into regions called cells, and then calculate the amplitude of the first-order gradients of each cell in both the horizontal and vertical direction. The most common method is to apply a 1D mask, [–1 0 1].

$$G\_{\mathbf{x}}(\mathbf{x}, \; y) = I(\mathbf{x} + \mathbf{1}, \; y) - I(\mathbf{x} - \mathbf{1}, \; y), \tag{2}$$

$$G\_{\mathcal{Y}}(\mathbf{x}, \ y) = I(\mathbf{x}, \ y+1) - I(\mathbf{x}, \ y-1), \tag{3}$$

where *I*(*x*, *y*) is the pixel value of the point (*x*, *y*) and *Gx*(*x*, *y*) and *Gy*(*x*, *y*) denote the horizontal gradient amplitude and the vertical gradient amplitude, respectively. The magnitude of the gradient and the orientation of each pixel (*x*, *y*) are computed as follows:

$$G(\mathbf{x}, \ y) = \sqrt{G\_{\mathbf{x}}(\mathbf{x}, \ y)^2 + G\_{\mathbf{y}}(\mathbf{x}, \ y)^2},\tag{4}$$

$$\theta(\mathbf{x}, \ y) = \tan^{-1} \left( \frac{G\_y(\mathbf{x}, \ y)}{G\_x(\mathbf{x}, \ y)} \right). \tag{5}$$

The magnitude of the gradient and the orientation of each pixel in the cell are voted in nine bins with the tri-linear interpolation. The histograms of each cell are generated pixel based on direction gradients and, finally, the histograms of the whole cells are combined to extract the feature of the face image. Karaaba et al. [44] proposed a combination of different histograms of oriented gradients (HOG) to perform a robust face recognition system. This technique is named "multi-HOG".

The authors create a vector of distances between the target and the reference face images for identification. Arigbabu et al. [46] proposed a novel face recognition system based on the Laplacian filter and the pyramid histogram of gradient (PHOG) descriptor. In addition, to investigate the face recognition problem, support vector machine (SVM) is used with different kernel functions.

• Correlation filters: Face recognition systems based on the correlation filter (CF) have given good results in terms of robustness, location accuracy, efficiency, and discrimination. In the field of facial recognition, the correlation techniques have attracted great interest since the first use of an optical correlator [47]. These techniques provide the following advantages: high ability for discrimination, desired noise robustness, shift-invariance, and inherent parallelism. On the basis of these advantages, many optoelectronic hybrid solutions of correlation filters (CFs) have been introduced such as the joint transform correlator (JTC) [48] and VanderLugt correlator (VLC) [47] techniques. The purpose of these techniques is to calculate the degree of similarity between target and reference images. The decision is taken by the detection of a correlation peak. Both techniques (VLC and JTC) are based on the "4 *f* " optical configuration [37]. This configuration is created by two convergent lenses (Figure 4). The face image *F* is processed by the fast Fourier transform (FFT) based on the first lens in the Fourier plane *SF*. In this Fourier plane, a specific filter P is applied (for example, the phase-only filter (POF) filter [2]) using optoelectronic interfaces. Finally, to obtain the filtered face image *F*- (or the correlation plane), the inverse FFT (IFFT) is made with the second lens in the output plane.

For example, the VLC technique is done by two cascade Fourier transform structures realized by two lenses [4], as presented in Figure 5. The VLC technique is presented as follows: firstly, a 2D-FFT is applied to the target image to get a target spectrum *S*. After that, a multiplication between the target spectrum and the filter obtain with the 2D-FFT of a reference image is affected, and this result is placed in the Fourier plane. Next, it provides the correlation result recorded on the correlation plane, where this multiplication is affected by inverse FF.

The correlation result, described by the peak intensity, is used to determine the similarity degree between the target and reference images.

$$\mathbf{C} = FFT^{-1}\{\mathbf{S}^\* \circ POF\},\tag{6}$$

where *FFT*−<sup>1</sup> stands for the inverse fast FT (FFT) operation, \* represents the conjugate operation, and ◦ denotes the element-wise array multiplication. To enhance the matching process, Horner and Gianino [49] proposed a phase-only filter (POF). The POF filter can produce correlation

peaks marked with enhanced discrimination capability. The POF is an optimized filter defined as follows:

$$H\_{\rm POF}(\mu, \upsilon) = \frac{S^\*(\mu, \upsilon)}{\left| S(\mu, \upsilon) \right|} \prime \tag{7}$$

where *S*∗(*u*, *v*) is the complex conjugate of the 2D-FFT of the reference image. To evaluate the decision, the peak to correlation energy (PCE) is defined as the energy in the correlation peaks' intensity normalized to the overall energy of the correlation plane.

$$PCE = \frac{\sum\_{i,j}^{N} E\_{\text{peak}}(i, j)}{\sum\_{i,j}^{M} E\_{\text{correlation}-plane}(i, j)},\tag{8}$$

where *i*, *j* are the coefficient coordinates; *M* and *N* are the size of the correlation plane and the size of the peak correlation spot, respectively; *Epeak* is the energy in the correlation peaks; and *Ecorrelation*<sup>−</sup>*plane* is the overall energy of the correlation plane. Correlation techniques are widely applied in recognition and identification applications [4,37,50–53]. For example, in the work of [4], the authors presented the efficiency performances of the VLC technique based on the "4f" configuration for identification using GPU Nvidia Geforce 8400 GS. The POF filter is used for the decision. Another important work in this area of research is presented by Leonard et al. [50], which presented good performance and the simplicity of the correlation filters for the field of face recognition. In addition, many specific filters such as POF, BPOF, Ad, IF, and so on are used to select the best filter based on its sensitivity to the rotation, scale, and noise. Napoléon et al. [3] introduced a novel system for identification and verification fields based on an optimized 3D modeling under different illumination conditions, which allows reconstructing faces in different poses. In particular, to deform the synthetic model, an active shape model for detecting a set of key points on the face is proposed in Figure 6. The VanderLugt correlator is proposed to perform the identification and the LBP descriptor is used to optimize the performances of a correlation technique under different illumination conditions. The experiments are performed on the Pointing Head Pose Image Database (PHPID) database with an elevation ranging from −30◦ to +30◦.

**/HQV /HQV I 3 ,QSXWSODQH )RXULHUSODQH 2XWSXWSODQH**

**Figure 3.** The local binary pattern (LBP) descriptor [19].

**Figure 4.** All "4f" optical configuration [37].

**Figure 5.** Flowchart of the VanderLugt correlator (VLC) technique [4]. FFT, fast Fourier transform; POF, phase-only filter.

**Figure 6.** (**a**) Creation of the 3D face of a person, (**b**) results of the detection of 29 landmarks of a face using the active shape model, (**c**) results of the detection of 26 landmarks of a face [3].
