*2.3. Function Features-Based Online Signature Verification*

One of the advantages of online signature verification is that signature is captured by specialized sensors-based devices. So dynamic information can be recorded and used for verification, which makes verification more accurate and reliable. In function features-based methods, a set of function features, such as position, pressure, velocity, acceleration, etc., is firstly captured. Then matching between features of the test and the reference and decision-making are implemented.

## 2.3.1. Function Features Extraction

Usually, lots of features can be obtained directly from the specialized electronic devices. Horizontal and vertical position, pressure and timestamp of each sample point are the basic measurements. Let *x*, *y*, *p*, *t* be the mentioned basic measurements, *n* = 1, 2, 3, ... *N* be the discrete time index of the temporal functions and *N* be the time duration of a signature in sampling units [14]. Based on them, various features can be derived. Among them, 20 frequently used function features are selected. The features are grouped according to their properties, such as position-related, pressure-related, velocity-related, acceleration-related, and angle-related. The features are listed in Table 1.


**Table 1.** Function features extracted for online signature verification.

2.3.2. Matching Based on Shape Context-Dynamic Time Warping (SC-DTW)

Feature matching is very critical for function features-based verification. In recent years, DTW has been widely applied as the matching technique in online signature verification. The DTW method compress or expand the time axis of two temporal functions locally to make them aligned.

Here are two time series *T* = (*t*1, *t*2, ..., *tN*) and *R* = (*r*1,*r*2, ...,*rM*) and their lengths are N and M respectively. The similarity between the *nth* point of *T* and the *mth* point of *R* are calculated according to defined similarity rule. All the similarity values constitute a DTW cost matrix denoted by *d*(*m*, *n*) defined as:

$$d(m, n) = ||\; \vert \; r\_m - t\_n \; || \tag{6}$$

The overall distance is calculated as following equation:

$$D(m,n) = d(m,n) + \min\left\{ \begin{array}{l} D(m,n-1) + \mathbb{C} \\ D(m-1,n-1) \\ D(m-1,n) + \mathbb{C} \end{array} \right.\tag{7}$$

where *D*(*n*, *m*) is the cumulative distance up to the current element and C is gap cost. To alleviate the situation of signature at different length, the distance is normalized by Equation (8).

$$d = \frac{D}{\sqrt{M \times N}}\tag{8}$$

DTW has been an effective method of finding the alignment between two signatures with different length. However, a time series has both numerical nature and shape nature. DTW warps time series depending on the similarity of their numerical characteristics as Equation (6) but ignores the shape properties. It may lead to abnormal alignment sometimes. Zhang and Tang [38] propose a novel variant of DTW, named SC-DTW. The SC-DTW employs shape context to replace the raw observed values used in conventional DTW, getting ahead in time series data mining. In this paper, we adopted the SC-DTW for function features-based verification to further improve the accuracy.

Specifically, the alignment of two point is decided by their shape matching cost of shape contexts, which means

$$d(n,m) = \mathbb{C}\_{nm} \tag{9}$$

where *Cnm* is defined in Equation (5).

Under this circumstance, a function feature is considered to be a 1-D array and a 2-D shape. The problem of measuring the similarity of two function features can be translate into how similar these two shapes. Figure 4 shows the process of SC-DTW. Figure 4a,b are the time series of 11*th* function feature *v* listed in Table 1 of two signatures from the same user. Figure 4c,d are the corresponding shape context histograms. That the shape context is similar means the sample points in time series are well matched. Please note that the application of shape contexts is only used to find the alignment between two time series. The measurable cumulative distance of them is still obtained by the original cost matrix for the convenience of following classification.

**Figure 4.** SC-DTW. (**a**,**b**) Time series of total velocity *v* from two signatures and a pair of corresponding points found by shape context. (**c**,**d**) show the shape context histograms of the points marked in (**a**,**b**), respectively.

Given a *<sup>N</sup>*(*k*) <sup>×</sup> *<sup>D</sup>* feature set *<sup>X</sup>*(*k*), extracted from a reference signature and a *<sup>N</sup>*(*q*) <sup>×</sup> *<sup>D</sup>* feature set *X*(*q*), extracted from a signature which is claimed to belong to the same user, a D-dimensional vector *z*(*k*,*q*) called 'similarity feature vector' can be derived by calculating the similarity between each pair of corresponding function features using SC-DTW mentioned above.

#### 2.3.3. Classification Based on Interval-Valued Symbolic Representation

The concept of symbolic data analysis has been applied in the field of document image analysis and cluster analysis. Interval-valued and histogram-valued symbolic representation can represent the variability and distribution of feature values. Guru and Prakash [39] extract global features of signature to form an interval-valued feature vectors and proposed a method for verification and recognition based on the symbolic representation. Pal and Alaei [5] also propose an interval-valued symbolic representation-based method for offline verification. In this paper, we first use the interval-valued symbolic representation to model the similarity features derived from SC-DTW and then build a classifier for verification.

Let [*ref*1,*ref*2, ··· ,*refn*] be a set of *n* enrolled reference signatures of user. In addition, denote *D* as the similarity feature vector of an user, where *D<sup>r</sup> ij* is the SC-DTW distance of feature *r* between signature *refi* and *refj*, as is showed in Table 2. Each user has a feature vector like that. For the *kth* feature, we compute the statistical mean *μ<sup>k</sup>* and standard deviation *σ<sup>k</sup>* and the lower and upper bound of interval value can be computed as Equation (10).

$$\begin{aligned} sf\_k &= ([f\_k^-, f\_k^+], \mu\_{k'}\sigma\_k) \\ f\_k^- &= \mu\_k - \mathfrak{a}\sigma\_k \\ f\_k^+ &= \mu\_k + \mathfrak{a}\sigma\_k \\ \mu\_k &= mean(f\_k) \\ \sigma\_k &= std(f\_k) \end{aligned} \tag{10}$$

where *s fk* is the symbolic representation of *kth* feature of a user and includes an interval value and two continuous values. *α* is a scalar to control the upper and lower limit of each feature. In addition, the symbolic feature vectors are computed for all users and stored in the template base for future verification.


**Table 2.** Similarity feature vector of each individual.

For signature verification problem, the signature is compared with all the reference signatures belonging to the claimed ID. Let *Ft* = [ *ft*1, *ft*2, ··· , *ftD*] denote a D-dimensional feature vector representing the average SC-DTW distance with reference signatures. In addition, denote *s f* = [[ *f* − *<sup>r</sup>*<sup>1</sup> , *<sup>f</sup>* <sup>+</sup> *<sup>r</sup>*<sup>1</sup> ], [ *f* <sup>−</sup> *<sup>r</sup>*<sup>2</sup> , *<sup>f</sup>* <sup>+</sup> *<sup>r</sup>*<sup>2</sup> ], ··· , [ *f* <sup>−</sup> *rD*, *<sup>f</sup>* <sup>+</sup> *rD*]] as the reference signatures of the claimed identity described by an interval-valued symbolic feature vector. Each feature value of the test signature is compared with corresponding interval in *s f* to examine whether it lies within the interval. The feature value represents the dissimilarity of two signatures. That is, the more similar the two signatures, the smaller the value and the closer to 0. The total value of features of a test signature which fall inside the interval value decides how this test signatures is similar to genuine ones, as is showed in Equations (11) and (12). Define *A* as the measure of measure of degree of authenticity:

$$A = \sum\_{i=1}^{D} \mathcal{C}(f\_{li\prime} \left[ f\_{ri\prime}^{-} f\_{ri}^{+} \right]) \tag{11}$$

where

$$\mathbb{C}(f\_{li\prime}[f\_{ri\prime}^{-}f\_{ri}^{+}]) = \begin{cases} 2 & \text{if } 0 \le f\_{li} \le f\_{ri}^{-} \\ 1 & \text{if } f\_{ri}^{-} < f\_{li} \le f\_{ri}^{+} \\ 0 & \text{else} \end{cases} \tag{12}$$

If the acceptance count *A* is greater that a threshold *th*, the test signature will be classified as a genuine signature of its claimed user. In the user-dependent scenario, every person has its own *A* which is computed using those training samples. For each training signature, there is an *A* we got. For each person, we compute several *A* and then average them thus getting *Am*. The threshold *th* equals to *β* × *Am*.
