*3.1. Database and Evaluation Measurement*

To evaluate the effectiveness of the proposed method, we run a set of experiments on public database SVC 2004 Task2. There are 40 users and each user has 20 genuine signatures and 20 skilled forgeries. These genuine signatures are collected in two sessions, spaced apart by at least one week. The skilled forgeries are contributed by who could replay the writing sequence of the signatures on the computer screen and practice the forgeries for a few times until they were confident to proceed to the actual data collection. The signatures are mostly in either English or Chinese [40]. In our experiments, for each of the users, we randomly select five male/female genuine signatures for enrolment as

reference signatures. The signatures are chosen from the first or second session. The remaining 15 genuine signatures (not selected for enrolment) and 20 skilled forgeries are considered for testing the performance of our proposal. As for the random-forgeries scenario, corresponding to any user, we randomly select 20 signatures from other users. The trial is conducted ten times for each user.

We evaluate the performance primarily using the Equal Error Rate (EER): which is the error when false acceptance rate is equal to false rejection rate. We considered two forms of calculating EER: EER-*commonThreshold* and EER-*userThreshold*. EER-*commonThreshold* is calculated using a global decision threshold. In this case, all the feature values from all training signatures are used to find an optimal value based on minimum EER. The same threshold is shared by all users. EER-*userThreshold* is using user-specific decision threshold. It is derived from feature values of training samples of each user. For the respective user, the best threshold corresponds to his/her lowest EER. Since there are multiple users in the database SVC 2004, the average of EER across all users is applied as overall performance of the method when using user threshold in our experiments.
