1. Introduction
Due to the insufficiency of data representation, Dictionary Learning (DL) has aroused considerable interest in the past decade and achieved much success in the various applications, such as image denoising [
1,
2], person re-identification [
3,
4] and vision recognition [
5,
6,
7,
8]. Generally speaking, the DL methods are developed based on a basic hypothesis, which are that a test signal can be well approximated using the linear combination of some atoms in a dictionary. Thus the dictionary usually plays an important role in the success of these applications. Traditionally, the DL methods can be roughly divided into two categories: The unsupervised DL methods and the supervised DL methods [
9,
10].
In the unsupervised DL methods, a dictionary is optimized to reconstruct all the training samples without any label assignment; hence, there is no class information in the learned dictionary. By further integrating the label information into dictionary learning, the supervised DL methods can achieve better classification performance than the unsupervised ones for image classification. Supervised dictionary learning encodes the input signals using the learned dictionary, then utilizes the representation coefficients or the residuals for classification. Thus the discriminative ability of the dictionary and the representative ability of the coding coefficients play the key roles in this kind of approach. According to the types of the dictionary, the supervised DL methods can be further divided into three categories: The class-shared DL methods, the class-specific DL methods and the hybrid DL methods [
11].
The class-shared DL methods generally force the coding coefficients to be discriminative via learning a single dictionary shared by all the classes. Based on the K-SVD algorithm, Zhang and Li [
5] proposed the Discriminative K-SVD (D-KSVD) method to construct a classification error term for learning a linear classifier. Jiang et al. [
6] further proposed the Label Consistent K-SVD (LC-KSVD) which encourages the coding coefficients from the same class to be as similar as possible. Considering the characteristics of atoms, Song et al. [
12] designed an indicator function to regularize the class-shared dictionary to improve the discriminative ability of coding coefficients. In general, the computation of the test stage is very efficient in the class-shared DL methods, but it is hard to improve the coefficients’ discriminativity for better classification performance as the class-shared dictionary is not enough for fitting the complex data.
In the class-specific dictionary learning, each sub-dictionary is assigned to a single class and the sub-dictionaries with the different classes are encouraged to be as independent as possible. As a representative class-specific DL method, Fisher Discrimination Dictionary Learning (FDDL) [
13] employs Fisher discrimination criterion on the coding coefficients, then utilizes the representation residual of each class to establish the discriminative term. Using with the incoherence constraint, Ramirez et al. [
14] proposed a structured dictionary learning scheme to promote the discriminative ability of the class-specific sub-dictionaries. Akhtar et al. [
15] developed a Joint discriminative Bayesian Dictionary and Classifier learning (JBDC) model to associate the dictionary atoms by the class labels with Bernoulli distributions. The class-specific DL methods usually associate a dictionary atom to a single class directly; hence, the reconstruction error with respect to each class can be used for classification. However, the test stage of this category often requires the coefficient computation of test data over many sub-dictionaries.
In the hybrid dictionary learning, a dictionary is designed to have a set of class-shared atoms in addition to the class-specific sub-dictionaries. Wang and Kong [
16] proposed a hybrid dictionary dubbed DL-COPAR to explicitly separate the common and particular features of the data, which also encourages the class-specific sub-dictionaries to be incoherent. Vu et al. [
17] developed a Low-Rank Shared Dictionary Learning (LRSDL) method to preserve the common features of samples. Gao et al. [
18] developed a Category-specific and Shared Dictionary Learning (CSDL) approach for fine-grained image classification. Wang et al. [
19] designed a structured dictionary consisting of label-particular atoms corresponding to some class and shared atoms commonly used by all the classes, and introduced a Cross-Label Suppression for Dictionary Learning (CLSDL) to generate approximate sparse coding vectors for classification. To some extent, the hybrid dictionary is very effective at preserving the complex structure of the visual data. However, it is nontrivial to design the class-specific and shared dictionaries with the proper number of atoms, which often has a severe effect on classification performance.
In addition to utilizing the class label information, more and more supervised DL approaches have been proposed to incorporate the locality information of the data into the learned dictionary. By calculating the distances between the bases (atoms) and the training samples, Wang et al. [
20] developed a Locality-constrained Linear Coding (LLC) model to select the k-nearest neighbor bases for coding, and set the coding coefficients of other atoms to zero. Wei et al. [
21] proposed locality-sensitive dictionary learning to enhance the power of discrimination for sparse coding. Song et al. [
22] integrated the locality constraints into the multi-layer discriminative dictionary to avoid the appearance of over-fitting. By coupling the locality reconstruction and the label reconstruction, the LCLE-DL method [
7] ensures that the locality-based and label-based coding coefficients are as approximate to each other as possible. It is noted that the locality constraint in LCLE-DL may cause the dictionary atoms from the different classes to be similar, which weakens the discriminative ability of the learned dictionary.
It is observed that the real-world object categories are not only a marked difference, but are also strongly correlated in terms of the visual property, e.g., faces from the different persons often share similar illumination and pose variants; objects in the Caltech 101 dataset [
23] have the correlated background. These correlations are not very helpful to distinguish the different categories, but without them the data with common features cannot be well represented. Thus, the dictionary learning approach should learn the distinctive features with the class-specific dictionary, and simultaneously exploit the common features of the correlated classes by learning a commonality dictionary. To this end, we proposed the locality preserving and label-aware constraint-based hybrid dictionary learning (LPLC-HDL) method for image classification, which is composed of a label-aware constraint, a group regularization and a locality constraint. The main contributions are summarized as follows.
- [1].
The proposed LPLC-HDL method learns the hybrid dictionary by fully exploiting the locality information and the label information of the data. In this way, the learned hybrid dictionary can not only preserve the complex structural information of the data, but also have strong discriminativity for image classification.
- [2].
In LPLC-HDL, a locality constraint is constructed to encourage the samples from different classes with similar features to have similar commonality representation; then, a label-aware constraint is integrated to make the class-specific dictionary sparsely represent the samples from the same class, so that the robust particularity–commonality representation can be obtained by the proposed LPLC-HDL.
- [3].
In a departure from the competing methods which impose the -norm or -norm on the coefficients, LPLC-HDL consists of -norm constraints that can be calculated efficiently. The objective function is solved elegantly by employing an alternative optimization technique.
The rest of this paper is outlined as follows.
Section 2 reviews the related work on our LPLC-HDL method. Then
Section 3 presents the details of LPLC-HDL and an effective optimization is introduced in
Section 4. To verify the efficiency of our method for image classification, the experiments are conducted in
Section 5. Finally, the conclusion is summarized in
Section 6.
2. Notation and Background
In this section, we first provide the notation used in this paper, then review the LCLE-DL algorithm and the objective function of hybrid dictionary learning (DL), which can be taken as the theoretical background of our LPLC-HDL method.
2.1. Notation
Let be a set of N training samples in an m dimension with class labels ; here, C is the class number of the training samples and is a matrix consisting of training samples of the ith class. Suppose and make up the learned hybrid dictionary from the training samples X, where is the atom number of shared dictionary, denotes the atom number of class-specific dictionary and is the atom number of ith class sub-dictionary. Let be the coding coefficients of training samples X over the hybrid dictionary D; then, and represent the coding coefficients over the shared dictionary and the ith class sub-dictionary, respectively.
According to [
24], a row vector of coefficient matrix
Z can be defined as a profile of the corresponding dictionary atom. Therefore, we can define a vector
as the profile of atom
for all the training samples, where the sub-vector
is the sub-profile for the training samples of the
cth class.
2.2. The LCLE-DL Algorithm
To improve the classification performance, the LCLE-DL algorithm [
7] takes both the locality and label information of dictionary atoms into account in the learning process. This algorithm firstly constructs the locality constraint to ensure that similar profiles have similar atoms, then establishes the label embedding constraint to encourage the atoms of the same class to have similar profiles. The objective function of LCLE-DL is defined as follows.
where
and
denote the locality-based and the label-based coding coefficients,
is the graph Laplacian matrix that is constructed by the atom’s similarity in the dictionary
,
is the scaled label matrix which is constructed using the label matrix of the dictionary
.
combined with the second term encodes the reconstruction under the locality constraint;
combined with fourth term encodes the reconstruction under the label embedding;
is used to transfer the label constraint to the locality constraint.
and
are the regularization parameters; the constraint on the atoms can avoid the scaling issue.
The LCLE-DL algorithm first exploits the K-SVD algorithm to learn sub-dictionaries
using with the training samples
. Similar to the label matrix of the training samples
Y, the label matrix of the dictionary
can be obtained as
. Then a weighted label matrix is constructed by
. Next, the label embedding of atoms is defined as
where
is the scaled label matrix of the dictionary
; the above terms make the coding coefficients
have a block-diagonal structure with strong discriminative information.
The learned dictionary inherits the manifold structure of the data via using the derived-graph Laplacian matrix L, and the optimal representation of the samples can be obtained with the label embedding of dictionary atoms. By combining the double reconstructions, LCLE-DL ensures the label-based and the locality-based coding coefficients are as approximate to each other as possible. However, the locality constraint is imposed on the class-specific dictionary in the LCLE-DL algorithm, which may cause the dictionary atoms from the different classes to be similar, thus the discriminative ability of the dictionary is weakened.
2.3. The Objective Function of Hybrid DL
In recent years, the hybrid DL [
16,
19,
25,
26] has been getting more and more attention in the classification problem. The hybrid dictionary has been shown to perform better than the other types of dictionaries, as it can preserve both the class-specific and common information of the data. To learn such a dictionary, we can define the objective function of Hybrid DL as follows.
where
,
is the dictionary shared by all the classes,
is the
ith class sub-dictionary,
and
are the coding coefficients of the samples from the
ith class over
and
, and
and
denote the functions about the hybrid dictionary and the coding coefficients, respectively. The constraint of
typically adopts
[
26] norm or
[
25,
27] norm for sparse coding.
3. The Proposed Method
By learning the shared dictionary , the previous hybrid DL algorithms can capture the common features of the data, but they do not concern the correlation among these features, which will reduce the robustness of the learned dictionary. In this section, we firstly utilize the locality information of the atoms in to construct a locality constraint, then impose it on the coefficients , so that the correlation of the common features is captured explicitly and the learned dictionary is very robust concerning commonality representation. Moreover, once the correlation is discarded, the classification of a query sample will be dominated by the class-specific sub-dictionary corresponding to the correct class to reach minimized data fidelity.
To obtain the discriminative ability of the class-specific dictionary, we further introduce a label-aware constraint as well as group regularization on the distinctive coding coefficients for the particularity representation. Since this constraint integrated with the locality constraint to reconstruct the input data, they will reinforce each other in the learning process, which results in a discriminative hybrid dictionary for image classification.
Accordingly, the objective function of the proposed LPLC-HDL method can be formulated as follows.
where
,
and
are the regularization parameters, which can adjust the weights of the label-aware constraint, the group regularization and the the locality constraint, respectively. Here we set the Euclidean length of all the atoms in the shared dictionary
and the class-specific dictionary
to be 1, which can avoid the scaling issue.
3.1. The Locality Constraint for Commonality Representation
Locality information of the data has played an important role in many real applications. By incorporating locality information for learning a dictionary, we can ensure that the samples with common features tend to have similar coding coefficients [
7]. Further more, the dictionary atoms measure the similarity of the samples, which are more robust to the noise and outliers than the original samples. Hence, we use the atoms of the shared dictionary to capture the correlation among the common features and construct a locality constraint.
Based on the shared dictionary
, we can construct a nearest neighbor graph
as follows.
where
is a parameter to control the exponential function,
kNN(
) denotes
k nearest neighbors of atom
,
indicates the similarity between the atoms
and
. For convenience of calculation, we invariably set the parameters
and
, as they have the stable values in the experiments.
Once
is calculated, we construct a graph Laplacian matrix
as follows.
Since is constructed based on the dictionary , it will be updated in coordination with in the learning process.
By now, we can obtain a locality constraint term for choosing graph Laplacian matrix
as follows.
Because the profile
and the atom
have a one-to-one correspondence, the above equation ensures that similar atoms encourage similar profiles [
7]. Hence, the correlated information of the common features can be inherited by the coefficient matrix
and the graph Laplacian matrix
.
3.2. The Constraints for Particularity Representation
3.2.1. The Label-Aware Constraint
To obtain the particularity representation for the classification, we will assign the labels to the atoms of the class-specific dictionary, as presented in [
7,
19]. If an atom
, the label
i will be assigned to the atom
and kept invariant in the iterations. We take
as the index set for the atoms of
ith class sub-dictionary
, and
as the index set for all the atoms of the class-specific dictionary. For the particularity representation of samples
, it is desirable that the large coefficients should mainly occur on the atoms in
. In other words, the sub-profiles associated with the atoms in
need to be suppressed to some extent.
For the
ith class samples, we construct a matrix
to pick up the sub-profiles from the representation
, which locate at the atoms in
rather than the atoms in
, so that we can define a label-aware constraint term as follows.
and the matrix
can be written as
where
is the
th entry of matrix
. For the particularity representation
, minimizing the label-aware constraint can suppress the large values in the sub-profiles associated with the atoms in
, as well as encourage the large ones in the sub-profiles associated with
ith class atoms. Therefore, it is expected that this constraint with a proper scalar can make the particularity representation approximately sparse. Besides, once a series of matrices
are constructed, they will be unchanged in the iterations. Thus it is very efficient for coding over the class-specific dictionary.
3.2.2. The Group Regularization
Furthermore, to promote the consistency of particularity representation from the same class, we introduce the group regularization on
. In light of the label information of training samples, assume one sample is related to one vertex; the vertices corresponding to the same class samples are connected and neighboring each other; thus, each class forms a densely connected sub-graph. Considering the training samples
and their coding coefficients
, we first define
graph maps with mapping each graph to a line that consists of
points, as follows
Here is the kth component of , which corresponds to the kth atom in the ith class sub-dictionary .
Then, we can calculate the variation for these
graph maps as follows.
where
denotes the normalized Laplacian of the overall graph for the
ith class, which can be derived as
For the different classes, the vertices related to their samples should not be connected; thus, the graphs of C classes are isolated from each other. Therefore, we can obtain the total variation for graph maps of all the C classes as , where . Keeping this group regularization small will promote the consistency of the representation for the same class samples. Moreover, by combing it with the label-aware constraint, the coding coefficients for the different classes will be remarkably distinct with the large coefficients locating in the different areas, which is very favorable for the classification task.
5. Experiments
In this section, we compare LPLC-HDL with the representative dictionary learning methods including D-KSVD [
5], LC-KSVD [
6], LCLE-DL [
7], FDDL [
13], DL-COPAR [
16] and CLSDDL [
19] on the Yale face dataset [
29], the Extended YaleB face dataset [
30], the Labeled Faces in the Wild (LFW) dataset [
31] for face recognition, the Caltech-101 object dataset [
23] and the Oxford 102 Flowers dataset [
32] for object classification and flower classification, respectively. Moreover, we further compare it with the Sparse Representation based Classification (SRC) [
33], the Collaborative Representation based Classifier (CRC) [
34], the Probabilistic CRC (ProCRC) [
35], the Sparsity Augmented Collaborative Representation (SACR) [
36] and some other state-of-the-art methods on the particular datasets.
In the proposed LPLC-HDL method, the three parameters
and
are selected by five-fold cross validation on the training set, and their optimal values on each dataset are shown in
Table 1. In addition, the atom numbers of the shared dictionary and the class-specific dictionary for each dataset are elaborated as detailed in the following experiments.
5.1. Experiments on the Yale Face Dataset
In the experiments, we first consider the Yale face dataset which contains 165 gray scale images for 15 individuals with 11 images per category. Each individual contains one different facial expression or configuration: Left-light, center-light, right-light, w/glasses, w/no glasses, happy, sad, normal, surprised, sleepy and wink, as shown in
Figure 2a. Each image has
pixel resolution, and is resized to 576-dimensional vectors with normalization for representation. Following the setting in [
19], six images of each individual are randomly selected for training and the rest are used for testing. In addition, the number of dictionary atoms is set to
+ 60 = 120, which means four class-specific atoms for each individual and 60 shared atoms for all the individuals. The parameters of our LPLC-HDL on the Yale face dataset can be seen in
Table 1.
To acquire a stable recognition accuracy, we operate LPLC-HDL over 30 times rather than 10 times with independent training/testing splits. The comparison results on the Yale face dataset are listed in
Table 2. Since the variations in terms of facial expression are complex, the variations in the testing samples cannot be well represented via directly using the training data. Hence SRC has worse accuracy than the other dictionary learning methods. We can also see that the hybrid dictionary learning methods including DL-COPAR [
16], LRSDL [
17] and CLSDDL [
19] outperform the remaining competing approaches, and the proposed LPLC-HDL method achieves the best recognition accuracy of 97.01%, which illustrates that our method can distinguish the shared and class-specific information of face images more appropriately.
We further illustrate the performance of LPLC-HDL on face recognition accuracy with different sizes of the shared and class-specific dictionaries, as shown in
Figure 3. From it we can see that a higher recognition accuracy on the Yale face dataset can be obtained by increasing the atom numbers of both the shared and class-specific dictionaries. Besides,
Figure 3 also shows that when the number of the class-specific atoms is few, the recognition accuracy is sensitive to the number of the shared atoms, as the large shared dictionary is harmful to the discriminative power of the class-specific dictionary in this case. However, increasing the number of the class-specific atoms beyond the particular number of training samples of each class brings no notable increase in recognition accuracy. Considering that the large dictionary will slow down both the training time and the testing time, the atom numbers of the shared and class-specific dictionaries are not set to big values in the experiments.
5.2. Experiments on the Extended YaleB Face Dataset
The Extended YaleB face dataset contains 2414 frontal face images of 38 people. For each person, there are about 64 face images and the original images have
pixels. This dataset is challenging due to varying poses and illumination conditions, as displayed in
Figure 2b. For comparison, we use the normalized
images instead of the original pixel information. In addition, we randomly select 20 images per category for training and take the rest as the testing images. The parameters of LPLC-HDL on the Extended YaleB face dataset are also shown in
Table 1. The hybrid dictionary size is set to
+ 76 = 570, which denotes 13 class-specific atoms for each person and 76 shared atoms for all the persons, with the same structure adopted by the other hybrid dictionary learning methods.
We repeatedly run LPLC-HDL and all the comparison methods 10 times for reliable accuracy, and the average recognition rates are listed in
Table 3. As shown in
Table 3, compared with the K-SVD, D-KSVD, FDDL and LC-KSVD methods, LCLE-DL achieves a better recognition result with the same dictionary size. The reason for this behavior is that the LCLE-DL method can effectively utilize the locality and label information of the atoms in the dictionary learning. It is also shown that the hybrid dictionary learning methods, including DL-COPAR, LRSDL, CLSDDL and LPLC-HDL, generally outperform the other DL methods, which demonstrates the discriminative ability of the hybrid dictionary. By integrating the locality and label information into the hybrid dictionary, our LPLC-HDL method obtains the best recognition rate of 97.25%, which outperforms the second best approach CLSDDL-GC by 0.8%, and at least 1.2% higher than the other competing methods.
5.3. Experiments on the LFW Face Dataset
The LFW face dataset has more than 13,000 images with the name of the person pictured, and all of them are collected from the web for unconstrained face recognition and verification. Following the prior work [
7], we use a subset of the LFW face dataset which consists of 1215 images from 86 persons. In this subset, there are around 11–20 images for each person, and all the images are resized to be
. Some samples from this face dataset are shown in
Figure 2c. For each person, eight samples are randomly selected for training and the remaining samples are taken for testing. The parameters of LPLC-HDL on the LFW face dataset are also shown in
Table 1. In addition, the hybrid dictionary size is set to
+ 86 = 430, which means four class-specific atoms for each individual and 86 shared atoms for all the individuals.
We repeatedly run LPLC-HDL and the comparison methods 10 times, the average recognition rates are reported in
Table 4, where the symbol ± denotes the standard deviation of average recognition rates. Similar to the results on the Extended YaleB face dataset, the LCLE-DL method achieves a higher recognition rate than the other shared or class-specific DL methods, and the hybrid dictionary learning methods, e.g., DL-COPAR, LRSDL, CLSDDL and LPLC-HDL, have a super performance in general. The proposed LPLC-HDL method obtains the best result with an average recognition rate of 42.39%, which is significantly better than all the comparison methods.
5.4. Object Classification
In this subsection, we evaluate LPLC-HDL on the Caltech-101 dataset [
23] for object classification. This dataset contains a total of 9146 images from 101 object categories and a background category. The number of images for per category varies from a minimum of 31 to a maximum of 800 images. The resolution of each image is about
, as shown in
Figure 4. Following the settings [
6,
40], we perform the Spatial Pyramid Features (SPFs) on this dataset. Firstly, we partition each image into
sub-regions with different spatial scales
L = 0, 1, 2, then extract SIFT descriptors over a sub-region with a spacing of 8 pixels and concatenate them as the SPFs. Next, we encode the SPFs with a codebook of size 1024. Finally, the dimension of the features is reduced to 3000 using the the Principal Component Analysis (PCA) algorithm.
For this dataset, 10 samples of each category are selected as the training data and the remaining are for testing. The parameters of LPLC-HDL on this dataset can be seen in
Table 1. The hybrid dictionary size is set to
+ 100 = 1018, which denotes nine class-specific atoms for each category and 100 shared atoms for all the categories. The proposed LPLC-HDL and the comparison methods are carried out 10 times, the average classification rates are reported in
Table 5. As can be seen in
Table 5, our LPLC-HDL method achieves the best classification result again, with improvement margins of at least 1.3% compared with the comparison methods.
5.5. Flower Classification
We finally evaluate the proposed LPLC-HDL on the Oxford 102 Flowers dataset [
32] for fine-grained image classification, which consists of 8189 images from 102 categories, and each category contains at least 40 images. This dataset is very challenging because there exist large variations within the same category but small differences across several categories. The flowers appear at different scales, poses and lighting conditions; some flower images are shown in
Figure 5. For each category, 10 images are used for training, 10 for validation, and the rest for testing, as in [
32]. For ease of comparison, we take the convolutional neural network (CNN) features provided by Cai et al. [
35] as the image-level features.
The parameters of LPLC-HDL on the Oxford 102 Flowers dataset can be seen in
Table 1. The size of hybrid dictionary is set to be
+ 100 = 1018, which means nine atoms for each category and 100 atoms as the shared amount. We compare LPLC-HDL with different kinds of representative methods for flower classification, such as the basic classifiers (Softmax and linear SVM [
38]), the recent ProCRC [
35], the related DDL methods and the deep learning models. Following the common measurement [
41,
42], we evaluate the comparison methods by the average classification accuracy of all the categories, and the results are reported in
Table 6. From it we can see that, taking the CNN features as the image-level features, the proposed LPLC-HDL method outperforms the basic classifiers and the competing DL methods. Compared with the other kinds of methods, LPLC-HDL is greatly superior to SparBases [
43] and SMP [
44]. Moreover, our method outperforms the recent deep learning models including GoogLeNet-GAP [
45], ASPD [
46] and AugHS [
47], which need to design the special CNN architectures for flower classification.
5.6. Parameter Sensitivity
In the proposed LPLC-HDL method, there are three key parameters, i.e.,
,
and
, which are used to balance the importance of the label-aware constraint, the group regularization and the locality constraint. To analyze the sensitivity of the parameters, we define a candidate set
,
,
,
,
,
,
,
,
,
,
for them and then perform LPLC-HDL with different combinations of the parameters on the Yale face dataset and the Caltech 101 dataset. By fixing the parameter
, the classification accuracy versus different values of the parameters
and
are shown in
Figure 6a and
Figure 7a. As can be seen in the figures, the best classification result can be obtained when the parameters
and
locate in a feasible range. When
is very small, the effect of the group regularization is limited, leading to weak discrimination of the class-specific dictionary. On the other hand, when
becomes very large, the classification accuracy drops as the remaining terms in (
4) become less important, which decreases the representation ability of the hybrid dictionary.
By fixing the parameters
and
, the classification accuracy versus different values of the parameter
are shown in
Figure 6b and
Figure 7b. From them we can see that the classification accuracy is insensitive to the parameter
when its value located in a certain range, e.g.,
on the Yale face dataset. It should be noted that the incoherence between the shared and class-specific dictionaries is increased with increasing parameter
, which influences the reconstruction of the test data and decreases the classification accuracy.
Due to the diversity of the datasets, it is still an open problem to adaptively select the optimal parameters for the different datasets. In the experiments, we use an effective and simple way to find the optimal values for the parameters , and . Based on the previous analysis, we first fix the parameter to a small value such as 0.01, then search the candidate combination of the parameters and from the coarse set of , , , , , , , , , , . According to the best coarse combination of them, we can further define a fine candidate set where their optimal values may exist. Then we perform the proposed LPLC-HDL again with different combinations of the parameters and selected from the fine candidate set. This way, we can obtain the optimal values of the parameters for all the experimental datasets: hence, the best classification results are guaranteed.
Besides the key parameters
,
and
in our LPLC-HDL method, there are also the parameters
and
k in the proposed locality constraint. In the experiments, we find that these two parameters have stable values on the experimental datasets. For example,
Figure 8a,b show the classification accuracies of our LPLC-HDL method with respect to the parameter
by fixing the remaining parameters on the Yale face dataset and the Caltech 101 dataset. From the subfigures, we can see that the classification accuracy is insensitive to the parameter
, and the approximate best result can be obtained when
. For the parameter
k, this similar phenomenon can be observed on the experimental datasets. Thus we can set
and
in our LPLC-HDL method for convenience of calculation.
5.7. Evaluation of Computational Time
We also conducted experiments to evaluate the running time of the proposed LPLC-HDL and other representative DL methods on the two face datasets and the Caltech-101 dataset, the comparison results are listed in
Table 7. “Train” denotes the running time of each iteration, and “Test” is the average processing time for classifying one test sample. All the experiments are conducted on a 64-bit computer with Intel i7-7700 3.6 GHz CPU and 12 GB RAM under the MATLAB R2019b programming environment. From
Table 7, we can see that, although slower than CLSDDL-LC [
19], the training efficiency of LPLC-HDL is obviously higher than that of D-KSVD [
5], LC-KSVD [
6], FDDL [
13] and DL-COPAR [
16]. In the testing stage, the proposed LPLC-HDL has similar testing efficiency as D-KSVD and LC-KSVD, and the testing process of LPLC-HDL is much faster than that of FDDL and DL-COPAR. Specifically, the average time for classifying a test image by LPLC-HDL is always less than that of CLSDDL-LC on the experiments.