Approximate Nearest Neighbor Search Using Enhanced Accumulative Quantization

Ai, Liefu; Cheng, Hongjun; Wang, Xiaoxiao; Chen, Chunsheng; Liu, Deyang; Zheng, Xin; Wang, Yuanzhi

doi:10.3390/electronics11142236

Open AccessArticle

Approximate Nearest Neighbor Search Using Enhanced Accumulative Quantization

by

Liefu Ai

^1,2,*

,

Hongjun Cheng

¹,

Xiaoxiao Wang

¹,

Chunsheng Chen

¹,

Deyang Liu

^1,2,

Xin Zheng

^1,2

and

Yuanzhi Wang

¹

School of Computer and Information, Anqing Normal University, Anqing 246133, China

²

The Unversity Key Laboratory of Intelligent Perception and Computing of Anhui Province, Anqing Normal University, Anqing 246133, China

^*

Author to whom correspondence should be addressed.

Electronics 2022, 11(14), 2236; https://doi.org/10.3390/electronics11142236

Submission received: 15 June 2022 / Revised: 9 July 2022 / Accepted: 15 July 2022 / Published: 17 July 2022

(This article belongs to the Special Issue Advances in Signal, Image and Information Processing)

Download

Browse Figures

Versions Notes

Abstract

:

Approximate nearest neighbor (ANN) search is fundamental for fast content-based image retrieval. While vector quantization is one key to performing an effective ANN search, in order to further improve ANN search accuracy, we propose an enhanced accumulative quantization (E-AQ). Based on our former work, we introduced the idea of the quarter point into accumulative quantization (AQ). Instead of finding the nearest centroid, the quarter vector was used to quantize the vector and was computed for each vector according to its nearest centroid and second nearest centroid. Then, the error produced through codebook training and vector quantization was reduced without increasing the number of centroids in each codebook. To evaluate the accuracy to which vectors were approximated by their quantization outputs, we realized an E-AQ-based exhaustive method for ANN search. Experimental results show that our approach gained up to 0.996 and 0.776 Recall@100 with eight size 256 codebooks on SIFT and GIST datasets, respectively, which is at least 1.6% and 4.9% higher than six other state-of-the-art methods. Moreover, based on the experimental results, E-AQ needs fewer codebooks while still providing the same ANN search accuracy.

Keywords:

approximate nearest neighbor search; vector quantization; accumulative quantization; quarter point

1. Introduction

With the rapid growth of image data in daily life, large-scale image retrieval has attracted much attention in the research communities for image processing, computer vision and machine learning [1]. In particular, nearest neighbor search has been widely studied for fast image retrieval [2]. Compared with exact nearest neighbor (NN) search, approximate nearest neighbor (ANN) search tasks have lower complexity and higher feasibility [3]. The goal of ANN search is to find the sample in the coding space that is closest in probability to the query with some given metric methods [4]. Normally, to perform an efficient search, the vectors in a dataset need to be stored in memory, which is a key challenge for large-scale datasets [5]. Therefore, ANN-related methods usually apply compact codes to represent each vector in the dataset. According to the current research focus and achievements, there are mainly two categories for ANN search, which are vector quantization and binary hashing [6]. Vector quantization adopts the Euclidean distance to measure the similarity between vectors, while binary hashing usually uses the Hamming distance.

Binary hashing algorithms commonly learn hashing functions to map the image features from high-dimensional space to low-dimensional Hamming space [7,8,9,10,11,12,13,14], matching query and database vectors by their hashing codes [15,16]. However, the representational ability of binary hashing codes is limited by the number of binary codes. As a result, the accuracy of ANN hashing-related methods is generally inferior to that of vector quantization methods [17].

In recent years, many effective quantization methods have been proposed in some research. Among them, product quantization (PQ) [18] was first proposed and applied in ANN search. This method segments the vectors into sub-vectors in the primitive dimension space and quantizes them separately to reduce the complexity of the algorithm. Then it constitutes the approximate representation of feature vectors through the Cartesian product of sub-quantization codes. In this way, PQ can significantly control the quantized storage increment. Many variants based on PQ [5,19,20,21,22,23,24] further improved the performance of ANN search. In addition, PQ can be applied to construct an index for un-exhaustive ANN search [25], which can improve ANN search time efficiency.

Besides PQ and its evolutions, some excellent methods have also been proposed to figure out the optimal scheme of the feature vector compression problem. These different methods were also effectively applied in ANN search. Iterative quantization (ITQ) [26], which has higher precision compared with product quantization, uses the quantization rotation matrix to obtain an optimal result. In addition, some analogous works introduced tree dynamics [27], orthogonality constraints [28] and the additive quantization scheme [29], respectively, to optimize quantization performance. Residual vector quantization (RVQ) [30] has multiple different low-complexity quantizers to reduce the quantization error sequentially. Moreover, to optimize the effect of quantization, some significant methods [31,32,33] have been proposed to provide modified strategies. On the other hand, an analogous method called accumulative quantization (AQ) [34] also constructs approximations by addition.

Recently, some research [1,35,36,37,38,39,40,41] has proposed deep quantization by combining classical quantization methods with a deep neural network. Different from the above methods using artificial image features, deep quantization realizes the end-to-end learning process from image feature extraction to quantization encoding and also works effectively in some applications.

In addition, vector quantization is also used in image compression to reduce the memory requirements for storing an image. The representative works include [42,43], where the codebooks design was treated as an optimization task to minimize the error between an image and its reconstruction. Under bio-inspired metaheuristics, FA-LBG [44] was improved by allowing greater influence of the training set in the codebook design [42]. Additionally, an FSS-LBG [43] was presented, in which a new breeding operation based on fish school search [45] was further proposed to favor exploration ability.

In this paper, we focus on further improving the accuracy of ANN search. We propose an enhanced accumulative quantization that combines accumulative quantization with the idea of the quarter point. Based on our former work of accumulative quantization, after decomposing vectors into several partial vectors, the quarter vector was computed and used as the quantization output instead of the nearest centroid for both training codebooks and quantizing vectors. Due to the lower error produced by the quarter vector than the nearest centroid, the codebooks were used to approximate the vectors more precisely. In addition, we designed a corresponding method of efficiently computing the Euclidean distance between the query vector and a vector in the dataset.

2. Related Works

2.1. Quarter-Point Product Quantization

Like PQ, quarter-point product quantization (QPQ) [24] divides the vectors into M independent sub-vectors, whose dimension is d/M. Then, a dictionary

C

consisting of M sub-codebooks

\{C_{1}, C_{2}, \dots, C_{M}\}

is obtained by performing k-means on sub-vectors, where

C_{i} = \{c_{i, 1}, c_{i, 2}, \dots, c_{i, K}\}

, (

i = 1, \dots M

and

K

are the number of centroids in each sub-codebook).

After dividing a vector y into M sub-vectors {

y (i), i = 1, \dots, M

}, each

y (i)

is quantized by corresponding codebook

C_{i}

. Instead of using the nearest centroid as the quantization output, QPQ uses the quarter point

c_{i}^{'} (y (i))

, which can be generated by the nearest centroid

c_{i}^{1} (y (i))

and the second nearest centroid

c_{i}^{2} (y (i))

. The

c_{i}^{'} (y (i))

can be computed by the Formula (1):

c_{i}^{'} (y (i)) = \frac{3}{4} c_{i}^{1} (y (i)) + \frac{1}{4} c_{i}^{2} (y (i))

(1)

Then, the Euclidean distance d(,) between query vector q and vector y can be formalized with the Equation (2):

d (q, y) = \sum_{i = 1}^{M} d (q (i), c_{i}^{'} (y (i)))

(2)

2.2. Accumulative Quantization

In our former work [34], after decomposing vectors into partial-vectors of the same dimension, each codebook was trained on partial-vectors. Then, we used accumulative quantization (AQ) to approximate a vector according to the sum of its partial vectors, which we quantized with the corresponding codebook.

Taking the partial vector of vector y as

y (i), (i = 1, \dots, M)

, the quantization output of

y (i)

is denoted as

c_{i} (y (i))

. Thus, y can be approximated by the Equation (3):

y \approx \sum_{i = 1}^{M} c_{i} (y (i))

(3)

where

c_{i} (y (i))

is the nearest centroid in codebook

C_{i}

according to the Euclidean distance.

The asymmetric distance between query vector q and the vector y can be computed with the following Equation (4):

y \approx \sum_{i = 1}^{M} c_{i} (y (i))

(4)

where d(,) denotes the Euclidean distance.

The QPQ approach divides the vectors into several low dimensional sub-vectors. It is based on the assumption that the components in a vector are independent of each other, which is not applicable for real data. AQ is performed on the whole vector with several codebooks so that the dependence among the components in a vector is preserved. Although the accuracy of ANN search can be improved by increasing the number K of codebooks or the number M of centroids in each codebook, this brings more costs. Under this background, we tried to combine the AQ and the idea of the quarter point for the purpose of both preserving the dependence among the vector components and improving the accuracy of ANN search without increasing K or M.

3. Enhanced Accumulative Quantization

Generally, for accumulative quantization, the ANN search accuracy can be improved by increasing the number K of centroids when training codebooks. However, larger K results in more training costs. Built on AQ, we propose an enhanced accumulative quantization (E-AQ) to reduce the errors during training codebooks under the same K, so that ANN search accuracy can be improved.

E-AQ consists of codebook training and vector quantization, whereas codebook training includes initial codebook training and codebook optimization.

3.1. Codebook Training

By combining the idea of the quarter point with AQ, with E-AQ we aimed to train M codebooks consisting of K*M centroids to quantize and approximate vectors. Instead of the nearest centroid, the quarter point was applied and used to quantize vectors.

3.1.1. Initial Codebook Training

Let

X = {x_{n} \in R^{D}}, n = 1, \dots, N,

be the training vectors set. Similar to AQ, E-AQ decomposes X into M training sets where the dimension of vectors stays unchanged. Each vector in X is equal to the sum of its corresponding M partial vectors. As shown in Figure 1, each vector was first divided into M D/M-dimension sub-vectors. Then, each sub-vector was padded with 0 until the vector dimension was restored to D. Consequently, X was partitioned into M sets

X (1), \dots, X (M)

.

Towards each training set

X (i)

, K-means was performed to generate an initial codebook

C_{i}

that contained K centroids

{c_{i, 1}, c_{i, 2}, \dots, c_{i, K}}

.

Then the training error was measured by the mean squared error (MSE) between vectors and their reconstructions, shown in Formula (5):

M S E = \frac{1}{N} (\sum_{n = 1}^{N} {‖x_{n} - {\hat{x}}_{n}‖}^{2})

(5)

where

{\hat{x}}_{n}

is the sum of M quantization outputs after decomposing x_n into M partial vectors.

Inspired by the idea of the quarter point, in order to reduce the quantization error, the nearest centroid

c_{i}^{1} (x_{n} (i))

and the second nearest centroid

c_{i}^{2} (x_{n} (i))

in each codebook

C_{i}

were obtained to compute the quarter vector, which was used as the quantization output

{\hat{x}}_{n} (i)

of the partial vector

x_{n} (i)

.

Then,

{\hat{x}}_{n}

was formulized by the Equation (6):

\begin{array}{l} {\hat{x}}_{n} = \sum_{i = 1}^{M} [{\hat{x}}_{n} (i)] \\ = \sum_{i = 1}^{M} [\frac{3}{4} c_{i}^{1} (x_{n} (i)) + \frac{1}{4} c_{i}^{2} (x_{n} (i))] \end{array}

(6)

However, each codebook

C_{i}

was trained independently so as to minimize the error between

x_{n} (i)

and

{\hat{x}}_{n} (i)

. Although the quarter point mechanism was adopted, how to minimize the overall error between

x_{n}

and

{\hat{x}}_{n}

was still not considered. Therefore, a codebook optimization was needed for this purpose.

3.1.2. Codebooks Optimization

After training initial codebooks, the error vector

e_{n}

association to each training vector

x_{n}

can be denoted as the Formula (7):

e_{n} = x_{n} - {\hat{x}}_{n} (1 \leq n \leq N)

(7)

The M initial codebooks were optimized sequentially in an iterative manner, which is detailed in algorithm 1. When all the M codebooks were updated once, an iteration was performed. Then the MSE was computed, and the MSE between two iterations was compared to determine whether to end codebook optimization.

Algorithm 1: Codebook optimization pseudocode.
Inputs: initial codebooks { $C_{i}$ , i = 1, …, M}, quantization outputs { ${\hat{x}}_{n} (i)$ , i = 1, …, M; n = 1, …, N}, overall error vectors { $e_{n}$ , n = 1, …, N}, the number K of centroids in each codebook
Outputs: optimized codebooks { $C_{i}$ , i = 1, …, M}
1.	while true do
2.	for i from 1 to M
3.	compute input vectors by $x_{n}^{'} (i) = {\hat{x}}_{n} (i) + e_{n}$ (n = 1, …, N)
4.	find the nearest centroid in codebook $C_{i}$ for each input vector
5.	for j from 1 to K
6.	collect the input vectors whose nearest centroid is $c_{i, j}$ in $C_{i}$
7.	a mean mechanism is designed in which the mean vector of the above selected input vectors is computed as the new $c_{i, j}$
8.	end for
9.	with updated $C_{i}$ , recompute the quarter point for each input vector and take it as new quantization output ${\hat{x}}_{n} (i)$
10.	update each error vector by $e_{n} = x_{n}^{'} (i) - {\hat{x}}_{n} (i)$
11.	end for
12.	if the MSE computed by Formula (5) converges
13.	terminate the algorithm
14.	end if
15.	end while

3.2. Quantizing Vector Based on E-AQ

To quantize a vector y, we generally followed AQ, including initial vector quantization and quantization output optimization. However, instead of the nearest centroid, we computed the quarter point using E-AQ, which was considered the quantization output so that y could be approximated more accurately.

(1) Initial vector quantization. After decomposing y into M partial vectors

y (m) (m = 1, \dots, M)

, each

y (m)

was quantized with the corresponding codebook

C_{m}

, where the nearest centroid

c_{m}^{1} (y (m))

and second nearest centroid

c_{m}^{2} (y (m))

were found in order to compute the quarter vector according to formula (1). Then, M initial quantization outputs

\hat{y} (m) (m = 1, \dots, M)

were obtained; meanwhile, the overall quantization error vector

e

was computed by Formulas (6) and (7).

(2) Quantization output optimization. Like AQ, the M quantization outputs were optimized sequentially for the purpose of minimizing the quantization error with fixed codebooks. The detail of optimizing is shown in algorithm 2. When the M quantization outputs were optimized completely, an iteration was performed. Then the new generated quantization outputs were compared with those in the last iteration, and the optimization ended if the quantization outputs did not change. Figure 2 displays that the curve of optimizing quantization outputs tended to be flat in small numbers of iterations. We concluded that this process converged rapidly.

Algorithm 2: Quantization output optimization pseudocode.
Inputs: codebooks ${C_{i}$ , i = 1, …, M}, initial quantization outputs ${\hat{y} (m)$ , m = 1, …, M} of vector y, overall error vector e
Outputs: optimized quantization outputs ${\hat{y} (m)$ , m = 1, …, M},
1.	while true do
2.	for m from 1 to M
3.	compute the input vector by $T = \hat{y} (m) + e$
4.	compute the quarter vector and take it as new quantization output $\hat{y} (m)$
5.	compute the residual vector by $e = T - \hat{y} (m)$
6.	end for
7.	if quantization outputs do not change
8.	terminate the algorithm
9.	end if
10.	end while

4. E-AQ-Based Exhaustive ANN Search Method

To evaluate the accuracy of approximating vectors by E-AQ, an exhaustive search for ANN was implemented. Then the Euclidean distance d(,) between query vector q and vector y in the dataset was approximated using the Equation (8):

d (q, y) \approx d (q, \hat{y})

(8)

Suppose the M quantization outputs of y are

\hat{y} (m), m = 1, \dots, M

. By applying Formula (6), Formula (8) can be transformed into the following Formula (9):

\begin{array}{l} d {(q, \hat{y})}^{2} \\ = {‖q - \hat{y}‖}^{2} = {‖q - \sum_{m = 1}^{M} \hat{y} (m)‖}^{2} \\ = {‖q - \sum_{m = 1}^{M} [\frac{3}{4} c_{m}^{1} (y (m)) + \frac{1}{4} c_{m}^{2} (y (m))]‖}^{2} \\ = {‖q‖}^{2} + {‖\hat{y}‖}^{2} - 2 \sum_{i = 1}^{M} 〈q, (\frac{3}{4} c_{m}^{1} (y (m)) + \frac{1}{4} c_{m}^{2} (y (m)))〉 \\ = {‖q‖}^{2} + {‖\hat{y}‖}^{2} - \frac{3}{2} q^{T} \cdot c_{m}^{1} (y (m)) - \frac{1}{2} q^{T} \cdot c_{m}^{2} (y (m)) \end{array}

(9)

The vectors q and y were both assumed to be column vectors; subsequently, the items

q^{T} \cdot c_{m}^{1} (y (m))

and

q^{T} \cdot c_{m}^{2} (y (m))

were identical to the inner product between q and the centroids in the mth codebook, which was calculated and stored in a look-up table when q was submitted. The term

{‖\hat{y}‖}^{2}

was pre-computed when quantizing it completely. The term

{‖q‖}^{2}

was constant for all database vectors and did not affect the ANN search, so it did not need to be computed.

5. Experiments

5.1. Datasets

Two publicly available datasets, the SIFT descriptor dataset and the GIST descriptor dataset, were used to evaluate performance.

Both the SIFT and GIST datasets had three subsets: learning set, database set and query set. The learning set was used to train stage codebooks, and the database and query sets were used for evaluating quantization performance and ANN search performance. For the SIFT dataset, the learning set was extracted from Flicker images [46] and the database and query vectors were extracted from INRIA holidays images [47]. For GIST dataset, the learning set consisted of the tiny image set of [48]. The database set was the holidays image set combined with Flicker 1M. The query vectors were extracted from the holidays image queries. All the descriptors were high-dimensional float vectors. The details of the datasets are given in Table 1.

5.2. Convergence of Training Codebooks

For easy implementation, the total number of iterations (the total number was 10) was set as the convergence condition instead of the preset threshold when optimizing codebooks. The training error produced in training codebooks was adopted to measure the convergence.

The number of centroids in each codebook was denoted as the parameter K, which ranged {64, 256}. The number of codebooks was denoted as the parameter M, which ranged {2, 4, 8, 16}.

Figure 3 and Figure 4 show the convergence of training codebooks under different parameters of K and M on SIFT-1M and GIST-1M, respectively. When the iteration number is 0, the training error represents the error produced while training the initial codebooks.

The training error was measured by the MSE, which was computed according to Formula (5).

With larger K or M, the final training error became smaller. Compared to the error from training initial codebooks, the training error reduced significantly after performing codebook optimization. Moreover, the process of optimizing codebooks could converge rapidly in a few number of iterations, which can be observed from Figure 3 and Figure 4. It took only 5 iterations on SIFT-1M, and only 7 iterations on GIST-1M.

5.3. Comparison of Training Error and Quantization Error

Figure 5 shows the comparison of training error produced by training codebooks between E-AQ and AQ [34] under different parameters on SIFT-1M and GIST-1M. The training error was measured by the MSE, computed according to Formula (5). The parameter K denotes the number of centroids in each codebook, ranged {64, 256}. The parameter M denotes the number of codebooks, ranged {2, 4, 8, 16}.

It can be seen from Figure 5 that the training error of both AQ and E-AQ reduced with increasing values of K and M. Meanwhile, under the same K or M, E-AQ had a smaller training error than AQ on both SIFT-1M and GIST-1M. We may draw the conclusion that E-AQ can further improve the discrimination of codebooks by combining with the quarter point mechanism.

Figure 6 compares the errors produced through quantizing 1M database vectors in SIFT and GIST datasets under different K and M. The overall quantization error was measured with the MSE between vectors and their reconstructions. It can be seen that E-AQ had lower overall quantization error than AQ under the same K and M. This means that vectors can be approximated more accurately by E-AQ than AQ.

5.4. ANN Search Performance

Exhaustive ANN search was implemented on SIFT-1M and GIST-1M to evaluate the accuracy to which vectors were approximated by their quantization output.

This section mainly discusses the impacts of K and M on the accuracy of ANN search, which was measured with Recall@R. This was defined as the proportion of query vectors for which the nearest neighbor was ranked in the first R position. The larger the Recall@R was, the better the accuracy of the ANN search was.

Figure 7 and Figure 8 show the Recall@R (R = 1, 2, 5, 10, 20, 50, 100) when performing E-AQ for exhaustive ANN search on SIFT-1M and GIST-1M respectively. The parameter K was set as

K \in {64, 128, 256, 512}

, and the parameter M was set as

M \in {2, 4, 8, 16}

.

Each curve shows that the ANN search accuracy was improved with increasing values of K under the same M. By comprehensively observing the (a), (b), (c) and (d) in Figure 7 and Figure 8, larger M brought better ANN search accuracy under the same K. Comparing Figure 5 with Figure 7 and Figure 8, larger K and larger M made for lower training error, as well as a more accurate ANN search. Therefore, the conclusion can be drawn that E-AQ based ANN search can be improved by reducing the error in training codebooks.

5.5. Comparison with the State-of-the-Arts

We compared our method with six state-of-the-art methods regarding the accuracy of exhaustive ANN search. These methods included PQ [18], OPQ [19], RVQ [30], QPQ [24], NOCQ [28] and AQ [34].

The parameters in these 6 methods were typically set as K = 256 and M = 8 in the corresponding references. For PQ, OPQ and QPQ, these approaches divided the vectors into 8 sub-vectors and each sub-vector was quantized with a codebook of 256 centroids. For RVQ, NOCQ and AQ, these approaches quantized the vectors with 8 codebooks where each codebook possessed 256 centroids. Hence, we set the parameters K and M to be the same in this experiment for consistency.

Figure 9 compares the exhaustive ANN search accuracy between E-AQ and these 6 approaches on SIFT-1M and GIST-1M, where the accuracy was measured with Recall@R.

From Figure 9a, it can be seen that E-AQ was superior to PQ, OPQ, RVQ, QPQ, NOCQ and AQ under the same scale of codebooks. Due to the larger dimension of the GIST vector compared to the SIFT vector, the Recall@R of all the methods decreased on the GIST-1M dataset, especially for PQ and QPQ. However, E-AQ still had the best ANN search accuracy.

Comparing E-AQ with AQ, E-AQ had better accuracy than AQ on both the SIFT-1M and GIST-1M datasets, demonstrating that E-AQ can further improve AQ by combining with the idea of the quarter point so that a vector can be approximated by its quantization outputs more accurately.

Table 2 and Table 3 also show the comparison between E-AQ and the other 6 methods of exhaustive ANN search, in which the search result of E-AQ with M = 6 and M = 7 is further detailed.

There is a scenario where the dimension of a vector cannot be divided by M when E-AQ decomposes the vector into M partial vectors, such as M = 7 on a SIFT vector. We set the number of non-zero components in each of the first 6 partial vectors as floor(D/M) and the number of non-zero components in the last partial vector as floor(D/M) + (D%M).

From Table 2 and Table 3, we can see that E-AQ outperformed the other methods on the condition of the same K and M. Moreover, E-AQ with M = 7 had better Recall@R than PQ, OPQ, RVQ, NOCQ and QPQ and comparable Recall@R with AQ on both SIFT-1M and GIST-1M. This demonstrates that AQ needs fewer codebooks than the other methods under the condition of the same search accuracy.

6. Conclusions

By combining AQ with the idea of the quarter point, we propose E-AQ for improving the accuracy of ANN search. After decomposing vectors into partial vectors, the quarter vector of each input vector is computed as the corresponding quantization output instead of using the nearest centroid in both training codebooks and quantizing vectors; as a result, vectors can be approximated by their quantization outputs more accurately. The experimental results show that E-AQ obtained better ANN search accuracy than state-of-the-art methods. Furthermore, under the condition of gaining the same ANN search accuracy, E-AQ needed fewer codebooks than other methods. This demonstrates that our approach was effective in further improving the accuracy of exhaustive ANN search.

Our approach achieved an improvement compared to AQ without increasing K and M. However, the process was performed based on the original dimension of the vectors. Therefore, the vector dimensions are still a restriction on performance. We will further investigate a method that performs in low dimensions while preserving the dependence among vector components. The k-means method was used to train the initial codebooks in our approach. It is simple and effective, but has some limitations, such as pre-setting the cluster number and only using the Euclidean distance to measure similarity. We will try to study bio-inspired metaheuristics for training codebooks. Additionally, as exhaustive ANN search needs to compute the distance between the query vector and all the vectors in a dataset, all the distances are sorted; as a result, the search time efficiency mostly depends on the scale of the dataset. In the future, we will investigate an efficient non-exhaustive ANN search with E-AQ.

Author Contributions

Conceptualization, L.A. and H.C.; methodology, L.A. and H.C.; validation, L.A., H.C., X.W., C.C., D.L. and X.Z.; formal analysis, L.A.; investigation, H.C., X.W. and C.C.; resources, L.A., D.L. and X.Z.; writing—original draft preparation, H.C.; writing—review and editing, L.A., D.L. and X.Z.; visualization, H.C., X.W. and C.C.; supervision, L.A.; project administration, L.A., D.L. and X.Z.; funding acquisition, L.A., D.L., X.Z. and Y.W. All authors have read and agreed to the published version of the manuscript.

Funding

This work was supported by the National Natural Science Foundation of China, grant number 61801006; the National Natural Science Foundation of Anhui Province in China, grant numbers 1608085MF144 and 1908085MF194; the University Science Research Project of Anhui Province in China, grant numbers KJ2020A0498 and AQKJ2015B006; and the National Key Research and Development Program of China, grant number SQ2020YFF0402315.

Data Availability Statement

The data presented in this study are openly available in the cited references. Experimental code related to this paper can be obtained by contacting the corresponding author.

Acknowledgments

The authors would like to thank the anonymous reviewers for their constructive comments that greatly improved the quality of this manuscript.

Conflicts of Interest

The authors declare no conflict of interest regarding the publication of this paper. The funders had no role in the design of the study; in the collection, analyses, or interpretation of data; in the writing of the manuscript; or in the decision to publish the results.

References

Jang, Y.K.; Cho, N.I. Self-supervised Product Quantization for Deep Unsupervised Image Retrieval. In Proceedings of the IEEE/CVF International Conference on Computer Vision, Montreal, QC, Canada, 10–17 October 2021; pp. 12085–12094. [Google Scholar]
Li, W.; Zhang, Y.; Sun, Y.; Wang, W.; Li, M.; Zhang, W.; Lin, X. Approximate nearest neighbor search on high dimensional data—Experiments, analyses, and improvement. IEEE Trans. Knowl. Data Eng. 2019, 32, 1475–1488. [Google Scholar] [CrossRef]
Ozan, E.C.; Kiranyaz, S.; Gabbouj, M. K-subspaces quantization for approximate nearest neighbor search. IEEE Trans. Knowl. Data Eng. 2016, 28, 1722–1733. [Google Scholar] [CrossRef]
Liu, H.; Ji, R.; Wang, J.; Shen, C. Ordinal Constraint Binary Coding for Approximate Nearest Neighbor Search. IEEE Trans. Pattern Anal. Mach. Intell. 2019, 41, 941–955. [Google Scholar] [CrossRef] [PubMed]
Pan, Z.; Wang, L.; Wang, Y.; Liu, Y. Product quantization with dual codebooks for approximate nearest neighbor search. Neurocomputing 2020, 401, 59–68. [Google Scholar] [CrossRef]
Ai, L.; Yu, J.; He, Y.; Guan, T. High-dimensional indexing technologies for large scale content-based image retrieval: A review. J. Zhejiang Univ.-Sci. C 2013, 14, 505–520. [Google Scholar] [CrossRef]
Paulevé, L.; Jégou, H.; Amsaleg, L. Locality sensitive hashing: A comparison of hash function types and querying mechanisms. Pattern Recognit. Lett. 2010, 31, 1348–1358. [Google Scholar] [CrossRef] [Green Version]
Xu, H.; Wang, J.; Li, Z.; Zeng, G.; Li, S.; Yu, N. Complementary hashing for approximate nearest neighbor search. In Proceedings of the International Conference on Computer Vision, Barcelona, Spain, 6–13 November 2011; pp. 1631–1638. [Google Scholar]
Zhang, X.; Liu, W.; Dundar, M.; Badve, S.; Zhang, S. Towards large-scale histopathological image analysis: Hashing-based image retrieval. IEEE Trans. Med. Imaging 2014, 34, 496–506. [Google Scholar] [CrossRef]
Huang, Q.; Feng, J.; Zhang, Y.; Fang, Q.; Ng, W. Query-aware locality-sensitive hashing for approximate nearest neighbor search. Proc. VLDB Endow. 2015, 9, 1–12. [Google Scholar] [CrossRef] [Green Version]
Lu, X.; Zheng, X.; Li, X. Latent semantic minimal hashing for image retrieval. IEEE Trans. Image Process. 2016, 26, 355–368. [Google Scholar] [CrossRef]
Gu, Y.; Yang, J. Multi-level magnification correlation hashing for scalable histopathological image retrieval. Neurocomputing 2019, 351, 134–145. [Google Scholar] [CrossRef]
Cai, D. A revisit of hashing algorithms for approximate nearest neighbor search. IEEE Trans. Knowl. Data Eng. 2019, 33, 2337–2348. [Google Scholar] [CrossRef]
Lu, H.; Zhang, M.; Xu, X.; Li, Y.; Shen, H.T. Deep fuzzy hashing network for efficient image retrieval. IEEE Trans. Fuzzy Syst. 2020, 29, 166–176. [Google Scholar] [CrossRef]
Liu, X.; Du, B.; Deng, C.; Liu, M.; Lang, B. Structure sensitive hashing with adaptive product quantization. IEEE Trans. Cybern. 2015, 46, 2252–2264. [Google Scholar] [CrossRef] [PubMed]
Ozan, E.C.; Kiranyaz, S.; Gabbouj, M. M-pca binary embedding for approximate nearest neighbor search. In Proceedings of the 2015 IEEE Trustcom/BigDataSE/ISPA, Helsinki, Finland, 20–22 August 2015; pp. 1–5. [Google Scholar]
Wang, J.; Zhang, T.; Song, J.; Sebe, N.; Shen, H.T. A survey on learning to hash. IEEE Trans. Pattern Anal. Mach. Intell. 2017, 40, 769–790. [Google Scholar] [CrossRef] [PubMed]
Jegou, H.; Douze, M.; Schmid, C. Product quantization for nearest neighbor search. IEEE Trans. Pattern Anal. Mach. Intell. 2010, 33, 117–128. [Google Scholar] [CrossRef] [Green Version]
Ge, T.; He, K.; Ke, Q.; Sun, J. Optimized product quantization. IEEE Trans. Pattern Anal. Mach. Intell. 2013, 36, 744–755. [Google Scholar] [CrossRef]
Kalantidis, Y.; Avrithis, Y. Locally optimized product quantization for approximate nearest neighbor search. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Columbus, OH, USA, 23–28 June 2014; pp. 2321–2328. [Google Scholar]
Heo, J.P.; Lin, Z.; Yoon, S.E. Distance encoded product quantization. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Columbus, OH, USA, 23–28 June 2014; pp. 2131–2138. [Google Scholar]
Zhan, J.; Mao, J.; Liu, Y.; Guo, J.; Zhang, M.; Ma, S. Jointly optimizing query encoder and product quantization to improve retrieval performance. In Proceedings of the 30th ACM International Conference on Information & Knowledge Management, Queensland, Australia, 1–5 November 2021; pp. 2487–2496. [Google Scholar]
Chen, T.; Li, L.; Sun, Y. Differentiable product quantization for end-to-end embedding compression. In Proceedings of the International Conference on Machine Learning, Vienna, Austria, 12–18 July 2020; pp. 1617–1626. [Google Scholar]
An, S.; Huang, Z.; Bai, S.; Che, G.; Ma, X.; Luo, J.; Chen, Y. Quarter-Point Product Quantization for approximate nearest neighbor search. Pattern Recognit. Lett. 2019, 125, 187–194. [Google Scholar] [CrossRef]
Yuan, X.; Liu, Q.; Long, J.; Hu, L.; Wang, S. Multi-PQTable for Approximate Nearest-Neighbor Search. Information 2019, 10, 190. [Google Scholar] [CrossRef] [Green Version]
Gong, Y.; Lazebnik, S.; Gordo, A.; Perronnin, F. Iterative quantization: A procrustean approach to learning binary codes for large-scale image retrieval. IEEE Trans. Pattern Anal. Mach. Intell. 2012, 35, 2916–2929. [Google Scholar] [CrossRef] [Green Version]
Babenko, A.; Lempitsky, V. Tree quantization for large-scale similarity search and classification. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Boston, MA, USA, 7–12 June 2015; pp. 4240–4248. [Google Scholar]
Wang, J.; Zhang, T. Composite quantization. IEEE Trans. Pattern Anal. Mach. Intell. 2018, 41, 1308–1322. [Google Scholar] [CrossRef]
Babenko, A.; Lempitsky, V. Additive quantization for extreme vector compression. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Columbus, OH, USA, 23–28 June 2014; pp. 931–938. [Google Scholar]
Chen, Y.; Guan, T.; Wang, C. Approximate nearest neighbor search by residual vector quantization. Sensors 2010, 10, 1259–11273. [Google Scholar] [CrossRef] [PubMed] [Green Version]
Ai, L.; Yu, J.; Wu, Z.; He, Y.; Guan, T. Optimized residual vector quantization for efficient approximate nearest neighbor search. Multimed. Syst. 2017, 23, 169–181. [Google Scholar] [CrossRef]
Wei, B.; Guan, T.; Yu, J. Projected residual vector quantization for ANN search. IEEE Multimed. 2014, 21, 41–51. [Google Scholar] [CrossRef]
Ai, L.; Cheng, H.; Tao, Y.; Yu, J.; Zheng, X.; Liu, D. Codewords-Expanded Enhanced Residual Vector Quantization for Approximate Nearest Neighbor Search. J. Comput.-Aided Des. Comput. Graph. 2022, 34, 459–469. [Google Scholar]
Ai, L.; Tao, Y.; Cheng, H.; Wang, Y.; Xie, S.; Liu, D. Accumulative Quantization for Approximate Nearest Neighbor Search. Comput. Intell. Neurosci. 2022, 2022, 4364252. [Google Scholar] [CrossRef] [PubMed]
Yu, T.; Yuan, J.; Fang, C.; Jin, H. Product quantization network for fast image retrieval. In Proceedings of the European Conference on Computer Vision, Munich, Germany, 9–14 September 2018; pp. 186–201. [Google Scholar]
Klein, B.; Wolf, L. End-to-end supervised product quantization for image search and retrieval. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Long Beach, CA, USA, 15–20 June 2019; pp. 5041–5050. [Google Scholar]
Jang, Y.K.; Cho, N.I. Generalized product quantization network for semi-supervised image retrieval. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Long Beach, CA, USA, 15–20 June 2020; pp. 3420–3429. [Google Scholar]
Zhai, Q.; Jiang, M. Deep Product Quantization for Large-Scale Image Retrieval. In Proceedings of the IEEE 4th International Conference on Big Data Analytics, Suzhou, China, 15–18 March 2019; pp. 198–202. [Google Scholar]
Yu, T.; Meng, J.; Fang, C.; Jin, H.; Yuan, J. Product Quantization Network for Fast Visual Search. Int. J. Comput. Vis. 2020, 128, 2325–2343. [Google Scholar] [CrossRef]
Liu, M.; Dai, Y.; Bai, Y.; Duan, L.-Y. Deep Product Quantization Module for Efficient Image Retrieval. In Proceedings of the ICASSP 2020—2020 IEEE International Conference on Acoustics, Speech and Signal Processing, Barcelona, Spain, 4–8 May 2020; pp. 4382–4386. [Google Scholar]
Feng, Y.; Chen, B.; Dai, T.; Xia, S.T. Adversarial attack on deep product quantization network for image retrieval. In Proceedings of the AAAI Conference on Artificial Intelligence, New York, NY, USA, 7–12 February 2020; pp. 10786–10793. [Google Scholar]
Severo, V.; Leitao, H.; Lima, J.; Lopes, W.; Madeiro, F. Modified firefly algorithm applied to image vector quantisation codebook design. Int. J. Innov. Comput. Appl. 2016, 7, 202–213. [Google Scholar] [CrossRef]
Fonseca, C.; Ferreira, F.; Madeiro, F. Vector quantization codebook design based on Fish School Search algorithm. Appl. Soft Comput. J. 2018, 73, 958–968. [Google Scholar] [CrossRef]
Horng, M. Vector quantization using the firefly algorithm for image compression. Expert Syst. Appl. 2012, 39, 1078–1091. [Google Scholar] [CrossRef]
Filho, C.; Neto, F.; Lins, A.; Nascimento, A.; Lima, M.P. A novel search algorithm based on fish school behavior. In Proceedings of the IEEE International Conference on Systems, Man and Cybernetics, Singapore, 12–15 October 2008; pp. 2646–2651. [Google Scholar]
Jegou, H.; Douze, M.; Schmid, C. Hamming embedding and weak geometric consistency for large scale image search. In Proceedings of the European Conference on Computer Vision, Berlin/Heidelberg, Germany, 12–18 October 2008; pp. 304–317. [Google Scholar]
The INRIA Holidays Dataset. Available online: http://lear.inrialpes.fr/people/jegou/data.php#holidays (accessed on 13 June 2022).
Torralba, A.; Fergus, F.; Freeman, W.T. 80 million tiny images: A large database for non-parametric object and scene recognition. IEEE Trans. PAMI 2008, 30, 1958–1970. [Google Scholar] [CrossRef]

Figure 1. The data processing flow chart of codebook generation.

Figure 2. The convergence of quantization output optimization on a representative dataset of 1M SIFT database vectors with 8 codebooks, each codebook containing 256 centroids.

Figure 3. The convergence of training codebooks on SIFT-1M.

Figure 4. The convergence of training codebooks on GIST-1M.

Figure 5. The errors produced by training M codebooks on different datasets.

Figure 6. The quantization errors on 1M database vectors under different K and M.

Figure 7. ANN search accuracies under different K and M on SIFT-1M.

Figure 8. ANN search accuracies under different K and M on GIST-1M.

Figure 9. Comparison of exhaustive ANN searches.

Table 1. Summary of the used datasets.

Information	SIFT-1M	GIST-1M
Dimension	128	960
Size of learning set	100,000	500,000
Size of database set	1,000,000	1,000,000
Size of query set	10,000	1000

Table 2. Comparison of exhaustive ANN searches on SIFT-1M.

Methods	Recall@1	Recall@10	Recall@100
PQ	0.228	0.603	0.922
OPQ	0.245	0.639	0.940
RVQ	0.257	0.659	0.952
NOCQ	0.290	0.715	0.970
QPQ	0.315	0.745	0.975
AQ	0.300	0.740	0.980
E-AQ (M = 6)	0.284	0.737	0.963
E-AQ (M = 7)	0.368	0.811	0.989
E-AQ (M = 8)	0.401	0.852	0.996

Table 3. Comparison of exhaustive ANN searches on GIST-1M.

Methods	Recall@1	Recall@10	Recall@100
PQ	0.054	0.121	0.319
OPQ	0.128	0.344	0.696
RVQ	0.113	0.325	0.676
NOCQ	0.140	0.378	0.730
QPQ	0.063	0.152	0.377
AQ	0.110	0.350	0.740
E-AQ (M = 6)	0.121	0.332	0.669
E-AQ (M = 7)	0.123	0.360	0.744
E-AQ (M = 8)	0.145	0.410	0.776

Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

© 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Ai, L.; Cheng, H.; Wang, X.; Chen, C.; Liu, D.; Zheng, X.; Wang, Y. Approximate Nearest Neighbor Search Using Enhanced Accumulative Quantization. Electronics 2022, 11, 2236. https://doi.org/10.3390/electronics11142236

AMA Style

Ai L, Cheng H, Wang X, Chen C, Liu D, Zheng X, Wang Y. Approximate Nearest Neighbor Search Using Enhanced Accumulative Quantization. Electronics. 2022; 11(14):2236. https://doi.org/10.3390/electronics11142236

Chicago/Turabian Style

Ai, Liefu, Hongjun Cheng, Xiaoxiao Wang, Chunsheng Chen, Deyang Liu, Xin Zheng, and Yuanzhi Wang. 2022. "Approximate Nearest Neighbor Search Using Enhanced Accumulative Quantization" Electronics 11, no. 14: 2236. https://doi.org/10.3390/electronics11142236

APA Style

Ai, L., Cheng, H., Wang, X., Chen, C., Liu, D., Zheng, X., & Wang, Y. (2022). Approximate Nearest Neighbor Search Using Enhanced Accumulative Quantization. Electronics, 11(14), 2236. https://doi.org/10.3390/electronics11142236

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Approximate Nearest Neighbor Search Using Enhanced Accumulative Quantization

Abstract

1. Introduction

2. Related Works

2.1. Quarter-Point Product Quantization

2.2. Accumulative Quantization

3. Enhanced Accumulative Quantization

3.1. Codebook Training

3.1.1. Initial Codebook Training

3.1.2. Codebooks Optimization

3.2. Quantizing Vector Based on E-AQ

4. E-AQ-Based Exhaustive ANN Search Method

5. Experiments

5.1. Datasets

5.2. Convergence of Training Codebooks

5.3. Comparison of Training Error and Quantization Error

5.4. ANN Search Performance

5.5. Comparison with the State-of-the-Arts

6. Conclusions

Author Contributions

Funding

Data Availability Statement

Acknowledgments

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI