Abstract
We suggest a new technique for developing noisy tree data structures. We call it a “walking tree”. As applications of the technique we present a noisy Self-Balanced Binary Search Tree (we use a Red–Black tree as an implementation) and a noisy segment tree. The asymptotic complexity of the main operations for the tree data structures does not change compared to the case without noise. We apply the data structures in quantum algorithms for several problems on strings like the string-sorting problem and auto-complete problem. For both problems, we obtain quantum speed-up. Moreover, for the string-sorting problem, we show a quantum lower bound.
Keywords:
noisy computation; self-balanced search tree; segment tree; quantum computation; quantum algorithms; string processing; sorting MSC:
68P05; 68W20; 68Q12
1. Introduction
Tree data structures are well-known and used in different algorithms. At the same time, when we construct algorithms with random behavior like randomized and quantum algorithms, we should consider error probability. We suggest a general method for updating a tree data structure in the noisy case. We call it Walking tree. For a tree of height h, we consider an operation for processing all nodes from a root to a target node. Suppose the running time of the operation is , where T is running time required to process a node. Then, if navigation by the tree can have an error, our technique allows us to carry it out with running time, where is the error probability for the whole operation. Note that the standard way to handle probabilistic navigation is the success probability boosting (repetition of the noisy action) with complexity.
Our technique is based on results for the noisy Binary search algorithm [1]. The authors of that paper present an idea based on the random walk algorithm for a balanced binary tree that can be constructed for the binary search algorithm. We generalize the idea for a tree with any structure that allows us to apply the method to a wide class of tree data structures. Different algorithms for noisy search, especially a noisy tree, and graph processing and search were considered in [2,3,4,5,6,7,8]. We apply our technique to two tree data structures. The first one is the Red–Black tree [9], which is an implementation of a self-balanced binary search tree [9]. If the key comparing procedure has a bounded error, then our noisy self-balanced binary search tree allows us to conduct add, remove, and search operations in running time, where is the error probability for a whole operation and N is the number of nodes in the tree. In the case of , we have running time and the noisy key comparing procedure does not affect running time (asymptotically). At the same time, if we use the success probability boosting technique, then the running time is . The second one is the Segment tree [10,11]. If the indexes comparing procedure has a bounded error, then our noisy segment tree allows us to conduct update and request operations in running time, where is the error probability for a whole operation and N is the number of leaves. In the case of , we have running time. So, we obtain a similar advantage. We use these data structures in the context of quantum computation [12] which is one of the hot topics in the last decades. There are many problems where we can obtain a quantum speed-up. Some of them can be found here [13,14], including graph problems [15,16,17,18,19,20,21] and string processing problems [22,23,24,25,26,27,28]. Quantum algorithms have randomized behavior, so it is important to use noisy data structures for this model. We use the quantum query model [29] as the main computational model for the quantum algorithms. We apply the walking tree method for the following problems.
The first one is the string-sorting problem. We want to sort n strings of a length l in lexicographical order. However, quantum algorithms cannot sort arbitrary comparable objects faster than [30,31]. At the same time, some results improve the hidden constant [32,33]. Other researchers investigated the space-bounded case [34]. The situation with sorting strings is a little bit different. We know that the classical Radix sort algorithm has running time [9] for a finite-size alphabet. That is faster than sorting algorithms for arbitrary comparable objects. Here, the lower bound for classical (randomized or deterministic) algorithms is . In the quantum case, faster algorithms with running time are known [35,36]. Here, does not consider log factors. In this paper, we suggest a simpler implementation based on a noisy red–black tree.
The second one is the Auto-Complete Problem. We have two kinds of queries: adding a string s to the dictionary and querying the most frequent completion of a string t from the dictionary. We call s a completion of t if t is a prefix of s. Assume that L is the total sum of all lengths of strings from all queries. We solve the problem using quantum string comparing algorithm [35,36,37,38,39] and noisy Red–Black tree. The running time of the quantum algorithm is . The lower bound for quantum running time is . At the same time, the best classical algorithm based on trie (prefix tree) [40,41,42,43] has running time. That is also the classical (deterministic or randomized) lower bound . So, we obtain quantum speed-up if most of the strings have length.
2. Preliminaries
In the paper, we use the following notation.
- means that is a logarithm with base 2.
- means a polynomial dependent on N. Formally, for some constants and .
- means big-O notation, upper bound. Formally, if for some .
- means big-O notation with ignoring log factor. Formally, if for some .
- means big-Omega notation, lower bound. Formally, if for some .
In the paper, for two strings s and t, the notation means s precedes t in the lexicographical order. Let be the length of a string s.
2.1. Graph Theory
Let us consider a rooted tree G. Let be the set of nodes (vertices), and be the set of edges. Let one fixed node be the root of the tree. Assume, a procedure GetTheRoot(G) returns it. A path P is a sequence of nodes that are connected by edges, i.e., for . Note, that there are no duplicates among . Here, k is the length of the path. We use notation if there is j such that . The notation is reasonable because there are no duplicates in a path. Because G is a tree, the path between any two nodes u and v is unique. The distance between two nodes v and u is the length of the path between them. The height of v is the distance between it and the root that is . Let be the tree’s height. For a node v, Parent(v) is a parent node, where Parent(v)
and Parent(v); a set of children is Children(v) Parent(u)
.
2.2. Quantum Query Model
In Section 6 we suggest quantum algorithms as applications for our data structures. We have only one quantum subroutine, and the rest part of the algorithm is classical. One of the most popular computation models for quantum algorithms is the query model. We use the standard form of the quantum query model. Let be an M variable function. We wish to compute on an input . We are given oracle access to the input x, i.e., it is implemented by a specific unitary transformation that is usually defined as where the register indicates the index of the variable we are querying, is the output register, and is some auxiliary work-space. The operation is implemented by the CNOT gate. An algorithm in the query model consists of alternating applications of arbitrary unitaries independent of the input and the query unitary, and measurement in the end. The smallest number of queries for an algorithm that outputs with probability on all x is called the quantum query complexity of the function f, and notation is used for it. We use the term running time instead of query complexity to remove confusion with “query” in the definition of problems in Section 6. In the paper, we use modifications of Grover’s search algorithm [44,45] as quantum subroutines. For these subroutines, the time complexity is more than the query complexity for additional log factor [46,47]. Note that in the general case, we can consider a function f with non-Boolean arguments. It can be simulated by a Boolean-argument-function case using a binary representation of arguments.
The modification of Grover’s search algorithm [48] in the case of a known number of solutions can be used in our problems. We refer the readers to [12,29] for more details on quantum computing.
3. Main Technique: A Walking Tree
In this section, we present a rooted tree that we call a walking tree. It is a utility data structure for noisy computation for the main data structure. Here we use it for the following data structures: (i) Binary Search Tree. We assume that elements comparing procedures can have errors. (ii) Segment Tree. We assume that the indexes (borders of segments) comparing procedure can have errors.
Note that the walking tree is a general technique, and it can be used for other tree data structures. Let us present the general idea of the tree. The technique is motivated by [1]. Assume that G is a rooted tree. We are interested in moving from the root to a specific (target) node operation. Assume that we have the following procedures:
- GetTheRoot(G) returns the root node of the tree G.
- SelectChild(v) returns the child of the node that should be reached from the node v. We assume that there is only one child that should be reached from a node.
- IsItTheTarget(v) returns if the node is the last node that should be visited in the operation; and returns otherwise.
- ProcessANode(V) processes the node in the required way.
- IsANodeCorrect(V) returns if the node should be visited during the the operation; and returns if the node is visited because of an error.
Assume that the operation has the following form (Algorithm 1).
| Algorithm 1 An operation on the tree G |
GetTheRoot(G) ProcessANode(v) while IsItTheTarget(v) = False do SelectChild(v) ProcessANode(v) end while |
Let us consider the operation such that “navigation” procedures (that are SelectChild, IsANodeCorrect, and IsItTheTarget) can return an answer with an error p, where , where . We assume that error events are independent. We need to isolate p from . We cannot be sure that the algorithm works if . That is why we use statement. Our goal is to conduct the operation with an error . Note that in the general case, can be non-constant and depend on the number of tree nodes. Let be the height of the tree. The standard technique is boosting success probability. On each step, we repeat SelectChild procedure times and choose the most frequent answer. In that case, the error probability of the operation is at most , and the running time of the operation is , where T is complexity of ProcessANode procedure. Our goal is to have running time.
Let us construct a rooted tree W by G such that the set of nodes of W has a one-to-one correspondence to the nodes of G and the same with sets of edges. We call W a walking tree. Let and be bijections between these two sets. For simplicity, we define procedures for W similar to the procedures for G. Suppose , then GetTheRoot(W) = (GetTheRoot(G)); SelectChild(u) = (SelectChild IsItTheTarget(u) = IsItTheTarget IsANodeCorrect(u) = Note that the navigation procedures are noisy (have an error). We reduce the error probability to by constant number of repetitions (using the boosting success probability technique). Additionally, we associate a counter with a node that is a non-negative integer number. Initially, values of counters are 0 for all nodes, i.e., for each .
We invoke a random walk by the walking tree W. The walk starts from the root node GetTheRoot(W). Let us discuss processing . Firstly, we check the counter’s value . If , then we carry out steps from 1.1 to 1.3.
Step 1.1. We check the correctness of current node using IsANodeCorrect(u) procedure. If the result is , then we go to Step 1.2. If the result is , then we are here because of an error, and we go up by changing Parent(u). If the node u is the root, then we stay in u.
Step 1.2. We check whether the current node is target using IsItTheTarget(u) procedure. If it is , then we increase the counter . If it is , then we go to Step 1.3.
Step 1.3. We go to the children SelectChild(u).
If , then we carry out Step 2.1. We can say that the counter is a measure of confidence that u is the target node. If , then we should continue walking. If , then we think that u is the target node. If a bigger value of means we are more confident that it is the target node.
Step 2.1. If IsItTheTarget(u) = True, then we increase the counter . Otherwise, we decrease the counter . So, we become more or less confident in the fact that the node u is the target.
The walking process stops in s steps, where . The stopping node u is the target one. After that, we carry out the operation with the original tree G. We store path in , such that , = Parent, and is the root node of G. Then, we process them using ProcessANode for i from 1 to k. Let a procedure OneStep(u) be one step of the walking process on the walking tree W. It accepts the current node u and returns the new node. The code representation of the procedure is in Algorithm 2.
| Algorithm 2 One step of the walking process, OneStep(u). The input is and the result is the node for the next step of the walking |
if then if IsANodeCorrect(u) = False then ▹ Step 1.1 if GetTheRoot(W) then Parent(u) end if else if IsItTheTarget(u) = True then ▹ Step 1.2 else SelectChild(u) ▹ Step 1.3 end if end if else if IsItTheTarget(u) = True then ▹ Step 2.1 else end if end if |
The whole algorithm is presented in Algorithm 3.
| Algorithm 3 The walking algorithm for steps |
GetTheRoot(W) fordo Onestep(u) end for , , while
GetTheRoot(G) do Parent(v) ▹ Here ∘ means the concatenation of two sequences. The line adds the node to the beginning of the path sequence ▹ The length of the path sequence end while for do ProcessANode end for |
Let us discuss the algorithm and its properties. On each node, we have two options, we go in the direction of the target node or the opposite direction.
Assume that of the current node u. If we are in a wrong branch, then the only correct direction is the parent node. If we are in the correct branch, then the only correct direction is the correct child node. All other directions are wrong. Assume that . If we are in the target node, then the only correct direction is increasing the counter, and the wrong direction is decreasing the counter. Otherwise, the only correct direction is decreasing the counter.
Choosing the direction is based on the results of at most three invocations of navigation procedures (SelectChild, IsANodeCorrect, and IsItTheTarget). Remember that we reach error probability using a constant number of repetitions. Due to the independence of error events, the total error probability of choosing a direction is at most . So, the probability of moving in correct direction is at least and for a wrong direction it is at most . Let us show that if , then the error probability for an operation on G is . Note that can be non-constant. In Corollarys 1 and 2, we have .
Theorem 1.
Given error probability , Algorithm 3 completes the same action as Algorithm 1 with a running time of .
Proof.
Let us consider the walking tree. We emulate the counter by replacing it with a nodes chain of length . Formally, for a node we add nodes such that Parent, Parent for . The only child of is for , and does not have children.
In that case, the increasing of can be emulated by moving from to . The decreasing can be emulated by moving from to . We can assume that is the node u itself.
Let be the target node, i.e., IsItTheTarget. Let us consider the distance L between the target node and the current node in the modified tree. The distance L is a random variable. Each step of the walk increases or decreases the distance L by 1. So, we can present , where is the root node of W, , and are independent random variables that represent i-th step and show increasing or decreasing the distance. Let if we move in the correct direction, and if we move in the wrong direction. Note that the probability of moving to the correct direction () is at least and the probability of moving to the wrong direction () is at most . From now on, without loss of generality, we assume that and .
If , then we are in the node in the modified tree and in the node in the original walking tree W. Note that , where by the definition of the height of a tree. Therefore, means . So, the probability of success of the operation is the probability of the event, i.e., .
Let for . We treat as independent binary random variables. Let . For such X and for any , the following form of Chernoff bound [49] holds
Since , we have and the inequality (1) becomes
Substituting Y for X we, obtain
From now on without loss of generality, we assume that for some . Let and .
In the following steps, we relax the inequality by obtaining less tight bounds for the target probability.
Firstly, we obtain a new lower bound
and hence
Secondly, we obtain a new upper bound
Combining the two obtained bounds we have
and hence
Considering the probability of the opposite event we finally obtain
□
In the next section, we show several applications of the technique.
4. Noisy Tree Data Structures
4.1. Noisy Binary Search Tree
Let us consider a Self-Balanced Search Tree [9]. It is a binary rooted tree G. Let . We associate a comparable element with a node . (i) For a node , we have where is from the left sub-tree of v; and where is from the right sub-tree of v. (ii) The height of the tree .
As an implementation of Self-Balanced Search Tree, we use a Red–Black Tree [9,50]. It allows us to add and remove a node with a specific value with running time. Assume that the comparing procedure of two elements has an error p. Each operation (remove and add an element) has three steps: searching, carrying out the action (removing or adding), and re-balancing. Re-balancing does not invoke comparing operations, that is why it does not have an error. So, the only “noisy” procedure (which can have an error) is searching. Let us discuss it.
Let us associate and with a node . That are left and right bounds for with respect to the ancestor nodes. Formally, is an ancestor of , is an ancestor of . We can compute them as follows. If v is the root, then , and . Here, and are constants that are a priori less and more than any for . Let v be a non-root node. If v is the left child of Parent(v), then Parent(v)), Parent(v)). If v is the right one, then Parent(v)), Parent(v)). Assume that a comparing function for elements Compare(a,b) returns if ; if ; 0 if . An error probability for the function is for some . Let us present each of the required procedures for searching an object x operation. GetTheRoot(G) is for the root node of G. SelectChild(v) returns the left child of v if Compare; and returns the right child if Compare. IsItTheTarget(v) returns if Compare. ProcessANode(v) does nothing. IsANodeCorrect(v) returns if , formally, Compare and Compare. The presented operations satisfy all requirements. Let us present the complexity result that directly follows from Theorem 1.
Theorem 2.
Suppose the comparing function for elements of Red–Black Tree has an error for some . Then, using the walking tree, we can carry out searching, adding and removing operations with running time and an error probability ε.
Remind that it is important for the proof to isolate p from . We cannot be sure that the algorithm works if . If , then the “noisy” setting does not affect asymptotic complexity.
Corollary 1.
Suppose the comparing function for elements of the Red–Black Tree has an error for some . Then, using the walking tree, we can carry out searching, adding, and removing operations with running time and an error probability .
4.2. Noisy Segment Tree
We consider a standard segment tree data structure [10,11] for an array for some integer . The segment tree is a full binary tree such that each node corresponds to a segment of the array b. If a node v corresponds to a segment , then we store a value that represents some information about the segment. Let us consider a function f such that . A segment of a node is the union of segments that correspond to its two children. Typically, the children correspond to segments and , for . We consider such that it can be computed by values of two children and , where and are left and right children of v. Leaves correspond to single elements of the array b. As an example, we can consider integer values and sum as the value in a node v and a corresponding segment . The data structure allows us to invoke the following requests in running time.
- Update. Parameters are an index i and an element x (). The procedure assigns . For this goal, we assign x for the leaf w that corresponds to the and update ancestors of w.
- Request. Parameters are two indexes i and j (), the procedure computes .
The main part of both operations is the following. For the given root node and an index i, we should find the leaf node corresponding to . The main step is the following. If we are in a node v with associated segment , then we compare i with a middle element and choose the left or the right child. Assume that we have a comparing function for indexes Compare(a,b) that returns if ; if ; 0 if . The comparing function returns the answer with an error for some .
Let us present each of the required procedures for searching the leaf with index i in a segment tree G. GetTheRoot(G) returns the root node of the segment tree G. For , and the segment associated with a node v, the function SelectChild(v) returns the left child of v if Compare ; and returns the right child otherwise. IsItTheTarget(v) returns if , formally, Compare and Compare; and returns otherwise. Here the segment is associated with v. ProcessANode(v) recomputes according to the values of in the left and the right children. IsANodeCorrect(v) returns if , formally, Compare and Compare; and returns otherwise. Here, the segment is associated with v. The presented operations satisfy all requirements. Let us present the complexity result that directly follows from Theorem 1.
Theorem 3.
Suppose, the comparing function for indexes of a segment tree is noisy and has an error for some . Then, using the walking tree, we can conduct update and request operations with running time and an error probability ε.
If we take , then the “noisy” setting does not affect asymptotic complexity.
Corollary 2.
Suppose, the comparing function for indexes of a segment tree is noisy and has an error for some . Then, using the walking tree, we can carry out update and request operations with running time and an error probability
We can apply the same ideas for the segment tree with range updates [11] (examples of applications can be found in [51,52]) or other modifications of the data structure. If we have access to the full segment tree, including leaves, then we can complete operations without the walking tree. We can use the noisy binary search algorithm [1] for searching the leaf that corresponds to the index i and then process all the ancestors of the leaf. At the same time, if we have only access to the root node or compressed case, then a noisy segment tree is more useful.
Theorem 3 and Corollary 2 are the direct application of the main technique and results from Theorem 1 to a specific data structure (segment tree). This is one of two applications that are presented in this paper.
Analysis, Discussion, Modifications for Noisy Segment Tree
There are different additional operations with a segment tree. One such example is the segment tree with range updates. In this modification, we can update the values for a range by a value in one request. The reader can find more information in [11] and examples of applications in [51,52]. The main operation with a noisy comparing procedure is the same. So, we can still use the same idea for such modifications of the segment tree.
Remark 1.
If the segment tree is constructed for an array , then we can extend it to , where that is closest to n power of 2 and are neutral element for the function f. If we have a node v and two borders and of the segment associated with v, then we always can compute the segments for the left and the right children that are and for . Additionally, we can compute the segment for the parent that is , where if the node is the left child of its parent. If the node is the right child of its parent, then the parent’s segment is , where . Therefore, we should not store the borders of a segment in a node, and we can compute them during the walk on the segment tree. Additionally, we should not construct the walking tree. We can keep it in mind and walk by the segment tree itself using only three variables: the and borders of the current segment and a counter if required.
If we have access to the full segment tree, including leaves, then we can conduct operations without the walking tree. We can use the noisy binary search algorithm [1] for searching the leaf that corresponds to the index i, and then process all the ancestors of the leaf. There are at least two useful scenarios for a noisy segment tree.
1. We have access only to the root and have no direct access to leaves.
2. The second one is the compressed segment tree. If initially, all elements of the array b are empty or neutral for the function f, then we can compress a subtree with one node with a label of a segment with empty elements. On each step, we do not store any subtree if it has only neutral elements. In that case, we store only a root of this tree and mark it as a subtree with neutral elements. It is reasonable if n is very big and storing the whole tree is very expensive. In that case, we can replace the noisy binary tree with the noisy self-balanced search tree from the previous section. The search tree stores the updated elements in leaves, and we can search the required index in this data structure. At the same time, the noisy segment tree uses much less memory with respect to Remark 1. That is why a noisy segment tree is more effective in this case too.
5. Quantum Sort Algorithm for Strings
As one of the interesting applications, we suggest applications from quantum computing [12,29]. Let us discuss the string-sorting problem as one of the applications.
Problem: There are n strings of size l for some positive integers n and l. The problem is to find a permutation such that , or and for each . The Quantum sorting algorithm for strings was presented in [35,36]. The running time of the algorithm is . We can present the algorithm with the same complexity but in a simpler way. Assume that we have a noisy self-balanced binary search tree with strings as keys and a quantum string comparing procedure.
There is a quantum algorithm for comparing two strings quadratically faster than any classical counterparts. The algorithm is based on modifications [39,53] of Grover’s search algorithm [44,45]. The result is the following.
Lemma 1.
([36]). There is a quantum algorithm that compares two strings s and t of lengths and in the lexicographical order with running time and error probability .
We assume that the comparing procedure compares strings in lexicographical order, and if they are equal, then it compares indexes. In fact, we store indexes of the strings in nodes. We assume that if a node stores a key index i, then any node from the left subtree has a key index j such that or ( and ); and any node from the right subtree has a key index such that or ( and ). Initially, the tree is empty. Let be a function that adds a string to the tree. Let be a function that returns the index of the minimal string s from the tree according to the comparing procedure. After returning the index, the function removes it from the tree. The algorithm is two for-loops. The first one is adding all strings one by one using for . The second one is obtaining the index of minimal strings for . The code representation of the algorithm is in Appendix A. The second For-loop can be replaced by in-order traversal (dfs) of the tree for constructing the list. This approach has a smaller hidden constant in big-O. The full idea is presented in Appendix A for completeness.
Theorem 4.
The quantum running time for sorting n string of length l is and .
The upper bound is complexity of the presented algorithm and algorithm from [36]. The proof of the lower bound is presented below.
For simplicity, we assume that strings are binary, i.e., , for . Let us formally define the sorting function.
For positive integers , let be a function that obtains n binary strings of length l as input and returns a permutation of n integers that is a result of sorting input strings. Here, is a set of all permutations of integers from 0 to . For , we have where is a permutation such that or ( and ), for .
Note that in the case of , the function can be used to compute the majority function. We use to sort strings and the -th string is a value of the majority function. Therefore, we expect that complexity of should be [54]. In the case of , the function is similar to the OR function, so we expect that it requires queries [54].
Note that in the case of , the function can be used to compute the majority function. We use to sort strings and the -th string is a value of the majority function. Therefore, we expect that complexity of should be . In the case of , the function is similar to the OR function, so we expect that it requires queries.
Formally, we prove the following.
Lemma 2.
For positive integers , let be a majority function, and be a function that returns the minimal index of one in the input.
Proof.
Consider the input . Suppose that .
Then the proof of (2) follows from the fact that
Take an input . Let be a pair of words and y. Now we see that
and this completes the proof. □
Note that [36] proves the lower bound of the form . Combining their result with the Lemma 2, we get the following corollary.
Corollary 3.
The complexity of is .
6. Auto-Complete Problem
In this section, we present the Auto-Complete Problem and a quantum algorithm that is another application of the Noisy Binary Search Tree.
Problem: Assume that we use some constant-size alphabet, for example, binary, ASCII, or Unicode. We work with a sequence of strings , where is the length of the sequence; and are increasing indexes of strings. Here, the index is the index of the query for adding this string to . Initially, the sequence is empty. Then, we have n queries of two types. The first type is adding a string s to the sequence . Let be a number of occurrence (or “frequency”) of a string u among . The second type is querying the most frequent completion from of a string t. Let us define it formally. If t is a prefix of , then we say that is a completion of t. Let is a completion of be the set of completions for t, and let be the maximal “frequency” of strings from . The problem is to find the index .
We use a Self-Balanced Search tree for our solution. A node v of the tree stores 4-tuple , where i is an index of a string that is “stored” in the node, and . The tree is a Search tree by this strings similar to storing strings in Section 5. For comparing strings, we use a quantum procedure from Lemma 1. Therefore, our tree is noisy. The index j is the index of the most “frequent” string in the sub-tree whose root is v, and . Formally, for any node from this sub-tree if is associated with , then or ( and ).
Initially, the tree is empty. Let us discuss processing the first type of query. We want to add a string s to . We search a node v with associated such that . If we can find it, then we increase . It means j parameter of the node v or its ancestors can be updated. There are at most ancestors because the height of the tree is . So, for each ancestor of v that associated with , if or ( and ), then we update and .
If we cannot find the string s in , then we add a new node to the tree with associated 4-tuple , where r is the index of the query. Note that if we re-balance nodes in the Red–Black tree, then we easily can recompute j and elements of nodes.
Assume that we have a Search(s) procedure that returns a node v with associated where . If there is no such a node, then the procedure returns . A procedure AddAString(r) adding a node to the tree. A procedure GetTheRoot returns the root of the search tree. The processing of the first type of query is presented in Algorithm 4.
| Algorithm 4 Processing a query of the first type with an argument s and a query number r |
Search(s) if
then is associated with v if or and then end if while GetTheRoot do Parent(v) is associated with v if or and then end if end while else AddAString(r) end if |
Let us discuss processing the second type of query. All strings that can be a completion for t belong to the set and }. Here, we can obtain from the string t by replacing the last symbol with the next symbol from the alphabet. Formally, if , then , where the symbol succeeds in the alphabet. We can say that . The query processing consists of three steps.
Step 1. We search for a node such that t should be in the left sub-tree of v and should be in the right sub-tree of v. Formally, , and . For implementing this idea the procedure IsItTheTarget(v) checks the following condition:
The procedure SelectChild(v) returns the right child if , i.e.,
and returns the left child otherwise.
If we come to the null node, then there are no completions of t. If we find , then we carry out the next two steps. In those two steps, we compute the index of the required string and .
Step 2. Let us look to the left sub-tree with t. Let us find a node that contains an index of the minimal string . For implementing this idea, the procedure IsItTheTarget(v) checks whether the current string is t. Formally, it checks Compare. Additionally, the procedure saves the node if , i.e., Compare. The procedure SelectChild(v) works as for searching t. It returns the right child if , i.e., Compare, and returns the left child otherwise. In the end, the target node is stored in .
Then, we go up from this node. Let us consider a node v. If it is the right child of Parent(v), then it is bigger than the string from Parent(v) and the left child’s sub-tree, so we do nothing. If it is the left child of Parent(v), then it is less than the string from Parent(v) and all strings from the right child’s sub-tree, so we update and by values from the parent node and the right child. Formally, if for the Parent(v) node and for the right child node, then we complete the following actions. If or ( and ), then and . If or ( and ), then and . This idea is presented in Algorithm 5.
| Algorithm 5 Obtaining the answer of Step 2 by |
while
Parent
do Parent(v) if v = LeftChild then is associated with if or ( and ) then end if is associated with RightChild if RightChild then if ( or ( and )) then end if end if end if end while |
Step 3. Let us look to the right sub-tree with . Let us find a node that contains an index of the maximal string . Then, we go up from this node and carry out the symmetric actions as in Step 2.
Each of these three steps requires running time because each of them observes nodes of a single branch. The complexity of the quantum algorithm is presented in Theorem 5. Before presenting the theorem, let us present a lemma that helps us to prove quantum and classical lower bounds.
Lemma 3.
Auto-complete problem at least as hard as unstructured search 1 among bits.
Proof.
Assume that the alphabet is binary. For any other case, we just consider two letters from the alphabet. Assume that all strings from queries of the first type have length k.
Let . Let us consider the following case. We have m queries of the first type of the next form: , Here, for all and except one bit. We have two cases.
The first case is the following. There is only one pair such that , , and . In the second case, there is no such a pair.
Next m queries of the first type of the next form .
If is odd, then we add a query of the first type of the next form .
The last query of the second type of the form .
If we have the first case, then , and the answer is . If we have the second case, then , and has a smaller index. Therefore, the answer is .
Hence, the answer for the input is at least as hard as distinguishing between these two cases. At the same time, it requires searching 1 among bits. □
Based on the presented lemma and the discussed algorithm, we can present quantum lower and upper bounds for the problem in the next theorem.
Theorem 5.
The quantum algorithm with a noisy Self-Balanced Search tree for the Auto-Complete Problem has running time and error probability , where L is the sum of lengths of all queries. Additionally, the lower bound for quantum running time is .
Proof.
Let us start with the upper bound. Let us consider the processing of the first type of query. Here we add a string s. Firstly, we find the target node with running time and error probability according to Corollary 1. Then, we consider at most ancestors for updating with running time and no error. So, processing the first type of query works with running time and error probability.
Let us consider the processing of the second type of query. Here we search for a completion for a string t. Searching nodes , and works with running time and error probability according to Corollary 1. Then, we consider at most ancestors for updating with running time and no error. So, processing the second type of query works with running time and error probability.
Let be the lengths of strings from queries. So, the total complexity is
The last two equalities are due to Cauchy–Bunyakovsky–Schwarz inequality and .
The error probability of processing a query is at most . Therefore, the error probability of processing n queries is at most due to all error events are independent.
Let us discuss the lower bound. It is known [55] that the quantum running time for the unstructured search among variables is . So, due to Lemma 3 we obtain the required lower bound. □
Let us consider the classical (deterministic or randomized) case. If we use the same Self-balanced search tree, then the running time is . At the same time, if we use the Trie (prefix tree) data structure [40], then the complexity is .
We store all strings of in the trie. For each terminal node, we store the “frequency” of the corresponding string. For each node v (even non-terminal) that corresponds to a string u as a path from the root to v, additionally to the regular information we store the index of the required completion for t and its frequency. When we process the first type of query, we update the frequency in the terminal node and update additional information in all ancestor nodes because they store all possible prefixes of the string. For processing the query of the second type, we just find the required node and take the answer in the additional information of the node.
We can show that it is also the lower bound for the classical case.
Lemma 4.
The classical running time for the Auto-Complete Problem is , where L is the sum of the lengths of all queries.
Proof.
Let us start with the upper bound. Similarly to the proof of Theorem 5, we can show that the running time of processing a query of both types is or depends on the type of a query. Let be the lengths of strings from queries. So, the total complexity is .
Let us discuss the lower bound. It is known [55] that the classical running time for the unstructured search among variables is . So, due to Lemma 3 we obtain the required lower bound. □
If strings have a length that is at least , then we obtain a quantum speed-up.
7. Conclusions
We suggest the Walking tree technique for noisy tree data structures. We apply the technique to the Red–Black tree and the Segment tree. We show that the complexity of the main operations is asymptotically equal to the complexity of standard (not noisy) versions of the data structures. We use a noisy Red–Black tree for two problems: The string-sorting Problem and The auto-complete problem. The considered algorithms are quantum because they use the quantum string comparing algorithm as a subroutine. This subroutine demonstrates quadratic speed-up, but it has a non-zero error probability. For the string-sorting problem, we show lower and upper bounds that are the same up to a log factor. Note that lower bounds for , , and functions may be of independent interest. For the auto-complete problem, we obtain quantum speed-up for the problems if where n is the number of queries and l is the size of an input string of a query. Future work can include applying the Walking tree technique to other tree data structures and obtaining the noisy version of them with good complexity of main operations. Also, it is interesting to find more applications for noisy data structures. We assume that quantum algorithms should be one of the fruitful fields for such applications. It is interesting to meet quantum lower and upper bounds for the considered problems.
Author Contributions
The main idea of the technique, K.K.; technical proofs for the main technique, N.S.; lower bounds, M.Z.; applications, K.K. and D.M.; constructions and concepts, K.K. and M.Z. All authors have read and agreed to the published version of the manuscript.
Funding
This paper has been supported by the Kazan Federal University Strategic Academic Leadership Program (PRIORITY-2030).
Data Availability Statement
There is no data for the research.
Acknowledgments
We thank Aliya Khadieva for useful discussions.
Conflicts of Interest
The authors declare no conflict of interest. The funders had no role in the design of the study; in the collection, analyses, or interpretation of data; in the writing of the manuscript, or in the decision to publish the results.
Appendix A. Quantum Sorting Algorithm—The Second Approach
Assume that we want to construct a result list. We use variable as a result.
We use the recursive procedure GetListByTree(v) for in-order traversal (dfs) of the searching tree. Here, v is the processing node. Assume that we have GetTheLeftChild(v) for obtaining the left child of v, GetTheRightChild(v) for obtaining the right child of v; GetIndexOfString(v) for obtaining the index of the string that is stored in v. The procedure is presented in Algorithm A1.
| Algorithm A1 The recursive procedure GetListByTree(v) for in-order traversal (dfs) of the searching tree |
if
then GetListByTree(GetTheLeftChild(v)) GetIndexOfString(v) GetListByTree(GetTheRightChild(v)) end if |
The total sorting algorithm is Algorithm A2.
| Algorithm A2 Quantum string-sorting algorithm |
for
do Add(i) end for ▹ Initially, the list is empty GetListByTree(GetTheRoot()) for
do end for |
References
- Feige, U.; Raghavan, P.; Peleg, D.; Upfal, E. Computing with noisy information. SIAM J. Comput. 1994, 23, 1001–1018. [Google Scholar] [CrossRef]
- Pelc, A. Searching with known error probability. Theor. Comput. Sci. 1989, 63, 185–202. [Google Scholar] [CrossRef]
- Karp, R.M.; Kleinberg, R. Noisy binary search and its applications. In Proceedings of the Eighteenth Annual ACM-SIAM Symposium on Discrete Algorithms, Philadelphia, PA, USA, 7–9 January 2007; pp. 881–890. [Google Scholar]
- Emamjomeh-Zadeh, E.; Kempe, D.; Singhal, V. Deterministic and probabilistic binary search in graphs. In Proceedings of the Forty-Eighth Annual ACM Symposium on Theory of Computing, Cambridge, MA, USA, 19–21 June 2016; pp. 519–532. [Google Scholar]
- Dereniowski, D.; Łukasiewicz, A.; Uznański, P. An efficient noisy binary search in graphs via median approximation. In Proceedings of the 32nd International Workshop on Combinatorial Algorithms, Ottawa, ON, Canada, 5–7 July 2021; pp. 265–281. [Google Scholar]
- Deligkas, A.; Mertzios, G.B.; Spirakis, P.G. Binary search in graphs revisited. Algorithmica 2019, 81, 1757–1780. [Google Scholar] [CrossRef]
- Boczkowski, L.; Korman, A.; Rodeh, Y. Searching on trees with noisy memory. arXiv 2016, arXiv:1611.01403. [Google Scholar]
- Dereniowski, D.; Kosowski, A.; Uznanski, P.; Zou, M. Approximation Strategies for Generalized Binary Search in Weighted Trees. In Proceedings of the 44th International Colloquium on Automata, Languages, and Programming (ICALP 2017), Warsaw, Poland, 10–14 July 2017. [Google Scholar]
- Cormen, T.H.; Leiserson, C.E.; Rivest, R.L.; Stein, C. Introduction to Algorithms; McGraw-Hill: New York, NY, USA, 2001. [Google Scholar]
- Mark, D.B.; Otfried, C.; Marc, V.K.; Mark, O. Computational Geometry Algorithms and Applications; Spinger: Berlin/Heidelberg, Germany, 2008. [Google Scholar]
- Laaksonen, A. Guide to Competitive Programming; Springer: Berlin/Heidelberg, Germany, 2017. [Google Scholar]
- Nielsen, M.A.; Chuang, I.L. Quantum Computation and Quantum Information; Cambridge University Press: Cambridge, UK, 2010. [Google Scholar]
- de Wolf, R. Quantum Computing and Communication Complexity; University of Amsterdam: Amsterdam, The Netherlands, 2001. [Google Scholar]
- Jordan, S. Quantum Algorithms Zoo. 2021. Available online: http://quantumalgorithmzoo.org/ (accessed on 20 September 2023).
- Dürr, C.; Heiligman, M.; Høyer, P.; Mhalla, M. Quantum query complexity of some graph problems. SIAM J. Comput. 2006, 35, 1310–1328. [Google Scholar] [CrossRef]
- Khadiev, K.; Safina, L. Quantum Algorithm for Dynamic Programming Approach for DAGs. Applications for Zhegalkin Polynomial Evaluation and Some Problems on DAGs. In Proceedings of the International Conference on Unconventional Computation and Natural Computation, Tokyo, Japan, 3–7 June 2019; Volume 4362, pp. 150–163. [Google Scholar]
- Khadiev, K.; Kravchenko, D.; Serov, D. On the Quantum and Classical Complexity of Solving Subtraction Games. In Proceedings of the 14th International Computer Science Symposium in Russia, Novosibirsk, Russia, 1–5 July 2019; Volume 11532, pp. 228–236. [Google Scholar]
- Khadiev, K.; Safina, L. Quantum Algorithm for Dynamic Programming Approach for DAGs and Applications. Lobachevskii J. Math. 2023, 44, 699–712. [Google Scholar] [CrossRef]
- Lin, C.Y.Y.; Lin, H.H. Upper Bounds on Quantum Query Complexity Inspired by the Elitzur-Vaidman Bomb Tester. In Proceedings of the 30th Conference on Computational Complexity (CCC 2015), Portland, OR, USA, 17–19 June 2015. [Google Scholar]
- Lin, C.Y.Y.; Lin, H.H. Upper Bounds on Quantum Query Complexity Inspired by the Elitzur–Vaidman Bomb Tester. Theory Comput. 2016, 12, 537–566. [Google Scholar] [CrossRef]
- Beigi, S.; Taghavi, L. Quantum speedup based on classical decision trees. Quantum 2020, 4, 241. [Google Scholar] [CrossRef]
- Ramesh, H.; Vinay, V. String matching in O(n+m) quantum time. J. Discret. Algorithms 2003, 1, 103–110. [Google Scholar] [CrossRef]
- Montanaro, A. Quantum pattern matching fast on average. Algorithmica 2017, 77, 16–39. [Google Scholar] [CrossRef]
- Le Gall, F.; Seddighin, S. Quantum Meets Fine-Grained Complexity: Sublinear Time Quantum Algorithms for String Problems. In Proceedings of the 13th Innovations in Theoretical Computer Science Conference (ITCS 2022), Berkeley, CA, USA, 31 January–3 February 2022. [Google Scholar]
- Le Gall, F.; Seddighin, S. Quantum meets fine-grained complexity: Sublinear time quantum algorithms for string problems. Algorithmica 2023, 85, 1251–1286. [Google Scholar] [CrossRef]
- Akmal, S.; Jin, C. Near-optimal quantum algorithms for string problems. In Proceedings of the 2022 Annual ACM-SIAM Symposium on Discrete Algorithms (SODA), Alexandria, VA, USA, 9–12 January 2022; pp. 2791–2832. [Google Scholar]
- Charalampopoulos, P.; Pissis, S.P.; Radoszewski, J. Longest Palindromic Substring in Sublinear Time. In Proceedings of the 33rd Annual Symposium on Combinatorial Pattern Matching (CPM), Prague, Czech Republic, 27–29 June 2022. [Google Scholar]
- Ablayev, F.; Ablayev, M.; Salikhova, N. Hybrid classical-quantum text search based on hashing. arXiv 2023, arXiv:2311.01213. [Google Scholar]
- Ambainis, A. Understanding Quantum Algorithms via Query Complexity. Proc. Int. Conf. Math. 2018, 4, 3283–3304. [Google Scholar]
- Høyer, P.; Neerbek, J.; Shi, Y. Quantum complexities of ordered searching, sorting, and element distinctness. In Proceedings of the International Colloquium on Automata, Languages, and Programming, Crete, Greece, 8–12 July 2001; pp. 346–357. [Google Scholar]
- Høyer, P.; Neerbek, J.; Shi, Y. Quantum complexities of ordered searching, sorting, and element distinctness. Algorithmica 2002, 34, 429–448. [Google Scholar]
- Odeh, A.; Elleithy, K.; Almasri, M.; Alajlan, A. Sorting N elements using quantum entanglement sets. In Proceedings of the Third International Conference on Innovative Computing Technology, London, UK, 29–31 August 2013; pp. 213–216. [Google Scholar]
- Odeh, A.; Abdelfattah, E. Quantum sort algorithm based on entanglement qubits {00, 11}. In Proceedings of the 2016 IEEE Long Island Systems, Applications and Technology Conference (LISAT), Farmingdale, NY, USA, 29 April 2016; pp. 1–5. [Google Scholar]
- Klauck, H. Quantum time-space tradeoffs for sorting. In Proceedings of the Thirty-Fifth Annual ACM Symposium on Theory of Computing, San Diego, CA, USA, 9–11 June 2003; pp. 69–76. [Google Scholar]
- Khadiev, K.; Ilikaev, A. Quantum Algorithms for the Most Frequently String Search, Intersection of Two String Sequences and Sorting of Strings Problems. In Proceedings of the International Conference on Theory and Practice of Natural Computing, Kingston, ON, Canada, 9–11 December 2018; pp. 234–245. [Google Scholar]
- Khadiev, K.; Ilikaev, A.; Vihrovs, J. Quantum Algorithms for Some Strings Problems Based on Quantum String Comparator. Mathematics 2022, 10, 377. [Google Scholar] [CrossRef]
- Babu, H.M.H.; Jamal, L.; Dibbo, S.V.; Biswas, A.K. Area and delay efficient design of a quantum bit string comparator. In Proceedings of the 2017 IEEE Computer Society Annual Symposium on VLSI (ISVLSI), Bochum, Germany, 3–5 July 2017; pp. 51–56. [Google Scholar]
- Aborot, J.A. An Oracle Design for Grover’s Quantum Search Algorithm for Solving the Exact String Matching Problem. Theory and Practice of Computation. In Proceedings of the Workshop on Computation: Theory and Practice WCTP2017, Osaka, Japan, 12–13 September 2019; pp. 36–48. [Google Scholar]
- Kapralov, R.; Khadiev, K.; Mokut, J.; Shen, Y.; Yagafarov, M. Fast Classical and Quantum Algorithms for Online k-server Problem on Trees. CEUR Workshop Proc. 2022, 3072, 287–301. [Google Scholar]
- Knuth, D. The Art of Computer Programming; Sorting and Searching; Pearson Education: London, UK, 1973; Volume 3. [Google Scholar]
- De La Briandais, R. File searching using variable length keys. In Proceedings of the Western Joint Computer Conference, San Francisco, CA, USA, 3–5 March 1959; pp. 295–298. [Google Scholar]
- Black, P.E. Dictionary of Algorithms and Data Structures; Technical Report; National Institute of Standards and Technology: Gaithersburg, MD, USA, 1998. [Google Scholar]
- Brass, P. Advanced Data Structures; Cambridge University Press: Cambridge, UK, 2008; Volume 193. [Google Scholar]
- Grover, L.K. A fast quantum mechanical algorithm for database search. In Proceedings of the Twenty-Eighth Annual ACM Symposium on Theory of Computing, Philadelphia, PA, USA, 22–24 May 1996; pp. 212–219. [Google Scholar]
- Boyer, M.; Brassard, G.; Høyer, P.; Tapp, A. Tight bounds on quantum searching. Fortschritte Phys. 1998, 46, 493–505. [Google Scholar] [CrossRef]
- Arunachalam, S.; de Wolf, R. Optimizing the Number of Gates in Quantum Search. Quantum Inf. Comput. 2017, 17, 251–261. [Google Scholar] [CrossRef]
- Grover, L.K. Trade-offs in the quantum search algorithm. Phys. Rev. A 2002, 66, 052314. [Google Scholar] [CrossRef]
- Long, G.L. Grover algorithm with zero theoretical failure rate. Phys. Rev. A 2001, 64, 022307. [Google Scholar] [CrossRef]
- Motwani, R.; Raghavan, P. Randomized Algorithms; Chapman & Hall/CRC: Boca Raton, FL, USA, 2010. [Google Scholar]
- Guibas, L.J.; Sedgewick, R. A dichromatic framework for balanced trees. In Proceedings of the 19th Annual Symposium on Foundations of Computer Science, Washington, DC, USA, 16–18 October 1978; pp. 8–21. [Google Scholar]
- Khadiev, K.; Remidovskii, V. Classical and quantum algorithms for constructing text from dictionary problem. Nat. Comput. 2021, 20, 713–724. [Google Scholar] [CrossRef]
- Khadiev, K.; Remidovskii, V. Classical and Quantum Algorithms for Assembling a Text from a Dictionary. Nonlinear Phenom. Complex Syst. 2021, 24, 207–221. [Google Scholar] [CrossRef]
- Kothari, R. An optimal quantum algorithm for the oracle identification problem. In Proceedings of the 31st International Symposium on Theoretical Aspects of Computer Science, Lyon, France, 5–8 March 2014; p. 482. [Google Scholar]
- Beals, R.; Buhrman, H.; Cleve, R.; Mosca, M.; de Wolf, R. Quantum Lower Bounds by Polynomials. In Proceedings of the 39th Annual Symposium on Foundations of Computer Science, Palo Alto, CA, USA, 8–11 November 1998; pp. 352–361. [Google Scholar]
- Bennett, C.H.; Bernstein, E.; Brassard, G.; Vazirani, U. Strengths and weaknesses of quantum computing. SIAM J. Comput. 1997, 26, 1510–1523. [Google Scholar] [CrossRef]
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).