# The cell probe complexity of succinct data structures

The Cell Probe Complexity of Succinct Data Structures
Anna G?l1 and Peter Bro Miltersen2 a
1

Dept. of Computer Science, University of Texas at Austin. panni@cs.utexas.edu 2 Dept. of Computer Science, University of Aarhus. bromille@brics.dk Abstract. We show lower bounds in the cell probe model for the redundancy/query time tradeo? of solutions to static data structure problems.

1

Introduction

In the cell probe model (e.g., [1, 3, 4, 6, 7, 9, 18–21]), a boolean static data structure problem is given by a map f : {0, 1}n × {0, 1}m → {0, 1}, where {0, 1}n is a set of possible data to be stored, {0, 1}m is a set of possible queries and f (x, y) is the answer to question y about data x. For natural problems, we have m n: the question we pose to the database is much shorter than the database itself. Examples of natural data structuring problems include: Substring Search: Given a string x in {0, 1}n we want to store it in a data structure so that given a query string y of length m, we can tell whether y is a substring of x by inspecting the data structure. This problem is modeled by the function f de?ned by f (x, y) = 1 i? y is a substring of x. Pre?x Sum: Given a bit vector x ∈ {0, 1}n, store it in a data structure so that k queries “What is ( i=1 xi ) mod 2?” can be answered. This problem is modeled vy by the function f de?ned by f (x, y) = ( i=1 xi ) mod 2 where y is the binary representation of the integer vy . For Substring Search, both the data to be stored and the query are bit strings, as our framework requires. The only reason for this requirement is that to make our discussion about current lower bound techniques and their limitations clear, we want the parameter n to always refer to the number of bits of the data to be stored, the parameter m to always refer to the number of bits of a query and the output of the query to be a single bit. In general, we don’t necessarily expect the data we want to store to be bit strings, but an arbitrary encoding as bit strings may take care of this, as in the following example. Membership: Given a set S of k binary strings each of length m, store S as a data structure so that given a query y ∈ {0, 1}m, we can tell whether y ∈ S. To make this problem ?t intomthe framework above, the function f would be de?ned by letting n = log2 2k and ?xing, in some arbitrary way, a compact encoding of k-sets as n-bit strings and letting f (S, y) = 1 i? y ∈ S. The framework captures not only the classical “storage and retrieval” static data structure problems but also more general problems of dealing with preprocessed information, such as the classical algebraic problem of polynomial evaluation with preprocessing of coe?cients ([15, pp. 470-479], see also [19]):

Polynomial Evaluation: Store g ∈ F[x], |F| = 2k , g of degree ≤ d as a memory image so that queries “What is g(x)?” can be answered for any x ∈ F. This problem is non-boolean, but can be modeled as a boolean problem by letting n = (d+1)k, m = k +log k, ?xing an arbitrary compact encoding of polynomials and ?eld elements as bit strings and letting f (g, x · y) = vy ’th bit of g(x), where y is the binary notation of vy and · denotes concatenation. In the cell probe model with word size 1 (the bit probe model), a solution with space bound s and time bound t to a problem f is given by a storage scheme φ : {0, 1}n → {0, 1}s, and a query algorithm q so that q(φ(x), y) = f (x, y). The time t of the query algorithm is its bit probe complexity, i.e., the worst case number of bits it reads in φ(x). Every problem possesses two trivial solutions: The solution of explicitly storing the answer to every query (this solution has space s = 2m and time t = 1) and the solution of storing the data verbatim and reading the entire data when answering queries (this solution has space s = n and time t = n, as we only charge for reading bits in φ(x), not for computation). The study of cell probe complexity concerns itself with the tradeo? between s and t that may be obtained by solutions somewhere between the two extremes de?ned by the trivial solutions. Such solutions may be quite non-trivial and depend strongly on the problem considered. A polynomial solution satis?es s = nO(1) and t = mO(1) . For instance, perfect hashing schemes form solutions to Membership with s = O(n) and t = O(m) [11] and even s = n + o(n) and t = O(m) [5, 25]. Substring Search also admits an s = O(n), t = O(m) solution [12] and very recently a solution with s = n + o(n) and t = mO(1) was constructed [13] but no solution with s = n + o(n) and t = O(m) is known. For a problem such as Polynomial Evaluation (and many natural data structure problems, such as partial match type problems [3, 4, 7]), we know of no solution with s = nO(1) , t = mO(1) . Thus, a main concern is to prove that such solutions do not exist. For s = O(n), lower bounds of the form t = ?(m) may be obtained for explicit and natural problems by simple counting arguments [6]. For s = nO(1) , we can do almost as good: Lower bounds of the form t = ?(m/ log n) can be obtained using communication complexity [20]. But no very good (i.e., ω(m)) lower bounds are known on t for any explicit problem f for the case of s = O(n) or s = nO(1) even though counting arguments prove the existence of (non-explicit) problems f with lower bounds of the form t = ?(n), even for m ≈ (log n)2 [18]. Thus, it is consistent with our current knowledge that solutions with s = O(n) and t = O(m) exist for all explicit (e.g., all exponential time computable) problems, though it is certainly a generally believed conjecture that this is not the case! Given our lack of tools strong enough to show statements such as s = O(n) ? t = ω(m) for explicit problems, it seems appropriate to lower our ambitions slightly and try to show such lower bounds for t for any non-trivial value of s. Achieving such goals is well in line with the current trend in the theoretical as well as practical studies of data structures (e.g., [17, 5, 25, 13]) of focusing on succinct data structures where s = n + r for some redundancy r n, i.e., on

structures whose space requirement is close to the information theoretic minimum. Restricting our attention to such succinct structures by no means trivializes obtaining the lower bounds we want to show. For instance, it is open (and remains open, also after this work) whether a solution with r = 0 and t = O(m) exists for the Membership problem. However, in this paper we show that for certain explicit (polynomial computable) problems it is possible to show lower bounds of the form t = ω(m) and even t = ?(n) for structures with a su?ciently strong upper bound on r: Theorem 1. Let k, d be integers larger than 0 so that d < 2k /3. Let F = GF(2k ) and let n = (d + 1)k. Let a storage scheme φ : {f |f ∈ F[x], degree(f ) ≤ d} → {0, 1}n+r and associated query scheme for “What is f (x)?”, x ∈ F with bit probe complexity t be given. Then, (r + 1)t ≥ n/3. In particular, for very small redundancies, we get an almost optimal lower bound stating that the query algorithm has to inspect almost the entire data structure. The theorem is for the (more natural) non-boolean version of the polynomial evaluation problem. A lower bound of (r + 1)t ≥ n/3k for the boolean version of polynomial evaluation we de?ned previously immediately follows. The proof of Theorem 1 (presented in Section 2) is based on the fact that the problem of polynomial evaluation hides an error correcting code: The strings of query answers for each possible data (i.e., each polynomial) form the ReedSolomon code. We can generalize Theorem 1 to any problem hiding an error correcting code in a similar way (see Theorems 4 and 5 in Section 2). However, not many natural data structuring problems contain an error correcting code in this way. In Section 2, we introduce a parameter of data structuring problems called balance and, using the sun?ower lemma of Erd?s and Rado show that o for problems having constant balance, we get a lower bound of the form t(r + 1)2 ≥ ?(n) (Theorem 6). A problem hiding a good error correcting code in the way described above has constant balance, but the converse statement is not necessarily true. Hence Theorem 6 has the potential to prove lower bounds on a wider range of problems than Theorems 4 and 5, though we do not have any natural data structuring problems as examples of this at the moment. The results above are based on combinatorial properties of a coding theoretic ?avor of the problems f to be solved. We don’t know how to prove similar lower bounds for natural storage and retrieval problems such as Substring Search. However, we get a natural restriction of the cell probe model by looking at the case of systematic or index structures. These are storage schemes φ satisfying φ(x) = x · φ? (x) for some map φ? , i.e, we require that the original data is kept “verbatim” in the data structure. We refer to φ? (x) as the index part of φ(x). The restriction only makes sense if there is a canonical way to interpret the data to be stored as a bit-string. It is practically motivated: The data to be encoded may be in read-only memory or belong to someone else or it may be necessary to keep it around for reasons unrelated to answering the queries de?ned by f . For more discussion, see, e.g. Manber and Wu [17]. In the systematic model, we prove a tight lower bound for Pre?x Sum (in fact, we show that the lower

bound is implicit in work of Nisan, Rudich and Saks [24]) and a lower bound for Substring Search. Theorem 2. Θ(n/(r + 1)) bit probes are necessary and su?cient for answering queries in a systematic structure for Pre?x Sum with r bit redundancy. Theorem 3. Consider Substring Search with parameters n, m so that 2 log2 n + 5 ≤ m ≤ 5 log2 n. For any systematic scheme solving it with redundancy r and 1 bit probe complexity t, we have (r + 1)t ≥ 800 n/ log n. Both proofs are presented in Section 3. We are aware of one paper previous to this one where lower bounds of the form t = ω(m) were established for succinct, systematic data structures: Demaine and Lopez-Ortiz [8] show such a lower bound for a variation of the Substring Search problem. In their variation, a query does not just return a boolean value but an index of an occurrence of the substring if it does indeed occur in the string. For this variation, they prove the following lower bound for a value of m which is Θ(log n) as in our bound: t = o(m2 / log m) ? (r + 1)t = ?(n log n). Thus, they give a lower bound on the query time even with linear redundancy which our method cannot. On the other hand, their method cannot give lower bounds on the query time better than ?(m2 / log m) even for very small redundancies which our method can. Furthermore, our lower bound applies to the boolean version of the problem.

2

Lower bounds for non-systematic structures

Proof of Theorem 1. Let a storage scheme φ with redundancy r and an associated query scheme with bit probe complexity t be given. Let s = n + r. Assume to the contrary that the scheme satis?es (r + 1)t < n/3. As r ≥ 0 in any valid scheme, we have t < n/3. We make a randomized construction of another storage scheme φ by randomly removing r + 1 bits of the data structures of storage scheme φ. That is, we pick S ? {1, .., n+r} of size r+1 at random and let φ (x) = φ(x) with bits in positions i ∈ S removed. Thus, φ (x) ∈ {0, 1}n?1. We make an associated query scheme for φ by simulating the query scheme for φ, but whenever a bit has to be read that is no longer there, we immediately answer “Don’t know”. Clearly, if we use our new storage scheme φ and the associated query scheme, we will on every query, either get the right answer or the answer “Don’t know”. Now ?x a polynomial f and a query x and let us look at the probability that the randomized construction gives us the answer “Don’t know” on this particular data/query-pair. The probability is equal to the probability that the random set S intersects the ?xed set T of bits that are inspected on query x in structure φ(f ) according to the old scheme. As |S| = r + 1 and |T | ≤ t, the probability of no intersection can be bounded as Pr[S ∩ T = ?] ≥ ( s?t ) ( s?1?t ) . . . ( s?(r+1)+1?t ) s s?1 s?(r+1)+1
t ≥ (1 ? n )r+1 ≥ 1 ? (r+1)t > 2/3. This means that if we ?x f and count the n number of answers that are not “Don’t know” among all answers to “What is f (x)?”, x ∈ F, the expected number of such valid answers is > 2|F|/3, and the expected number of “Don’t know” answers is < |F|/3. Thus, for ?xed f , the

probability that the number of valid answers for this f is < |F|/3 is < 1/2. De?ne f to be “good” for a particular choice of S if the number of valid answers for f is at least |F|/3. Thus, for random S, the probability that a particular ?xed f is good is > 1/2, by the above calculation, so if we count among all 2 n possible f ’s the number of good f ’s, the expectation of this number is > 2n /2. Thus, we can ?x a value of S so that the number of good f ’s is > 2n /2. Let the set of good f ’s relative to this choice of S be called G. We now argue that the map φ : G → {0, 1}n?1 is a 1-1 map: Given the value φ (f ) for a particular f ∈ G, we can run the query algorithm for f (x) for all x ∈ F and retrieve a valid answer in at least |F|/3 cases - in the other cases we get the answer “Don’t know”. Since the degree of f is less than |F|/3, the information we retrieve is su?cient to reconstruct f . Thus, we have constructed a 1-1 map from G with |G| > 2n /2 to the set {0, 1}n?1 which has size 2n /2. This violates the pigeonhole principle, and we conclude that our assumption (r +1)t < n/3 was in fact wrong. This completes the proof of Theorem 1. Theorem 1 can be generalized to any problem based on some error correcting code. Consider an arbitrary boolean static data structure problem, given by a map f : {0, 1}n ×{0, 1}m → {0, 1}. Let N = 2n , and M = 2m . Then the problem can be represented by an N × M Boolean matrix Af , with the entry at the row indexed by x and the column indexed by y being equal to f (x, y). Theorem 4. Let Af be the N by M (N = 2n ) matrix of a data structure problem such that the rows of Af have pairwise distance at least δM . If the problem can be solved with redundancy r and query time t, then t(r + 1) ≥ δn/2. The argument can also be extended to problems where the minimum distance may not be large, but instead we require that within any ball of radius ρM there are at most L codewords (i.e., codes with certain list decoding properties). In fact, the even weaker property of having only few codewords in every subcube of dimension ρM is su?cient for our purposes. (Note that this property corresponds to the problem of list decoding from erasures, rather than from errors.) Let αi1 , . . . , αiM ?d be an arbitrary 0/1 assignment to M ? d coordinates. The set S ? {0, 1}M of size |S| = 2d formed by all possible vectors from {0, 1}M agreeing with αi1 , . . . , αiM ?d and arbitrary in the remaining coordinates is called a subcube of dimension d. Theorem 5. Let Af be the N by M (N = 2n ) matrix of a data structure problem such that within any subcube of dimension ρM there are at most L row vectors from Af . If the problem can be solved with redundancy r and query time t, then t(r + 1 + log L) ≥ ρ(n ? log L)/2. The proofs of Theorems 4 and 5 are very similar to the proof of Theorem 1 and appear in the full version of this paper. We next give a general lower bound for any problem whose matrix satis?es certain conditions. Informally, we require that the submatrix formed by any small subset of rows contains a balanced column.

De?nition 1. Let A be a matrix with 0/1 entries. We say that A has balance at least λ for parameter k, if for any k rows of the matrix A there exists a column that contains at least λk 0-s and at least λk 1-s among the entries of the given k rows. Lemma 1. Given a code with N words in {0, 1}l, let A be the N by l matrix formed by the words as rows. If the minimum distance of the code is δl, then A has balance at least δ/8 for every 1 < k ≤ N . Proof. Look at the k by l table formed by k rows of A. Let γ = δ/8. Suppose that each column in the table has either < γk 0-s or < γk 1-s. Let a be the number of mostly 1 columns and b be the number of mostly 0 columns. Then < k/2 rows have > 2γa 0-s on the mostly 1 part. Restrict the table to the other k > k/2 rows. In this table, the b mostly 0 columns still have < 2γk 1-s. So, < k /2 rows have > 4γb 1-s on the mostly 0 part. Thus, > k/4 rows have both < 2γa 0-s on the mostly 1 part and < 4γb 1’s on the mostly 0 part, respectively. The distance of any two of these rows is < 4γa + 8γb < δl, which is a contradiction. The proof of Lemma 1 also extends to codes where the minimum distance may not be large, but instead we require that within any ball of certain radius there are not too many words, i.e., to problems satisfying the condition of Theorem 5. We can, however, construct codes that satisfy the property of having large balance for every k, without the property of having few codewords in every Hamming ball of a given radius, and even without the weaker property of having few codewords in every subcube of a given dimension. Consider the following example of such construction. Let ρ be any constant, and L any integer, such 1 that ρ + L < 1/20. We will construct a set of words in {0, 1}M with at least L words in some subcube of dimension ρM , such that for any set of rows of the 1 corresponding matrix there is a column with balance > ρ + L . Start with any 1 family that has balance at least 5(ρ+ L ). (We know the existence of such families, from the existence of good error correcting codes.) Add L words to this family as follows. Take a code of L words on c log L coordinates for some constant c, with relative minimum distance 1/4. (Such code exists for some constant c.) Let the ?rst c log L coordinates of the extra L words to be words from this code of size L, and let the L words be identical in the remaining M ?c log L coordinates. Unless L is huge (compared to M ), we have c log L < ρM , thus we have L words in a subcube of dimension ρM . It is not hard to see that the corresponding 1 matrix has balance at least ρ + L for any k. Thus, the following theorem has the potential of giving lower bounds for a wider range of problems than the theorems of Section 2. Consider an arbitrary boolean static data structure problem, given by a map f : {0, 1}n × {0, 1}m → {0, 1}. Theorem 6. Let Af be the N by M (N = 2n , M = 2m ) matrix of f . If Af has balance at least λ for every 1 < k ≤ log N . and the problem de?ned by f can be solved with redundancy r and query time t, then t(r + 1)2 ≥ λn. Proof. A solution to the data structure problem is given by a representation φ : {0, 1}n → {0, 1}s and a query algorithm. We consider a matrix B of size

N × s, such that the row of B indexed by x is the vector φ(x). We use the following standard observation. Observation 1. Given a set C of N = 2s?r vectors in {0, 1}s, for every 0 ≤ s w ≤ s there is a vector v ∈ {0, 1}s , such that there are at least w /2r vectors in C at distance w from v. Proof. Let χ(u, v) = 1 if u and v di?er in w coordinates, and χ(u, v) = 0 s . On the other hand, otherwise. We have u∈C v∈{0,1}s χ(u, v) = |C| w s v∈{0,1}s u∈C χ(u, v) ≤ 2 maxv∈{0,1}s |Cv,w | , where Cv,w = {z ∈ C| z and v di?er in w coordinates }. This completes the proof of Observation 1. Let w = r + 1 (note that r + 1 ≥ 1), and let v ∈ {0, 1}s, guaranteed to exist s by the observation, such that there are at least r+1 /2r rows of B at distance r +1 from v. Let Bv be the matrix obtained from B by adding v to each row of B (taking bitwise XOR). With each vector u ∈ {0, 1}s we associate a set U ? [s], such that i ∈ [s] belongs to U if and only if the i-th entry of u is 1. Then the s matrix Bv speci?es a family B of N sets, such that at least r+1 /2r members of B have cardinality r + 1. A family of k sets S1 , . . . , Sk is called a sun?ower with k petals and core T , if Si ∩ Sj = T for all i = j. We also require that the sets Si \ T are nonempty. Lemma 2 (Erd?s and Rado, [10]). Let F be a family of sets each with caro dinality w. If |F| > w!(k ? 1)w , then F contains a sun?ower with k petals.
s Since r+1 /2r > (r + 1)!(s/(r + 1)2 )r+1 , Lemma 2 implies that B contains a sun?ower with k = s/(r + 1)2 petals. Let S1 , . . . , Sk be the sets of the sun?ower, and let T be its core. Then, the sets Si T are pairwise disjoint. (Si T denotes the symmetric di?erence of the sets Si and T .) Let z and u1 , . . . , uk be the vectors obtained by adding the vector v to the characteristic vectors of the set T and S1 , . . . , Sk , respectively. Then the vectors u1 , . . . , uk are rows of the matrix B, and they have the property that the vectors z ⊕ u1 , . . . , z ⊕ uk have no common 1’s, since the set Si T is exactly the set of coordinates where the vectors z and ui di?er from each other. Let x1 , . . . , xk be the data such that ui = φ(xi ), i = 1, . . . , k. Consider now the k rows of Af indexed by x1 , . . . , xk . By our assumption on Af , there is a question y, such that at least λk of the answers f (xi , y) are 0, and at least λk of the answers f (xi , y) are 1. We think of the query algorithm as a decision tree, and show that it has large depth. In particular, we show that the path consistent with the vector z has to be at least λk long. (Note that the vector z may not be a row of the matrix B. However, we can assume that the decision tree has been trimmed, so that there are no long paths that can be cut o? without a?ecting the correctness of the algorithm. This implies that there is at least one path corresponding to a vector φ(x) that the algorithm may actually have to follow, and is at least λk long.) Assume that the query algorithm reads at most t < λk bits on any input when trying to answer the question y, and assume that the bits read are consistent with the vector z. Since the sets of coordinates where z di?ers from ui for i = 1, . . . , k are pairwise

disjoint, after asking at most t questions, the algorithm can rule out at most t of the data x1 , . . . , xk , and the remaining k ? t are still possible. If t < λk, then among the data that are still not ruled out, both the answer 0 and the answer 1 is possible, and the algorithm cannot determine the answer to the given question y. This completes the proof of Theorem 6. It is not hard to ?nd examples of matrices with large balance for k ≤ log N , if we are not worried about the number of rows N being large enough compared to the number of columns M . We should mention that there are well known constructions (e.g. [2, 14, 22, 23, 26]) for the much stronger property requiring that all possible 2k patterns appear in the submatrix formed by arbitrary k rows. However, in such examples, N ≤ M or 2k ≤ M must trivially hold. Error correcting codes provide examples where N can be very large compared to M . Let n(k, λ, M ) denote the largest possible number n, such that 2n by M 0/1 matrices exist with balance at least λ for k. Lower bounds on the largest achievable rate of error-correcting codes or list decodable codes provide lower bounds on n(k, λ, M ). For example, the Gilbert-Varshamov bound (see e.g. [16]) together with Lemma 1 implies n(k, λ, M ) ≥ (1 ? H(8λ))M , for every k > 1. Note that while error correcting codes give large balance for every k > 1, for our purposes matrices that have large balance for only certain values of k may already be useful. It would be interesting to know if n(k, λ, M ) can be signi?cantly larger (for certain values of k) than what is achievable by error-correcting or list decodable codes. If this is the case, then our techniques might help to achieve lower bounds for the Membership problem.

3

Lower bounds for systematic structures

Proof of Theorem 2. Upper bound: For r = 0, the upper bound is obvious. For r ≥ 1, divide the input vector into r equal sized blocks and let yi be the parity of the i’th block. Now store for each j = 1, ..r, the parity of y1 , y2 , . . . , yj . Given a pre?x sum query, it can be answered by reading a non-systematic bit, that gives the parity of a collection of blocks and XORing it with a number of individual input bits, all found in a single block of size n/r. The bit probe complexity is O(n/r). Lower bound: Let a scheme of redundancy r be given and suppose the queries can be answered with t bit probes, i.e., we can ?nd x1 ⊕ · · · ⊕ xj using a decision tree of depth t over the input bits and the index bits. Split the input into r + 1 n blocks of about equal length, each block containing at least r+1 bits. It is possible to determine the parity of one of the blocks by a decision tree of depth 2t over the input bits and the index bits. We now apply a theorem of Nisan, Rudich and Saks [24]: Given l + 1 instances of computing parity of k bits, with l help bits (which can be arbitrary functions of the (l + 1)k input bits), given for free. At least one of the l + 1 parity functions has decision tree complexity ≥ k. We immediately get the desired bound. Proof of Theorem 3. Since we must have r ≥ 0 and t ≥ 1 in a valid scheme, we can assume that 1 ≤ t ≤ 800 n n otherwise there is nothing to prove. We need log

4

Open problems

It is interesting that all our best bounds, both in the non-systematic and in the systematic case, are of the form “(r + 1)t must be linear or almost linear in n.” We don’t see any inherent reason for this and in general do not expect the lower bounds obtained to be tight. Thus, it would be nice to to prove a lower bound of, say, the form, t < n/polylog n ? r > n/polylog n for Polynomial Evaluation in the non-systematic case or Substring Search in the systematic case. For the latter result, it would be su?cient to verify the hypothesis about the game de?ned above. It is also interesting to note that our lower bound for Substring Search and the lower bound of Demaine and Lopez-Ortiz are incomparable. Can the two techniques be combined to yield a better lower bound? We have only been able to prove lower bounds in the non-systematic case for problems satisfying certain coding theoretic properties. It would be very nice to extend the nonsystematic lower bounds to more natural search and retrieval problems, such as Substring Search. A prime example of a problem for which we would like better bounds is Membership as de?ned in the introduction. As the data to be stored has no canonical representation as a bitstring, it only makes sense to consider this problem in the non-systematic model. The lower bound r = O(n) ? t = ?(m) was shown by Buhrman et al [6]. On the other hand, a variety of lowredundancy dictionaries with r = o(n) and t = O(m) has been constructed [5, 25]. We conjecture that any solution for membership with t = O(m) must have some redundancy, i.e., that t = O(m) ? r ≥ 1. It would be very nice to establish this. The main open problem of cell probe complexity remains: Show, for some explicit problem, a tradeo? of the form r = O(n) ? t = ω(m). Clearly, for such tradeo?s the distinction between systematic and non-systematic structures is inconsequential.
Acknowledgements Anna G?l is supported in part by NSF CAREER Award CCRa 9874862 and an Alfred P. Sloan Research Fellowship. Peter Bro Miltersen is supported by BRICS, Basic Research in Computer Science, a centre of the Danish National Research Foundation.

References
1. M. Ajtai. A lower bound for ?nding predecessors in Yao’s cell probe model. Combinatorica, 8:235–247, 1988. 2. N. Alon, O. Goldreich, J. H? astad, R. Peralta: Simple constructions of almost kwise independent random variables. Random Structures and Algorithms 3 (1992), 289–304. 3. O. Barkol and Y. Rabani, Tighter bounds for nearest neighbor search and related problems in the cell probe model. In Proc. 32th Annual ACM Symposium on Theory of Computing (STOC’00), pages 388–396. 4. A. Borodin, R. Ostrovsky, Y. Rabani, Lower bounds for high dimensional nearest neighbor search and related problems. In Proc. 31th Annual ACM Symposium on Theory of Computing (STOC’99), pages 312–321. 5. A. Brodnik and J.I. Munro. Membership in constant time and almost-minimum space. SIAM Journal on Computing, 28:1627–1640, 1999.

6. H. Buhrman, P.B. Miltersen, J. Radhakrishnan, S. Venkatesh. Are bitvectors optimal? In Proc. 32th Annual ACM Symposium on Theory of Computing (STOC’00), pages 449–458. 7. A. Chakrabarti, B. Chazelle, B. Gum, and A. Lvov. A lower bound on the complexity of approximate nearest-neighbor searching on the Hamming Cube. In Proc. 31th Annual ACM Symposium on Theory of Computing (STOC’99), pages 305–311. 8. E.D. Demaine and A. Lopez-Ortiz. A Linear Lower Bound on Index Size for Text Retrieval. In Proc. 12th Annual ACM-SIAM Symposium on Discrete Algorithms (SODA’01), pages 289–294. 9. P. Elias and R. A. Flower. The complexity of some simple retrieval problems. Journal of the Association for Computing Machinery, 22:367-379, 1975. 10. P. Erd?s and R. Rado. Intersection theorems for systems of sets. Journal of London o Mathematical Society 35 (1960), pages 85-90. 11. M. L. Fredman, J. Koml?s, and E. Szemer?di. Storing a sparse table with O(1) o e worst case access time. Journal of the Association for Computing Machinery, 31:538–544, 1984. 12. R. Grossi, J.S. Vitter. Compressed su?x arrays and su?x trees with applications to text indexing and string matching. In Proc. 32th Annual ACM Symp. on Theory of Computing (STOC’00), pages 397–406. 13. R. Grossi, A. Gupta, and J.S. Vitter. High-Order Entropy-Compressed Text Indexes. In Proc. 14th Annual ACM-SIAM Symp. on Discrete Algorithms (SODA’03), pages 841–850. 14. D. J. Kleitman and J. Spencer: Families of k-independent sets. Discrete Math.6 (1973), pp. 255-262. 15. D.E. Knuth, The Art of Computer Programming, Vol. II: Seminumerical Algorithms (Addison-Wesley, Reading, MA, 2nd ed., 1980). 16. F. J. MacWilliams and N. J. A. Sloane. The theory of error correcting codes. Elsevier/North-Holland, Amsterdam, 1981. 17. U. Manber, S. Wu. GLIMPSE - A Tool to Search Through Entire Filesystems. White Paper. Available at http://glimpse.cs.arizona.edu/. 18. P.B. Miltersen. The bitprobe complexity measure revisited. In 10th Annual Symposium on Theoretical Aspects of Computer Science (STACS’93), pages 662–671, 1993. 19. P.B. Miltersen, On the cell probe complexity of polynomial evaluation, Theoretical Computer Science, 143:167–174, 1995. 20. P.B. Miltersen, N. Nisan, S. Safra, and A. Wigderson: On data structures and asymmetric communication complexity, Journal of Computer and System Sciences, 57:37–49, 1998. 21. M. Minsky and S. Papert. Perceptrons. MIT Press, Cambridge, Mass., 1969. 22. J. Naor and M. Naor: Small-bias probability spaces: e?cient constructions and applications. SIAM J. Comput., Vol. 22, No. 4, (1993), pp. 838-856. 23. M. Naor, L. Schulman, A. Srinivasan: Splitters and near optimal derandomization. In Proc. of 36th IEEE FOCS, (1995), pp. 182-191. 24. N. Nisan, S. Rudich, and M. Saks. Products and Help Bits in Decision Trees, SIAM J. Comput. 28:1035–1050, 1999. 25. R. Pagh. Low redundancy in static dictionaries with O(1) lookup time. In International Colloquium on Automata Languages and Programming (ICALP’99), Lecture Notes in Computer Science, Volume 1644, pages 595–604, 1999. 26. G. Seroussi and N. Bshouty: Vector sets for exhaustive testing of logic circuits. IEEE Trans. Inform. Theory, 34 (1988), pp. 513-522.

Statistical Encoding of Succinct Data Structures_免....pdf
The cell probe complexit... 暂无评价 13页 免费 Succinct Data Structures 暂无评价 12页 免费 SUCCINCT DATA STRUCTU... 236页 免费 Quick decoding and ...
Succinct Dynamic Data Structures Rajeev Raman 1, Venkatesh ....pdf
Succinct Dynamic Data Structures Rajeev Raman 1, Venkatesh Raman 2_专业资料...Saks, \The Cell Probe Complexity of Dynamic Data Structures", Proceedings ...
6.897 Advanced Data Structures Spring_专业资料。In this lecture, we discuss lower bounds on the cell-probe complexity of the static predecessor problem. ...
The bit probe complexity measure revisited.pdf
elements of D into data structures in the memory of a random access ...Later studies in cell probe complexity have, however, focused on worst case...
Abstract In the Uncapacitated Facility Location (UFL).pdf
to quickly add two such functions and probe them...These functions have succinct representations and ...In section 4, we describe the data structures ...
MIT niu sample statement.pdf
The cell-probe model is a strong nonuniform model of computation, used ... for static data structures: reduction to asymmetric communication complexity. ...
On the Relation Between BDDs and FDDs (Extended Abstract)_....pdf
Data structures for Boolean functions build an ...Finally, we determine the complexity of some ...A succinct multi-level representation is now ...
(Tool paper)Genet_a Tool for the Synthesis and Mining of ....pdf
Moreover, the complexity of the analysis ...Hence data structures and algorithms behind Genet ...It is interesting to note the succinctness of ...
ABSTRACT Time-Space Trade-Offs for Predecessor Sear....pdf
new technique for proving cell-probe lower bounds for static data structures...1.1 The Complexity-Theoretic View / lg a In external memory (b > ), ...
The IO-complexity of ordered binary-decision diagram ....pdf
, or on ?nding alternative succinct representations while maintaining the e?...[2] where a number of external (batched) dynamic data structures are ...
Succinct Representation of General Unlabeled Graphs.pdf
Succinct Representation of General Unlabeled Graphs_...2 ! Thus, if c2 Time Complexity: The most ...Naor, Implicit O(1) Probe Search, twenty- rst...
The complexity of plan existence and evaluation in ....pdf
the complexity of the planning process for a richer set of plan structures...In the succinct representation, straightforward probability matrices would be ...
The Complexity of Some Basic Problems for Dynamic Process ....pdf
The Complexity of Some Basic Problems for Dynamic...Nodes of a data- ow graph represent tasks to ...succinct way using generating circuits or a ...
Scalable and dynamic quorum systems.pdf
probe complexity of quorum systems and their ...1.1 Scalable Dynamic Data Structures - P2P ...? when is the number of neighbors the cell has...
On the Complexity of Relational Problems for Finite State ....pdf
On the Complexity of Relational Problems for ...the speci cation of the data and control ow in...harder for processes with such succinct descriptions...
The Expressive Power and Complexity of Dynamic Process Graphs....pdf
The Expressive Power and Complexity of Dynamic ...this will be equivalent to ordinary data ow ...succinct form using generating circuits or a ...