# L Coding Complexity The Computational Complexity of Succinct Descriptions

ISSN 0918-2802 Technical Report

L

Coding Complexity:
The Computational Complexity of Succinct Descriptions

Jos L. Balczar, Ricard Gavald, e a a and Osamu Watanabe
TR96-0018 October

Department of Computer Science Tokyo Institute of Technology
^ Ookayama 2-12-1 Meguro Tokyo 152, Japan

http://www.cs.titech.ac.jp/

c The author(s) of this report reserves all the rights.

Title:

Coding Complexity: The Computational Complexity of Succinct Descriptions

Authors:

Jos L. Balczar, Ricard Gavald e a a Dept. of Software (LSI), Univ. Politcnica de Catalunya, E-08028 Barcelona, Spain. e E-mail: fbalqui,gavaldag@lsi.upc.es. Osamu Watanabe Dept. of Computer Science, Tokyo Inst. of Tech., Meguro-ku Tokyo 152, Japan. E-mail: watanabe@cs.titech.ac.jp. For a given set of strings, the problem of obtaining a succinct description becomes an important subject of research, related to several areas of theoretical computer science. In structural complexity theory, researchers have developed a reasonable framework for studying the complexity of these problems. In this paper, we survey how such investigation has proceeded, and explain the current status of our knowledge.
Abstract.

1

Introduction

The question \What is structural complexity theory?", or \Why do we study structural complexity theory?" has been sometimes asked, and it has been the source of interesting discussions. Attempting to give an answer to this question, Book and Watanabe wrote an article [BW93] on structural complexity theory. There they illustrated, through examples, that there is the following role in structural complexity theory: When studying some speci c problem, one may establish some intuition regarding the diculty of the problem compared with others. The role of structural complexity theory is to provide a framework and a theory for expressing such intuition formally and investigate it further. Here we present another set of examples supporting this thesis. We survey a structural study on \coding complexity" of languages. The problem of obtaining a succinct description for a given set of data arises in various elds of computer science. More speci cally, we consider in this paper the problem of representing a given language , i.e., a set of strings. Here we regard a machinery, e.g., automaton, circuit, grammar, etc., for recognizing or generating a language as its description. That is, the problem is to compute, e.g., some (small) automaton accepting a given language. In this paper, we use \coding problem" referring to this type of problem, and by \coding complexity", we mean the complexity of these coding problems. Our coding problems are closely related to learning theory. Roughly speaking, this is the problem to be solved for designing a learning algorithm. The problem varies, however, depending on the learning framework chosen. In the PAC learning framework [Val84], a learning algorithm is essentially the same as an algorithm that computes a succinct description explaining a given set of examples. (Actually, the equivalence between learning and nding a compressed description of the examples can be stated and proved formally; see [BEHW87], [BP90].) Notice that a set of examples is given as input data. Hence, the coding problem becomes an optimization 1

problem, and we can discuss its diculty in the standard complexity theory framework; for example, we can argue for the diculty of a given PAC learning problem by showing the corresponding coding problem is NP-hard [PV88]. On the other hand, in the query learning framework [Ang87], information on a target language is obtained on the course of computation by asking queries to a teacher, a person or artifact who knows the target language, or enough information about it. Thus, some new framework is necessary for discussing the diculty of such learning process. We will see below that structural research on relative complexity and nonuniform complexity has provided us with one such framework and several important results, which would help us to investigate query learnability1 .

2

Preliminaries

We use standard notions and notations in formal language theory and complexity theory, and we review some of the basic notions and notations; see, e.g., [BDG88, Pap94] for the others. In this paper, a string is a binary string, i.e., an element of f0; 1g3, and by a language (or, more simply, a set), we mean a set of strings. For any set A and any integer n, we use An and A=n to denote the sets of strings in A of length at most n and exactly n, respectively. A subset of 03 is called a tally set, and a set is called a sparse set if there exists a polynomial p such that kAn k  p(n) for all n  0. For measuring uniform complexity, we use a standard Turing machine model, and de ne complexity classes by using resource bounded Turing machines. For example, P is the class of languages recognized by polynomial-time bounded Turing machines. We also consider Turing machine transducers for discussing the complexity of computing a function. For the sake of simplicity, we use the same notation for language and function classes. For example, by P we also mean the class of functions that are computable by polynomial-time Turing machine transducers, a class frequently denoted as PF or FP. For measuring relative complexity, we use oracle Turing machines, i.e., machines that ask queries to an oracle set. For example, for any set A, let P(A) is the class of languages recognized by polynomial-time bounded oracle Turing machines by using A as an oracle. The boolean circuit model is also considered in order to measure nonuniform complexity. Here we consider combinatorial circuits, i.e., acyclic circuits consisting of AND, OR, NOT, input and output gates. We assume that each circuit has only one output gate, and regard a circuit of n input gates as an acceptor for strings of length n. When necessary, we assume that circuits are encoded in f0; 1g3 in some reasonable way. In structural complexity theory, various polynomial-time reducibilities have been de ned for classifying polynomial-time bounded relativized computation. Among them,
1 The
results obtained so far are not strong enough to yield really important results on query learnability; but based on these results, we hope, a comprehensive theory of query learnability will be established in near future.

2

oracle Turing machine by asking only nonadaptive queries to B : this means simply that the list of queries to be asked is prepared before the corresponding answers are known, so that the queries do not depend of the answers to other queries. The case in which a single query tells the result to be output is known as m-reducibility; it de nes a polynomial-time computable function from the reduced set to the set it is reduced to, and likewise for their complements. If it is bijective and the inverse is also polynomial-time computable, we call it a polynomial-time isomorphism. We use sometimes Kolmogorov complexity. We provide only the very basic de nitions concerning this notion; see the book by Li and Vitnyi for more information [LV93]. Fix a a universal machine U . For a binary string x, any string y such that U (y ) stops and outputs x is considered as a description of x. The Kolmogorov complexity of x, denoted by K(x), is de ned as the length of the shortest description for x. String x is called Kolmogorovrandom if K(x) is at least equal to the length of x (except, maybe, up to some additive constant). It is well known that, for every length n, there are Kolmogorov-random strings of length n. On the other hand, if K(x) is at most logarithmic in the length of x, x is informally called Kolmogorov-easy.

P -reducibility and P -reducibility will be considered in this paper. For any set A and tt T B , we say that A is P -reducible to B if A is accepted by some polynomial-time oracle T Turing machine with B as an oracle. On the other hand, P -reducibility is more restrictt tive. That is, we say that A is P -reducible to B if A is accepted by some polynomial-time tt

3

Developing a Theory

We rst explain a framework for discussing the \descriptional complexity" of languages. As explained above, we consider three similar ways to represent a given language: circuit, tally set, and sparse set representations.

Circuits and Reduction Classes
Let us start with circuits. It has been known (see, e.g., [Sav72]) that circuit size can be used as a computational complexity measure that is closely related to the standard time complexity of Turing machines. But Karp and Lipton [KL80] may be the rst who explicitly use the notion of circuit size as a nonuniform complexity measure. They also introduced the class P=poly for characterizing the class of languages having polynomialsize circuits. Though P=poly is de ned in a di erent way, in this paper, we use the following rather intuitive de nition for P=poly. (See, e.g., [BDG88, Joh90]). if there exist a polynomial p and a family fCn gn0 of circuits such that for each n  0, (i) Cn takes a string x of length n as input and determines whether x is in A, and (ii) the size of Cn is bounded by p(n). Let P=poly be the class of languages having polynomial-size circuits.
circuits

De nition 3.1. A set A has polynomial-size

3

It has been long known (attributed to Meyer in [BH77]) that the class P=poly is characterized as the class of sets that are polynomial-time Turing reducible to sparse sets. Book and Ko [BK88] showed further characterizations by considering the following \reducibility classes". RP (SPARSE)) is the class of sets that are reducible to some tally set (resp., sparse set) r via some reduction of type r . (1) A is in P=poly, (2) A is polynomial-time Turing reducible to some sparse set, (3) A is polynomial-time truth-table reducible to some sparse set, (4) A is polynomial-time Turing reducible to some tally set, and (5) A is polynomial-time truth-table reducible to some tally set. That is, P=poly = RP (SPARSE) = RP (TALLY) = RP (SPARSE) = RP (TALLY). T T tt tt

De nition 3.2. [BK88] For any polynomial-time reduction type r, RP (TALLY) (resp., r

Proposition 3.3. [BK88] For any set A, the following ve statements are equivalent.

They also presented a deep study of more restrictive reduction classes, with precise comparisons among all of them. In this paper, we intuitively regard a family of circuits (resp., a tally or sparse set) as a \code". For example, for a given set A 2 P=poly, consider a family fCn gn0 of circuits recognizing A. Then each Cn can be regarded as a code of the set A=n = f x 2 A : jxj = n g. In this case, a circuit evaluator, a machine computing, from Cn and x, the value of Cn on x, is considered as a \decoder". (Note that we may assume that this decoder runs in polynomial time; it is easy to design a circuit evaluator that runs in polynomial time w.r.t. the size of a given circuit.) Similarly, in the case that A is reducible to a tally or sparse set L via some polynomial-time oracle machine M (i.e., A = M (L)), we can regard L and M as a \code" and its \decoder". That is, we would like to consider language coding systems that have polynomial-time decoders, and the above notions provide us with a framework for discussing \descriptional complexity" or \nonuniform complexity" in such language coding systems. Notice that in the above notions, we do not consider \coding complexity", the complexity of computing a code from a given language, while we assumed that \decoding complexity" is polynomial. This is because coding complexity is not essential for discussing descriptional complexity or nonuniform complexity; the main issue there is whether there exists a succinct representation, and we do not need to worry about a way to obtain the representation. In fact, in structural complexity theory, whereas properties of sets like those of being reducible to tally or sparse sets have been studied rather deeply, it seems that researchers have investigated somewhat less, and more implicitly than explicitly, \reverse complexity"2 : the complexity of computing a circuit, tally set, or sparse set representation from a given language.
2 This
term is attributed to Ron Book.

4

Before explaining such investigations, let us de ne some notions for discussing \coding complexity". For a tally set or sparse set representation, coding complexity is just the complexity of that tally or sparse set relative to the original set. For any set A and any relativizable complexity class C , if there is some tally or sparse set L such that A P L and T L 2 C (A), then we may consider that A has a tally or sparse set representation that is C ( )computable (relative to A). In this case, we say that A has a C ()-self-computable3 tally or 4 sparse set (representation ). On the other hand, we say that A has C ()-self-computable A n circuits if there is some oracle C -transducer M such that M (0 ) computes Cn for every n  0. (This is a generalization of the notion \self-producible circuits" introduced in [Ko85], which is, in our notation, equivalent to P()-self-computable circuits.) Lowness One of the important questions in structural complexity theory is to relate nonuniform complexity to the standard computational complexity. Research studying this question was started from the fundamental paper [KL80] of Karp and Lipton, where they proved some uniform complexity consequences from nonuniform upper bounds. The following theorem stands out among them. [KL80] If NP  P=poly, then the polynomial-time hierarchy collapses P \ 5P . to 62 2 This line of research is closely related to our coding complexity. In fact, though it was quite implicit in [KL80], the key for proving the above theorem is the fact that there is some upper bound on coding complexity5. This upper bound can be stated as follows. (The proof of the following theorem is not explicitly stated in the literature, but it is derived from the proof of Theorem 3.4 and, more directly, from the proofs in [KS85].) Every NP-complete set A in P=poly has a tally set representation that is computable in 1P. Similarly, it has polynomial-size circuits computable in 1P. 3 3 In general, the theorem holds for any \self-reducible" set A in P=poly (see, e.g., [BDG88, KS85, Sch86]). Due to the self-reducible property of A, it is not necessary to use A for computing its tally set (resp., circuit) representation. Thus, the above class 1P is the absolute one, and is not the relativized class 1P(A). 3 3
Theorem 3.4. Theorem 3.5. Remark.

class C .

3 The term

\self" is used because if A P L and L 2 C (A), then L 2 C (L) for any reasonable complexity T

we use \computable" for both sets and circuits, the notion of \computable" di ers slightly in each context. By it means \recognizable" for tally or sparse sets, and \producible" for circuits. speaking, for showing the theorem, it is enough to consider the problem of checking whether a given circuit is correct instead of producing a circuit.
5 Precisely

4 Although

5

By Theorem 3.4, Karp and Lipton demonstrated some sort of qualitative upper bound for self-reducible sets in P=poly. In [Sch83], Schning introduced the notion of \lowness o " into structural complexity theory; this notion provides a formal way to discuss such qualitative upper bounds. Intuitively speaking, a set is low it has low complexity when used as an oracle set. More speci cally, Schning de ned the following lowness notion for NP sets. A set A 2 NP is o P -low if 6P (A) = 6P . That is, the oracle information in A is useless in 6P -computation. 6k k k k (The notion of 1P-lowness is de ned similarly by using 1P classes instead of 6P classes.) k k k Notice that for any NP-complete set, e.g., SAT, 6P (SAT) = 6P+1; hence, SAT cannot be k k 6P-low unless the polynomial-hierarchy collapses to the 6P level. In other words, the role k k of SAT in 6P(SAT) cannot be replaced by any 6P-low set unless the polynomial-hierarchy k k collapses to 6P. That is, the lowness level of a set indicates how much it is away from k the NP-completeness. Since the lowness notion was introduced into complexity theory [Sch83], the lowness of various sets have been investigated. Also the lowness notion itself has been generalized in several ways. (Here we can explain only limited number of results. Thus, the reader is recommended to refer a survey paper [Kob95] on the study of low sets.) By looking through these results, one can see the close relation between showing lowness properties and obtaining succinct representations. For example, as shown above, every NP-complete set A has polynomial-size circuits that are computable in 1P. Then we can show that 3 A is 1P -low6 . That is, 1P (A) = 1P , because any 1P (A)-computation can be simulated 3 3 3 3 by some 1P-computation in the following way: First construct a circuit representation 3 of A (up to necessary length). Then using the obtained circuit in place of A, simulate the 1P(A)-computation without using oracle A. Clearly this computation can be done in 3 1P. 3 Since Schning's lowness is de ned for NP sets, coding complexity is discussed in o terms of nonrelativised complexity classes. Nevertheless, some idea for computing circuit representations, e.g., the one stated in [KS85], can be used for any other sets. Furthermore, Balczar, Book, and Schning introduced the notion of \extended lowness" in order to a o discuss lowness properties of sets outside of NP, and for studying this lowness, we now need to consider relativized complexity classes. For example, the following upper bound is immediate from the arguments in [BBS86, KS85]. Every set A in P=poly has a 1P()-self computable tally set. Similarly, it 3 has 1P()-self computable circuits. 3 As we will see later, this upper bound was improved much later from results in computational learning theory.
Theorem 3.6.

For showing lowness properties, it is enough that we can guess a circuit and check its correctness, and this task can be done in 6P, which is lower than 1P. From this fact, a stronger lowness property is 2 3 provable; i.e., A is 6P-low. 2
6

6

Tally and Sparse Degrees
Now let us move to the second line of research relating to coding complexity. It is about \equivalence classes" introduced by Book and Tang. They correspond to the degrees induced by polynomial-time reducibilities. As shown in Proposition 3.3, from the point of view of descriptional complexity, and modulo polynomial-time computations, there is essentially no di erence between three coding systems: circuits, tally sets, and sparse sets. Also there is no di erence between nonadaptive and adaptive way to access information. It is, however, reasonable to expect some sort of di erence between tally and sparse set representations. This and related issues were discussed at length at a workshop that Ron Book organized at UCSB by 1985, as well as the connections with Kolmogorov complexity and lowness. Shortly after, lowness of certain Kolmogorov-easy sets was established by Balczar a and Book, together with the rst rudiments of the result eventually due to Allender and Rubinstein on the isomorphism degrees of tally sets described below. After, the rst author visited Ker-I Ko, then at Houston7 , where three things happened that are still in his memory: the very warm and friendly hospitality of Ker-I and his family, the pain of a wisdom-tooth, and the sudden realization (at the time of going to bed) of the fact that circuit and tally set representations were so closely related. The precise statement of this last intuition was worked out as follows: [BB86] A set A has P( )-self-computable circuits if and only if there exists a tally set T that is polynomial-time Turing equivalent to A, i.e., A P T and T T P A. T
Theorem 3.7.

Thus, for the topics we discuss here, from now on we focus on the di erences between tally and sparse sets, and between nonadaptive and adaptive access. For clarifying such di erence, Book and Tang considered coding complexity, the complexity of obtaining tally or sparse set descriptions from a given P=poly language. The motivation was in the investigation of nonuniform complexity classes, subclasses of P=poly, in the light of the characterization given above of P( )-self-computable circuits. In [BB86], Balczar and Book studied the classi cation of P=poly, and obtained, among a others, the following result. [BB86] There is a set S in P=poly (in fact, a sparse set) for which there is no tally set T that is equivalent to A, i.e., S P T and T P S . Thus, S has no T T P( )-self-computable tally set.
Theorem 3.8.

That is, as a coding language, tally sets have much simpler structure than sparse sets, and thus it is hard to compute a tally set representation. Motivated by this fact, Book and Tang considered \reverse reducibility" (i.e., coding complexity in our terminology),
7

And his good friend Juan Romo, then at A&M at College Station.
7

and proposed the following concept of \equivalence classes" for studying the di erence between tally and sparse sets, and between various reduction types. EP (SPARSE)) is the class of languages that are equivalent to some tally set (resp., sparse r set) via some reduction of type r . By using these classes, they showed the following di erence.

De nition 3.9. [TB88] For any polynomial-time reduction type r , EP (TALLY) (resp., r

Theorem 3.10. [TB88]

 (1) EP (TALLY) 6= EP (SPARSE). T T  (2) EP (TALLY) 6= EP (SPARSE). tt tt  (3) EP (TALLY) 6= EP (TALLY). tt T

The di erence between EP (SPARSE) and EP (SPARSE) classes was again shown much tt T later when coding complexity was studied more explicitly, as described below.

Isomorphism Degrees
Book and Tang and their followers obtained many other classi cation results, just as the work of Book and Ko studied many other reduction classes. See, e.g., [AH92, AHOW91, AW90, TB88, TB91] for these results. We omit most of them here, but we will brie y mention what happens at the other end of the scale: the strongest degrees, de ned by polynomial-time isomorphisms, applied to tally sets. Indeed, tally strings are the most natural examples of words of low Kolmogorov complexity (speci cally, logarithmic), shortly followed by their images under string homomorphisms (such as 0101010101010101). Conversely, then, sets consisting only of words of logarithmic Kolmogorov complexity are the most natural generalization of tally sets, and easily one wonders which of the complexity-theoretic properties of tally sets are preserved under this generalization. The answer is: essentially, all of them. Building on a technically easier similar result in terms of semi-isomorphisms given in [BB86], Allender and Rubinstein found an interesting detour through the notion of rankability [GS91] and the intermediate step of the tally sets in P, which allowed them to prove:

Theorem 3.11. [AR88] A set contains only words of logarithmic, polynomial timebounded, Kolmogorov complexity if and only if it is polynomially isomorphic to a tally set.
This means that these sets have descriptions that are, on the one hand, strictly restricted regarding syntax, since only tally words can be used, but on the other hand are just a renaming of the encoded set, preserving, thus, all properties that are invariant 8

under polynomial-time isomorphism, including all their behaviors under polynomial-time reducibilities. Between isomorphism and Turing equivalence, a rich structure of reduction concepts exists, and each reduction provides a degree structure that we should consider (from the point of view of coding complexity) restricted now to tally sets. It is easy to see that the isomorphism degree of a tally set T di ers from its mequivalence degree, and we just mentioned the situation at the other end, near Turing equivalence. For many intermediate, ner reducibilities, The distinctness of the respective degrees is in fact a very deep question, equivalent or closely related to several other problems concerning the structure of polynomial and exponential-time classes. Speci cally, consider the following purely complexity-theoretic working hypothesis: Accepting computations for nondeterministic exponential time machines can be constructed deterministically in exponential time. This has been called sometimes \hypothesis Q", and has been studied in depth in [AW90]. It is obviously stronger than the equality of deterministic and nondeterministic exponential time, but weaker than P = NP. More precisely, it is an intermediate step [IT89] of the Sewelson's conjecture; that is, E = NE implies Q, and Q implies E = ENP , and Sewelson conjectured that E = NE implies E = ENP . Besides these, Q is related to interesting questions. For instance, certain one-way functions exist if and only if \Q" fails to hold. A connection with coding complexity is as follows: it turns out to be true if and only if all the many-one and bounded-truth-table degrees of tally sets coincide, and false if and only if all of them di er [AW90].
Logarithmic-Size Descriptions and Kolmogorov Complexity

Most of the results we have indicated, and many of the other related results from the same references that we have omitted, can be translated to the much more stringent condition on descriptions, namely logarithmic size. However, not all results are the strict analogues of those in the polynomial case, and technically there are a number of di erences in the argumentations. A rather complete account can be found in the Ph. D. dissertation of Hermo [Her96]; here we only highlight some aspects. The very concept of P=log, the logarithmic analog of P=poly, is somewhat unclear. Two alternative approaches, that are equivalent for polynomial bounds, are not anymore so: do we want a description for A=n for each individual length n, as in [KL80], or a description for An ? The rst one is more natural in terms of circuits but the resulting class is not closed under reductions (nor even isomorphisms), and thus is not amenable of an interesting complexity-theoretic analysis. Thus, the second one, which was introduced by Ko [Ko87], has to be chosen, and the circuit model adapted accordingly; the circuit expressions introduced in [WG94] turn out to be an adequate concept for this end. A second problem is that logarithmic size circuits are nonsense, because there is no way of devoting gates to more than a ridiculous fraction of the input. In [KL80] the proposed solution was to use circuit with easy descriptions, i.e. a sort of \description of 9

the description" of the original set; their formalization, however, was eventually found to correspond to a class di erent from P=log [HM94]. An alternative formulation from [HM94], based on Kolmogorov complexity, is successful in formalizing the intuition of [KL80], though, and works as well for circuit expressions to characterize the variant of Ko. A curious, paradoxical-looking situation arises from the comparison of these logarithmically easy circuit expressions with similarly bounded circuits. Since each circuit is a circuit expression, one would expect that the class de ned by Kolmogorov-easy circuit expressions be larger than that de ned by Kolmogorov-easy circuits; but it turns out to be properly smaller! The explanation is that the condition on the Kolmogorov complexity bounds circuit expressions more stringently than circuits, due to the fact that circuit expressions are requested to work for sets of the form An whereas circuits are already satisfactory if they work for A=n . See [BBH95] or [Her96] for details. This brand of logarithmic advice has also characterizations in terms of \doubly tally" (here denoted tally2) sets: tally sets whose words are all of length a power of 2. Both in these characterizations and in the one with circuit expressions, there is a marked di erence with the proofs in the polynomial case: the argumentation line is nontrivial after comparison. The main technical ingredient is a way of choosing a subsequence of objects from a given sequence (e.g. a sequence of circuit expressions), properly spaced out (usually by a double exponential), and to prove that the subsequence is short enough but still has the same descriptional power as the original sequence. This gives rise to the study of reduction and equivalence classes of tally2 sets, as well as their closure under isomorphism as in the previous section: yet one more instance of coding complexity; they present many characteristics shared with the polynomial bound case, and some peculiarities of their own. And actually a couple of open problems remain in the comparisons among all those reduction and equivalence classes.

4

Coding Complexity and Computational Learning

Though rather implicitly, coding complexity has been investigated in these two lines of research. Thus, when coding complexity was studied explicitly in relation to query learning, there was a rather rm intuitive background on which formal de nitions and proofs could stand. In order to analyze query learnability in structural complexity theory, Watanabe [Wat90] proposed a framework, which was later amended and extended by Watanabe and Gavald [Wat94, WG94]. They attempted to (i) characterize the power of query learna ing systems in terms of relativized complexity classes, and (ii) analyze the complexity of computing a description of a given target language, and thereby demonstrating nonlearnability of some concept classes. Here we omit explaining notions used in query learning and the Watanabe-Gavald framework. Also we state only one result from [WG94] a and omit others. See [Gav94, WG94] for the explanation and more results; we just hope 10

our small probe is illustrative enough. [WG94] (1) If CIR, the class of concepts represented by circuits, is polynomial-time query learnable, then every set A 2 P=poly has 1P( )-self-computable circuits. 2 (2) For some subclass of CIR, e.g., REPCIR, the converse relation also holds. That is, REPCIR is polynomial-time query learnable with subset and superset queries if and only if every set A 2 P=poly has 1P( )-self-computable circuits. 2 Thus, our coding complexity is quite important here; in particular, its lower bound is important for showing non-learnability results.
Theorem 4.1.

Theorem 4.5.

Motivated by all this, Gavald and Watanabe investigated a lower bound on coding coma plexity. Notice that the result of Balczar and Book (i.e., Theorem 3.8) gives one lower a bound. Gavald and Watanabe [GW93] improved it as follows. a [GW93] There is a set A in P=poly that has no (NP \ co-NP)()-selfcomputable tally set. Thus, A has no NPSV()-self-computable circuits. NPSV is the class of functions that are computable by a polynomial-time nondeterministic single valued transducer, which can be regarded as a function version of the class NP \ co-NP. By using the proof technique introduced for proving this theorem, the following separation, which was left open in [TB88], was also proved.  [GW93] EP (SPARSE) 6= EP (SPARSE). T tt The obtained lower bound NP \ co-NP is far from our goal 1P. On the other hand, 2 there is some evidence that this goal is quite dicult to achieve. After proving the NP \ co-NP lower bound, Gavald tried to improve known upper a P (see Theorem 3.6). By using a \majority vote strategy", he could obtain bound, i.e., 13 the following incomparable bound. (Here and after, we will state results only in terms of circuits. But the same or very similar results hold for tally sets.) [Gav95] Every set in P=poly has P(NP() 8 6P)-self-computable circuits. 3 P In P(NP() 8 63 ), a relativized part is only P(NP()) part, and 6P is an 3 absolute class. This was improved further by Kbler in the following way. o [Kob94] Every set in P=poly has P(NP() 8 6P)-self-computable circuits. 2
Theorem 4.2. Remark. Theorem 4.3. Theorem 4.4. Remark.

Upper and Lower Bounds on Coding Complexity

11

It follows from Theorem 4.4 (and of course, Theorem 4.5) that if NP = co-NP (and thus 6 = NP), then every set in P poly has 1 ()-self-computable circuits. Thus, our goal (i.e., the 1 lower bound) implies NP 6= co-NP. Therefore, it seems very hard to prove the 1 lower bound from no assumption. Here the problem of interest is to prove this goal from some reasonable assumption.
P 3 = P 2 P 2 P 2

The NP \ co-NP lower bound in Theorem 4.2 is a slight improvement over the longknown P lower bound. We present now another improvement that, as far as we know, was previously unpublished, namely, a BPP lower bound. The proof is based on a an easy but useful-looking lemma about Kolmogorov complexity taken from [Gav92]; we are not aware of this lemma having been proved in the literature. Informally speaking, we de ne a popular string to be one that shows up very frequently as the output of some recursive function. The lemma quanti es the intuition that all popular strings must have low Kolmogorov complexity. Lemma 4.6. (Popularity Lemma) Let : f0 1g3 ! f0 1g3 and : IN ! IN be total recursive functions. De ne the popularity of (w.r.t. and ) as Pop ( ) =j f j ( ) = ^ j j  (j j) g j Then, for every , K( )  (j j) 0 log Pop ( ) + (log(j j + (j j))) Proof. Notice rst that, on the hypothesis that and are total and recursive, the popularity function is also total and recursive. Fix a length and consider the list , , ..., n consisting of all words of length sorted in order of decreasing popularity; that is, Pop ( )  Pop ( )  ...  Pop ( n ). Ties are solved, say, by lexicographical order. Let be a program that, on input h i, computes the popularity of every string of length , sorts these strings according to their popularity and prints the th word in the resulting list. Then, our universal machine on input h i outputs precisely . We now estimate jh ij. Since every string of length  ( ) adds at most to the popularity of one string, we have for every X 2  Pop ( )  1 Pop ( )
f ; ; g x f g
f;g

A BPP Lower Bound

x

y

f y

x

y

g

x

:

x

x

g

x

f ;g

x

O

x

g

x

:

f

g

n

x1

x2

x2

n

f ;g

x1

f ;g

x2

f;g

x2 p

n; i

n

i

p; n; i

xi

p; n; i

g n

i

g (n )+1

j

i

f;g

xj

i

f;g

xi :

Hence,  2 K( ) by
i xi

g (n)+1

=

Pop ( ), so j j  ( ) + 1 0 log Pop ( ). Therefore, we can bound
f ;g

xi

i

g n

f ;g

xi

12

ij  j j + j j + j j + (log(j j + j j)) = (1) + log + ( ) 0 log Pop ( ) + (log( = ( ) 0 log Pop ( ) + (log( + ( ))) This proves the lemma. t u
K(x )
i

 jh

p; n; i

p

n

i

O

n

i

O

n

g n

f;g

xi

O

n

+ g (n)))

g n

f ;g

xi

O

n

g n

:

It is easy to give a version of this lemma for resource-bounded Kolmogorov complexity. Simply take into account the resources needed to compute f and g , and estimate those used by program p in the proof. The Popularity Lemma can be applied to fool BPP-type oracle computations in the following way: suppose that membership of a particular string in the oracle is \important" to determine whether a BPP machine accepts or rejects. Taking the string in or out of the oracle causes a large fraction of the randomized computations to change their result. Therefore, the string has to be queried in a large fraction of the computations. This causes the string to be very popular, so by the lemma it must have low Kolmogorov complexity. As there are few Kolmogorov-easy strings, there can be only few such \important" strings. This guarantees ample room to diagonalize.
Theorem 4.7.

tally set T tally set.
Proof.

There is a set S in P=poly (in fact, a sparse set) for which there is no such that S P T and T 2 BPP(S ). Thus, S has no BPP( )-self-computable T
n

Choose a sequence n1, n2, . . . , n , . . . of natural numbers such that n +1  2 i . S For each n , let x be a Kolmogorov-random string of length n . De ne S = f x g. Now assume that there is a tally set T such that S P T and T 2 BPP(S ), and let T machines M and M , respectively, witness these facts. Assume without loss of generality that both M and M run in time exactly p(n). and that M accepts in a BPP way with threshold 3=4. From these assumptions, we will derive a contradiction with the fact that strings x are random. Fix a large n and let S 0 be the nite oracle S  i01 . We show that string x is queried not too often by M (0 ). This is equivalent to showing that it does not appear too often as the result of the following function f (hj; k; yi): \the j th query of M with input 0 and oracle S 0 using random bits y ". The function is de ned only for those hj; k ; yi such that j  p(k), jyj  p(k), k  p(n ). For all these and some constant c, jhj; k; yij  p(k) + c log p(k)). Taking g (n) = p(n) + c log p(n), by the Popularity Lemma, we have
i i i i i i i a b a b b i i n i S b k b k i

log Pop (x )
f;g i

 

p k

( ) 0 K(x ) + O (log p(k )) p (k ) 0 n + O (log k )  p(k ) 0 3
i i

13

because k  p(n ). This means that Pop (x )  2 ( ) =8, or, in other words, at most a fraction 1=8 of the 2 ( ) computations of M (0 ) query x . Note that only this small fraction of computations may change when M (0 ) uses oracle S instead of S : S and S di er only on the string x and on strings too long to be reached by M (0 ) (if n is large enough). Therefore, as a probabilistic machine, M (0 ) gives the same result with both oracles S and S . This gives a recursive way of reconstructing T up to length p(n ), given only n , M , and the list of words in S : it is enough to run M (0 ) for every k  p(n ). Once this part of T is obtained, the reduction given by M can be used to nd the word x , by exhaustive search. All in all, we have a description of x in terms of n , M , M , and S . The bit-length of this description is much less than n for n suciently large. This contradicts the randomness of x . t u
i f ;g i p k p k S
0

k

b

i

k

b

0

0

i

b

k

i

b

k

0

0

i

i

b

S

0

k

b

i

a

i

i

i

a

b

0

i

i

i

As we will see next, it is not possible to extend this BPP lower bound to the presence of an NP oracle.
The Interplay with Learning Theory

Recently, yet further improvement has been obtained; this time from computational learning theory. Bshouty, Cleve, Kannan, and Tamon [BCKT94] studied query learnability of CIR when learners are provided some additional computational power, such as access to an NP oracle. By using a clever implementation of \majority vote strategy", they showed that CIR is randomized polynomial-time query learnable by using equivalence queries and also using some NP oracle. This result on the complexity of learning CIR is interesting for two main reasons. First, circuits are powerful enough to encode eciently any representation of boolean functions that is interesting from the point of view of learning theory. More precisely, equivalence queries (and other types of queries) with any sensible representation scheme can be replaced by queries with boolean circuits. Second, it is known that CIR is not polynomial-time learnable under widely believed cryptographic hypotheses ([Val84], see also [KV94]). The upper bound in [BCKT94] complements, up to some point, this cryptographic lower bound. In our framework, the learning result in [BCKT94] can be interpreted in the following way. [BCKT94] Every set in P=poly has ZPP(NP( ))-self-computable circuits. Remark. ZPP is the class of sets (in this case, functions) that are computable by some randomized machine with no error and within expected polynomial time.
Theorem 4.8.

This result, in turn, gives the following improvements of Theorem 3.5 and Theorem 3.4. [KW95] Every NP-complete set A in P=poly has polynomial-size circuits computable in ZPP(NP).
Theorem 4.9.

14

Theorem 4.10.

to ZPP(NP).

[KW95] If NP  P=poly, then the polynomial-time hierarchy collapses

To nish, let us mention very brie y that also the case of logarithmically long descriptions treated in section 3 has some relationship to learning theory. Consider Kolmogoroveasy circuit expressions that characterize the logarithmic nonuniform class: they are learnable in the presence of an NP oracle, by means of membership queries to the set to be described. Furthermore, the sets whose descriptions in these same terms can be found with queries to the set can be characterized by the polynomial time Turing degrees of tally2 sets (see [BBH95] or [Her96]). And, eventually, it has been proved [BBu96] that Kolmogorov-easy circuit expressions can be learned in polynomial time via membership queries (without the help of any additional oracle) if and only if the purely complexitytheoretic hypothesis Q, described in a previous section, holds: a learning algorithm exists for these expressions if and only if accepting computations for nondeterministic exponential time machines can be constructed deterministically in exponential time. As a by-product of the proof, it is proved there as well that the access to NP can be reduced to nonadaptive in the learning algorithm of [BBH95] if and only if the hypothesis Q is equivalent to the equality E = NE.

Acknowledgments
We would like to thank Ker-I Ko and Ding-Zhu Du for the task of preparing this festschrift volume, and for inviting us to participate in it. Osamu also thanks Josep Daz for inviting  him to Barcelona in 1990 and giving him a chance to start working with excellent researchers of Barcelona on the topics explained in this paper. But everything had started when Ron Book invited Jose and then Osamu to Santa Barbara.

References
[AH92] Allender E, Hemachandra L. Lower bounds for the low hierarchy. Journal ACM 1992;39:234{251.
of the

[AHOW91] Allender E, Hemachandra L, Ogiwara M, Watanabe O. Relating equivalence and reducibility to sparse sets. SIAM Journal of Computing 1992;21:521{539. [AR88] Allender E, Rubinstein R. P-printable sets. 1988;17:1193{1202.
SIAM Journal of Computing

[AW90] Allender E, Watanabe O. Kolmogorov complexity and degrees of tally sets. formation and Computation 1990;86:160{178.

In-

[Ang87] Angluin, D. Leaning regular sets from queries and counterexamples. Information and Computation 1987;75:87{106. 15

[Ang87] Angluin, D. Queries and concept learning. Machine

Learning

1988;2:319{342.
Acta

[BB86] Balczar J, Book R. Sets with small generalized Kolmogorov complexity, a Informatica 1986;23,679{688.

[BBS86] Balczar J, Book R, Schning U. Sparse sets, lowness and highness. SIAM Joura o nal of Computing 1986;15:679{688. [BBu96] Balczar J, Buhrman H. Characterizing the learnability of Kolmogorov-easy cira cuit expressions. In preparation. Preliminary version as report LSI{95{59{R. [BBH95] Balczar J, Buhrman H, Hermo M. Learnability of Kolmogorov-easy circuit a expressions. Proc. Second European Conference on Computational Learning Theory, Lecture Notes in Computer Science, Springer-Verlag 1995;904:112{124. [BDG88] Balczar J, Daz J, Gabarr J. Structural a  o
Complexity I

, Springer-Verlag 1988.

[BH77] Berman L, Hartmanis J. On isomorphisms and density of NP and other complete sets. SIAM Journal of Computing 1977;6:305{322. [BEHW87] Blumer A, Ehrenfeucht A, Haussler D, Warmuth M. Occam's razor. Information Processing Letters 1987;24:377{380. [BP90] Board R, Pitt L. On the necessity of Occam algorithms. posium on Theory of Computing , ACM Press 1990:54{63.
Proc. 22nd ACM Sym-

[BK88] Book R, Ko K. On sets truth-table reducible to sparse sets. Computing 1988;17:903{919.

SIAM Journal of

[BW93] Book R, Watanabe O. A view of structural complexity theory. Current Trends in Theoretical Computer Science (G. Rozenburg and A. Salomaa Eds.), World Scienti c 1993:451{468. [BCKT94] Bshouty N, Cleve R, Kannan S, Tamon C. Oracles and queries that are suf cient for exact learning. Proc. 7th ACM Conference on Computational Learning Theory ACM Press 1994:130{139. See also the journal version [BCGKT95]. [BCGKT95] Bshouty N, Cleve R, Gavald R, Kannan S, Tamon C. Oracles and queries a that are sucient for exact learning. Journal of Computer and System Sciences 1996;52:421{433. [Gav92] Gavald R. a
plexity Theory. Kolmogorov Randomness and its Applications to Structural Com-

Doctoral dissertation, LSI Department, Universitat Politcnica de e Catalunya, Barcelona, april 1992.

[Gav94] Gavald R. The complexity of learning with queries. Proc. 9th Structure in Coma plexity Theory Conference , IEEE 1994;324{337. 16

[Gav95] Gavald R. Bounding the complexity of advice functions. Journal of Computer a and System Sciences 1995;50:468{475. Preliminary version in Proc. 7th Structure in Complexity Theory Conference, IEEE 1992;249{245. [GW93] Gavald R, Watanabe O. On the computational complexity of small descriptions. a SIAM Journal of Computing 1993;22:1257{1275. [GS91] Goldberg A, Sipser M. Compression and ranking. 1991;20:524{536.
SIAM Journal of Computing

[Her96] Hermo M. Nonuniform complexity classes with sub-linear advice functions. Doctoral Dissertation, Universidad del Pa Vasco, Donostia, march 1996. s [HM94] Hermo M, Mayordomo E. A note on polynomial-size circuits with low resourcebounded Kolmogorov complexity. Mathematical Systems Theory 1994;27:347{356. [IT89] Impagliazzo R, Tardos G. Decision versus search problems in super-polynomial time. Proc. 30th IEEE Symposium on Foundations of Computer Science, IEEE 1989;222{227. [Joh90] Johnson D. A catalog of complexity classes. Handbook Science (Van Leeuwen, Ed.), Elsevier 1990:67{161. [Ko87] Ko K. On helping by robust oracle machines. 1987;52;15{36.
of Theoretical Computer

Theoretical Computer Science

[KL80] Karp R, Lipton R. Some connections between nonuniform and uniform complexity classes. Proc. 12th ACM Symposium on Theory of Computing, ACM Press 1980:302{ 309. [KV94] Kearns M, Valiant L. Cryptographic limitations on learning boolean formulae and nite automata. Journal of the ACM 1994;41:67{95. [Ko85] Ko K. Continuous optimization problems and a polynomial hierarchy of real functions. Journal of Complexity 1985;1:210{231. [KS85] Ko K, Schning U. On circuit-size complexity and the low hierarchy. SIAM Journal o of Computing 1985;14:41{51. [Kob94] Kbler J. Locating P/poly optimally in the extended low hierarchy. Theoretical o Computer Science 1994;134:263{285. [Kob95] Kbler J. On the structure of low sets. Proc. 10th Structure in Complexity Theory o Conference, IEEE 1995;246{261.

17

[KW95] Kbler J, Watanabe O. New collapse consequences of NP having small circuits. o Proc. 22nd International Colloquium on Automata, Languages and Programming, Lecture Notes in Computer Science, Springer-Verlag 1995;944:196{207. [LV93] Li M, Vitnyi P. An Introduction to Kolmogorov Complexiy and Its Applications. a Texts and Monographs in Computer Science. Springer-Verlag, 1993. [PV88] Pitt L, Valiant L. Computational limitations on learning from examples. Journal of the ACM 1988;35:965{984. [Sav72] Savage J. Computational work and time on nite machines. Journal of the ACM 1972;19:660{674. [Sch83] Schning U. A low and a high hierarchy in NP. Journal on Computer and System o 1983;27:14{28. [Sch86] Schning U. Complexity and Structure, Lecture Notes in Computer Science, o Springer-Verlag 1986;211. [Pap94] Papadimitriou C. Computational Complexity, Addison-Wesley 1994. [TB88] Tang S, Book R. Separating polynomial-time Turing and truth-table reductions by tally sets. Proc. 15th International Colloquium on Automata, Languages and Programming, Lecture Notes in Computer Science, Springer-Verlag 1988;317:591{599. [TB91] Tang S, Book R. Reducibilities on tally and sparse sets. RAIRO Informatique Thorique et Applications 1991;25:293{302. (This is an extended version of [TB88].) e [Val84] Valiant L. A theory of the learnable. Communication of the ACM 1984;27:1134{ 1142. [Wat90] Watanabe O. A formal study of learning via queries. Proc. 17th International Colloquium on Automata, Languages and Programming, Lecture Notes in Computer Science, Springer-Verlag 1990;443:139{152. [Wat94] Watanabe O: A framework for polynomial time query learnability. Mathematical Systems Theory 1994;27:211{229. [WG94] Watanabe O, Gavald R. Structural analysis of polynomial time query learnabila ity. Mathematical Systems Theory 1994;27:231{256.

18

Ïà¹ØÎÄÕÂ:
...Schemes on the Computational Complexity of Genet....pdf
The Influence of Different Coding Schemes on the Computational Complexity of ...Traditionally, GAs work on bit strings of xed total length l. Signi cant...
On the complexity of game isomorphism.pdf
We show that the computational complexity of the game isomorphism problem depends on the level of succinctness of the description of the input games but ...
The computational complexity of decentralized discr....pdf
The computational complexity of decentralized discrete-event control problems_×¨Òµ×ÊÁÏ¡£Computational complexity results are obtained for decentralized discrete-event...
Good policies for partially-observable Markov decis....pdf
We analyze the computational complexity of all of...Using succinct descriptions, the completeness ...representations and get intermediate complexity ...
¿ÆÑ§ÎÄÏ×.pdf
ABSTRACT The computational complexity of nite ...complexity classes; classes range from non...succinct descriptions 19, 37] for POMDPs are ...
Computational Complexity and Lexical.Functional Gra....pdf
Computational Complexity and Lexical.Functional ...Utilising the Grammar Coding System of LDOCE Bogura...Descriptions and an Analysis of its Complexity ...
The computational complexity of probabilistic plann....pdf
The computational complexity of probabilistic planning We examine the computational...Theoretische Informatik, Universitat Trier D-54286 Trier, GERMANY Michael L....
The computational complexity of some problems of li....pdf
The computational complexity of some problems of linear algebra Reproduction of...Computer Science, University of Waterloo, Waterloo, Ontario N2L 3G1, CANADA...
Reducing the Computational Complexity of a MAP Post....pdf
Reducing the Computational Complexity of a MAP ...Robertson and Robert L. Stevenson Department of ...of using this coding method (at least in terms...
The complexity of plan existence and evaluation in ....pdf
when it is expressed in a succinct representation...the computational complexity of determining whether ...descriptions of N and x, produce a description ...
Descriptive and computational complexity.pdf
Descriptive and computational complexity_×¨Òµ×ÊÁÏ¡£...For a long time, the notion of complexity ...coding a rst-order structure A of vocabulary = ...
On the computational complexity of sequence design ....pdf
On the computational complexity of sequence design problems_×¨Òµ×ÊÁÏ¡£Inverse ...Let L be a 2D or 3D cubic lattice and let G = (V; E ) be a ...
Remarks on the computational complexity of small un....pdf
Remarks on the computational complexity of small universal Turing machines_×¨Òµ×ÊÁÏ¡£This paper surveys some topics in the area of small universal Turing ...
...into the Computational Complexity of Planning in....pdf
An Investigation into the Computational Complexity of Planning in Non...and complexity for planning with rst order state and operator descriptions. ...
Computational complexity of simultaneous elementary....pdf
BP 239, 54506 Vand uvre-l s-Nancy, France. ...We study the computational complexity of ...Complexity A signature is a countable set of ...
Computational Complexity of Multi-way, DataAEow Con....pdf
Some of the local propagation algorithms that Computational Complexity of Multi-way, Data ow Constraint Problems Gilles Trombettoni, Bertrand Neveu Rapport de...
P. Heggernes y The Computational Complexity of the ....pdf
P. Heggernes y The Computational Complexity of ...The lled graph G = (V E ) is obtained by ...descriptions of several MD algorithms all these ...
On the Size of Logical Descriptions.pdf
as succinct as p On the Size of Logical Descriptions J. W. Degen ...then the L-atomic complexity of the structure A is also denoted by L ?...
Computational complexity in two-level morphology.pdf
complexity; a merged dictionary with bit-vectors ...Instead, the external descriptions are converted to...(1985). "The Computational Complexity of Two-...
Journal of Automata, Languages and Combinatorics u ....pdf
succinct descriptions of some context-free languages...grammar, intersection, descriptional complexity, ...its descriptional and computational complexity. Section...