Randomized Algorithms (Spring 2010)/Complexity classes and lower bounds

From TCS Wiki
Revision as of 15:38, 8 January 2010 by imported>WikiSysop
Jump to navigation Jump to search

Computational Models

Upper bounds, lower bounds

Bounds are just inequalities. An inequality

[math]\displaystyle{ A\le B }[/math]

is read "[math]\displaystyle{ B }[/math] is a lower bound of [math]\displaystyle{ A }[/math]" or equivalently "[math]\displaystyle{ A }[/math] is an upper bound of [math]\displaystyle{ B }[/math]".

In Computer Science, when talking about upper or lower bounds, people really mean the upper or lower bounds of complexities.

Complexity is measured by the resource costed by the computation. Our most precious resource is time (life is short!). Besides time complexity, there are other measures of complexity we may care about in different situations, including:

  • space;
  • communication costs;
  • number of random bits;
  • number of queries to the input;
  • amount of information provided by an oracle.

There are two fundamental ways of measuring complexities

Complexity of algorithms
For an algorithm [math]\displaystyle{ A }[/math], its complexity represents its performance. Let [math]\displaystyle{ T(A,x) }[/math] be the running time of the algorithm [math]\displaystyle{ A }[/math] on input [math]\displaystyle{ x }[/math]. The worst case time complexity is given by [math]\displaystyle{ T(A)=\max_{x}T(A,x) }[/math], where the maximum is taken of the domain of the inputs with the fixed length [math]\displaystyle{ n }[/math], and [math]\displaystyle{ T(A) }[/math] is written as a function of [math]\displaystyle{ n }[/math].
Complexity of problems
A computational problem can be formalized as a mapping from inputs to outputs. For a problem [math]\displaystyle{ f }[/math], its time complexity [math]\displaystyle{ T(f) }[/math] is defined as the [math]\displaystyle{ T(A) }[/math] for the best possible algorithm [math]\displaystyle{ A }[/math] that computes [math]\displaystyle{ f }[/math]. That is [math]\displaystyle{ T(f)=\min\{T(A)\mid A\mbox{ computes }f\} }[/math]. Again, [math]\displaystyle{ T(f) }[/math] is represented as a function of the length of the input [math]\displaystyle{ n }[/math].

The complexity of an algorithm tells how good a solution is, yet the complexity of a problem tells how hard a problem is. While the former is what we care mostly about in practice, the later is more about the fundamental truths of computation.

In Theoretical Computer Science, when talking about upper or lower bounds, people usually refer to the bounds of the complexity of a problem, rather than that of an algorithm.

Specifically, an upper bound (of the time complexity [math]\displaystyle{ T(f) }[/math] of the problem [math]\displaystyle{ f }[/math]) is an inequality in the form [math]\displaystyle{ T(f)\le t }[/math], expanding which, is

"there exists an algorithm [math]\displaystyle{ A }[/math] that computes [math]\displaystyle{ f }[/math], such that [math]\displaystyle{ T(A)\le t }[/math]".

And a lower bound [math]\displaystyle{ T(f)\ge t }[/math] means

"for any algorithm [math]\displaystyle{ A }[/math] that computes [math]\displaystyle{ f }[/math], its time complexity is at least [math]\displaystyle{ T(A)\ge t }[/math]".

Sometimes, the two terms "upper bound" and "lower bound" are abused so that an algorithm is called an upper bound and any bad news (impossibility results) can be called a lower bound.

Today's lecture is devoted to lower bounds, i.e. the necessary prices which have to paid by any algorithms which solve the given problems. Speaking of necessary prices, we have to be specific about the model, the mathematical rules which describe what problems are and what algorithms are.

Decision problems

Turing Machine

Complexity Classes

Problems are organized into classes according to their complexity in respective computational models. There are nearly 500 classes collected by the Complexity Zoo.

P, NP

A deterministic algorithm [math]\displaystyle{ A }[/math] is polynomial time if its running time is within a polynomial of [math]\displaystyle{ n }[/math] on any input of length [math]\displaystyle{ n }[/math].

We first define the P class.

Definition 1 The class P is the class of decision problems that can be computed by polynomial time algorithms.

Then we introduce the infamous NP class.

Definition 2 The class NP consists of all decision problems [math]\displaystyle{ f }[/math] that have a polynomial time algorithm [math]\displaystyle{ A }[/math] such that for any input [math]\displaystyle{ x }[/math],

[math]\displaystyle{ f(x)=1 }[/math] if and only if [math]\displaystyle{ \exists y }[/math], [math]\displaystyle{ A(x,y)=1 }[/math], where the size of [math]\displaystyle{ y }[/math] is within polynomial of the size of [math]\displaystyle{ x }[/math]

Informally, NP is the class of decision problems that the "yes" answers can be verified in polynomial time. The string [math]\displaystyle{ y }[/math] in the definition is called a certificate or a witness. The algorithm [math]\displaystyle{ A }[/math] verifies (instead of computing) the positive result of [math]\displaystyle{ f(x) }[/math] providing a polynomial size certificate.

This definition one of the equivalent definitions of the NP class. Another definition (also a classic one) is that NP is the class of decision problems that can be computed by polynomial time nondeterministic algorithms.

Note that unlike P, the definition of NP is asymmetric. It only requires the positive answer [math]\displaystyle{ f(x)=1 }[/math] to be poly-time verifiable, but does not say anything abou the negative answers [math]\displaystyle{ f(x)=0 }[/math]. The class for that case is co-NP.

Definition 3 The class co-NP consists of all decision problems [math]\displaystyle{ f }[/math] that have a polynomial time algorithm [math]\displaystyle{ A }[/math] such that for any input [math]\displaystyle{ x }[/math],

[math]\displaystyle{ f(x)=0 }[/math] if and only if [math]\displaystyle{ \exists y }[/math], [math]\displaystyle{ A(x,y)=0 }[/math], where the size of [math]\displaystyle{ y }[/math] is within polynomial of the size of [math]\displaystyle{ x }[/math]

Clearly, P [math]\displaystyle{ \subseteq }[/math] NP [math]\displaystyle{ \cap }[/math] co-NP. Does P = NP [math]\displaystyle{ \cap }[/math] co-NP? We don't know.

RP, BPP, ZPP

Now we proceeds to define complexity classes of the problems that can be solved efficiently by randomized algorithms. There are three types of randomized algorithms: Monte Carlo algorithms with one-sided errors, Monte Carlo algorithms with two-sided errors, and Las Vegas algorithms, which define the following three complexity classes: RP (for Randomized Polynomial time), BPP (for Bounded-error Probabilistic Polynomial time), and ZPP (for Zero-error Probabilistic Polynomial time).

For Monte Carlo algorithms, the running time is fixed but the correctness is random. If a Monte Carlo algorithm only has false positive answers but never has false negative answers, then it is called with one-sided error; otherwise if a Monte Carlo algorithm has both false positive answers and false negative answers, then it is called with two-sided error.

We first introduce the class RP of the problems which can be solved by polynomial time Monte Carlo algorithms with one-sided error.

Definition 4 The class RP consists of all decision problems [math]\displaystyle{ f }[/math] that have a randomized algorithm [math]\displaystyle{ A }[/math] running in worst-case polynomial time such that for any input [math]\displaystyle{ x }[/math],

  • if [math]\displaystyle{ f(x)=1 }[/math], then [math]\displaystyle{ \Pr[A(x)=1]\ge 1/2 }[/math];
  • if [math]\displaystyle{ f(x)=0 }[/math], then [math]\displaystyle{ \Pr[A(x)=1]=0 }[/math].

Like NP, the class RP is also asymmetrically defined, which hints us to define its co-class.

Definition 5 The class co-RP consists of all decision problems [math]\displaystyle{ f }[/math] that have a randomized algorithm [math]\displaystyle{ A }[/math] running in worst-case polynomial time such that for any input [math]\displaystyle{ x }[/math],

  • if [math]\displaystyle{ f(x)=1 }[/math], then [math]\displaystyle{ \Pr[A(x)=1]= 1 }[/math];
  • if [math]\displaystyle{ f(x)=0 }[/math], then [math]\displaystyle{ \Pr[A(x)=1]\le 1/2 }[/math].

We define the class BPP of the problems which can be solved by polynomial time Monte Carlo algorithms with two-sided error.

Definition 6 The class BPP consists of all decision problems [math]\displaystyle{ f }[/math] that have a randomized algorithm [math]\displaystyle{ A }[/math] running in worst-case polynomial time such that for any input [math]\displaystyle{ x }[/math],

  • if [math]\displaystyle{ f(x)=1 }[/math], then [math]\displaystyle{ \Pr[A(x)=1]\ge 3/4 }[/math];
  • if [math]\displaystyle{ f(x)=0 }[/math], then [math]\displaystyle{ \Pr[A(x)=1]\le 1/4 }[/math].

And finally the class ZPP of the problems which can be solved by Las Vegas algorithms running in expected polynomial time.

Definition 7 The class ZPP consists of all decision problems [math]\displaystyle{ f }[/math] that have a randomized algorithm [math]\displaystyle{ A }[/math] running in expected polynomial time such that for any input [math]\displaystyle{ x }[/math], [math]\displaystyle{ A(x)=f(x) }[/math].


What is known (you can even prove by yourself):

  • RP, BPP, ZPP all contain P
  • ZPP = RP[math]\displaystyle{ \cap }[/math]co-RP;
  • RP [math]\displaystyle{ \subseteq }[/math] BPP;
  • co-RP [math]\displaystyle{ \subseteq }[/math] BPP.

Open problem:

  • BPP vs P (the second most important open problem in the complexity theory to the P vs NP).

Yao's minimax principle