Randomized Algorithms (Spring 2010)/Approximate counting, linear programming

From TCS Wiki
Revision as of 07:02, 19 May 2010 by imported>WikiSysop (→‎Approximate Counting)
Jump to navigation Jump to search

Counting Problems

Complexity model

FPRAS

Approximate Counting

Let [math]\displaystyle{ U }[/math] be a finite set of known size, and let [math]\displaystyle{ G\subseteq U }[/math]. We want to compute the size of [math]\displaystyle{ G }[/math], [math]\displaystyle{ |G| }[/math].

We assume two devices:

  • A uniform sampler [math]\displaystyle{ \mathcal{U} }[/math], which uniformly and independently samples a member of [math]\displaystyle{ U }[/math] upon each calling.
  • A membership oracle of [math]\displaystyle{ G }[/math], denoted [math]\displaystyle{ \mathcal{O} }[/math]. Given as the input an [math]\displaystyle{ x\in U }[/math], [math]\displaystyle{ \mathcal{O}(x) }[/math] indicates whether or not [math]\displaystyle{ x }[/math] is a member of [math]\displaystyle{ G }[/math].

Equipped by [math]\displaystyle{ \mathcal{U} }[/math] and [math]\displaystyle{ \mathcal{O} }[/math], we can have the following Monte Carlo algorithm:

  • Choose [math]\displaystyle{ N }[/math] independent samples from [math]\displaystyle{ U }[/math]by the uniform sampler [math]\displaystyle{ \mathcal{U} }[/math], represented by the random variables [math]\displaystyle{ X_1,X_2,\ldots, X_N }[/math].
  • Let [math]\displaystyle{ Y_i }[/math] be the indicator random variable defined as [math]\displaystyle{ Y_i=\mathcal{O}(X_i) }[/math], namely, [math]\displaystyle{ Y_i }[/math] indicates whether [math]\displaystyle{ X_i\in G }[/math].
  • Define the estimator random variable
[math]\displaystyle{ Z=\frac{|U|}{N}\sum_{i=1}^N Y_i. }[/math]

It is easy to see that [math]\displaystyle{ \mathbf{E}[Z]=|G| }[/math] and we might hope that with high probability the value of [math]\displaystyle{ Z }[/math] is close to [math]\displaystyle{ |G| }[/math]. Formally, [math]\displaystyle{ Z }[/math] is called an [math]\displaystyle{ \epsilon }[/math]-approximation of [math]\displaystyle{ |G| }[/math] if

[math]\displaystyle{ (1-\epsilon)|G|\le Z\le (1+\epsilon)|G|. }[/math]

The following theorem states that the probabilistic accuracy of the estimation depends on the number of samples and the ratio between [math]\displaystyle{ |G| }[/math] and [math]\displaystyle{ |U| }[/math]

Theorem (estimator theorem)
Let [math]\displaystyle{ \alpha=\frac{|G|}{|U|} }[/math]. Then the Monte Carlo method yields an [math]\displaystyle{ \epsilon }[/math]-approximation to [math]\displaystyle{ |G| }[/math] with probability at least [math]\displaystyle{ 1-\delta }[/math] provided
[math]\displaystyle{ N\ge\frac{4}{\epsilon \alpha}\ln\frac{2}{\delta} }[/math].

Proof: Use the Chernoff bound.

[math]\displaystyle{ \square }[/math]

A counting algorithm for the set [math]\displaystyle{ G }[/math] has to deal with the following three complications:

  • Implement the membership oracle [math]\displaystyle{ \mathcal{O} }[/math]. This is usually straightforward, or assumed by the model.
  • Implement the uniform sampler [math]\displaystyle{ \mathcal{U} }[/math]. As we have seen, this can be solved by relaxing to near-uniform sampler which can be simulated by random walks.
  • Deal with exponentially small [math]\displaystyle{ \alpha=\frac{|G|}{|U|} }[/math]. This requires us to cleverly choose the universe [math]\displaystyle{ U }[/math]. Sometimes this needs some nontrivial ideas.

Counting DNFs

Permanents and perfect matchings

Volume estimation of convex bodies

Linear Programming