Randomized Algorithms (Spring 2010)/Approximate counting, linear programming
Counting Problems
Complexity model
FPRAS
Approximate Counting
Let us consider the following formal problem.
Let [math]\displaystyle{ U }[/math] be a finite set of known size, and let [math]\displaystyle{ G\subseteq U }[/math]. We want to compute the size of [math]\displaystyle{ G }[/math], [math]\displaystyle{ |G| }[/math].
We assume two devices:
- A uniform sampler [math]\displaystyle{ \mathcal{U} }[/math], which uniformly and independently samples a member of [math]\displaystyle{ U }[/math] upon each calling.
- A membership oracle of [math]\displaystyle{ G }[/math], denoted [math]\displaystyle{ \mathcal{O} }[/math]. Given as the input an [math]\displaystyle{ x\in U }[/math], [math]\displaystyle{ \mathcal{O}(x) }[/math] indicates whether or not [math]\displaystyle{ x }[/math] is a member of [math]\displaystyle{ G }[/math].
Equipped by [math]\displaystyle{ \mathcal{U} }[/math] and [math]\displaystyle{ \mathcal{O} }[/math], we can have the following Monte Carlo algorithm:
- Choose [math]\displaystyle{ N }[/math] independent samples from [math]\displaystyle{ U }[/math]by the uniform sampler [math]\displaystyle{ \mathcal{U} }[/math], represented by the random variables [math]\displaystyle{ X_1,X_2,\ldots, X_N }[/math].
- Let [math]\displaystyle{ Y_i }[/math] be the indicator random variable defined as [math]\displaystyle{ Y_i=\mathcal{O}(X_i) }[/math], namely, [math]\displaystyle{ Y_i }[/math] indicates whether [math]\displaystyle{ X_i\in G }[/math].
- Define the estimator random variable
- [math]\displaystyle{ Z=\frac{|U|}{N}\sum_{i=1}^N Y_i. }[/math]
It is easy to see that [math]\displaystyle{ \mathbf{E}[Z]=|G| }[/math] and we might hope that with high probability the value of [math]\displaystyle{ Z }[/math] is close to [math]\displaystyle{ |G| }[/math]. Formally, [math]\displaystyle{ Z }[/math] is called an [math]\displaystyle{ \epsilon }[/math]-approximation of [math]\displaystyle{ |G| }[/math] if
- [math]\displaystyle{ (1-\epsilon)|G|\le Z\le (1+\epsilon)|G|. }[/math]
The following theorem states that the probabilistic accuracy of the estimation depends on the number of samples and the ratio between [math]\displaystyle{ |G| }[/math] and [math]\displaystyle{ |U| }[/math]
Theorem (estimator theorem)
|
Proof: Use the Chernoff bound.
[math]\displaystyle{ \square }[/math]
A counting algorithm for the set [math]\displaystyle{ G }[/math] has to deal with the following three complications:
- Implement the membership oracle [math]\displaystyle{ \mathcal{O} }[/math]. This is usually straightforward, or assumed by the model.
- Implement the uniform sampler [math]\displaystyle{ \mathcal{U} }[/math]. As we have seen, this is usually approximated by random walks. How to design the random walk and bound its mixing rate is usually technical challenging, if possible at all.
- Deal with exponentially small [math]\displaystyle{ \alpha=\frac{|G|}{|U|} }[/math]. This requires us to cleverly choose the universe [math]\displaystyle{ U }[/math]. Sometimes this needs some nontrivial ideas.
Counting DNFs
A disjunctive normal form (DNF) formular is a disjunction (OR) of clauses, where each clause is a conjunction (AND) of literals. For example:
- [math]\displaystyle{ (x_1\wedge \overline{x_2}\wedge x_3)\vee(x_2\wedge x_4)\vee(\overline{x_1}\wedge x_3\wedge x_4) }[/math].
Note the difference from the conjunctive normal forms (CNF).
Given a DNF formular [math]\displaystyle{ \phi }[/math] as the input, the problem is to count the number of satisfying assignments of [math]\displaystyle{ \phi }[/math]. This problem is #P-complete.
Naively applying the Monte Carlo method will not give a good answer. Suppose that there are [math]\displaystyle{ n }[/math] variables. Let [math]\displaystyle{ U=\{\mathrm{true},\mathrm{false}\}^n }[/math] be the set of all truth assignments of the [math]\displaystyle{ n }[/math] variables. Let [math]\displaystyle{ G=\{x\in U\mid \phi(x)=\mathrm{true}\} }[/math] be the set of satisfying assignments for [math]\displaystyle{ \phi }[/math]. The straightforward use of Monte Carlo method samples [math]\displaystyle{ N }[/math] assignments from [math]\displaystyle{ U }[/math] and check how many of them satisfy [math]\displaystyle{ \phi }[/math]. This algorithm fails when [math]\displaystyle{ |G|/|U| }[/math] is exponentially small, namely, when exponentially small fraction of the assignments satisfy the input DNF formula.
- The union of sets problem
We reformulate the DNF counting problem in a more abstract framework, called the union of sets problem.
Let [math]\displaystyle{ V }[/math] be a finite universe. We are given [math]\displaystyle{ m }[/math] subsets [math]\displaystyle{ H_1,H_2,\ldots,H_m\subseteq V }[/math]. The following assumptions hold:
- For all [math]\displaystyle{ i }[/math], [math]\displaystyle{ |H_i| }[/math] is computable in poly-time.
- It is possible to sample uniformly from each individual [math]\displaystyle{ H_i }[/math].
- For any [math]\displaystyle{ x\in V }[/math], it can be determined in poly-time whether [math]\displaystyle{ x\in H_i }[/math].
The goal is to compute the size of [math]\displaystyle{ H=\bigcup_{i=1}^m H_i }[/math].
DNF counting can be interpreted in this general framework as follows. Suppose that the DNF formula [math]\displaystyle{ \phi }[/math] is defined on [math]\displaystyle{ n }[/math] variables, and [math]\displaystyle{ \phi }[/math] contains [math]\displaystyle{ m }[/math] clauses [math]\displaystyle{ C_1,C_2,\ldots,C_m }[/math], where clause [math]\displaystyle{ C_i }[/math] has [math]\displaystyle{ k_i }[/math] literals. Without loss of generality, we assume that in each clause, each variable appears at most once.
- [math]\displaystyle{ V }[/math] is the set of all assignments.
- Each [math]\displaystyle{ H_i }[/math] is the set of satisfying assignments for the [math]\displaystyle{ i }[/math]-th clause [math]\displaystyle{ C_i }[/math] of the DNF formular [math]\displaystyle{ \phi }[/math]. Then the union of sets [math]\displaystyle{ H=\bigcup_i H_i }[/math] gives the set of satisfying assignments for [math]\displaystyle{ \phi }[/math].
- Each clause [math]\displaystyle{ C_i }[/math] is a conjunction (AND) of literals. It is not hard to see that [math]\displaystyle{ |H_i|=2^{n-k_i} }[/math], which is efficiently computable.
- Sampling from an [math]\displaystyle{ H_i }[/math] is simple: we just fix the assignments of the [math]\displaystyle{ k_i }[/math] literals of that clause, and sample uniformly and independently the rest [math]\displaystyle{ (n-k_i) }[/math] variable assignments.
- For each assignment [math]\displaystyle{ x }[/math], it is easy to check whether it satisfies a clause [math]\displaystyle{ C_i }[/math], thus it is easy to determine whether [math]\displaystyle{ x\in H_i }[/math].
- The coverage algorithm
We now introduce the coverage algorithm for the union of sets problem.
Consider the multiset [math]\displaystyle{ U }[/math] defined by
- [math]\displaystyle{ U=H_1\uplus H_2\uplus\cdots \uplus H_m }[/math],
where [math]\displaystyle{ \uplus }[/math] denotes the multiset union. It is more convenient to define [math]\displaystyle{ U }[/math] as the set
- [math]\displaystyle{ U=\{(x,i)\mid x\in H_i\} }[/math].
For each [math]\displaystyle{ x\in H }[/math], there may be more than one instances of [math]\displaystyle{ (x,i)\in U }[/math]. We can choose a unique representative among the multiple instances [math]\displaystyle{ (x,i)\in U }[/math] for the same [math]\displaystyle{ x\in H }[/math], by choosing the [math]\displaystyle{ (x,i) }[/math] with the minimum [math]\displaystyle{ i }[/math], and form a set [math]\displaystyle{ G }[/math].
Formally, [math]\displaystyle{ G=\{(x,i)\in U\mid \forall (x,j)\in U, j\le i\} }[/math]. Every [math]\displaystyle{ x\in H }[/math] corresponds to a unique [math]\displaystyle{ (x,i)\in G }[/math] where [math]\displaystyle{ i }[/math] is the smallest among [math]\displaystyle{ x\in H_i }[/math].
It is obvious that [math]\displaystyle{ G\subseteq U }[/math] and
- [math]\displaystyle{ |G|=|H| }[/math].
Therefore, estimation of [math]\displaystyle{ |H| }[/math] is reduced to estimation of [math]\displaystyle{ |G| }[/math] with [math]\displaystyle{ G\subseteq U }[/math]. Then [math]\displaystyle{ |G| }[/math] can have an [math]\displaystyle{ \epsilon }[/math]-approximation with probability [math]\displaystyle{ (1-\delta) }[/math] in poly-time, if we can uniformly sample from [math]\displaystyle{ U }[/math] and [math]\displaystyle{ |G|/|U| }[/math] is suitably small.
An uniform sample from [math]\displaystyle{ U }[/math] can be implemented as follows:
- generate an [math]\displaystyle{ i\in\{1,2,\ldots,m\} }[/math] with probability [math]\displaystyle{ \frac{|H_i|}{\sum_{i=1}^m|H_i|} }[/math];
- uniformly sample an [math]\displaystyle{ x\in H_i }[/math], and return [math]\displaystyle{ (x,i) }[/math].
It is easy to see that this gives a uniform member of [math]\displaystyle{ U }[/math]. The above sampling procedure is poly-time because each [math]\displaystyle{ |H_i| }[/math] can be computed in poly-time, and sampling uniformly from each [math]\displaystyle{ H_i }[/math] is poly-time.
We now only need to lower bound the ratio
- [math]\displaystyle{ \alpha=\frac{|G|}{|U|} }[/math].
We claim that
- [math]\displaystyle{ \alpha\ge\frac{1}{m} }[/math].
It is easy to see this, because each [math]\displaystyle{ x\in H }[/math] has at most [math]\displaystyle{ m }[/math] instances of [math]\displaystyle{ (x,i) }[/math] in [math]\displaystyle{ U }[/math], and we already know that [math]\displaystyle{ |G|=|H| }[/math].
Due to the estimator theorem, this needs [math]\displaystyle{ \frac{4m}{\epsilon}\ln\frac{2}{\delta} }[/math] uniform random samples from [math]\displaystyle{ U }[/math].
This gives the coverage algorithm for the abstract problem of the union of sets. The DNF counting is a special case of it.
Permanents and perfect matchings
Let [math]\displaystyle{ U=\{u_1,u_2,\ldots,u_n\} }[/math] and [math]\displaystyle{ V=\{v_1,v_2,\ldots,v_n\} }[/math]. Consider a bipartite graph [math]\displaystyle{ G(U,V,E) }[/math]. An [math]\displaystyle{ M\subseteq E }[/math] is a perfect matching of [math]\displaystyle{ G }[/math] if every vertex of [math]\displaystyle{ G }[/math] has exactly one edge in [math]\displaystyle{ M }[/math] adjacent to it.
Given a bipartite graph [math]\displaystyle{ G(U,V,E) }[/math] with [math]\displaystyle{ |U|=|V|=n }[/math], we want to count the number of perfect matchings of [math]\displaystyle{ G }[/math]. This problem can be reduced to computing the permanent of a square matrix.
Definition (permanent)
|
If we multiply each term of the sum the sign of the permutation, then it gives us the determinant of the matrix,
- [math]\displaystyle{ \det(Q)=\sum_{\pi\in\mathbb{S}_n}\sgn(\pi)\prod_{i=1}^n Q_{i,\pi(i)} }[/math],
where [math]\displaystyle{ \sgn(\pi) }[/math], the sign of a permutation, is either [math]\displaystyle{ -1 }[/math] or [math]\displaystyle{ +1 }[/math], according to whether the minimum number of pair-wise interchanges to achieve [math]\displaystyle{ \pi }[/math] from [math]\displaystyle{ (1,2,\ldots,n) }[/math] is odd or even.
Unlike the determinants, which are computable in poly-time, permanents are hard to compute, as permanents can be used to count the number of perfect matchings in a bipartite graph, which is #P-complete.
A bipartite graph [math]\displaystyle{ G(U,V,E) }[/math] with [math]\displaystyle{ |U|=|V|=n }[/math] can be represented by an [math]\displaystyle{ n\times n }[/math] matrix [math]\displaystyle{ Q }[/math] with 0-1 entries as follows:
- Each row of [math]\displaystyle{ Q }[/math] corresponds to a vertex in [math]\displaystyle{ U }[/math] and each column of [math]\displaystyle{ Q }[/math] corresponds to a vertex in [math]\displaystyle{ V }[/math].
- [math]\displaystyle{ Q_{ij}=\begin{cases} 1 & \mbox{if }i\sim j,\\ 0 & \mbox{otherwise}. \end{cases} }[/math]
Note the subtle difference between the definition of [math]\displaystyle{ Q }[/math] and the adjacency matrix.
Each perfect matching corresponds to a permutation [math]\displaystyle{ \pi }[/math] of [math]\displaystyle{ U }[/math] with [math]\displaystyle{ (u,\pi(u))\in E }[/math] for every [math]\displaystyle{ u\in U }[/math], which corresponds to a permutation [math]\displaystyle{ \pi\in\mathbb{S}_n }[/math] with [math]\displaystyle{ \prod_{i=1}^n Q_{i,\pi(i)}=1 }[/math]. It is than easy to see that [math]\displaystyle{ \mathrm{per}(Q) }[/math] gives the number of perfect matchings in the bipartite graph [math]\displaystyle{ G }[/math].
It is known that counting the number of perfect matchings in a bipartite graph is #P-hard. Since this problem can be reduced to computing the permanent, thus the problem of computing the permanents is also #P-hard.
Now we show that with randomization, we can approximate the number of perfect matchings in a bipartite graph. In particular, we will give an FPRAS for counting the perfect matchings in a dense bipartite graph.
The Jerrum-Sinclair algorithm
Fix a bipartite graph [math]\displaystyle{ G(U,V,E) }[/math] with [math]\displaystyle{ |U|=|V|=n }[/math]. Let [math]\displaystyle{ \mathcal{M}_k }[/math] be the set of matchings of size [math]\displaystyle{ k }[/math] in [math]\displaystyle{ G }[/math], and let [math]\displaystyle{ m_k=|\mathcal{M}_k| }[/math]. Thus, [math]\displaystyle{ \mathcal{M}_k }[/math] is the set of perfect matchings in [math]\displaystyle{ G }[/math], and our goal is to compute [math]\displaystyle{ m_n }[/math].
Let [math]\displaystyle{ r_k=\frac{m_k}{m_{k-1}} }[/math] for [math]\displaystyle{ 1\lt k\le n }[/math]. Then
- [math]\displaystyle{ m_k=m_{k-1}r_k }[/math],
which gives us a recursion to compute the [math]\displaystyle{ m_n }[/math], as
- [math]\displaystyle{ m_n=m_{1}\frac{m_2}{m_1}\cdot\frac{m_3}{m_2}\cdots\frac{m_n}{m_{n-1}}=m_1\prod_{k=2}^n r_k }[/math],
where [math]\displaystyle{ m_1=|\mathcal{M}_1| }[/math] is the number of matchings of size 1 in the bipartite graph [math]\displaystyle{ G }[/math], which is just the number of edges in [math]\displaystyle{ G }[/math]. Therefore, [math]\displaystyle{ m_n }[/math] can be computed once we know [math]\displaystyle{ r_k }[/math] for [math]\displaystyle{ 1\lt k\le n }[/math].
Each [math]\displaystyle{ r_k }[/math] can be estimated by sampling uniformly from the set [math]\displaystyle{ \mathcal{M}_k\cup\mathcal{M}_{k-1} }[/math]. The algorithm for estimating [math]\displaystyle{ m_n }[/math] is outlined as:
- For each [math]\displaystyle{ 1\lt k\le n }[/math], have an FPRAS for computing the [math]\displaystyle{ r_k }[/math] by uniform sampling sufficiently many members from [math]\displaystyle{ \mathcal{M}_k\cup\mathcal{M}_{k-1} }[/math] as
- uniformly sample [math]\displaystyle{ N }[/math] matching from [math]\displaystyle{ \mathcal{M}_k\cup\mathcal{M}_{k-1} }[/math], for some polynomially large [math]\displaystyle{ N }[/math];
- assuming that there are [math]\displaystyle{ X }[/math] sampled matchings of size [math]\displaystyle{ k }[/math], return [math]\displaystyle{ r_k }[/math] as [math]\displaystyle{ r_k=\frac{X}{N-X} }[/math].
- 2. Compute [math]\displaystyle{ m_n }[/math] as [math]\displaystyle{ m_n=m_1\prod_{k=2}^n r_k }[/math], where [math]\displaystyle{ m_1=|E| }[/math].
There are several issues that we have to deal with in order to have a fully functional FPRAS for counting perfect matchings.
- By taking the product of [math]\displaystyle{ r_k }[/math]'s, the errors for individual [math]\displaystyle{ r_k }[/math]'s add up.
- In order to accurately estimate [math]\displaystyle{ r_k }[/math] by sampling from [math]\displaystyle{ \mathcal{M}_k\cup\mathcal{M}_{k-1} }[/math], the ratio [math]\displaystyle{ r_k }[/math] should be within the range [math]\displaystyle{ \left[\frac{1}{\alpha},\alpha\right] }[/math] for some [math]\displaystyle{ \alpha }[/math] within polynomial of [math]\displaystyle{ n }[/math].
- Implement the uniform sampling from [math]\displaystyle{ \mathcal{M}_k\cup\mathcal{M}_{k-1} }[/math].
- Estimator for each [math]\displaystyle{ r_k }[/math]
- Accumulation of errors
- Near-uniform sampling from [math]\displaystyle{ \mathcal{M}_k\cup\mathcal{M}_{k-1} }[/math]
In the last lecture, we have shown that by random walk, we can sample a near-uniform member of [math]\displaystyle{ \mathcal{M}_n\cup\mathcal{M}_{n-1} }[/math] in poly-time.
Volume estimation
We consider the problem of computing the volume of a given convex body [math]\displaystyle{ K }[/math] in [math]\displaystyle{ n }[/math] dimensions.
We use [math]\displaystyle{ \Upsilon(K)\, }[/math] to denote the volume of the convex body [math]\displaystyle{ K }[/math]. Abstractly, the problem is that given as input a convex body [math]\displaystyle{ K }[/math] in [math]\displaystyle{ n }[/math] dimensions, return the [math]\displaystyle{ \Upsilon(K)\, }[/math].
We should be more specific about the input model. Since we allow an arbitrary convex body as input, it is not even clear how to describe the body. We assume that [math]\displaystyle{ K }[/math] is described by means of a membership oracle [math]\displaystyle{ \mathcal{O} }[/math], such that for a query of an [math]\displaystyle{ n }[/math]-dimensional point [math]\displaystyle{ x }[/math], [math]\displaystyle{ \mathcal{O}(x) }[/math] indicates whether [math]\displaystyle{ x\in K }[/math].
For example, the [math]\displaystyle{ n }[/math]-dimensional convex body defined by the intersection of [math]\displaystyle{ m }[/math] half-spaces, which is the set of feasible solutions to a system of [math]\displaystyle{ m }[/math] linear constraints, can be described as
- [math]\displaystyle{ A x\le \boldsymbol{b} }[/math],
for some [math]\displaystyle{ m\times n }[/math] matrix [math]\displaystyle{ A }[/math], and [math]\displaystyle{ m }[/math]-dimensional vector [math]\displaystyle{ b }[/math]. For a query of an [math]\displaystyle{ x }[/math], the membership oracle [math]\displaystyle{ \mathcal{O} }[/math] just check whether [math]\displaystyle{ Ax\le b }[/math].
For deterministic algorithms, there are negative news for this problem.
Theorem (Bárány-Füredi 1987)
|
That is said, with deterministic algorithms, we cannot even approximate the volume within even a wildly loose range.