随机算法 \ 高级算法 (Fall 2016)/Problem Set 1 and 高级算法 (Fall 2016)/Nonconstructive Proof of Lovász Local Lemma: Difference between pages

Revision as of 09:30, 3 October 2016

Lovász Local Lemma

Given a sequence of events [math]\displaystyle{ A_1,A_2,\ldots,A_n }[/math], we use the dependency graph to describe the dependencies between these events.

Definition

Let [math]\displaystyle{ A_1,A_2,\ldots,A_n }[/math] be a sequence of events. A graph [math]\displaystyle{ D=(V,E) }[/math] on the set of vertices [math]\displaystyle{ V=\{1,2,\ldots,n\} }[/math] is called a dependency graph for the events [math]\displaystyle{ A_1,\ldots,A_n }[/math] if for each [math]\displaystyle{ i }[/math], [math]\displaystyle{ 1\le i\le n }[/math], the event [math]\displaystyle{ A_i }[/math] is mutually independent of all the events [math]\displaystyle{ \{A_j\mid (i,j)\not\in E\} }[/math].

The notion of mutual independence between an event and a set of events is formally defined as follows.

Definition

An event [math]\displaystyle{ A }[/math] is said to be mutually independent of events [math]\displaystyle{ B_1,B_2,\ldots, B_k }[/math], if for any disjoint [math]\displaystyle{ I^+,I^-\subseteq\{1,2,lots,k\} }[/math], it holds that

[math]\displaystyle{ \Pr\left[A\mid \bigwedge_{i\in I^+}B_i\wedge \bigwedge_{i\in I^-}\overline{B_i}\right]=\Pr[A] }[/math].

Example

Let [math]\displaystyle{ X_1,X_2,\ldots,X_m }[/math] be a set of mutually independent random variables. Each event [math]\displaystyle{ A_i }[/math] is a predicate defined on a number of variables among [math]\displaystyle{ X_1,X_2,\ldots,X_m }[/math]. Let [math]\displaystyle{ v(A_i) }[/math] be the unique smallest set of variables which determine [math]\displaystyle{ A_i }[/math]. The dependency graph [math]\displaystyle{ D=(V,E) }[/math] is defined by

[math]\displaystyle{ (i,j)\in E }[/math] iff [math]\displaystyle{ v(A_i)\cap v(A_j)\neq \emptyset }[/math].

The following lemma, known as the Lovász local lemma, first proved by Erdős and Lovász in 1975, is an extremely powerful tool, as it supplies a way for dealing with rare events.

Lovász Local Lemma (symmetric case)

Let [math]\displaystyle{ A_1,A_2,\ldots,A_n }[/math] be a set of events, and assume that the following hold:

for all [math]\displaystyle{ 1\le i\le n }[/math], [math]\displaystyle{ \Pr[A_i]\le p }[/math];
the maximum degree of the dependency graph for the events [math]\displaystyle{ A_1,A_2,\ldots,A_n }[/math] is [math]\displaystyle{ d }[/math], and

[math]\displaystyle{ ep(d+1)\le 1 }[/math].

Then

[math]\displaystyle{ \Pr\left[\bigwedge_{i=1}^n\overline{A_i}\right]\gt 0 }[/math].

We will prove a general version of the local lemma, where the events [math]\displaystyle{ A_i }[/math] are not symmetric. This generalization is due to Spencer.

Lovász Local Lemma (general case)

Let [math]\displaystyle{ D=(V,E) }[/math] be the dependency graph of events [math]\displaystyle{ A_1,A_2,\ldots,A_n }[/math]. Suppose there exist real numbers [math]\displaystyle{ x_1,x_2,\ldots, x_n }[/math] such that [math]\displaystyle{ 0\le x_i\lt 1 }[/math] and for all [math]\displaystyle{ 1\le i\le n }[/math],

[math]\displaystyle{ \Pr[A_i]\le x_i\prod_{(i,j)\in E}(1-x_j) }[/math].

Then

[math]\displaystyle{ \Pr\left[\bigwedge_{i=1}^n\overline{A_i}\right]\ge\prod_{i=1}^n(1-x_i) }[/math].

To see that the general LLL implies symmetric LLL, we set [math]\displaystyle{ x_i=\frac{1}{d+1} }[/math] for all [math]\displaystyle{ i=1,2,\ldots,n }[/math]. Then we have [math]\displaystyle{ \left(1-\frac{1}{d+1}\right)^d\gt \frac{1}{\mathrm{e}} }[/math].

Assume the condition in the symmetric LLL:

for all [math]\displaystyle{ 1\le i\le n }[/math], [math]\displaystyle{ \Pr[A_i]\le p }[/math];
[math]\displaystyle{ ep(d+1)\le 1 }[/math];

then it is easy to verify that for all [math]\displaystyle{ 1\le i\le n }[/math],

[math]\displaystyle{ \Pr[A_i]\le p\le\frac{1}{e(d+1)}\lt \frac{1}{d+1}\left(1-\frac{1}{d+1}\right)^d\le x_i\prod_{(i,j)\in E}(1-x_j) }[/math].

Due to the general LLL, we have

[math]\displaystyle{ \Pr\left[\bigwedge_{i=1}^n\overline{A_i}\right]\ge\prod_{i=1}^n(1-x_i)=\left(1-\frac{1}{d+1}\right)^n\gt 0 }[/math].

This proves the symmetric LLL.

Now we prove the general LLL by the original induction proof.

Proof.

First, apply the chain rule. We have

[math]\displaystyle{ \Pr\left[\bigwedge_{i=1}^n\overline{A_i}\right]=\prod_{i=1}^n\Pr\left[\overline{A_i}\mid \bigwedge_{j=1}^{i-1}\overline{A_{j}}\right]=\prod_{i=1}^n\left(1-\Pr\left[{A_i}\mid \bigwedge_{j=1}^{i-1}\overline{A_{j}}\right]\right) }[/math].

Next we prove by induction on [math]\displaystyle{ m }[/math] that for any set of [math]\displaystyle{ m }[/math] events [math]\displaystyle{ i_1,\ldots,i_m }[/math],

[math]\displaystyle{ \Pr\left[A_{i_1}\mid \bigwedge_{j=2}^m\overline{A_{i_j}}\right]\le x_{i_1} }[/math].

The local lemma follows immediately by the above chain rule.

For [math]\displaystyle{ m=1 }[/math], this is obvious because

[math]\displaystyle{ \Pr[A_{i_1}]\le x_{i_1}\prod_{(i_1,j)\in E}(1-x_j)\le x_{i_1} }[/math].

For general [math]\displaystyle{ m }[/math], let [math]\displaystyle{ i_2,\ldots,i_k }[/math] be the set of vertices adjacent to [math]\displaystyle{ i_1 }[/math] in the dependency graph, i.e. event [math]\displaystyle{ A_{i_1} }[/math] is mutually independent of [math]\displaystyle{ A_{i_{k+1}},A_{i_{k+2}},\ldots, A_{i_{m}} }[/math]. By conditional probability, we have

[math]\displaystyle{ \Pr\left[A_{i_1}\mid \bigwedge_{j=2}^m\overline{A_{i_j}}\right] =\frac{\Pr\left[ A_i\wedge \bigwedge_{j=2}^k\overline{A_{i_j}}\mid \bigwedge_{j=k+1}^m\overline{A_{i_j}}\right]} {\Pr\left[\bigwedge_{j=2}^k\overline{A_{i_j}}\mid \bigwedge_{j=k+1}^m\overline{A_{i_j}}\right]} }[/math].

First, we bound the numerator. Due to that [math]\displaystyle{ A_{i_1} }[/math] is mutually independent of [math]\displaystyle{ A_{i_{k+1}},A_{i_{k+2}},\ldots, A_{i_{m}} }[/math], we have

[math]\displaystyle{ \begin{align} \Pr\left[ A_{i_1}\wedge \bigwedge_{j=2}^k\overline{A_{i_j}}\mid \bigwedge_{j=k+1}^m\overline{A_{i_j}}\right] &\le\Pr\left[ A_{i_1}\mid \bigwedge_{j=k+1}^m\overline{A_{i_j}}\right]\\ &=\Pr[A_{i_1}]\\ &\le x_{i_1}\prod_{(i_1,j)\in E}(1-x_j). \end{align} }[/math]

Next, we bound the denominator. Applying the chain rule, we have

[math]\displaystyle{ \Pr\left[\bigwedge_{j=2}^k\overline{A_{i_j}}\mid \bigwedge_{j=k+1}^m\overline{A_{i_j}}\right] =\prod_{j=2}^k\Pr\left[\overline{A_{i_j}}\mid \bigwedge_{\ell=j+1}^m\overline{A_{i_\ell}}\right] }[/math]

which by the induction hypothesis, is at least

[math]\displaystyle{ \prod_{j=2}^k(1-x_{i_j})=\prod_{\{i_1,i_j\}\in E}(1-x_j) }[/math]

where [math]\displaystyle{ E }[/math] is the set of edges in the dependency graph.

Altogether, we prove the induction hypothesis

[math]\displaystyle{ \Pr\left[A_{i_1}\mid \bigwedge_{j=2}^m\overline{A_{i_j}}\right] \le\frac{x_{i_1}\prod_{(i_1,j)\in E}(1-x_j)}{\prod_{\{i_1,i_j\}\in E}(1-x_j)}\le x_{i_1}. }[/math]

Due to the chain rule, it holds that

[math]\displaystyle{ \begin{align} \Pr\left[\bigwedge_{i=1}^n\overline{A_i}\right] &=\prod_{i=1}^n\Pr\left[\overline{A_i}\mid \bigwedge_{j=1}^{i-1}\overline{A_{j}}\right]\\ &=\prod_{i=1}^n\left(1-\Pr\left[A_i\mid \bigwedge_{j=1}^{i-1}\overline{A_{j}}\right]\right)\\ &\ge\prod_{i=1}^n\left(1-x_i\right). \end{align} }[/math]

[math]\displaystyle{ \square }[/math]

@@ Line 1: / Line 1: @@
-== Problem 1==
+== Lovász Local Lemma==
-For any <math>\alpha\ge 1</math>, a cut <math>C</math> in an undirected (multi)graph <math>G(V,E)</math>is called an <math>\alpha</math>-min-cut if <math>|C|\le\alpha|C^*|</math> where <math>C^*</math> is a min-cut in <math>G</math>.
+Given a sequence of events <math>A_1,A_2,\ldots,A_n</math>, we use the '''dependency graph''' to describe the dependencies between these events.
-# Give a lower bound to the probability that Karger's Random Contraction algorithm returns an <math>\alpha</math>-min-cut in a graph <math>G(V,E)</math> of <math>n</math> vertices.
+{{Theorem
-# Use the above bound to estimate the number of distinct <math>\alpha</math>-min cuts in <math>G</math>.
+|Definition|
+:Let <math>A_1,A_2,\ldots,A_n</math> be a sequence of events. A graph <math>D=(V,E)</math> on the set of vertices <math>V=\{1,2,\ldots,n\}</math> is called a '''dependency graph''' for the events <math>A_1,\ldots,A_n</math> if for each <math>i</math>, <math>1\le i\le n</math>, the event <math>A_i</math> is mutually independent of all the events <math>\{A_j\mid (i,j)\not\in E\}</math>.
+}}
-== Problem 2==
+The notion of mutual independence between an event and a set of events is formally defined as follows.
-Let <math>G(V,E)</math> be an undirected graph with positive edge weights <math>w:E\to\mathbb{Z}^+</math>. Given a partition of <math>V</math> into <math>k</math> disjoint subsets <math>S_1,S_2,\ldots,S_k</math>, we define
+{{Theorem|Definition|
-:<math>w(S_1,S_2,\ldots,S_k)=\sum_{uv\in E\atop \exists i\neq j: u\in S_i,v\in S_j}w(uv)</math>
+:An event <math>A</math> is said to be '''mutually independent''' of events <math>B_1,B_2,\ldots, B_k</math>, if for any disjoint <math>I^+,I^-\subseteq\{1,2,lots,k\}</math>, it holds that
-as the cost of the '''<math>k</math>-cut''' <math>\{S_1,S_2,\ldots,S_k\}</math>. Our goal is to find a <math>k</math>-cut with maximum cost.
+::<math>\Pr\left[A\mid \bigwedge_{i\in I^+}B_i\wedge \bigwedge_{i\in I^-}\overline{B_i}\right]=\Pr[A]</math>.
-# Give a poly-time greedy algorithm for finding the weighted max <math>k</math>-cut. Prove that the approximation ratio is <math>(1-1/k)</math>.
+}}
-# Consider the following local search algorithm for the weighted max cut (max 2-cut).
- start with an arbitrary bipartition of <math>V</math> into disjoint <math>S_0,S_1</math>;
- while (true) do
-    if <math>\exists i\in\{0,1\}</math> and <math>v\in S_i</math> such that <font color=red>(______________)</font>
-       then <math>v</math> leaves <math>S_i</math> and joins <math>S_{1-i}</math>;
-       continue;
-    end if
-    break;
- end
-:Fill in the blank parenthesis. Give an analysis of the running time of the algorithm. And prove that the approximation ratio is 0.5.
-== Problem 3==
+;Example
-Given <math>m</math> subsets <math>S_1,S_2,\ldots, S_m\subseteq U</math> of a universe <math>U</math> of size <math>n</math>, we want to find a <math>C\subseteq\{1,2,\ldots, n\}</math> of fixed size <math>k=|C|</math> with the maximum '''coverage''' <math>\left|\bigcup_{i\in C}S_i\right|</math>.
+:Let <math>X_1,X_2,\ldots,X_m</math> be a set of ''mutually independent'' random variables. Each event <math>A_i</math> is a predicate defined on a number of variables among <math>X_1,X_2,\ldots,X_m</math>. Let <math>v(A_i)</math> be the unique smallest set of variables which determine <math>A_i</math>. The dependency graph <math>D=(V,E)</math> is defined by
+:::<math>(i,j)\in E</math> iff <math>v(A_i)\cap v(A_j)\neq \emptyset</math>.
-* Give a poly-time greedy algorithm for the problem. Prove that the approximation ratio is <math>1-(1-1/k)^k>1-1/e</math>.
+The following lemma, known as the Lovász local lemma, first proved by Erdős and Lovász in 1975, is an extremely powerful tool, as it supplies a way for dealing with rare events.
-== Problem 4==
+{{Theorem
-We consider minimum makespan scheduling on parallel identical machines when jobs are subject to '''precedence constraints'''.
+|Lovász Local Lemma (symmetric case)|
+:Let <math>A_1,A_2,\ldots,A_n</math> be a set of events, and assume that the following hold:
+:#for all <math>1\le i\le n</math>, <math>\Pr[A_i]\le p</math>;
+:#the maximum degree of the dependency graph for the events <math>A_1,A_2,\ldots,A_n</math> is <math>d</math>, and
+:::<math>ep(d+1)\le 1</math>.
+:Then
+::<math>\Pr\left[\bigwedge_{i=1}^n\overline{A_i}\right]>0</math>.
+}}
-We still want to schedule <math>n</math> jobs <math>j=1,2,\ldots, n</math> on <math>m</math> identical machines, where job <math>j</math> has  processing time <math>p_j</math>. But now a partial order <math>\preceq</math> is defined on jobs, so that if <math>j\prec k</math> then job <math>j</math> must be completely finished before job <math>k</math> begins. The following is a variant of the ''List'' algorithm for this problem.
+We will prove a general version of the local lemma, where the events <math>A_i</math> are not symmetric. This generalization is due to Spencer.
- Input: a list of <math>n</math> jobs with processing times <math>p_1,p_2,\ldots, p_n</math>;
+{{Theorem
+|Lovász Local Lemma (general case)|
+:Let <math>D=(V,E)</math> be the dependency graph of events <math>A_1,A_2,\ldots,A_n</math>. Suppose there exist real numbers <math>x_1,x_2,\ldots, x_n</math> such that <math>0\le x_i<1</math> and for all <math>1\le i\le n</math>,
+::<math>\Pr[A_i]\le x_i\prod_{(i,j)\in E}(1-x_j)</math>.
+:Then
+::<math>\Pr\left[\bigwedge_{i=1}^n\overline{A_i}\right]\ge\prod_{i=1}^n(1-x_i)</math>.
+}}
- whenever a machine becomes idle
+To see that the general LLL implies symmetric LLL, we set <math>x_i=\frac{1}{d+1}</math> for all <math>i=1,2,\ldots,n</math>. Then we have <math>\left(1-\frac{1}{d+1}\right)^d>\frac{1}{\mathrm{e}}</math>.
-     assign the next ''available'' job on the list to the machine;
-Here a job <math>k</math> is available if all jobs <math>j\prec k</math> have already been completely processed.
+Assume the condition in the symmetric LLL:
+:#for all <math>1\le i\le n</math>, <math>\Pr[A_i]\le p</math>;
+:#<math>ep(d+1)\le 1</math>;
+then it is easy to verify that for all <math>1\le i\le n</math>,
+:<math>\Pr[A_i]\le p\le\frac{1}{e(d+1)}<\frac{1}{d+1}\left(1-\frac{1}{d+1}\right)^d\le x_i\prod_{(i,j)\in E}(1-x_j)</math>.
+Due to the general LLL, we have
+:<math>\Pr\left[\bigwedge_{i=1}^n\overline{A_i}\right]\ge\prod_{i=1}^n(1-x_i)=\left(1-\frac{1}{d+1}\right)^n>0</math>.
+This proves the symmetric LLL.
-* Prove that the approximation ratio is 2.
+Now we prove the general LLL by the original induction proof.
+{{Proof|
+First, apply the chain rule. We have
+:<math>\Pr\left[\bigwedge_{i=1}^n\overline{A_i}\right]=\prod_{i=1}^n\Pr\left[\overline{A_i}\mid \bigwedge_{j=1}^{i-1}\overline{A_{j}}\right]=\prod_{i=1}^n\left(1-\Pr\left[{A_i}\mid \bigwedge_{j=1}^{i-1}\overline{A_{j}}\right]\right)</math>.
-== Problem 5 ==
+Next we prove by induction on <math>m</math> that for any set of <math>m</math> events <math>i_1,\ldots,i_m</math>,
-For a '''hypergraph''' <math>H(V,E)</math> with vertex set <math>V</math>, every '''hyperedge''' <math>e\in E</math> is a subset <math>e\subset V</math> of vertices, not necessarily of size 2. A hypergraph <math>H(V,E)</math> is '''<math>k</math>-uniform''' if every hyperedge <math>e\in V</math> is of size <math>k=|e|</math>.
+:<math>\Pr\left[A_{i_1}\mid \bigwedge_{j=2}^m\overline{A_{i_j}}\right]\le x_{i_1}</math>.
+The local lemma follows immediately by the above chain rule.
-A hypergraph <math>H(V,E)</math> is said to have '''property B''' (named after Bernstein) if <math>H</math> is 2-coloable; that is, if there is a '''proper 2-coloring''' <math>f:V\to\{{\color{red}R},{\color{blue}B}\}</math> which assigns each vertex one of the two colors <font color=red>Red</font> or <font color=blue>Blue</font>, such that none of the hyperedge is ''monochromatic''.
+For <math>m=1</math>, this is obvious because
+:<math>\Pr[A_{i_1}]\le x_{i_1}\prod_{(i_1,j)\in E}(1-x_j)\le x_{i_1}</math>.
-# Let <math>H(V,E)</math> be a <math>k</math>-uniform hypergraph in which every hyperedge <math>e\in E</math> shares vertices with at most <math>d</math> other hyperedges.
+For general <math>m</math>, let <math>i_2,\ldots,i_k</math> be the set of vertices adjacent to  <math>i_1</math> in the dependency graph, i.e. event <math>A_{i_1}</math> is mutually independent of <math>A_{i_{k+1}},A_{i_{k+2}},\ldots, A_{i_{m}}</math>.
-#*Show that if <math>2\mathrm{e}\cdot (d+1)\le 2^{k}</math>, then <math>H</math> has property B.
+By conditional probability, we have
-#*Describe how to use Moser's recursive Fix algorithm to find a proper 2-coloring of <math>H</math>. Give the pseudocode. Prove the condition in interns of <math>d</math> and <math>k</math> under which the algorithm can find a 2-coloring of <math>H</math> with high probability.
+:<math>
-#*Describe how to use Moser-Tardos random solver to find a proper 2-coloring of <math>H</math>. Give the pseudocode. Prove the condition in interns of <math>d</math> and <math>k</math> under which the algorithm can find a 2-coloring of <math>H</math> within bounded expected time.
+\Pr\left[A_{i_1}\mid \bigwedge_{j=2}^m\overline{A_{i_j}}\right]
-# Let <math>H(V,E)</math> be a hypergraph (not necessarily uniform) with at least <math>n\ge 2</math> vertices satisfying that
+=\frac{\Pr\left[ A_i\wedge \bigwedge_{j=2}^k\overline{A_{i_j}}\mid \bigwedge_{j=k+1}^m\overline{A_{i_j}}\right]}
-#:<math>\forall v\in V, \sum_{e\ni v}(1-1/k)^{-|e|}2^{-|e|+1}\le \frac{1}{n}</math>.
+{\Pr\left[\bigwedge_{j=2}^k\overline{A_{i_j}}\mid \bigwedge_{j=k+1}^m\overline{A_{i_j}}\right]}
-#*Show that <math>H</math> has property B.
+</math>.
-#*Describe how to use Moser-Tardos random solver to find a proper 2-coloring of <math>H</math>. Give an upper bound on the expected running time.
+First, we bound the numerator. Due to that <math>A_{i_1}</math> is mutually independent of <math>A_{i_{k+1}},A_{i_{k+2}},\ldots, A_{i_{m}}</math>, we have
+:<math>
+\begin{align}
+\Pr\left[ A_{i_1}\wedge \bigwedge_{j=2}^k\overline{A_{i_j}}\mid \bigwedge_{j=k+1}^m\overline{A_{i_j}}\right]
+&\le\Pr\left[ A_{i_1}\mid \bigwedge_{j=k+1}^m\overline{A_{i_j}}\right]\\
+&=\Pr[A_{i_1}]\\
+&\le x_{i_1}\prod_{(i_1,j)\in E}(1-x_j).
+\end{align}
+</math>
+Next, we bound the denominator. Applying the chain rule, we have
+:<math>
+\Pr\left[\bigwedge_{j=2}^k\overline{A_{i_j}}\mid \bigwedge_{j=k+1}^m\overline{A_{i_j}}\right]
+=\prod_{j=2}^k\Pr\left[\overline{A_{i_j}}\mid \bigwedge_{\ell=j+1}^m\overline{A_{i_\ell}}\right]
+</math>
+which by the induction hypothesis, is at least
+:<math>
+\prod_{j=2}^k(1-x_{i_j})=\prod_{\{i_1,i_j\}\in E}(1-x_j)
+</math>
+where <math>E</math> is the set of edges in the dependency graph.
+Altogether, we prove the induction hypothesis
+:<math>
+\Pr\left[A_{i_1}\mid \bigwedge_{j=2}^m\overline{A_{i_j}}\right]
+\le\frac{x_{i_1}\prod_{(i_1,j)\in E}(1-x_j)}{\prod_{\{i_1,i_j\}\in E}(1-x_j)}\le x_{i_1}.
+</math>
+Due to the chain rule, it holds that
+:<math>
+\begin{align}
+\Pr\left[\bigwedge_{i=1}^n\overline{A_i}\right]
+&=\prod_{i=1}^n\Pr\left[\overline{A_i}\mid \bigwedge_{j=1}^{i-1}\overline{A_{j}}\right]\\
+&=\prod_{i=1}^n\left(1-\Pr\left[A_i\mid \bigwedge_{j=1}^{i-1}\overline{A_{j}}\right]\right)\\
+&\ge\prod_{i=1}^n\left(1-x_i\right).
+\end{align}
+</math>
+}}

随机算法 \ 高级算法 (Fall 2016)/Problem Set 1 and 高级算法 (Fall 2016)/Nonconstructive Proof of Lovász Local Lemma: Difference between pages

Revision as of 09:30, 3 October 2016

Lovász Local Lemma

Navigation menu

Search