高级算法 (Fall 2016)/Min-Cut and Max-Cut and 高级算法 (Fall 2016)/Nonconstructive Proof of Lovász Local Lemma: Difference between pages

From TCS Wiki
(Difference between pages)
Jump to navigation Jump to search
imported>Etone
 
imported>Etone
No edit summary
 
Line 1: Line 1:
<font color=red size=3> under construction</font>[[File:Under_construction.png‎|30px]]
Given a sequence of events <math>A_1,A_2,\ldots,A_n</math>, we use the '''dependency graph''' to describe the dependencies between these events.


= Graph Cut =
{{Theorem
Let <math>G(V, E)</math> be an undirected graph. A subset <math>C\subseteq E</math> of edges is a '''cut''' of graph <math>G</math> if <math>G</math> becomes ''disconnected'' after deleting all edges in <math>C</math>.
|Definition (dependency graph)|
 
:Let <math>A_1,A_2,\ldots,A_n</math> be a sequence of events. A graph <math>D=(V,E)</math> on the set of vertices <math>V=\{1,2,\ldots,n\}</math> is called a '''dependency graph''' for the events <math>A_1,\ldots,A_n</math> if for each <math>i</math>, <math>1\le i\le n</math>, the event <math>A_i</math> is mutually independent of all the events <math>\{A_j\mid (i,j)\not\in E\}</math>.
More formally, a pair of ''disjoint'' subsets <math>S,T\subseteq V</math> of vertices is called a '''bipartition''' of <math>V</math> if <math>S</math> and <math>T</math> are not empty and <math>S\cup T=V</math>.
Given a bipartition <math>\{S,T\}</math> of <math>V</math>, we denote by
:<math>E(S,T)=\{uv\in E\mid u\in S, v\in T\}</math>
the set of "crossing edges" with one endpoint in each of <math>S</math> and <math>T</math>.
 
Then every cut <math>C\subseteq E</math> in <math>G</math> corresponds to a
:<math>C=E(S,T),\quad \{S,T\}\mbox{ is a bipartition of }V</math>.
 
Given a graph <math>G</math>, there might be many cuts in <math>G</math>, and we are interested in finding its '''minimum''' or '''maximum''' cut.
 
= Min-Cut =
The '''min-cut problem''', also called the '''global minimum cut problem''', is defined as follows.
{{Theorem|Min-cut problem|
*'''Input''': an undirected graph <math>G(V,E)</math>;
*'''Output''': a cut <math>C</math> in <math>G</math> with the smallest size <math>|C|</math>.
}}
}}


Equivalently, the problem asks to find a bipartition of <math>V</math> into disjoint non-empty subsets <math>S</math> and <math>T</math> that minimizes <math>|E(S,T)|</math>.
The notion of mutual independence between an event and a set of events is formally defined as follows.
 
{{Theorem|Definition (mutual independence)|
We consider the problem in a slightly more generalized setting, where the input graphs <math>G</math> can be '''multi-graphs''', meaning that there could be multiple edges between two vertices <math>u</math> and <math>v</math>. We call such edges the '''parallel edges'''. The cuts in multi-graphs are defined in the same way as before, and the cost of a cut <math>C</math> is given by the total number of edges (including parallel edges) in <math>C</math>. Equivalently, one may think of a multi-graph as a graph with integer edge weights, and the cost of a cut <math>C</math> is the total weights of all edges in <math>C</math>.
:An event <math>A</math> is said to be '''mutually independent''' of events <math>B_1,B_2,\ldots, B_k</math>, if for any disjoint <math>I^+,I^-\subseteq\{1,2,\ldots,k\}</math>, it holds that
 
::<math>\Pr\left[A \mid \left(\bigwedge_{i\in I^+}B_i\right) \wedge \left(\bigwedge_{i\in I^-}\overline{B_i}\right)\right]=\Pr[A]</math>.
A canonical deterministic algorithm for this problem is through the [http://en.wikipedia.org/wiki/Max-flow_min-cut_theorem max-flow min-cut theorem]. The max-flow algorithm finds us a minimum '''<math>s</math>-<math>t</math> cut''', which disconnects a '''source''' <math>s\in V</math> from a '''sink''' <math>t\in V</math>, both specified as part of the input. A global min cut can be found by exhaustively finding the minimum <math>s</math>-<math>t</math> cut for an arbitrarily fixed source <math>s</math> and all possible sink <math>t\neq s</math>. This takes <math>(n-1)\times</math>max-flow time where <math>n=|V|</math> is the number of vertices.
 
The fastest known deterministic algorithm for the minimum cut problem on multi-graphs is the [https://en.wikipedia.org/wiki/Stoer–Wagner_algorithm Stoer–Wagner algorithm], which achieves an <math>O(mn+n^2\log n)</math> time complexity where <math>m=|E|</math> is the total number of edges (counting the parallel edges).
 
If we restrict the input to be '''simple graphs''' (meaning there is no parallel edges) with no edge weight, there are better algorithms. The [http://delivery.acm.org/10.1145/2750000/2746588/p665-kawarabayashi.pdf?ip=114.212.86.114&id=2746588&acc=ACTIVE%20SERVICE&key=BF85BBA5741FDC6E%2E180A41DAF8736F97%2E4D4702B0C3E38B35%2E4D4702B0C3E38B35&CFID=839435129&CFTOKEN=67928165&__acm__=1474187635_eafe662feeb838ca5ece2f6b56715177 most recent one] was published in STOC 2015, achieving a near-linear (in the number of edges) time complexity.
 
== Karger's ''Contraction'' algorithm ==
We will describe a simple and elegant randomized algorithm for the min-cut problem. The algorithm is due to [http://people.csail.mit.edu/karger/ David Karger].
 
Let <math>G(V, E)</math> be a '''multi-graph''', which allows more than one '''parallel edges''' between two distinct vertices <math>u</math> and <math>v</math> but does not allow any '''self-loops''': the edges that adjoin a vertex to itself. A multi-graph <math>G</math> can be represented by an adjacency matrix <math>A</math>, in the way that each non-diagonal entry <math>A(u,v)</math> takes nonnegative integer values instead of just 0 or 1, representing the number of parallel edges between <math>u</math> and <math>v</math> in <math>G</math>, and all diagonal entries <math>A(v,v)=0</math> (since there is no self-loop).
 
Given a multi-graph <math>G(V,E)</math> and an edge <math>e\in E</math>, we define the following '''contraction''' operator Contract(<math>G</math>, <math>e</math>), which transform <math>G</math> to a new multi-graph.
{{Theorem|The contraction operator ''Contract''(<math>G</math>, <math>e</math>)|
:say <math>e=uv</math>:
:*replace <math>\{u,v\}</math> by a new vertex <math>x</math>;
:*for every edge (no matter parallel or not) in the form of <math>uw</math> or <math>vw</math> that connects one of <math>\{u,v\}</math> to a vertex <math>w\in V\setminus\{u,v\}</math> in the graph other than <math>u,v</math>, replace it by a new edge <math>xw</math>;
:*the reset of the graph does not change.
}}
}}


In other words, the <math>Contract(G,uv)</math> merges the two vertices <math>u</math> and <math>v</math> into a new vertex <math>x</math> whose incident edges preserves the edges incident to <math>u</math> or <math>v</math> in the original graph <math>G</math> except for the parallel edges between them. Now you should realize why we consider multi-graphs instead of simple graphs, because even if we start with a simple graph without parallel edges, the contraction operator may create parallel edges.
;Example
:Let <math>X_1,X_2,\ldots,X_m</math> be a set of ''mutually independent'' random variables. Each event <math>A_i</math> is a predicate defined on a number of variables among <math>X_1,X_2,\ldots,X_m</math>. Let <math>v(A_i)</math> be the unique smallest set of variables which determine <math>A_i</math>. The dependency graph <math>D=(V,E)</math> is defined by
:::<math>(i,j)\in E</math> iff <math>v(A_i)\cap v(A_j)\neq \emptyset</math>.


The contraction operator is illustrated by the following picture:
The following lemma, known as the Lovász local lemma, first proved by Erdős and Lovász in 1975, is an extremely powerful tool, as it supplies a way for dealing with rare events.
[[Image:Contract.png|600px|center]]


Karger's algorithm uses a simple idea:
{{Theorem
*At each step we randomly select an edge in the current multi-graph to contract until there are only two vertices left.
|Lovász Local Lemma (symmetric case)|
*The parallel edges between these two remaining vertices must be a cut of the original graph.
:Let <math>A_1,A_2,\ldots,A_n</math> be a set of events, and assume that there is a <math>p\in[0,1)</math> such that the followings are satisfied:
*We return this cut and hope that with good chance this gives us a minimum cut.
:#for all <math>1\le i\le n</math>, <math>\Pr[A_i]\le p</math>;
The following is the pseudocode for Karger's algorithm.
:#the maximum degree of the dependency graph for the events <math>A_1,A_2,\ldots,A_n</math> is <math>d</math>, and
{{Theorem|''RandomContract'' (Karger 1993)|
:::<math>\mathrm{e}p\cdot (d+1)\le 1</math>.
:'''Input:''' multi-graph <math>G(V,E)</math>;
:Then
:
::<math>\Pr\left[\bigwedge_{i=1}^n\overline{A_i}\right]>0</math>.
:while <math>|V|>2</math> do
:* choose an edge <math>uv\in E</math> uniformly at random;
:* <math>G=Contract(G,uv)</math>;
:return <math>C=E</math> (the parallel edges between the only two vertices in <math>V</math>);
}}
}}


Another way of looking at the contraction operator Contract(<math>G</math>,<math>e</math>) is that we are dealing with classes of vertices. Let <math>V=\{v_1,v_2,\ldots,v_n\}</math> be the set of all vertices. We start with <math>n</math> vertex classes <math>S_1,S_2,\ldots, S_n</math> with each class <math>S_i=\{v_i\}</math> contains one vertex. By calling <math>Contract(G,uv)</math>, where <math>u\in S_i</math> and <math>v\in S_j</math> for distinct <math>i\neq j</math>, we take union of <math>S_i</math> and <math>S_j</math>. The edges in the contracted multi-graph are the edges that cross between different vertex classes.
We will prove a general version of the local lemma, where the events <math>A_i</math> are not symmetric. This generalization is due to Spencer.
 
{{Theorem
This view of contraction is illustrated by the following picture:
|Lovász Local Lemma (general case)|
[[Image:Contract_class.png|600px|center]]
:Let <math>D=(V,E)</math> be the dependency graph of events <math>A_1,A_2,\ldots,A_n</math>. Suppose there exist real numbers <math>x_1,x_2,\ldots, x_n</math> such that <math>0\le x_i<1</math> and for all <math>1\le i\le n</math>,
 
::<math>\Pr[A_i]\le x_i\prod_{(i,j)\in E}(1-x_j)</math>.
The following claim is left as an exercise for the class:
:Then
:{|border="2" width="100%" cellspacing="4" cellpadding="3" rules="all" style="margin:1em 1em 1em 0; border:solid 1px #AAAAAA; border-collapse:collapse;empty-cells:show;"
::<math>\Pr\left[\bigwedge_{i=1}^n\overline{A_i}\right]\ge\prod_{i=1}^n(1-x_i)</math>.
|
*With suitable choice of data structures, each operation <math>Contract(G,e)</math> can be implemented within running time <math>O(n)</math> where <math>n=|V|</math> is the number of vertices.
|}
 
In the above '''''RandomContract''''' algorithm, there are precisely <math>n-2</math> contractions. Therefore, we have the following time upper bound.
{{Theorem|''Theorem''|
:The time of a single running of the '''''RandomContract''''' algorithm is bounded by <math>O(n^2)</math>.
}}
}}
The reason that here we emphasize it's a "single running" will be clear later: because we may run this algorithm for many times to guarantee a desirable accuracy.
== Analysis of accuracy ==
We now analyze the performance of the above algorithm. Since the algorithm is '''''randomized''''', its output cut is a random variable even when the input is fixed, so ''the output may not always be correct''. We want to give a theoretical guarantee of the chance that the algorithm returns a correct answer on an arbitrary input.
More precisely, on an arbitrarily fixed input multi-graph <math>G</math>, we want to answer the following question rigorously:
:<math>p_{\text{correct}}=\Pr[\,\text{a minimum cut is returned by }RandomContract\,]\ge ?</math>


To answer this question, we prove a stronger statement: for arbitrarily fixed input multi-graph <math>G</math> and a particular minimum cut <math>C</math> in <math>G</math>,  
To see that the general LLL implies symmetric LLL, we set <math>x_i=\frac{1}{d+1}</math> for all <math>i=1,2,\ldots,n</math>. Then we have <math>\left(1-\frac{1}{d+1}\right)^d>\frac{1}{\mathrm{e}}</math>.
:<math>p_{C}=\Pr[\,C\mbox{ is returned by }RandomContract\,]\ge ?</math>
Obviously this will imply the previous lower bound for <math>p_{\text{correct}}</math> because the event in <math>p_{C}</math> implies the event in <math>p_{\text{correct}}</math>.
:{|border="2" width="100%" cellspacing="4" cellpadding="3" rules="all" style="margin:1em 1em 1em 0; border:solid 1px #AAAAAA; border-collapse:collapse;empty-cells:show;"
|
*In above argument we use the simple law in probability that <math>\Pr[A]\le \Pr[B]</math> if <math>A\subseteq B</math>, i.e. event <math>A</math> implies event <math>B</math>.
|}


We introduce the following notations:
Assume the condition in the symmetric LLL:
*Let <math>e_1,e_2,\ldots,e_{n-2}</math> denote the sequence of random edges chosen to contract in a running of ''RandomContract'' algorithm.
:#for all <math>1\le i\le n</math>, <math>\Pr[A_i]\le p</math>;
*Let <math>G_1=G</math> denote the original input multi-graph. And for <math>i=1,2,\ldots,n-2</math>, let <math>G_{i+1}=Contract(G_{i},e_i)</math> be the multigraph after <math>i</math>th contraction.
:#<math>\mathrm{e}p\cdot(d+1)\le 1</math>;
Obviously <math>e_1,e_2,\ldots,e_{n-2}</math> are random variables, and they are the ''only'' random choices used in the algorithm: meaning that they along with the input <math>G</math>, uniquely determine the sequence of multi-graphs <math>G_1,G_2,\ldots,G_{n-2}</math> in every iteration as well as the final output.  
then it is easy to verify that for all <math>1\le i\le n</math>,
:<math>\Pr[A_i]\le p\le\frac{1}{e(d+1)}<\frac{1}{d+1}\left(1-\frac{1}{d+1}\right)^d\le x_i\prod_{(i,j)\in E}(1-x_j)</math>.
Due to the general LLL, we have
:<math>\Pr\left[\bigwedge_{i=1}^n\overline{A_i}\right]\ge\prod_{i=1}^n(1-x_i)=\left(1-\frac{1}{d+1}\right)^n>0</math>.
This proves the symmetric LLL.


We now want to compute the probability <math>p_C</math> by decompose it into more elementary events involving <math>e_1,e_2,\ldots,e_{n-2}</math>. This is due to the following proposition.
Now we prove the general LLL by the original induction proof.
{{Theorem
|Proposition 1|
:If <math>C</math> is a minimum cut in a multi-graph <math>G</math> and <math>e\not\in C</math>, then <math>C</math> is still a minimum cut in the contracted graph <math>G'=contract(G,e)</math>.
}}
{{Proof|
{{Proof|
We first observe that contraction will never create new cuts: every cut in the contracted graph <math>G'</math> must also be a cut in the original graph <math>G</math>.
First, apply the chain rule. We have
:<math>\Pr\left[\bigwedge_{i=1}^n\overline{A_i}\right]=\prod_{i=1}^n\Pr\left[\overline{A_i}\mid \bigwedge_{j=1}^{i-1}\overline{A_{j}}\right]=\prod_{i=1}^n\left(1-\Pr\left[{A_i}\mid \bigwedge_{j=1}^{i-1}\overline{A_{j}}\right]\right)</math>.


We then observe that a cut <math>C</math> in <math>G</math> "survives" in the contracted graph <math>G'</math> if and only if the contracted edge <math>e\not\in C</math>.
Next we prove by induction on <math>m</math> that for any set of <math>m</math> events <math>i_1,\ldots,i_m</math>,
:<math>\Pr\left[A_{i_1}\mid \bigwedge_{j=2}^m\overline{A_{i_j}}\right]\le x_{i_1}</math>.
The local lemma follows immediately by the above chain rule.


Both observations are easy to verify by the definition of contraction operator (in particular, easier to verify if we take the vertex class interpretation). The detailed proofs are left as an exercise.
For <math>m=1</math>, this is obvious because
}}
:<math>\Pr[A_{i_1}]\le x_{i_1}\prod_{(i_1,j)\in E}(1-x_j)\le x_{i_1}</math>.


Recall that <math>e_1,e_2,\ldots,e_{n-2}</math> denote the sequence of random edges chosen to contract in a running of ''RandomContract'' algorithm.
For general <math>m</math>, let <math>i_2,\ldots,i_k</math> be the set of vertices adjacent to <math>i_1</math> in the dependency graph, i.e. event <math>A_{i_1}</math> is mutually independent of <math>A_{i_{k+1}},A_{i_{k+2}},\ldots, A_{i_{m}}</math>.


By Proposition 1, the event <math>\mbox{``}C\mbox{ is returned by }RandomContract\mbox{''}\,</math> is equivalent to the event <math>\mbox{``}e_i\not\in C\mbox{ for all }i=1,2,\ldots,n-2\mbox{''}</math>. Therefore:
By conditional probability, we have
:<math>
\Pr\left[A_{i_1}\mid \bigwedge_{j=2}^m\overline{A_{i_j}}\right]
=\frac{\Pr\left[ A_i\wedge \bigwedge_{j=2}^k\overline{A_{i_j}}\mid \bigwedge_{j=k+1}^m\overline{A_{i_j}}\right]}
{\Pr\left[\bigwedge_{j=2}^k\overline{A_{i_j}}\mid \bigwedge_{j=k+1}^m\overline{A_{i_j}}\right]}
</math>.
First, we bound the numerator. Due to that <math>A_{i_1}</math> is mutually independent of <math>A_{i_{k+1}},A_{i_{k+2}},\ldots, A_{i_{m}}</math>, we have
:<math>
:<math>
\begin{align}
\begin{align}
p_C
\Pr\left[ A_{i_1}\wedge \bigwedge_{j=2}^k\overline{A_{i_j}}\mid \bigwedge_{j=k+1}^m\overline{A_{i_j}}\right]
&=
&\le\Pr\left[ A_{i_1}\mid \bigwedge_{j=k+1}^m\overline{A_{i_j}}\right]\\
\Pr[\,C\mbox{ is returned by }{RandomContract}\,]\\
&=\Pr[A_{i_1}]\\
&=
&\le x_{i_1}\prod_{(i_1,j)\in E}(1-x_j).
\Pr[\,e_i\not\in C\mbox{ for all }i=1,2,\ldots,n-2\,]\\
&=
\prod_{i=1}^{n-2}\Pr[e_i\not\in C\mid \forall j<i, e_j\not\in C].
\end{align}
\end{align}
</math>
</math>
The last equation is due to the so called '''chain rule''' in probability.
:{|border="2" width="100%" cellspacing="4" cellpadding="3" rules="all" style="margin:1em 1em 1em 0; border:solid 1px #AAAAAA; border-collapse:collapse;empty-cells:show;"
|
*The '''chain rule''', also known as the '''law of progressive conditioning''', is the following proposition: for a sequence of events (not necessarily independent) <math>A_1,A_2,\ldots,A_n</math>,
::<math>\Pr[\forall i, A_i]=\prod_{i=1}^n\Pr[A_i\mid \forall j<i, A_j]</math>.
:It is a simple consequence of the definition of conditional probability. By definition of conditional probability,
::<math>\Pr[A_n\mid \forall j<n]=\frac{\Pr[\forall i, A_i]}{\Pr[\forall j<n, A_j]}</math>,
:and equivalently we have
::<math>\Pr[\forall i, A_i]=\Pr[\forall j<n, A_j]\Pr[A_n\mid \forall j<n]</math>.
:Recursively apply this to <math>\Pr[\forall j<n, A_j]</math> we obtain the chain rule.
|}
Back to the analysis of probability <math>p_C</math>.
Now our task is to give lower bound to each <math>p_i=\Pr[e_i\not\in C\mid \forall j<i, e_j\not\in C]</math>. The condition <math>\mbox{``}\forall j<i, e_j\not\in C\mbox{''}</math> means the min-cut <math>C</math> survives all first <math>i-1</math> contractions <math>e_1,e_2,\ldots,e_{i-1}</math>. Apply Proposition 1 again. The min-cut <math>C</math> survives all first <math>i-1</math> contractions means that <math>C</math> is also a min-cut in <math>G_i</math>. Recall that <math>G_{i}</math> denotes the multi-graph after first <math>i-1</math> contractions.
Then the conditional probability <math>p_i=\Pr[e_i\not\in C\mid \forall j<i, e_j\not\in C]</math> is the probability that no edges in <math>C</math> is chosen (when a uniform random edge in the current multi-graph <math>G_i</math> is chosen) assuming that <math>C</math> minimum cut in the current multi-graph <math>G_i</math>. Intuitively this probability should be bounded from below, because as a min-cut <math>C</math> should be sparse among the edge set <math>E</math>. This intuition is justified by the following proposition.
{{Theorem
|Proposition 2|
:If <math>C</math> is a min-cut in a multi-graph <math>G(V,E)</math>, then <math>|E|\ge \frac{|V||C|}{2}</math>.
}}
{{Proof|
:It must hold that the degree of each vertex <math>v\in V</math> is at least <math>|C|</math>, or otherwise the set of edges incident to <math>v</math> forms a cut of size smaller than <math>|C|</math> which separates <math>\{v\}</math> from the rest of the graph, contradicting that <math>C</math> is a min-cut. And the bound <math>|E|\ge \frac{|V||C|}{2}</math> follows directly from applying the [https://en.wikipedia.org/wiki/Handshaking_lemma handshaking lemma] to the fact that every vertex in <math>G</math> has degree at least <math>|C|</math>.
}}
Let <math>V_i</math> and <math>E_i</math> denote the vertex set and edge set of the multi-graph <math>G_i</math> respectively, and recall that <math>G_i</math> is the multi-graph obtained from applying first <math>(i-1)</math> contractions. Obviously <math>|V_{i}|=n-i+1</math>. And due to Proposition 2, <math>|E_i|\ge \frac{|V_i||C|}{2}</math> if <math>C</math> is still a min-cut in <math>G_i</math>.


The probability <math>p_i=\Pr[e_i\not\in C\mid \forall j<i, e_j\not\in C]</math> can be computed as
Next, we bound the denominator. Applying the chain rule, we have
:<math>
:<math>
\begin{align}
\Pr\left[\bigwedge_{j=2}^k\overline{A_{i_j}}\mid \bigwedge_{j=k+1}^m\overline{A_{i_j}}\right]
p_i
=\prod_{j=2}^k\Pr\left[\overline{A_{i_j}}\mid \bigwedge_{\ell=j+1}^m\overline{A_{i_\ell}}\right]
&=1-\frac{|C|}{|E_i|}\\
</math>
&\ge1-\frac{2}{|V_i|}\\
which by the induction hypothesis, is at least
&=1-\frac{2}{n-i+1}
\end{align},</math>
where the inequality is due to Proposition 2.
 
We now can put everything together. We arbitrarily fix the input multi-graph <math>G</math> and any particular minimum cut <math>C</math> in <math>G</math>.
:<math>\begin{align}
p_{\text{correct}}
&=\Pr[\,\text{a minimum cut is returned by }RandomContract\,]\\
&\ge
\Pr[\,C\mbox{ is returned by }{RandomContract}\,]\\
&=
\Pr[\,e_i\not\in C\mbox{ for all }i=1,2,\ldots,n-2\,]\\
&=
\prod_{i=1}^{n-2}\Pr[e_i\not\in C\mid \forall j<i, e_j\not\in C]\\
&\ge
\prod_{i=1}^{n-2}\left(1-\frac{2}{n-i+1}\right)\\
&=
\prod_{k=3}^{n}\frac{k-2}{k}\\
&= \frac{2}{n(n-1)}.
\end{align}</math>
 
This gives us the following theorem.
{{Theorem
|Theorem|
: For any multigraph with <math>n</math> vertices, the ''RandomContract'' algorithm returns a minimum cut with probability at least <math>\frac{2}{n(n-1)}</math>.
}}
At first glance this seems to be a miserable chance of success. However, notice that there may be exponential many cuts in a graph (because potentially every nonempty subset <math>S\subset V</math> corresponds to a cut <math>C=E(S,\overline{S})</math>), and Karger's algorithm effectively reduce this exponential-sized space of feasible solutions to a quadratic size one, an exponential improvement!
 
We can run ''RandomContract'' independently for <math>t=\frac{n(n-1)\ln n}{2}</math> times and return the smallest cut ever returned. The probability that a minimum cut is found is at least:
 
:<math>\begin{align}
&\quad 1-\Pr[\,\mbox{all }t\mbox{ independent runnings of } RandomContract\mbox{ fails to find a min-cut}\,] \\
&= 1-\Pr[\,\mbox{a single running of }{RandomContract}\mbox{ fails}\,]^{t} \\
&\ge 1- \left(1-\frac{2}{n(n-1)}\right)^{\frac{n(n-1)\ln n}{2}} \\
&\ge 1-\frac{1}{n}.
\end{align}</math>
 
Recall that a running of ''RandomContract'' algorithm takes <math>O(n^2)</math> time. Altogether this gives us a randomized algorithm running in time <math>O(n^4\log n)</math> and find a minimum cut [https://en.wikipedia.org/wiki/With_high_probability '''with high probability'''].
 
== A Corollary by the Probabilistic Method ==
Karger's algorithm and its analysis implies the following combinatorial theorem regarding the number of distinct minimum cuts in a graph.
{{Theorem|Corollary|
:For any graph <math>G(V,E)</math> of <math>n</math> vertices, the number of distinct minimum cuts in <math>G</math> is at most <math>\frac{n(n-2)}{2}</math>.
}}
{{Proof|
For each minimum cut <math>C</math> in <math>G</math>, we define <math>\mathcal{E}_C</math> to be the event that ''RandomContract'' returns <math>C</math>. Due to the analysis of RandomContract, <math>\Pr[\mathcal{E}_C]\ge \frac{2}{n(n-1)}</math>. The events <math>\mathcal{E}_C</math> are mutually disjoint for distinct <math>C</math> and the event that ''RandomContract'' returns a min-cut is the disjoint union of <math>\mathcal{E}_C</math> over all min-cut <math>C</math>. Therefore,
:<math>
:<math>
\begin{align}
\prod_{j=2}^k(1-x_{i_j})=\prod_{\{i_1,i_j\}\in E}(1-x_j)
&\Pr[\mbox{ RandomContract returns a min-cut}]\\
=
&\sum_{\mbox{min-cut }C\mbox{ in }G}\Pr[\mathcal{E}_C]\\
\ge
&\sum_{\mbox{min-cut }C\mbox{ in }G}\frac{2}{n(n-1)},
\end{align}
</math>
</math>
which must be no greater than 1 for a well-defined probability space. This means the total number of min-cut in <math>G</math> must be no greater than <math>\frac{n(n-1)}{2}</math>.
where <math>E</math> is the set of edges in the dependency graph.
}}
Note that the statement of this theorem has no randomness at all, however the proof involves a randomized algorithm. This is an example of [http://en.wikipedia.org/wiki/Probabilistic_method the probabilistic method].
 
== Fast Min-Cut ==
In the analysis of ''RandomContract'', we have the following observation:
* The probability of success is only getting worse when the graph becomes small.
This motivates us to consider the following alternation to the algorithm: first using random contractions to reduce the number of vertices to a moderately small number, and then recursively finding a min-cut in this smaller instance. This seems just a restatement of exactly what we have been doing. Inspired by the idea of boosting the accuracy via independent repetition, here we apply the recursion on ''two'' smaller instances generated independently.
 
The algorithm obtained in this way is called ''FastCut''. We first define a procedure to randomly contract edges until there are <math>t</math> number of vertices left.
 
{{Theorem|''RandomContract''<math>(G, t)</math>|
:while <math>|V|>t</math> do
:* choose an edge <math>uv\in E</math> uniformly at random;
:* <math>G=contract(G,uv)</math>;
:return <math>G</math>;
}}
 
The ''FastCut'' algorithm is recursively defined as follows.
{{Theorem|''FastCut''<math>(G)</math>|
:if <math>|V|\le 6</math> then return a mincut by brute force;
:else let <math>t=\left\lceil1+|V|/\sqrt{2}\right\rceil</math>;
:: <math>G_1=RandomContract(G,t)</math>;
:: <math>G_2=RandomContract(G,t)</math>;
::return the smaller one of <math>FastCut(G_1)</math> and <math>FastCut(G_2)</math>;
}}
 
As before, all <math>G</math> are multigraphs.


Let <math>C</math> be a min-cut in the original multigraph <math>G</math>. By the same analysis as in the case of ''RandomContract'', we have
Altogether, we prove the induction hypothesis
:<math>
:<math>
\begin{align}
\Pr\left[A_{i_1}\mid \bigwedge_{j=2}^m\overline{A_{i_j}}\right]
&\Pr[C\text{ survives all contractions in }RandomContract(G,t)]\\
\le\frac{x_{i_1}\prod_{(i_1,j)\in E}(1-x_j)}{\prod_{\{i_1,i_j\}\in E}(1-x_j)}\le x_{i_1}.
=
&\prod_{i=1}^{n-t}\Pr[C\text{ survives the }i\text{-th contraction}\mid C\text{ survives the first }(i-1)\text{-th contractions}]\\
\ge
&\prod_{i=1}^{n-t}\left(1-\frac{2}{n-i+1}\right)\\
=
&\prod_{k=t+1}^{n}\frac{k-2}{k}\\
=
&\frac{t(t-1)}{n(n-1)}.
\end{align}
</math>
</math>
When <math>t=\left\lceil1+n/\sqrt{2}\right\rceil</math>, this probability is at least <math>1/2</math>.


We use <math>p(n)</math> to denote the probability that <math>C</math> is returned by <math>FastCut(G)</math>, where <math>G</math> is a multigraph of <math>n</math> vertices. We then have the following recursion for <math>p(n)</math>.
Due to the chain rule, it holds that
:<math>
:<math>
\begin{align}
\begin{align}
p(n)
\Pr\left[\bigwedge_{i=1}^n\overline{A_i}\right]
&=
&=\prod_{i=1}^n\Pr\left[\overline{A_i}\mid \bigwedge_{j=1}^{i-1}\overline{A_{j}}\right]\\
\Pr[C\text{ is returned by }\textit{FastCut}(G)]\\
&=\prod_{i=1}^n\left(1-\Pr\left[A_i\mid \bigwedge_{j=1}^{i-1}\overline{A_{j}}\right]\right)\\
&=
&\ge\prod_{i=1}^n\left(1-x_i\right).
1-\left(1-\Pr[C\text{ survives in }G_1\wedge C=\textit{FastCut}(G_1)]\right)^2\\
&=
1-\left(1-\Pr[C\text{ survives in }G_1]\Pr[C=\textit{FastCut}(G_1)\mid C\text{ survives in }G_1]\right)^2\\
&\ge
1-\left(1-\frac{1}{2}p\left(\left\lceil1+n/\sqrt{2}\right\rceil\right)\right)^2,
\end{align}
\end{align}
</math>
</math>
where the last inequality is due to the fact that <math>\Pr[C\text{ survives all contractions in }RandomContract(G,t)]\ge1/2</math> and our previous discussions in the analysis of ''RandomContract'' that if the min-cut <math>C</math> survives all first <math>(n-t)</math> contractions then <math>C</math> must be a min-cut in the remaining multigraph.
}}
 
The base case is that  <math>p(n)=1</math> for <math>n\le 6</math>. Solving this recursion of <math>p(n)</math> (or proving by induction) gives us that
:<math>
p(n)=\Omega\left(\frac{1}{\log n}\right).
</math>
 
Recall that we can implement an edge contraction in <math>O(n)</math> time, thus it is easy to verify the following recursion of time complexity:
:<math>
T(n)=2T\left(\left\lceil1+n/\sqrt{2}\right\rceil\right)+O(n^2),
</math>
where <math>T(n)</math> denotes the running time of <math>FastCut(G)</math> on a multigraph <math>G</math> of <math>n</math> vertices.
 
Solving the recursion of <math>T(n)</math> with the base case <math>T(n)=O(1)</math> for <math>n\le 6</math>, we have <math>T(n)=O(n^2\log n)</math>.
 
Therefore, for a multigraph <math>G</math> of <math>n</math> vertices, the algorithm <math>FastCut(G)</math> returns a min-cut in <math>G</math> with probability <math>\Omega\left(\frac{1}{\log n}\right)</math> in time <math>O(n^2\log n)</math>. Repeat this independently for <math>O(log n)</math> times, we have an algorithm which runs in time <math>O(n^2\log^2n)</math> and returns a min-cut with probability <math>1-O(1/n)</math>, a high probability.

Latest revision as of 09:48, 3 October 2016

Given a sequence of events [math]\displaystyle{ A_1,A_2,\ldots,A_n }[/math], we use the dependency graph to describe the dependencies between these events.

Definition (dependency graph)
Let [math]\displaystyle{ A_1,A_2,\ldots,A_n }[/math] be a sequence of events. A graph [math]\displaystyle{ D=(V,E) }[/math] on the set of vertices [math]\displaystyle{ V=\{1,2,\ldots,n\} }[/math] is called a dependency graph for the events [math]\displaystyle{ A_1,\ldots,A_n }[/math] if for each [math]\displaystyle{ i }[/math], [math]\displaystyle{ 1\le i\le n }[/math], the event [math]\displaystyle{ A_i }[/math] is mutually independent of all the events [math]\displaystyle{ \{A_j\mid (i,j)\not\in E\} }[/math].

The notion of mutual independence between an event and a set of events is formally defined as follows.

Definition (mutual independence)
An event [math]\displaystyle{ A }[/math] is said to be mutually independent of events [math]\displaystyle{ B_1,B_2,\ldots, B_k }[/math], if for any disjoint [math]\displaystyle{ I^+,I^-\subseteq\{1,2,\ldots,k\} }[/math], it holds that
[math]\displaystyle{ \Pr\left[A \mid \left(\bigwedge_{i\in I^+}B_i\right) \wedge \left(\bigwedge_{i\in I^-}\overline{B_i}\right)\right]=\Pr[A] }[/math].
Example
Let [math]\displaystyle{ X_1,X_2,\ldots,X_m }[/math] be a set of mutually independent random variables. Each event [math]\displaystyle{ A_i }[/math] is a predicate defined on a number of variables among [math]\displaystyle{ X_1,X_2,\ldots,X_m }[/math]. Let [math]\displaystyle{ v(A_i) }[/math] be the unique smallest set of variables which determine [math]\displaystyle{ A_i }[/math]. The dependency graph [math]\displaystyle{ D=(V,E) }[/math] is defined by
[math]\displaystyle{ (i,j)\in E }[/math] iff [math]\displaystyle{ v(A_i)\cap v(A_j)\neq \emptyset }[/math].

The following lemma, known as the Lovász local lemma, first proved by Erdős and Lovász in 1975, is an extremely powerful tool, as it supplies a way for dealing with rare events.

Lovász Local Lemma (symmetric case)
Let [math]\displaystyle{ A_1,A_2,\ldots,A_n }[/math] be a set of events, and assume that there is a [math]\displaystyle{ p\in[0,1) }[/math] such that the followings are satisfied:
  1. for all [math]\displaystyle{ 1\le i\le n }[/math], [math]\displaystyle{ \Pr[A_i]\le p }[/math];
  2. the maximum degree of the dependency graph for the events [math]\displaystyle{ A_1,A_2,\ldots,A_n }[/math] is [math]\displaystyle{ d }[/math], and
[math]\displaystyle{ \mathrm{e}p\cdot (d+1)\le 1 }[/math].
Then
[math]\displaystyle{ \Pr\left[\bigwedge_{i=1}^n\overline{A_i}\right]\gt 0 }[/math].

We will prove a general version of the local lemma, where the events [math]\displaystyle{ A_i }[/math] are not symmetric. This generalization is due to Spencer.

Lovász Local Lemma (general case)
Let [math]\displaystyle{ D=(V,E) }[/math] be the dependency graph of events [math]\displaystyle{ A_1,A_2,\ldots,A_n }[/math]. Suppose there exist real numbers [math]\displaystyle{ x_1,x_2,\ldots, x_n }[/math] such that [math]\displaystyle{ 0\le x_i\lt 1 }[/math] and for all [math]\displaystyle{ 1\le i\le n }[/math],
[math]\displaystyle{ \Pr[A_i]\le x_i\prod_{(i,j)\in E}(1-x_j) }[/math].
Then
[math]\displaystyle{ \Pr\left[\bigwedge_{i=1}^n\overline{A_i}\right]\ge\prod_{i=1}^n(1-x_i) }[/math].

To see that the general LLL implies symmetric LLL, we set [math]\displaystyle{ x_i=\frac{1}{d+1} }[/math] for all [math]\displaystyle{ i=1,2,\ldots,n }[/math]. Then we have [math]\displaystyle{ \left(1-\frac{1}{d+1}\right)^d\gt \frac{1}{\mathrm{e}} }[/math].

Assume the condition in the symmetric LLL:

  1. for all [math]\displaystyle{ 1\le i\le n }[/math], [math]\displaystyle{ \Pr[A_i]\le p }[/math];
  2. [math]\displaystyle{ \mathrm{e}p\cdot(d+1)\le 1 }[/math];

then it is easy to verify that for all [math]\displaystyle{ 1\le i\le n }[/math],

[math]\displaystyle{ \Pr[A_i]\le p\le\frac{1}{e(d+1)}\lt \frac{1}{d+1}\left(1-\frac{1}{d+1}\right)^d\le x_i\prod_{(i,j)\in E}(1-x_j) }[/math].

Due to the general LLL, we have

[math]\displaystyle{ \Pr\left[\bigwedge_{i=1}^n\overline{A_i}\right]\ge\prod_{i=1}^n(1-x_i)=\left(1-\frac{1}{d+1}\right)^n\gt 0 }[/math].

This proves the symmetric LLL.

Now we prove the general LLL by the original induction proof.

Proof.

First, apply the chain rule. We have

[math]\displaystyle{ \Pr\left[\bigwedge_{i=1}^n\overline{A_i}\right]=\prod_{i=1}^n\Pr\left[\overline{A_i}\mid \bigwedge_{j=1}^{i-1}\overline{A_{j}}\right]=\prod_{i=1}^n\left(1-\Pr\left[{A_i}\mid \bigwedge_{j=1}^{i-1}\overline{A_{j}}\right]\right) }[/math].

Next we prove by induction on [math]\displaystyle{ m }[/math] that for any set of [math]\displaystyle{ m }[/math] events [math]\displaystyle{ i_1,\ldots,i_m }[/math],

[math]\displaystyle{ \Pr\left[A_{i_1}\mid \bigwedge_{j=2}^m\overline{A_{i_j}}\right]\le x_{i_1} }[/math].

The local lemma follows immediately by the above chain rule.

For [math]\displaystyle{ m=1 }[/math], this is obvious because

[math]\displaystyle{ \Pr[A_{i_1}]\le x_{i_1}\prod_{(i_1,j)\in E}(1-x_j)\le x_{i_1} }[/math].

For general [math]\displaystyle{ m }[/math], let [math]\displaystyle{ i_2,\ldots,i_k }[/math] be the set of vertices adjacent to [math]\displaystyle{ i_1 }[/math] in the dependency graph, i.e. event [math]\displaystyle{ A_{i_1} }[/math] is mutually independent of [math]\displaystyle{ A_{i_{k+1}},A_{i_{k+2}},\ldots, A_{i_{m}} }[/math].

By conditional probability, we have

[math]\displaystyle{ \Pr\left[A_{i_1}\mid \bigwedge_{j=2}^m\overline{A_{i_j}}\right] =\frac{\Pr\left[ A_i\wedge \bigwedge_{j=2}^k\overline{A_{i_j}}\mid \bigwedge_{j=k+1}^m\overline{A_{i_j}}\right]} {\Pr\left[\bigwedge_{j=2}^k\overline{A_{i_j}}\mid \bigwedge_{j=k+1}^m\overline{A_{i_j}}\right]} }[/math].

First, we bound the numerator. Due to that [math]\displaystyle{ A_{i_1} }[/math] is mutually independent of [math]\displaystyle{ A_{i_{k+1}},A_{i_{k+2}},\ldots, A_{i_{m}} }[/math], we have

[math]\displaystyle{ \begin{align} \Pr\left[ A_{i_1}\wedge \bigwedge_{j=2}^k\overline{A_{i_j}}\mid \bigwedge_{j=k+1}^m\overline{A_{i_j}}\right] &\le\Pr\left[ A_{i_1}\mid \bigwedge_{j=k+1}^m\overline{A_{i_j}}\right]\\ &=\Pr[A_{i_1}]\\ &\le x_{i_1}\prod_{(i_1,j)\in E}(1-x_j). \end{align} }[/math]

Next, we bound the denominator. Applying the chain rule, we have

[math]\displaystyle{ \Pr\left[\bigwedge_{j=2}^k\overline{A_{i_j}}\mid \bigwedge_{j=k+1}^m\overline{A_{i_j}}\right] =\prod_{j=2}^k\Pr\left[\overline{A_{i_j}}\mid \bigwedge_{\ell=j+1}^m\overline{A_{i_\ell}}\right] }[/math]

which by the induction hypothesis, is at least

[math]\displaystyle{ \prod_{j=2}^k(1-x_{i_j})=\prod_{\{i_1,i_j\}\in E}(1-x_j) }[/math]

where [math]\displaystyle{ E }[/math] is the set of edges in the dependency graph.

Altogether, we prove the induction hypothesis

[math]\displaystyle{ \Pr\left[A_{i_1}\mid \bigwedge_{j=2}^m\overline{A_{i_j}}\right] \le\frac{x_{i_1}\prod_{(i_1,j)\in E}(1-x_j)}{\prod_{\{i_1,i_j\}\in E}(1-x_j)}\le x_{i_1}. }[/math]

Due to the chain rule, it holds that

[math]\displaystyle{ \begin{align} \Pr\left[\bigwedge_{i=1}^n\overline{A_i}\right] &=\prod_{i=1}^n\Pr\left[\overline{A_i}\mid \bigwedge_{j=1}^{i-1}\overline{A_{j}}\right]\\ &=\prod_{i=1}^n\left(1-\Pr\left[A_i\mid \bigwedge_{j=1}^{i-1}\overline{A_{j}}\right]\right)\\ &\ge\prod_{i=1}^n\left(1-x_i\right). \end{align} }[/math]
[math]\displaystyle{ \square }[/math]