随机算法 (Fall 2015)/Lovász Local Lemma and 高级算法 (Fall 2016)/Min-Cut and Max-Cut: Difference between pages

From TCS Wiki
(Difference between pages)
Jump to navigation Jump to search
imported>Etone
 
imported>Etone
 
Line 1: Line 1:
= Lovász Local Lemma=
<font color=red size=3> under construction</font>[[File:Under_construction.png‎|30px]]
Suppose that we are give a set of "bad" events <math>A_1,A_2,\ldots,A_n</math>. We want to know that it is possible that none of them occurs, that is:
 
:<math>
= Graph Cut =
\Pr\left[\bigwedge_{i=1}^n\overline{A_i}\right]>0.
Let <math>G(V, E)</math> be an undirected graph. A subset <math>C\subseteq E</math> of edges is a '''cut''' of graph <math>G</math> if <math>G</math> becomes ''disconnected'' after deleting all edges in <math>C</math>.  
</math>
Obviously, a ''necessary'' condition for this is that for none of the bad events its occurrence is certain, i.e. <math>\Pr[A_i]<1</math> for all <math>i</math>. We are interested in the ''sufficient'' condition for the above. There are two easy cases:
;Case 1<nowiki>: mutual independence.</nowiki>
If all the bad events <math>A_1,A_2,\ldots,A_m</math> are mutually independent, then
:<math>
\Pr\left[\bigwedge_{i=1}^m\overline{A_i}\right]=\prod_{i=1}^m(1-\Pr[A_i])
</math>
and hence this probability is positive if <math>\Pr[A_i]<1</math> for all <math>i</math>.


;Case 2<nowiki>: arbitrary dependency.</nowiki>
More formally, a pair of ''disjoint'' subsets <math>S,T\subseteq V</math> of vertices is called a '''bipartition''' of <math>V</math> if <math>S</math> and <math>T</math> are not empty and <math>S\cup T=V</math>.
On the other extreme, if we know nothing about the dependencies between these bad event, the best we can do is to apply the union bound:
Given a bipartition <math>\{S,T\}</math> of <math>V</math>, we denote by
:<math>
:<math>E(S,T)=\{uv\in E\mid u\in S, v\in T\}</math>
\Pr\left[\bigwedge_{i=1}^m\overline{A_i}\right]\ge 1-\sum_{i=1}^m\Pr\left[A_i\right],
the set of "crossing edges" with one endpoint in each of <math>S</math> and <math>T</math>.  
</math>
which is positive if <math>\sum_{i=1}^m\Pr\left[A_i\right]<1</math>. This is a very loose bound, however it cannot be further improved if no further information regarding the dependencies between the events is assumed.


== Lovász Local Lemma (symmetric case) ==
Then every cut <math>C\subseteq E</math> in <math>G</math> corresponds to a
In most situations, the dependencies between events are somewhere between these two extremal cases: the events are not independent of each other, but on the other hand the dependencies between them are not total out of control. For these more general cases, we would like to exploit the tradeoff between probabilities of bad events and dependencies between them.  
:<math>C=E(S,T),\quad \{S,T\}\mbox{ is a bipartition of }V</math>.


The Lovász local lemma is such a powerful tool for showing the possibility of rare event under ''limited dependencies''. The structure of dependencies between a set of events is described by a '''dependency graph''', which is a graph with events as vertices and each event is adjacent to the events which are dependent with it in the dependency graph.
Given a graph <math>G</math>, there might be many cuts in <math>G</math>, and we are interested in finding its '''minimum''' or '''maximum''' cut.


{{Theorem
= Min-Cut =
|Definition (dependency graph)|
The '''min-cut problem''', also called the '''global minimum cut problem''', is defined as follows.
:Let <math>A_1,A_2,\ldots,A_m</math> be a set of events. A graph <math>D=(V,E)</math> with set of vertices <math>V=\{A_1,A_2,\ldots,A_m\}</math> is called a '''dependency graph''' for the events <math>A_1,\ldots,A_m</math> if every event <math>A_i</math> is mutually independent of all the events in <math>\{A_j\mid (A_i,A_j)\not\in E\}</math>.
{{Theorem|Min-cut problem|
*'''Input''': an undirected graph <math>G(V,E)</math>;
*'''Output''': a cut <math>C</math> in <math>G</math> with the smallest size <math>|C|</math>.
}}
}}
The maximum degree <math>d</math> of the dependency graph <math>D</math> is a very useful information, as it tells us that every event <math>A_i</math> among  <math>A_1,A_2,\ldots,A_m</math> is dependent with how many other events at most.
;Remark on the mutual independence
:In probability theory, an event <math>A</math> is said to be independent of events <math>B_1,B_2,\ldots,B_k</math> if for any ''disjoint'' <math>I,J\subseteq\{1,2,\ldots,k\}</math>, we have
:::<math>\Pr\left[A\mid \left(\bigwedge_{i\in I}B_i \right)\wedge \left(\bigwedge_{i\in J}\overline{B}_i\right) \right]=\Pr[A]</math>,
:that is, occurrences of events among <math>B_1,B_2,\ldots,B_k</math> have no influence on the occurrence of <math>A</math>.
;Example
:Let <math>X_1,X_2,\ldots,X_n</math> be a set of ''mutually independent'' random variables. Each event <math>A_i</math> is a predicate defined on a number of variables among <math>X_1,X_2,\ldots,X_n</math>. Let <math>\mathsf{vbl}(A_i)</math> be the unique smallest set of variables which determine <math>A_i</math>. The dependency graph <math>D=(V,E)</math> is defined as that any two events <math>A_i,A_j</math> are adjacent in <math>D</math> if and only if they share variables, i.e. <math>\mathsf{vbl}(A_i)\cap\mathsf{vbl}(A_j)\neq\emptyset</math>.


The following theorem was proved by Erdős and Lovász in 1975 and then later improved by Lovász in 1977. Now it is commonly referred as the '''Lovász local lemma'''. It is a very powerful tool, especially when being used with the probabilistic method, as it supplies a way for dealing with rare events.
Equivalently, the problem asks to find a bipartition of <math>V</math> into disjoint non-empty subsets <math>S</math> and <math>T</math> that minimizes <math>|E(S,T)|</math>.


{{Theorem
We consider the problem in a slightly more generalized setting, where the input graphs <math>G</math> can be '''multi-graphs''', meaning that there could be multiple edges between two vertices <math>u</math> and <math>v</math>. We call such edges the '''parallel edges'''. The cuts in multi-graphs are defined in the same way as before, and the cost of a cut <math>C</math> is given by the total number of edges (including parallel edges) in <math>C</math>. Equivalently, one may think of a multi-graph as a graph with integer edge weights, and the cost of a cut <math>C</math> is the total weights of all edges in <math>C</math>.
|Lovász Local Lemma (symmetric case)|
:Let <math>A_1,A_2,\ldots,A_m</math> be a set of events, and assume that the followings hold:
#<math>\Pr[A_i]\le p</math> for every event <math>A_i</math>;
#every event <math>A_i</math> is mutually independent of all other events except at most <math>d</math> of them, and
:::<math>\mathrm{e}p(d+1)\le 1</math>.
:Then
::<math>\Pr\left[\bigwedge_{i=1}^n\overline{A_i}\right]>0</math>.
}}
Here <math>d</math> is the maximum degree of the dependency graph <math>D</math> for the events <math>A_1,\ldots,A_m</math>.


Intuitively, the Lovász Local Lemma says that if a rare (but hopefully possible) event is formulated as to avoid a series of bad events simultaneously, then the rare event is indeed possible if:
A canonical deterministic algorithm for this problem is through the [http://en.wikipedia.org/wiki/Max-flow_min-cut_theorem max-flow min-cut theorem]. The max-flow algorithm finds us a minimum '''<math>s</math>-<math>t</math> cut''', which disconnects a '''source''' <math>s\in V</math> from a '''sink''' <math>t\in V</math>, both specified as part of the input. A global min cut can be found by exhaustively finding the minimum <math>s</math>-<math>t</math> cut for an arbitrarily fixed source <math>s</math> and all possible sink <math>t\neq s</math>. This takes <math>(n-1)\times</math>max-flow time.
* none of these bad events is too probable;
* none of these bad events is dependent with too many other bad events;
Here the tradeoff between "too probable" and "too many" is characterized by the <math>\mathrm{e}p(d+1)\le 1</math> condition.


According to a result of Shearer in 1985, the condition of the Lovász Local Lemma cannot be substantially improved if only the bounds on <math>p</math> and <math>d</math> are known.
The fastest known deterministic algorithm for the minimum cut problem on multi-graphs is the [https://en.wikipedia.org/wiki/Stoer–Wagner_algorithm Stoer–Wagner algorithm], which achieves an <math>O(mn+n^2\log n)</math> time complexity.


==Lovász Local Lemma (asymmetric case)==
If we restrict the input to be '''simple graphs''' (meaning there is no parallel edges) with no edge weight, there are better algorithms. The [http://delivery.acm.org/10.1145/2750000/2746588/p665-kawarabayashi.pdf?ip=114.212.86.114&id=2746588&acc=ACTIVE%20SERVICE&key=BF85BBA5741FDC6E%2E180A41DAF8736F97%2E4D4702B0C3E38B35%2E4D4702B0C3E38B35&CFID=839435129&CFTOKEN=67928165&__acm__=1474187635_eafe662feeb838ca5ece2f6b56715177 most recent one] was published in STOC 2015, achieving a near-linear (in the number of edges) time complexity.
Sometimes when applying the local lemma, a few bad events are much more probable than others or are dependent with more other bad events. In this case, using the same upper bounds <math>p</math> on the probability of bad events or <math>d</math> on the number of dependent events will be much wasteful. To more accurately deal with such general cases, we need a more refined way to characterize the tradeoff between local dependencies and probabilities of bad events.  


We need to introduce a few notations that will be frequently used onwards.
== Karger's ''Contraction'' algorithm ==
Let <math>\mathcal{A}=\{A_1,A_2,\ldots,A_m\}</math> be a set of events. For every event <math>A_i\in\mathcal{A}</math>, we define its neighborhood and inclusive neighborhood as follows:
We will describe a simple and elegant randomized algorithm for the min-cut problem. The algorithm is due to [http://people.csail.mit.edu/karger/ David Karger].
*'''inclusive neighborhood''': <math>\Gamma^+(A_i)\,</math> denotes the set of events in <math>\mathcal{A}</math>, including <math>A_i</math> itself, that are dependent with <math>A_i</math>. More precisely, <math>A_i</math> is mutually independent of all events in <math>\mathcal{A}\setminus\Gamma^+(A_i)</math>.
*'''neighborhood''': <math>\Gamma(A_i)=\Gamma^+(A_i)\setminus \{A_i\}</math>, that is, <math>\Gamma(A_i)</math> contains the events in <math>\mathcal{A}</math> that are dependent with <math>A_i</math>, not including <math>A_i</math> itself.


The following is the asymmetric version of the Lovász Local Lemma. This generalization is due to Spencer.
Let <math>G(V, E)</math> be a multi-graph, which allows more than one parallel edges between two distinct vertices <math>u</math> and <math>v</math> but does not allow any self-loops: edges that connect a vertex to itself. A multi-graph can be represented by an adjacency matrix <math>A</math> where <math>A(u,v)</math> takes nonnegative integer values instead of just 0 or 1, representing the number of parallel edges between <math>u</math> and <math>v</math>, and <math>A(v,v)=0</math> for all <math>v</math> (since there is no self-loop).


{{Theorem
Given a multi-graph <math>G(V,E)</math> and an edge <math>e\in E</math>, we define the following '''contraction''' operator Contract(<math>G</math>, <math>e</math>), which returns a new multi-graph.
|Lovász Local Lemma (general case)|
{{Theorem|The contraction operator ''Contract''(<math>G</math>, <math>e</math>)|
:Let <math>\mathcal{A}=\{A_1,A_2,\ldots,A_m\}</math> be a set of events, where every event <math>A_i\in\mathcal{A}</math> is mutually independent of all other events excepts those in its neighborhood <math>\Gamma(A_i)\,</math> in the dependency graph. Suppose there exist real numbers <math>\alpha_1,\alpha_2,\ldots, \alpha_m\in[0,1)</math> such that for every <math>A_i\in\mathcal{A}</math>,
:say <math>e=uv</math>:
::<math>\Pr[A_i]\le \alpha_i\prod_{A_j\in\Gamma(A_i)}(1-\alpha_j)</math>.
:*merge <math>u</math> and <math>v</math> into one new vertex <math>x</math>;
:Then
:*for every edge (no matter parallel or not) in the form of <math>uw</math> or <math>vw</math> that connects one of <math>\{u,v\}</math> to a vertex <math>w\in V\setminus\{u,v\}</math> in the graph other than <math>u,v</math>, replace it by a new edge <math>xw</math>;
::<math>\Pr\left[\bigwedge_{A_i\in\mathcal{A}}\overline{A_i}\right]\ge\prod_{i=1}^m(1-\alpha_i)</math>.
:*the reset of the graph does not change.
}}
}}
This generalized version of the local lemma immediately implies the symmetric version of the lemma. Namely, <math>\Pr\left[\bigwedge_{i}\overline{A_i}\right]>0</math> if the followings are satisfied:
# <math>\Pr[A_i]\le p</math> for all <math>A_i\in\mathcal{A}</math>;
# <math>\mathrm{e}p(d+1)\le 1</math>, where <math>d=\max_{A_i\in\mathcal{A}}|\Gamma(A_i)|</math> is the maximum degree of the dependency graph.
To see this, for every <math>A_i\in\mathcal{A} </math> let <math>\alpha_i=\frac{1}{d+1}</math>. Note that <math>\prod_{A_j\in\Gamma(A_i)}(1-\alpha_j)\ge \left(1-\frac{1}{d+1}\right)^d>\frac{1}{\mathrm{e}}</math>.


With the above two conditions satisfied, for all <math>A_i\in\mathcal{A}</math>, it is easy to verify:
In other words, the <math>Contract(G,uv)</math> merges the two vertices <math>u</math> and <math>v</math> into a new vertex <math>x</math> whose incident edges preserves the edges incident to <math>u</math> or <math>v</math> in the original graph <math>G</math> except for the parallel edges between them.
:<math>\Pr[A_i]\le p\le\frac{1}{\mathrm{e}(d+1)}<\frac{1}{d+1}\left(1-\frac{1}{d+1}\right)^d\le \alpha_i\prod_{A_j\in\Gamma(A_i)}(1-\alpha_j)</math>,
which according to the Lovász Local Lemma (general case), implies that
:<math>\Pr\left[\bigwedge_{i=1}\overline{A_i}\right]\ge\prod_{i=1}^m(1-\alpha_i)=\left(1-\frac{1}{d+1}\right)^m>0</math>.
This gives the symmetric version of the local lemma.


== A non-constructive proof of LLL ==
Perhaps a better way to look at contraction is to interpret it as union of equivalent classes of vertices. Initially every vertex is in a dinstinct equivalent class. Upon call a <math>contract(G,uv)</math>, the two equivalent classes corresponding to <math>u</math> and <math>v</math> are unioned together, and only those edges crossing between different equivalent classes are counted as valid edges in the graph.
We then give the proof of the generalized Lovász Local Lemma. In particular, this proof is '''''non-constructive''''', in contrast to the '''''constructive''''' proofs that we are going to introduce later, which are basically algorithms.


Apply the chain rule. The probability that none of the bad events occurs can be expressed as:
{{Theorem|''RandomContract'' (Karger 1993)|
:<math>
:while <math>|V|>2</math> do
\begin{align}
:* choose an edge <math>uv\in E</math> uniformly at random;
\Pr\left[\bigwedge_{i=1}^m\overline{A_i}\right]
:* <math>G=contract(G,uv)</math>;
=\prod_{i=1}^m\Pr\left[\overline{A_i}\mid \bigwedge_{j=1}^{i-1}\overline{A_{j}}\right]
:return <math>C=E</math> (the parallel edges between the only remaining vertices in <math>V</math>);
=\prod_{i=1}^m\left(1-\Pr\left[A_i\mid \bigwedge_{j=1}^{i-1}\overline{A_{j}}\right]\right).
}}
\end{align}
</math>


It is then sufficient to show that
A multi-graph can be maintained by appropriate data strucrtures such that each contraction takes <math>O(n)</math> time, where <math>n</math> is the number of vertices, so the algorithm terminates in time <math>O(n^2)</math>. We leave this as an exercise.
:<math>\Pr\left[A_i\mid \bigwedge_{j=1}^{i-1}\overline{A_{j}}\right]\le\alpha_i</math>,
which will prove the lemma.


We then prove a slightly more general statement:
== Analysis of accuracy ==
:(induction hypothesis) <math>\Pr\left[A_{i_1}\mid \bigwedge_{j=2}^\ell\overline{A_{i_j}}\right]\le \alpha_{i_1}</math> for any distinct events <math>A_{i_1},A_{i_2}\ldots,A_{i_\ell}\in\mathcal{A}</math>.
For convenience, we assume that each edge has a unique "identity" <math>e</math>. And when an edge <math>uv\in E</math> is contracted to new vertex <math>x</math>, and each adjacent edge <math>uw</math> of <math>u</math> (or adjacent edge <math>vw</math> of <math>v</math>) is replaced by <math>xw</math>, the identity <math>e</math> of the edge <math>uw</math> (or <math>vw</math>) is transfered to the new edge <math>xw</math> replacing it. When referring a cut <math>C</math>, we consider <math>C</math> as a set of edge identities <math>e</math>, so that a cut <math>C</math> is changed by the algorithm only if some of its edges are removed during contraction.


The proof is by induction on <math>\ell</math>. For <math>\ell=1</math>, due to the assumption in the Lovász Local Lemma
We first prove some lemma.
:<math>\Pr[A_{i_1}]\le\alpha_{i_1}\prod_{A_j\in\Gamma(A_{i_1})}(1-\alpha_{j})\le\alpha_{i_1}</math>.


For general <math>\ell</math>, assume the hypothesis is true for all smaller <math>\ell</math>.
{{Theorem
Without loss of generality, assume that <math>A_{i_2},\ldots,A_{i_k}</math> are the events among <math>A_{i_2}\ldots,A_{i_\ell}</math> that are dependent with <math>A_{i_1}</math>, and <math>A_{i_1}</math> is mutually independent of the rest <math>A_{i_{k+1}},\ldots,A_{i_\ell}</math>.
|Lemma 1|
 
:If <math>C</math> is a cut in a multi-graph <math>G</math> and <math>e\not\in C</math>, then <math>C</math> is still a cut in <math>G'=contract(G,e)</math>.
Then applying the following basic conditional probability identity
:<math>\Pr[A\mid BC]=\frac{\Pr[AB\mid C]}{\Pr[B\mid C]}</math>,
we have
:<math>
\Pr\left[A_{i_1}\mid \bigwedge_{j=2}^\ell\overline{A_{i_j}}\right]
=\frac{\Pr\left[ A_i\wedge \bigwedge_{j=2}^k\overline{A_{i_j}}\mid \bigwedge_{j=k+1}^\ell\overline{A_{i_j}}\right]}
{\Pr\left[\bigwedge_{j=2}^k\overline{A_{i_j}}\mid \bigwedge_{j=k+1}^\ell\overline{A_{i_j}}\right]}
=\frac{\text{Numerator}}{\text{Denominator}}.
</math>
Due to the mutual independence between <math>A_{i_1}</math> and <math>A_{i_k+1},\ldots,A_{i_\ell}</math>, the <math>\text{Numerator}</math> becomes
:<math>
\text{Numerator}
\le\Pr\left[ A_{i_1}\mid \bigwedge_{j=k+1}^\ell\overline{A_{i_j}}\right]=\Pr[A_{i_1}],
</math>
which according to the assumption in the Lovász Local Lemma, is bounded as
:<math>
\text{Numerator}\le \alpha_{i_1}\prod_{A_j\in \Gamma(A_{i_1})}(1-\alpha_j).
</math>
Applying the chain rule to the <math>\text{Denominator}</math> we have
:<math>
\text{Denominator}=\prod_{j=2}^k\Pr\left[\overline{A_{i_j}}\mid \bigwedge_{r=j+1}^\ell\overline{A_{i_r}}\right].
</math>
Note that there are always less than <math>\ell</math> events involved, so we can apply the induction hypothesis and have
:<math>
\text{Denominator}\ge \prod_{j=2}^k(1-\alpha_{i_j})\ge \prod_{A_j\in\Gamma(A_{i_1})}(1-\alpha_j),
</math>
where the last inequality is due to the fact that <math>A_{i_2},\ldots,A_{i_k}\in\Gamma(A_{i_1})</math>.
 
Combining everything together, we have
:<math>
\Pr\left[A_{i_1}\mid \bigwedge_{j=2}^\ell\overline{A_{i_j}}\right]
\le\alpha_{i_1}.
</math>
As we argued in the beginning, this proves the general Lovász Local Lemma.
 
=Random Search for Exact-<math>k</math>-SAT=
We start by giving the definition of <math>k</math>-CNF and <math>k</math>-SAT.
{{Theorem|Definition (exact-<math>k</math>-CNF)|
:A logic expression <math>\phi</math> defined on <math>n</math> Boolean variables <math>x_1,x_2,\ldots,x_n\in\{\mathrm{true},\mathrm{false}\}</math> is said to be a '''conjunctive normal form (CNF)''' if <math>\phi</math> can be written as a conjunction(AND) of '''clauses''' as <math>\phi=C_1\wedge C_2\wedge\cdots\wedge C_m</math>, where each clause <math>C_i=\ell_{i_1}\vee \ell_{i_2}\vee\cdots\vee\ell_{i_k}</math> is a disjunction(OR) of '''literals''', where every literal <math>\ell_j</math> is either a variable <math>x_i</math> or the negation <math>\neg x_i</math> of a variable.
:*We call a CNF formula a '''exact-<math>k</math>-CNF''' if every clause consists of ''exact'' <math>k</math> ''distinct'' literals.  
}}
}}
For example:
{{Proof|
:<math>
It is easy to verify that <math>C</math> is a cut in <math>G'=contract(G,e)</math> if none of its edges is lost during the contraction.
(x_1\vee \neg x_2 \vee \neg x_3)\wedge (\neg x_1\vee \neg x_3\vee x_4)\wedge (x_1\vee x_2\vee x_4)\wedge (x_2\vee x_3\vee \neg x_4)
Since <math>C</math> is a cut in <math>G(V,E)</math>, there exists a nonempty vertex set <math>S\subset V</math> and its complement <math>\bar{S}=V\setminus S</math> such that <math>C=\{uv\mid u\in S, v\in\bar{S}\}</math>. And if <math>e\not\in C</math>, it must hold that either <math>e\in G[S]</math> or <math>e\in G[\bar{S}]</math> where <math>G[S]</math> and <math>G[\bar{S}]</math> are the subgraphs induced by <math>S</math> and <math>\bar{S}</math> respectively. In both cases none of edges in <math>C</math> is removed in <math>G'=contract(G,e)</math>.  
</math>
is an exact-<math>3</math>-CNF formula.
 
;Remark
:The notion of exact-<math>k</math>-CNF is slightly more restrictive than the <math>k</math>-CNF, where each clause consists of ''at most'' <math>k</math> variables. The discussion of the subtle differences between these two definitions can be found [https://en.wikipedia.org/wiki/Boolean_satisfiability_problem#3-satisfiability here].
 
A logic expression <math>\phi</math> is said to be '''satisfiable''' if there is an assignment of values of true or false to the variables <math>\boldsymbol{x}=(x_1,x_2,\ldots,x_n)</math> so that <math>\phi(\boldsymbol{x})</math> is true. For a CNF <math>\phi</math>, this mean that there is a truth assignment that satisfies all clauses in <math>\phi</math> simultaneously.
 
The '''exact-<math>k</math>-satisfiability (exact-<math>k</math>-SAT)''' problem is that given as input an exact-<math>k</math>-CNF formula <math>\phi</math> decide whether <math>\phi</math> is satisfiable.  
{{Theorem|exact-<math>k</math>-SAT|
:'''Input:''' an exact-<math>k</math>-CNF formula <math>\phi</math>.
:'''Output:''' whether <math>\phi</math> is satisfiable.
}}
}}
It is well known that <math>k</math>-SAT is '''NP-complete''' for any <math>k\ge 3</math>. The same also holds for the exact-<math>k</math>-SAT.


== Satisfiability of exact-<math>k</math>-CNF==
{{Theorem
Inspired by the Lovasz local lemma, we now consider the dependencies between clauses in a CNF formula.
|Lemma 2|
 
: The size of min-cut in <math>G'=contract(G,e)</math> is at least as large as the size of min-cut in <math>G</math>, i.e. contraction never reduces the size of min-cut.
Given a CNF formula <math>\phi</math> defined over Boolean variables <math>\mathcal{X}=\{x_1,x_2,\ldots,x_n\}</math> and a clause <math>C</math> in <math>\phi</math>, we use <math>\mathsf{vbl}(C)\subseteq\mathcal{X}</math> to denote the set of variables that appear in <math>C</math>.
For a clause <math>C</math> in a CNF formula <math>\phi</math>, its '''degree''' <math>d(C)=|\{D\neq C\mid \mathsf{D}\cap\mathsf{C}\neq\emptyset\}|</math> is the number of other clauses in <math>\phi</math> that share variables with <math>C</math>. The '''maximum degree''' <math>d</math> of a CNF formula <math>\phi</math> is <math>d=\max_{C\text{ in }\phi}d(C)</math>.
 
By the Lovasz local lemma, we almost immediately have the following theorem for the satisfiability of exact-<math>k</math>-CNF with bounded degree.
{{Theorem|Theorem|
:Let <math>\phi</math> be an exact-<math>k</math>-CNF formula with maximum degree <math>d</math>. If <math>d\le 2^{k}/\mathrm{e}-1</math> then <math>\phi</math> is always satisfiable.
}}
}}
{{Proof|
{{Proof|
Let <math>X_1,X_2,\ldots,X_n</math> be Boolean random variables sampled uniformly and independently from <math>\{\text{true},\text{false}\}</math>. We are going to show that <math>\phi</math> is satisfied by this random assignment with positive probability. Due to the probabilistic method, this will prove the existence of a satisfying assignment for <math>\phi</math>.
: Note that every cut in the contracted graph <math>G'</math> is also a cut in the original graph <math>G</math>.
 
Suppose there are <math>m</math> clauses <math>C_1,C_2,\ldots,C_m</math> in <math>\phi</math>. Let <math>A_i</math> denote the bad event that <math>C_i</math> is not satisfied by the random assignment <math>X_1,X_2,\ldots,X_n</math>. Clearly, each <math>A_i</math> is dependent with at most <math>d</math> other <math>A_j</math>'s. And our goal is to show that
:<math>\Pr\left[\bigwedge_{i=1}^m\overline{A_i}\right]>0</math>.
 
Recall that in an exact-<math>k</math>-CNF, every clause <math>C_i</math> consists of exact <math>k</math> variable, and is violated by precisely one local assignment among all <math>2^k</math> possibilities. Thus the probability that <math>C_i</math> is not satisfied is <math>\Pr[A_i]=2^{-k}</math>.
 
Assuming that <math>d\le 2^{k}/\mathrm{e}-1</math>, i.e. <math>\mathrm{e}(d+1)2^{-k}\le 1</math>, by the Lovasz local lemma (symmetric case), we have
:<math>\Pr\left[\bigwedge_{i=1}^m\overline{A_i}\right]>0</math>.
}}
}}
== The random search algorithm ==
The above theorem basically says that for a CNF if every individual clause is easy to satisfy and is dependent with few other clauses then the CNF should be always satisfiable. However, the theorem only states the existence of a satisfying solution, but does not specify a way to find this solution. Next we give a simple randomized algorithm and prove it can find the satisfying solution efficiently under a slightly stronger assumption than the Lovasz local lemma.
Given as input a CNF formula <math>\phi</math> defined on Boolean variables <math>\mathcal{X}=\{x_1,x_2,\ldots,x_n\}</math>, recall that for a clause <math>C</math> in a CNF <math>\phi</math>, we use <math>\mathsf{vbl}(C)\subseteq\mathcal{X}</math> to denote the set of variables on which <math>C</math> is defined.
The following algorithm is due to Moser in 2009. The algorithm consists of two components: the main function ''Solve''() and a sub-routine ''Fix''().


{{Theorem
{{Theorem
|Solve(CNF <math>\phi</math>)|
|Lemma 3|
:Pick values of <math>x_1,x_2\ldots,x_n</math> uniformly and independently at random;
:If <math>C</math> is a min-cut in a multi-graph <math>G(V,E)</math>, then <math>|E|\ge \frac{|V||C|}{2}</math>.
:While there is an unsatisfied clause <math>C</math> in <math>\phi</math>
}}
:: '''Fix'''(<math>C</math>);
{{Proof|  
:It must hold that the degree of each vertex <math>v\in V</math> is at least <math>|C|</math>, or otherwise the set of adjacent edges of <math>v</math> forms a cut which separates <math>v</math> from the rest of <math>V</math> and has size less than <math>|C|</math>, contradicting the assumption that <math>|C|</math> is a min-cut. And the bound <math>|E|\ge \frac{|V||C|}{2}</math> follows directly from the fact that every vertex in <math>G</math> has degree at least <math>|C|</math>.
}}
}}


The sub-routine '''Fix()''' is a recursive procedure:
For a multigraph <math>G(V, E)</math>, fixed a minimum cut <math>C</math> (there might be more than one minimum cuts), we analyze the probability that <math>C</math> is returned by the above algorithm.
{{Theorem
|Fix(Clause <math>C</math>)|
:Replace the values of variables in <math>\mathsf{vbl}(C)</math> with new uniform and independent random values;
:While there is ''unsatisfied'' clause <math>D</math> (including <math>C</math> itself) that <math>\mathsf{vbl}(C)\cap \mathsf{vbl}(D)\neq \emptyset</math>
:: '''Fix'''(<math>D</math>);
}}


It is an amazing discovery that this simple algorithm works well as long as the condition of Lovasz local lemma is satisfied. Here we prove a slightly weaker statement for the convenience of analysis.
Initially <math>|V|=n</math>. We say that the min-cut <math>C</math> "survives" a random contraction if none of the edges in <math>C</math> is chosen to be contracted.
After <math>(i-1)</math> contractions,  denote the current multigraph as <math>G_i(V_i, E_i)</math>. Supposed that <math>C</math> survives the first <math>(i-1)</math> contractions, according to Lemma 1 and 2, <math>C</math> must be a minimum cut in the current multi-graph <math>G_i</math>. Then due to Lemma 3, the current edge number is <math>|E_i|\ge |V_i||C|/2</math>. Uniformly choosing an edge <math>e\in E_i</math> to contract, the probability that the <math>i</math>th contraction contracts an edge in <math>C</math> is given by:


{{Theorem|Theorem|
:<math>\begin{align}\Pr_{e\in E_i}[e\in C] &= \frac{|C|}{|E_i|}  
:Let <math>\phi</math> be an exact-<math>k</math>-CNF formula with maximum degree <math>d</math>.
&\le |C|\cdot\frac{2}{|V_i||C|}
:There is a <math>d_0=\Theta(2^{k})</math> such that if <math>d< d_0</math> then the algorithm ''Solve''(<math>\phi</math>) finds a satisfying assignment for <math>\phi</math> in time <math>O(n+km\log m)</math> with high probability.
&= \frac{2}{|V_i|}.\end{align}</math>
}}
Note that in the Lovasz local lemma, the above <math>d_0</math> is <math>d_0=2^{k}/\mathrm{e}-1</math>. So this theorem archives asymptotically the same bound as the Lovasz local lemma.


The analysis is based on a technique called '''''entropy compression'''''. This is a very clever idea and may be very different from what you might have seen so far about algorithm analysis.
Therefore, conditioning on that <math>C</math> survives the first <math>(i-1)</math> contractions, the probability that <math>C</math> survives the <math>i</math>th contraction is at least <math>1-2/|V_i|</math>. Note that <math>|V_i|=n-i+1</math>, because each contraction decrease the vertex number by 1.


We first give a high level description of the ideas behind the analysis of this brilliant algorithm:
The probability that no edge in the minimum cut <math>C</math> is ever contracted is:
* We use <math>\mathsf{Alg}(r, \phi)</math> to abstractly denote an algorithm <math>\mathsf{Alg}</math> running on an input <math>\phi</math> with random bits <math>r\in\{0,1\}^*</math>. For an algorithm with no access to the random bits <math>r</math>, once the input <math>\phi</math> is fixed, the behavior of the algorithm as well as its output is ''deterministic''. But for ''randomized'' algorithms, the behavior of <math>\mathsf{Alg}(r, \phi)</math> is a random variable even when the input <math>\phi</math> is fixed.
* Fix an arbitrary (worst-case) input <math>\phi</math>. We try to construct a ''succinct representation'' <math>c</math> of the behavior of <math>\mathsf{Alg}(r, \phi)</math> in such a manner that the random bits <math>r</math> can always be fully recovered from this succinct representation <math>c</math>. In other words, <math>\mathsf{Alg}(r, \phi)</math> gives an encoding (a 1-1 mapping) of the random bits <math>r</math> to a succinct representation <math>c</math>.
* It is a fundamental law that random bits cannot be compressed significantly by any encoding. Therefore if a longer running time of <math>\mathsf{Alg}(r, \phi)</math> would imply that the random bits <math>r</math> can be encoded to a succinct representation <math>c</math> which is much shorter than <math>r</math>, then we prove the running time of the algorithm <math>\mathsf{Alg}(r, \phi)</math> cannot be too long.
:* A natural way to reach this last contradiction is to have the following situation: As the running time of <math>\mathsf{Alg}(r, \phi)</math> grows, naturally both lengths of random bits <math>r</math> and the succinct representation <math>c</math> of the behavior of <math>\mathsf{Alg}(r, \phi)</math> grow. So if the former grows much faster than the latter as the running time grows, then a large running time may cause the length of <math>r</math> significantly greater than the length of <math>c</math>.


---------
:<math>\begin{align}
We now proceed to the analysis of <math>\text{Solve}(\phi)</math> and prove the above theorem.
&\quad\,\prod_{i=1}^{n-2}\Pr[\,C\mbox{ survives all }(n-2)\mbox{ contractions }]\\
&=
\prod_{i=1}^{n-2}\Pr[\,C\mbox{ survives the }i\mbox{-th contraction}\mid C\mbox{ survives the first }(i-1)\mbox{-th contractions}]\\
&\ge
\prod_{i=1}^{n-2}\left(1-\frac{2}{|V_i|}\right) \\
&=
\prod_{i=1}^{n-2}\left(1-\frac{2}{n-i+1}\right)\\
&=
\prod_{k=3}^{n}\frac{k-2}{k}\\
&= \frac{2}{n(n-1)}.
\end{align}</math>


Fix any exact-<math>k</math>-CNF formula <math>\phi</math> with maximum degree <math>d</math>. Let <math>T</math> denote the total number of time the function <math>\text{Fix}()</math> is called (including both the calls in <math>\text{Solve}(\phi)</math> and the recursive calls). Note that <math>T</math> is a random variable depends solely on the random bits used by the algorithm (after the input <math>\phi</math> being fixed). We then show that <math>T=O(m\log m)</math> with high probability.  
This gives the following theorem.
{{Theorem
|Theorem|
: For any multigraph with <math>n</math> vertices, the ''RandomContract'' algorithm returns a minimum cut with probability at least <math>\frac{2}{n(n-1)}</math>.
}}


Potentially <math>T</math> can be both finite and infinite
Run ''RandomContract'' independently for <math>n(n-1)/2</math> times and return the smallest cut returned. The probability that a minimum cut is found is at least:


----------
:<math>\begin{align}
We consider a restrictive case.
1-\Pr[\mbox{failed every time}] &= 1-\Pr[{RandomContract}\mbox{ fails}]^{n(n-1)/2} \\
&\ge 1- \left(1-\frac{2}{n(n-1)}\right)^{n(n-1)/2} \\
&\ge 1-\frac{1}{e}.
\end{align}</math>


Let <math>X_1,X_2,\ldots,X_m\in\{\mathrm{true},\mathrm{false}\}</math> be a set of ''mutually independent'' random variables which assume boolean values. Each event <math>A_i</math> is an AND of at most <math>k</math> literals (<math>X_i</math> or <math>\neg X_i</math>). Let <math>v(A_i)</math> be the set of the <math>k</math> variables that <math>A_i</math> depends on. The probability that none of the bad events occurs is
A constant probability!
:<math>
\Pr\left[\bigwedge_{i=1}^n \overline{A_i}\right].
</math>
In this particular model, the dependency graph <math>D=(V,E)</math> is defined as that <math>(i,j)\in E</math> iff <math>v(A_i)\cap v(A_j)\neq \emptyset</math>.


Observe that <math>\overline{A_i}</math> is a clause (OR of literals). Thus, <math>\bigwedge_{i=1}^n \overline{A_i}</math> is a '''<math>k</math>-CNF''', the CNF that each clause depends on <math>k</math> variables.
== A Corollary by the Probabilistic Method ==
The probability
Karger's algorithm and its analysis implies the following combinatorial theorem regarding the number of distinct minimum cuts in a graph.
{{Theorem|Corollary|
:For any graph <math>G(V,E)</math> of <math>n</math> vertices, the number of distinct minimum cuts in <math>G</math> is at most <math>\frac{n(n-2)}{2}</math>.
}}
{{Proof|
For each minimum cut <math>C</math> in <math>G</math>, we define <math>\mathcal{E}_C</math> to be the event that ''RandomContract'' returns <math>C</math>. Due to the analysis of RandomContract, <math>\Pr[\mathcal{E}_C]\ge \frac{2}{n(n-1)}</math>. The events <math>\mathcal{E}_C</math> are mutually disjoint for distinct <math>C</math> and the event that ''RandomContract'' returns a min-cut is the disjoint union of <math>\mathcal{E}_C</math> over all min-cut <math>C</math>. Therefore,
:<math>
:<math>
\bigwedge_{i=1}^n \overline{A_i}>0
\begin{align}
&\Pr[\mbox{ RandomContract returns a min-cut}]\\
=
&\sum_{\mbox{min-cut }C\mbox{ in }G}\Pr[\mathcal{E}_C]\\
\ge
&\sum_{\mbox{min-cut }C\mbox{ in }G}\frac{2}{n(n-1)},
\end{align}
</math>
</math>
means that the the <math>k</math>-CNF <math>\bigwedge_{i=1}^n \overline{A_i}</math> is satisfiable.
which must be no greater than 1 for a well-defined probability space. This means the total number of min-cut in <math>G</math> must be no greater than <math>\frac{n(n-1)}{2}</math>.
 
}}
The satisfiability of <math>k</math>-CNF is a hard problem. In particular, 3SAT (the satisfiability of 3-CNF) is the first '''NP-complete''' problem (the Cook-Levin theorem). Given the current suspect on '''NP''' vs '''P''', we do not expect to solve this problem generally.
Note that the statement of this theorem has no randomness at all, however the proof involves a randomized algorithm. This is an example of [http://en.wikipedia.org/wiki/Probabilistic_method the probabilistic method].
 
However, the condition of the Lovasz local lemma has an extra assumption on the degree of dependency graph. In our model, this means that each clause shares variables with at most <math>d</math> other clauses. We call a <math>k</math>-CNF with this property a <math>k</math>-CNF with bounded degree <math>d</math>.


Therefore, proving the Lovasz local lemma on the restricted forms of events as described above, can be reduced to the following problem:
== Fast Min-Cut ==
;Problem
In the analysis of ''RandomContract'', we have the following observation:
:Find a condition on <math>k</math> and <math>d</math>, such that any <math>k</math>-CNF with bounded degree <math>d</math> is satisfiable.
* The probability of success is only getting worse when the graph becomes small.
This motivates us to consider the following alternation to the algorithm: first using random contractions to reduce the number of vertices to a moderately small number, and then recursively finding a min-cut in this smaller instance. This seems just a restatement of exactly what we have been doing. Inspired by the idea of boosting the accuracy via independent repetition, here we apply the recursion on ''two'' smaller instances generated independently.


In 2009, Moser comes up with the following procedure solving the problem. He later generalizes the procedure to general forms of events. This not only gives a beautiful constructive proof to the Lovasz local lemma, but also provides an efficient randomized algorithm for finding a satisfiable assignment for a number of events with bounded dependencies.  
The algorithm obtained in this way is called ''FastCut''. We first define a procedure to randomly contract edges until there are <math>t</math> number of vertices left.


Let <math>\phi</math> be a <math>k</math>-CNF of <math>n</math> clauses with bounded degree <math>d</math>,  defined on variables <math>X_1,\ldots,X_m</math>. The following procedure find a satisfiable assignment for <math>\phi</math>.
{{Theorem|''RandomContract''<math>(G, t)</math>|
 
:while <math>|V|>t</math> do
{{Theorem
:* choose an edge <math>uv\in E</math> uniformly at random;
|Solve(<math>\phi</math>)|
:* <math>G=contract(G,uv)</math>;
:Pick a random assignment of <math>X_1,\ldots,X_m</math>.
:return <math>G</math>;
:While there is an unsatisfied clause <math>C</math> in <math>\phi</math>
:: '''Fix'''(<math>C</math>).
}}
}}


The sub-routine '''Fix''' is defined as follows:
The ''FastCut'' algorithm is recursively defined as follows.
{{Theorem
{{Theorem|''FastCut''<math>(G)</math>|
|Fix(<math>C</math>)|
:if <math>|V|\le 6</math> then return a mincut by brute force;
:Replace the variables in <math>v(C)</math> with new random values.
:else let <math>t=\left\lceil1+|V|/\sqrt{2}\right\rceil</math>;
:While there is unsatisfied clause <math>D</math> that <math>v(C)\cap v(D)\neq \emptyset</math>
:: <math>G_1=RandomContract(G,t)</math>;
:: '''Fix'''(<math>D</math>).
:: <math>G_2=RandomContract(G,t)</math>;
::return the smaller one of <math>FastCut(G_1)</math> and <math>FastCut(G_2)</math>;
}}
}}


The procedure looks very simple. It just recursively fixes the unsatisfied clauses by randomly replacing the assignment to the variables.
As before, all <math>G</math> are multigraphs.


We then prove it works.
Let <math>C</math> be a min-cut in the original multigraph <math>G</math>. By the same analysis as in the case of ''RandomContract'', we have
 
:<math>
===Number of top-level callings of Fix ===
\begin{align}
In '''Solve'''(<math>\phi</math>), the subroutine '''Fix'''(<math>C</math>) is called. We now upper bound the number of times it is called (not including the recursive calls).
&\Pr[C\text{ survives all contractions in }RandomContract(G,t)]\\
 
=
Assume '''Fix'''(<math>C</math>) always terminates.
&\prod_{i=1}^{n-t}\Pr[C\text{ survives the }i\text{-th contraction}\mid C\text{ survives the first }(i-1)\text{-th contractions}]\\
:;Observation
\ge
::Every clause that was satisfied before '''Fix'''(<math>C</math>) was called will still remain satisfied and <math>C</math> will also be satisfied after '''Fix'''(<math>C</math>) returns.
&\prod_{i=1}^{n-t}\left(1-\frac{2}{n-i+1}\right)\\
 
=
The observation can be proved by induction on the structure of recursion.  Since there are <math>n</math> clauses, '''Solve'''(<math>\phi</math>) makes at most <math>n</math> calls to '''Fix'''.
&\prod_{k=t+1}^{n}\frac{k-2}{k}\\
 
=
We then prove that '''Fix'''(<math>C</math>) terminates.
&\frac{t(t-1)}{n(n-1)}.
 
\end{align}
=== Termination of Fix ===
</math>
The idea of the proof is to '''reconstruct''' a random string.
When <math>t=\left\lceil1+n/\sqrt{2}\right\rceil</math>, this probability is at least <math>1/2</math>.
 
Suppose that during the running of '''Solve'''(<math>\phi</math>), the '''Fix''' subroutine is called for <math>t</math> times (including all the recursive calls).
 
Let <math>s</math> be the sequence of the random bits used by '''Solve'''(<math>\phi</math>). It is easy to see that the length of <math>s</math> is <math>|s|=m+tk</math>, because the initial random assignment of <math>m</math> variables takes <math>m</math> bits, and each time of calling '''Fix''' takes <math>k</math> bits.
 
We then reconstruct <math>s</math> in an alternative way.
 
Recall that '''Solve'''(<math>\phi</math>) calls '''Fix'''(<math>C</math>) at top-level for at most <math>n</math> times. Each calling of '''Fix'''(<math>C</math>) defines a recursion tree, rooted at clause <math>C</math>, and each node corresponds to a clause (not necessarily distinct, since a clause might be fixed for several times). Therefore, the entire running history of '''Solve'''(<math>\phi</math>) can be described by at most <math>n</math> recursion trees.
 
:;Observation 1
::Fix a <math>\phi</math>. The <math>n</math> recursion trees which capture the total running history of '''Solve'''(<math>\phi</math>) can be encoded in <math>n\log n+t(\log d+O(1))</math> bits.
Each root node corresponds to a clause. There are <math>n</math> clauses in <math>\phi</math>. The <math>n</math> root nodes can be represented in <math>n\log n</math> bits.
 
The smart part is how to encode the branches of the tree. Note that '''Fix'''(<math>C</math>) will call '''Fix'''(<math>D</math>) only for the <math>D</math> that shares variables with <math>C</math>. For a k-CNF with bounded degree <math>d</math>, each clause <math>C</math> can share variables with at most <math>d</math> many other clauses. Thus, each branch in the recursion tree can be represented  in <math>\log d</math> bits. There are extra <math>O(1)</math> bits needed to denote whether the recursion ends. So totally  <math>n\log n+t(\log d+O(1))</math> bits are sufficient to encode all <math>n</math> recursion trees.
 
:;Observation 2
::The random sequence <math>s</math> can be encoded in <math>m+n\log n+t(\log d+O(1))</math> bits.
 
With <math>n\log n+t(\log d+O(1))</math> bits, the structure of all the recursion trees can be encoded. With extra <math>m</math> bits, the final assignment of the <math>m</math>
variables is stored.
 
We then observe that with these information, the sequence of the random bits <math>s</math> can be reconstructed from backwards from the final assignment.
 
The key step is that a clause <math>C</math> is only fixed when it is unsatisfied (obvious), and an unsatisfied clause <math>C</math> must have exact one assignment (a clause is OR of literals, thus has exact one unsatisfied assignment). Thus, each node in the recursion tree tells the <math>k</math> random bits in the random sequence <math>s</math> used in the call of the Fix corresponding to the node. Therefore, <math>s</math> can be reconstructed from the final assignment plus at most <math>n</math> recursion trees, which can be encoded in at most <math>m+n\log n+t(\log d+O(1))</math> bits.
 
The following theorem lies in the heart of the '''Kolmogorov complexity'''. The theorem states that random sequence is '''incompressible'''.
{{Theorem
|Theorem (Kolmogorov)|
:For any encoding scheme , with high probability, a random sequence <math>s</math> is encoded in at least <math>|s|</math> bits.
}}


Applying the theorem, we have that with high probability,
We use <math>p(n)</math> to denote the probability that <math>C</math> is returned by <math>FastCut(G)</math>, where <math>G</math> is a multigraph of <math>n</math> vertices. We then have the following recursion for <math>p(n)</math>.
:<math>m+n\log n+t(\log d+O(1))\ge |s|=m+tk</math>.
Therefore,
:<math>
:<math>
t(k-O(1)-\log d)\le n\log n.
\begin{align}
p(n)
&=
\Pr[C\text{ is returned by }\textit{FastCut}(G)]\\
&=
1-\left(1-\Pr[C\text{ survives in }G_1\wedge C=\textit{FastCut}(G_1)]\right)^2\\
&=
1-\left(1-\Pr[C\text{ survives in }G_1]\Pr[C=\textit{FastCut}(G_1)\mid C\text{ survives in }G_1]\right)^2\\
&\ge
1-\left(1-\frac{1}{2}p\left(\left\lceil1+n/\sqrt{2}\right\rceil\right)\right)^2,
\end{align}
</math>
</math>
In order to bound <math>t</math>, we need
where the last inequality is due to the fact that <math>\Pr[C\text{ survives all contractions in }RandomContract(G,t)]\ge1/2</math> and our previous discussions in the analysis of ''RandomContract'' that if the min-cut <math>C</math> survives all first <math>(n-t)</math> contractions then <math>C</math> must be a min-cut in the remaining multigraph.
:<math>k-O(1)-\log d>0</math>,
which hold for <math>d< 2^{k-\alpha}</math> for some constant <math>\alpha>0</math>. In fact, in this case, <math>t=O(n\log n)</math>, the running time of the procedure is bounded by a polynomial!


=== Back to the local lemma ===
The base case is that <math>p(n)=1</math> for <math>n\le 6</math>. Solving this recursion of <math>p(n)</math> (or proving by induction) gives us that
We showed that for <math>d<2^{k-O(1)}</math>, any <math>k</math>-CNF with bounded degree <math>d</math> is satisfiable, and the satisfied assignment can be found within polynomial time with high probability. Now we interprete this in a language of the local lemma.
 
Recall that the symmetric version of the local lemma:
{{Theorem
|Theorem (The local lemma: symmetric case)|
:Let <math>A_1,A_2,\ldots,A_n</math> be a set of events, and assume that the following hold:
:#for all <math>1\le i\le n</math>, <math>\Pr[A_i]\le p</math>;
:#the maximum degree of the dependency graph for the events <math>A_1,A_2,\ldots,A_n</math> is <math>d</math>, and
:::<math>ep(d+1)\le 1</math>.
:Then
::<math>\Pr\left[\bigwedge_{i=1}^n\overline{A_i}\right]>0</math>.
}}
Suppose the underlying probability space is a number of mutually independent uniform random boolean variables, and the evens <math>\overline{A_i}</math> are clauses defined on <math>k</math> variables. Then,
:<math>
:<math>
p=2^{-k}
p(n)=\Omega\left(\frac{1}{\log n}\right).
</math>
</math>
thus, the condition <math>ep(d+1)\le 1</math> means that
 
Recall that we can implement an edge contraction in <math>O(n)</math> time, thus it is easy to verify the following recursion of time complexity:
:<math>
:<math>
d<2^{k}/e
T(n)=2T\left(\left\lceil1+n/\sqrt{2}\right\rceil\right)+O(n^2),
</math>
</math>
which means the Moser's procedure is asymptotically optimal on the degree of dependency.
where <math>T(n)</math> denotes the running time of <math>FastCut(G)</math> on a multigraph <math>G</math> of <math>n</math> vertices.
 
Solving the recursion of <math>T(n)</math> with the base case <math>T(n)=O(1)</math> for <math>n\le 6</math>, we have <math>T(n)=O(n^2\log n)</math>.


= Algorithmic Lovász Local Lemma =
Therefore, for a multigraph <math>G</math> of <math>n</math> vertices, the algorithm <math>FastCut(G)</math> returns a min-cut in <math>G</math> with probability <math>\Omega\left(\frac{1}{\log n}\right)</math> in time <math>O(n^2\log n)</math>. Repeat this independently for <math>O(log n)</math> times, we have an algorithm which runs in time <math>O(n^2\log^2n)</math> and returns a min-cut with probability <math>1-O(1/n)</math>, a high probability.

Revision as of 10:48, 18 September 2016

under construction

Graph Cut

Let [math]\displaystyle{ G(V, E) }[/math] be an undirected graph. A subset [math]\displaystyle{ C\subseteq E }[/math] of edges is a cut of graph [math]\displaystyle{ G }[/math] if [math]\displaystyle{ G }[/math] becomes disconnected after deleting all edges in [math]\displaystyle{ C }[/math].

More formally, a pair of disjoint subsets [math]\displaystyle{ S,T\subseteq V }[/math] of vertices is called a bipartition of [math]\displaystyle{ V }[/math] if [math]\displaystyle{ S }[/math] and [math]\displaystyle{ T }[/math] are not empty and [math]\displaystyle{ S\cup T=V }[/math]. Given a bipartition [math]\displaystyle{ \{S,T\} }[/math] of [math]\displaystyle{ V }[/math], we denote by

[math]\displaystyle{ E(S,T)=\{uv\in E\mid u\in S, v\in T\} }[/math]

the set of "crossing edges" with one endpoint in each of [math]\displaystyle{ S }[/math] and [math]\displaystyle{ T }[/math].

Then every cut [math]\displaystyle{ C\subseteq E }[/math] in [math]\displaystyle{ G }[/math] corresponds to a

[math]\displaystyle{ C=E(S,T),\quad \{S,T\}\mbox{ is a bipartition of }V }[/math].

Given a graph [math]\displaystyle{ G }[/math], there might be many cuts in [math]\displaystyle{ G }[/math], and we are interested in finding its minimum or maximum cut.

Min-Cut

The min-cut problem, also called the global minimum cut problem, is defined as follows.

Min-cut problem
  • Input: an undirected graph [math]\displaystyle{ G(V,E) }[/math];
  • Output: a cut [math]\displaystyle{ C }[/math] in [math]\displaystyle{ G }[/math] with the smallest size [math]\displaystyle{ |C| }[/math].

Equivalently, the problem asks to find a bipartition of [math]\displaystyle{ V }[/math] into disjoint non-empty subsets [math]\displaystyle{ S }[/math] and [math]\displaystyle{ T }[/math] that minimizes [math]\displaystyle{ |E(S,T)| }[/math].

We consider the problem in a slightly more generalized setting, where the input graphs [math]\displaystyle{ G }[/math] can be multi-graphs, meaning that there could be multiple edges between two vertices [math]\displaystyle{ u }[/math] and [math]\displaystyle{ v }[/math]. We call such edges the parallel edges. The cuts in multi-graphs are defined in the same way as before, and the cost of a cut [math]\displaystyle{ C }[/math] is given by the total number of edges (including parallel edges) in [math]\displaystyle{ C }[/math]. Equivalently, one may think of a multi-graph as a graph with integer edge weights, and the cost of a cut [math]\displaystyle{ C }[/math] is the total weights of all edges in [math]\displaystyle{ C }[/math].

A canonical deterministic algorithm for this problem is through the max-flow min-cut theorem. The max-flow algorithm finds us a minimum [math]\displaystyle{ s }[/math]-[math]\displaystyle{ t }[/math] cut, which disconnects a source [math]\displaystyle{ s\in V }[/math] from a sink [math]\displaystyle{ t\in V }[/math], both specified as part of the input. A global min cut can be found by exhaustively finding the minimum [math]\displaystyle{ s }[/math]-[math]\displaystyle{ t }[/math] cut for an arbitrarily fixed source [math]\displaystyle{ s }[/math] and all possible sink [math]\displaystyle{ t\neq s }[/math]. This takes [math]\displaystyle{ (n-1)\times }[/math]max-flow time.

The fastest known deterministic algorithm for the minimum cut problem on multi-graphs is the Stoer–Wagner algorithm, which achieves an [math]\displaystyle{ O(mn+n^2\log n) }[/math] time complexity.

If we restrict the input to be simple graphs (meaning there is no parallel edges) with no edge weight, there are better algorithms. The most recent one was published in STOC 2015, achieving a near-linear (in the number of edges) time complexity.

Karger's Contraction algorithm

We will describe a simple and elegant randomized algorithm for the min-cut problem. The algorithm is due to David Karger.

Let [math]\displaystyle{ G(V, E) }[/math] be a multi-graph, which allows more than one parallel edges between two distinct vertices [math]\displaystyle{ u }[/math] and [math]\displaystyle{ v }[/math] but does not allow any self-loops: edges that connect a vertex to itself. A multi-graph can be represented by an adjacency matrix [math]\displaystyle{ A }[/math] where [math]\displaystyle{ A(u,v) }[/math] takes nonnegative integer values instead of just 0 or 1, representing the number of parallel edges between [math]\displaystyle{ u }[/math] and [math]\displaystyle{ v }[/math], and [math]\displaystyle{ A(v,v)=0 }[/math] for all [math]\displaystyle{ v }[/math] (since there is no self-loop).

Given a multi-graph [math]\displaystyle{ G(V,E) }[/math] and an edge [math]\displaystyle{ e\in E }[/math], we define the following contraction operator Contract([math]\displaystyle{ G }[/math], [math]\displaystyle{ e }[/math]), which returns a new multi-graph.

The contraction operator Contract([math]\displaystyle{ G }[/math], [math]\displaystyle{ e }[/math])
say [math]\displaystyle{ e=uv }[/math]:
  • merge [math]\displaystyle{ u }[/math] and [math]\displaystyle{ v }[/math] into one new vertex [math]\displaystyle{ x }[/math];
  • for every edge (no matter parallel or not) in the form of [math]\displaystyle{ uw }[/math] or [math]\displaystyle{ vw }[/math] that connects one of [math]\displaystyle{ \{u,v\} }[/math] to a vertex [math]\displaystyle{ w\in V\setminus\{u,v\} }[/math] in the graph other than [math]\displaystyle{ u,v }[/math], replace it by a new edge [math]\displaystyle{ xw }[/math];
  • the reset of the graph does not change.

In other words, the [math]\displaystyle{ Contract(G,uv) }[/math] merges the two vertices [math]\displaystyle{ u }[/math] and [math]\displaystyle{ v }[/math] into a new vertex [math]\displaystyle{ x }[/math] whose incident edges preserves the edges incident to [math]\displaystyle{ u }[/math] or [math]\displaystyle{ v }[/math] in the original graph [math]\displaystyle{ G }[/math] except for the parallel edges between them.

Perhaps a better way to look at contraction is to interpret it as union of equivalent classes of vertices. Initially every vertex is in a dinstinct equivalent class. Upon call a [math]\displaystyle{ contract(G,uv) }[/math], the two equivalent classes corresponding to [math]\displaystyle{ u }[/math] and [math]\displaystyle{ v }[/math] are unioned together, and only those edges crossing between different equivalent classes are counted as valid edges in the graph.

RandomContract (Karger 1993)
while [math]\displaystyle{ |V|\gt 2 }[/math] do
  • choose an edge [math]\displaystyle{ uv\in E }[/math] uniformly at random;
  • [math]\displaystyle{ G=contract(G,uv) }[/math];
return [math]\displaystyle{ C=E }[/math] (the parallel edges between the only remaining vertices in [math]\displaystyle{ V }[/math]);

A multi-graph can be maintained by appropriate data strucrtures such that each contraction takes [math]\displaystyle{ O(n) }[/math] time, where [math]\displaystyle{ n }[/math] is the number of vertices, so the algorithm terminates in time [math]\displaystyle{ O(n^2) }[/math]. We leave this as an exercise.

Analysis of accuracy

For convenience, we assume that each edge has a unique "identity" [math]\displaystyle{ e }[/math]. And when an edge [math]\displaystyle{ uv\in E }[/math] is contracted to new vertex [math]\displaystyle{ x }[/math], and each adjacent edge [math]\displaystyle{ uw }[/math] of [math]\displaystyle{ u }[/math] (or adjacent edge [math]\displaystyle{ vw }[/math] of [math]\displaystyle{ v }[/math]) is replaced by [math]\displaystyle{ xw }[/math], the identity [math]\displaystyle{ e }[/math] of the edge [math]\displaystyle{ uw }[/math] (or [math]\displaystyle{ vw }[/math]) is transfered to the new edge [math]\displaystyle{ xw }[/math] replacing it. When referring a cut [math]\displaystyle{ C }[/math], we consider [math]\displaystyle{ C }[/math] as a set of edge identities [math]\displaystyle{ e }[/math], so that a cut [math]\displaystyle{ C }[/math] is changed by the algorithm only if some of its edges are removed during contraction.

We first prove some lemma.

Lemma 1
If [math]\displaystyle{ C }[/math] is a cut in a multi-graph [math]\displaystyle{ G }[/math] and [math]\displaystyle{ e\not\in C }[/math], then [math]\displaystyle{ C }[/math] is still a cut in [math]\displaystyle{ G'=contract(G,e) }[/math].
Proof.

It is easy to verify that [math]\displaystyle{ C }[/math] is a cut in [math]\displaystyle{ G'=contract(G,e) }[/math] if none of its edges is lost during the contraction. Since [math]\displaystyle{ C }[/math] is a cut in [math]\displaystyle{ G(V,E) }[/math], there exists a nonempty vertex set [math]\displaystyle{ S\subset V }[/math] and its complement [math]\displaystyle{ \bar{S}=V\setminus S }[/math] such that [math]\displaystyle{ C=\{uv\mid u\in S, v\in\bar{S}\} }[/math]. And if [math]\displaystyle{ e\not\in C }[/math], it must hold that either [math]\displaystyle{ e\in G[S] }[/math] or [math]\displaystyle{ e\in G[\bar{S}] }[/math] where [math]\displaystyle{ G[S] }[/math] and [math]\displaystyle{ G[\bar{S}] }[/math] are the subgraphs induced by [math]\displaystyle{ S }[/math] and [math]\displaystyle{ \bar{S} }[/math] respectively. In both cases none of edges in [math]\displaystyle{ C }[/math] is removed in [math]\displaystyle{ G'=contract(G,e) }[/math].

[math]\displaystyle{ \square }[/math]
Lemma 2
The size of min-cut in [math]\displaystyle{ G'=contract(G,e) }[/math] is at least as large as the size of min-cut in [math]\displaystyle{ G }[/math], i.e. contraction never reduces the size of min-cut.
Proof.
Note that every cut in the contracted graph [math]\displaystyle{ G' }[/math] is also a cut in the original graph [math]\displaystyle{ G }[/math].
[math]\displaystyle{ \square }[/math]
Lemma 3
If [math]\displaystyle{ C }[/math] is a min-cut in a multi-graph [math]\displaystyle{ G(V,E) }[/math], then [math]\displaystyle{ |E|\ge \frac{|V||C|}{2} }[/math].
Proof.
It must hold that the degree of each vertex [math]\displaystyle{ v\in V }[/math] is at least [math]\displaystyle{ |C| }[/math], or otherwise the set of adjacent edges of [math]\displaystyle{ v }[/math] forms a cut which separates [math]\displaystyle{ v }[/math] from the rest of [math]\displaystyle{ V }[/math] and has size less than [math]\displaystyle{ |C| }[/math], contradicting the assumption that [math]\displaystyle{ |C| }[/math] is a min-cut. And the bound [math]\displaystyle{ |E|\ge \frac{|V||C|}{2} }[/math] follows directly from the fact that every vertex in [math]\displaystyle{ G }[/math] has degree at least [math]\displaystyle{ |C| }[/math].
[math]\displaystyle{ \square }[/math]

For a multigraph [math]\displaystyle{ G(V, E) }[/math], fixed a minimum cut [math]\displaystyle{ C }[/math] (there might be more than one minimum cuts), we analyze the probability that [math]\displaystyle{ C }[/math] is returned by the above algorithm.

Initially [math]\displaystyle{ |V|=n }[/math]. We say that the min-cut [math]\displaystyle{ C }[/math] "survives" a random contraction if none of the edges in [math]\displaystyle{ C }[/math] is chosen to be contracted. After [math]\displaystyle{ (i-1) }[/math] contractions, denote the current multigraph as [math]\displaystyle{ G_i(V_i, E_i) }[/math]. Supposed that [math]\displaystyle{ C }[/math] survives the first [math]\displaystyle{ (i-1) }[/math] contractions, according to Lemma 1 and 2, [math]\displaystyle{ C }[/math] must be a minimum cut in the current multi-graph [math]\displaystyle{ G_i }[/math]. Then due to Lemma 3, the current edge number is [math]\displaystyle{ |E_i|\ge |V_i||C|/2 }[/math]. Uniformly choosing an edge [math]\displaystyle{ e\in E_i }[/math] to contract, the probability that the [math]\displaystyle{ i }[/math]th contraction contracts an edge in [math]\displaystyle{ C }[/math] is given by:

[math]\displaystyle{ \begin{align}\Pr_{e\in E_i}[e\in C] &= \frac{|C|}{|E_i|} &\le |C|\cdot\frac{2}{|V_i||C|} &= \frac{2}{|V_i|}.\end{align} }[/math]

Therefore, conditioning on that [math]\displaystyle{ C }[/math] survives the first [math]\displaystyle{ (i-1) }[/math] contractions, the probability that [math]\displaystyle{ C }[/math] survives the [math]\displaystyle{ i }[/math]th contraction is at least [math]\displaystyle{ 1-2/|V_i| }[/math]. Note that [math]\displaystyle{ |V_i|=n-i+1 }[/math], because each contraction decrease the vertex number by 1.

The probability that no edge in the minimum cut [math]\displaystyle{ C }[/math] is ever contracted is:

[math]\displaystyle{ \begin{align} &\quad\,\prod_{i=1}^{n-2}\Pr[\,C\mbox{ survives all }(n-2)\mbox{ contractions }]\\ &= \prod_{i=1}^{n-2}\Pr[\,C\mbox{ survives the }i\mbox{-th contraction}\mid C\mbox{ survives the first }(i-1)\mbox{-th contractions}]\\ &\ge \prod_{i=1}^{n-2}\left(1-\frac{2}{|V_i|}\right) \\ &= \prod_{i=1}^{n-2}\left(1-\frac{2}{n-i+1}\right)\\ &= \prod_{k=3}^{n}\frac{k-2}{k}\\ &= \frac{2}{n(n-1)}. \end{align} }[/math]

This gives the following theorem.

Theorem
For any multigraph with [math]\displaystyle{ n }[/math] vertices, the RandomContract algorithm returns a minimum cut with probability at least [math]\displaystyle{ \frac{2}{n(n-1)} }[/math].

Run RandomContract independently for [math]\displaystyle{ n(n-1)/2 }[/math] times and return the smallest cut returned. The probability that a minimum cut is found is at least:

[math]\displaystyle{ \begin{align} 1-\Pr[\mbox{failed every time}] &= 1-\Pr[{RandomContract}\mbox{ fails}]^{n(n-1)/2} \\ &\ge 1- \left(1-\frac{2}{n(n-1)}\right)^{n(n-1)/2} \\ &\ge 1-\frac{1}{e}. \end{align} }[/math]

A constant probability!

A Corollary by the Probabilistic Method

Karger's algorithm and its analysis implies the following combinatorial theorem regarding the number of distinct minimum cuts in a graph.

Corollary
For any graph [math]\displaystyle{ G(V,E) }[/math] of [math]\displaystyle{ n }[/math] vertices, the number of distinct minimum cuts in [math]\displaystyle{ G }[/math] is at most [math]\displaystyle{ \frac{n(n-2)}{2} }[/math].
Proof.

For each minimum cut [math]\displaystyle{ C }[/math] in [math]\displaystyle{ G }[/math], we define [math]\displaystyle{ \mathcal{E}_C }[/math] to be the event that RandomContract returns [math]\displaystyle{ C }[/math]. Due to the analysis of RandomContract, [math]\displaystyle{ \Pr[\mathcal{E}_C]\ge \frac{2}{n(n-1)} }[/math]. The events [math]\displaystyle{ \mathcal{E}_C }[/math] are mutually disjoint for distinct [math]\displaystyle{ C }[/math] and the event that RandomContract returns a min-cut is the disjoint union of [math]\displaystyle{ \mathcal{E}_C }[/math] over all min-cut [math]\displaystyle{ C }[/math]. Therefore,

[math]\displaystyle{ \begin{align} &\Pr[\mbox{ RandomContract returns a min-cut}]\\ = &\sum_{\mbox{min-cut }C\mbox{ in }G}\Pr[\mathcal{E}_C]\\ \ge &\sum_{\mbox{min-cut }C\mbox{ in }G}\frac{2}{n(n-1)}, \end{align} }[/math]

which must be no greater than 1 for a well-defined probability space. This means the total number of min-cut in [math]\displaystyle{ G }[/math] must be no greater than [math]\displaystyle{ \frac{n(n-1)}{2} }[/math].

[math]\displaystyle{ \square }[/math]

Note that the statement of this theorem has no randomness at all, however the proof involves a randomized algorithm. This is an example of the probabilistic method.

Fast Min-Cut

In the analysis of RandomContract, we have the following observation:

  • The probability of success is only getting worse when the graph becomes small.

This motivates us to consider the following alternation to the algorithm: first using random contractions to reduce the number of vertices to a moderately small number, and then recursively finding a min-cut in this smaller instance. This seems just a restatement of exactly what we have been doing. Inspired by the idea of boosting the accuracy via independent repetition, here we apply the recursion on two smaller instances generated independently.

The algorithm obtained in this way is called FastCut. We first define a procedure to randomly contract edges until there are [math]\displaystyle{ t }[/math] number of vertices left.

RandomContract[math]\displaystyle{ (G, t) }[/math]
while [math]\displaystyle{ |V|\gt t }[/math] do
  • choose an edge [math]\displaystyle{ uv\in E }[/math] uniformly at random;
  • [math]\displaystyle{ G=contract(G,uv) }[/math];
return [math]\displaystyle{ G }[/math];

The FastCut algorithm is recursively defined as follows.

FastCut[math]\displaystyle{ (G) }[/math]
if [math]\displaystyle{ |V|\le 6 }[/math] then return a mincut by brute force;
else let [math]\displaystyle{ t=\left\lceil1+|V|/\sqrt{2}\right\rceil }[/math];
[math]\displaystyle{ G_1=RandomContract(G,t) }[/math];
[math]\displaystyle{ G_2=RandomContract(G,t) }[/math];
return the smaller one of [math]\displaystyle{ FastCut(G_1) }[/math] and [math]\displaystyle{ FastCut(G_2) }[/math];

As before, all [math]\displaystyle{ G }[/math] are multigraphs.

Let [math]\displaystyle{ C }[/math] be a min-cut in the original multigraph [math]\displaystyle{ G }[/math]. By the same analysis as in the case of RandomContract, we have

[math]\displaystyle{ \begin{align} &\Pr[C\text{ survives all contractions in }RandomContract(G,t)]\\ = &\prod_{i=1}^{n-t}\Pr[C\text{ survives the }i\text{-th contraction}\mid C\text{ survives the first }(i-1)\text{-th contractions}]\\ \ge &\prod_{i=1}^{n-t}\left(1-\frac{2}{n-i+1}\right)\\ = &\prod_{k=t+1}^{n}\frac{k-2}{k}\\ = &\frac{t(t-1)}{n(n-1)}. \end{align} }[/math]

When [math]\displaystyle{ t=\left\lceil1+n/\sqrt{2}\right\rceil }[/math], this probability is at least [math]\displaystyle{ 1/2 }[/math].

We use [math]\displaystyle{ p(n) }[/math] to denote the probability that [math]\displaystyle{ C }[/math] is returned by [math]\displaystyle{ FastCut(G) }[/math], where [math]\displaystyle{ G }[/math] is a multigraph of [math]\displaystyle{ n }[/math] vertices. We then have the following recursion for [math]\displaystyle{ p(n) }[/math].

[math]\displaystyle{ \begin{align} p(n) &= \Pr[C\text{ is returned by }\textit{FastCut}(G)]\\ &= 1-\left(1-\Pr[C\text{ survives in }G_1\wedge C=\textit{FastCut}(G_1)]\right)^2\\ &= 1-\left(1-\Pr[C\text{ survives in }G_1]\Pr[C=\textit{FastCut}(G_1)\mid C\text{ survives in }G_1]\right)^2\\ &\ge 1-\left(1-\frac{1}{2}p\left(\left\lceil1+n/\sqrt{2}\right\rceil\right)\right)^2, \end{align} }[/math]

where the last inequality is due to the fact that [math]\displaystyle{ \Pr[C\text{ survives all contractions in }RandomContract(G,t)]\ge1/2 }[/math] and our previous discussions in the analysis of RandomContract that if the min-cut [math]\displaystyle{ C }[/math] survives all first [math]\displaystyle{ (n-t) }[/math] contractions then [math]\displaystyle{ C }[/math] must be a min-cut in the remaining multigraph.

The base case is that [math]\displaystyle{ p(n)=1 }[/math] for [math]\displaystyle{ n\le 6 }[/math]. Solving this recursion of [math]\displaystyle{ p(n) }[/math] (or proving by induction) gives us that

[math]\displaystyle{ p(n)=\Omega\left(\frac{1}{\log n}\right). }[/math]

Recall that we can implement an edge contraction in [math]\displaystyle{ O(n) }[/math] time, thus it is easy to verify the following recursion of time complexity:

[math]\displaystyle{ T(n)=2T\left(\left\lceil1+n/\sqrt{2}\right\rceil\right)+O(n^2), }[/math]

where [math]\displaystyle{ T(n) }[/math] denotes the running time of [math]\displaystyle{ FastCut(G) }[/math] on a multigraph [math]\displaystyle{ G }[/math] of [math]\displaystyle{ n }[/math] vertices.

Solving the recursion of [math]\displaystyle{ T(n) }[/math] with the base case [math]\displaystyle{ T(n)=O(1) }[/math] for [math]\displaystyle{ n\le 6 }[/math], we have [math]\displaystyle{ T(n)=O(n^2\log n) }[/math].

Therefore, for a multigraph [math]\displaystyle{ G }[/math] of [math]\displaystyle{ n }[/math] vertices, the algorithm [math]\displaystyle{ FastCut(G) }[/math] returns a min-cut in [math]\displaystyle{ G }[/math] with probability [math]\displaystyle{ \Omega\left(\frac{1}{\log n}\right) }[/math] in time [math]\displaystyle{ O(n^2\log n) }[/math]. Repeat this independently for [math]\displaystyle{ O(log n) }[/math] times, we have an algorithm which runs in time [math]\displaystyle{ O(n^2\log^2n) }[/math] and returns a min-cut with probability [math]\displaystyle{ 1-O(1/n) }[/math], a high probability.