# Graph Cut

Let ${\displaystyle G(V,E)}$ be an undirected graph. A subset ${\displaystyle C\subseteq E}$ of edges is a cut of graph ${\displaystyle G}$ if ${\displaystyle G}$ becomes disconnected after deleting all edges in ${\displaystyle C}$.

More restrictively, we consider the cuts which disconnect a subset ${\displaystyle S}$ of vertices from the rest of the graph.

A pair of disjoint subsets ${\displaystyle S,T\subseteq V}$ of vertices is called a bipartition of ${\displaystyle V}$ if ${\displaystyle S}$ and ${\displaystyle T}$ are not empty and ${\displaystyle S\cup T=V}$.

Given a bipartition ${\displaystyle \{S,T\}}$ of ${\displaystyle V}$, a cut ${\displaystyle C}$ is given by

${\displaystyle C=E(S,T)\,}$,

where ${\displaystyle E(S,T)}$ is defined as

${\displaystyle E(S,T)=\{uv\in E\mid u\in S,v\in T\}}$

which represents the set of "crossing edges" with one endpoint in each of ${\displaystyle S}$ and ${\displaystyle T}$.

Given a graph ${\displaystyle G}$, there might be many cuts in ${\displaystyle G}$, and we are interested in finding the minimum or maximum cut.

# Min-Cut

The min-cut problem, also called the global minimum cut problem, is defined as follows.

 Min-cut problem Input: an undirected graph ${\displaystyle G(V,E)}$; Output: a cut ${\displaystyle C}$ in ${\displaystyle G}$ with the smallest size ${\displaystyle |C|}$.

Equivalently, the problem asks to find a bipartition of ${\displaystyle V}$ into disjoint non-empty subsets ${\displaystyle S}$ and ${\displaystyle T}$ that minimizes ${\displaystyle |E(S,T)|}$.

We consider the problem in a slightly more generalized setting, where the input graphs ${\displaystyle G}$ can be multi-graphs, meaning that there could be multiple edges between two vertices ${\displaystyle u}$ and ${\displaystyle v}$. We call such edges the parallel edges. The cuts in multi-graphs are defined in the same way as before, and the cost of a cut ${\displaystyle C}$ is given by the total number of edges (including parallel edges) in ${\displaystyle C}$. Equivalently, one may think of a multi-graph as a graph with integer edge weights, and the cost of a cut ${\displaystyle C}$ is the total weights of all edges in ${\displaystyle C}$.

A canonical deterministic algorithm for this problem is through the max-flow min-cut theorem. The max-flow algorithm finds us a minimum ${\displaystyle s}$-${\displaystyle t}$ cut, which disconnects a source ${\displaystyle s\in V}$ from a sink ${\displaystyle t\in V}$, both specified as part of the input. A global min cut can be found by exhaustively finding the minimum ${\displaystyle s}$-${\displaystyle t}$ cut for an arbitrarily fixed source ${\displaystyle s}$ and all possible sink ${\displaystyle t\neq s}$. This takes ${\displaystyle (n-1)\times }$max-flow time where ${\displaystyle n=|V|}$ is the number of vertices.

The fastest known deterministic algorithm for the minimum cut problem on multi-graphs is the Stoer–Wagner algorithm, which achieves an ${\displaystyle O(mn+n^{2}\log n)}$ time complexity where ${\displaystyle m=|E|}$ is the total number of edges (counting the parallel edges).

If we restrict the input to be simple graphs (meaning there is no parallel edges) with no edge weight, there are better algorithms. The most recent one was published in STOC 2015, achieving a near-linear (in the number of edges) time complexity.

## Karger's Contraction algorithm

We will describe a simple and elegant randomized algorithm for the min-cut problem. The algorithm is due to David Karger.

Let ${\displaystyle G(V,E)}$ be a multi-graph, which allows more than one parallel edges between two distinct vertices ${\displaystyle u}$ and ${\displaystyle v}$ but does not allow any self-loops: the edges that adjoin a vertex to itself. A multi-graph ${\displaystyle G}$ can be represented by an adjacency matrix ${\displaystyle A}$, in the way that each non-diagonal entry ${\displaystyle A(u,v)}$ takes nonnegative integer values instead of just 0 or 1, representing the number of parallel edges between ${\displaystyle u}$ and ${\displaystyle v}$ in ${\displaystyle G}$, and all diagonal entries ${\displaystyle A(v,v)=0}$ (since there is no self-loop).

Given a multi-graph ${\displaystyle G(V,E)}$ and an edge ${\displaystyle e\in E}$, we define the following contraction operator Contract(${\displaystyle G}$, ${\displaystyle e}$), which transform ${\displaystyle G}$ to a new multi-graph.

 The contraction operator Contract(${\displaystyle G}$, ${\displaystyle e}$) say ${\displaystyle e=uv}$: replace ${\displaystyle \{u,v\}}$ by a new vertex ${\displaystyle x}$; for every edge (no matter parallel or not) in the form of ${\displaystyle uw}$ or ${\displaystyle vw}$ that connects one of ${\displaystyle \{u,v\}}$ to a vertex ${\displaystyle w\in V\setminus \{u,v\}}$ in the graph other than ${\displaystyle u,v}$, replace it by a new edge ${\displaystyle xw}$; the reset of the graph does not change.

In other words, the ${\displaystyle Contract(G,uv)}$ merges the two vertices ${\displaystyle u}$ and ${\displaystyle v}$ into a new vertex ${\displaystyle x}$ whose incident edges preserves the edges incident to ${\displaystyle u}$ or ${\displaystyle v}$ in the original graph ${\displaystyle G}$ except for the parallel edges between them. Now you should realize why we consider multi-graphs instead of simple graphs, because even if we start with a simple graph without parallel edges, the contraction operator may create parallel edges.

The contraction operator is illustrated by the following picture:

Karger's algorithm uses a simple idea:

• At each step we randomly select an edge in the current multi-graph to contract until there are only two vertices left.
• The parallel edges between these two remaining vertices must be a cut of the original graph.
• We return this cut and hope that with good chance this gives us a minimum cut.

The following is the pseudocode for Karger's algorithm.

 RandomContract (Karger 1993) Input: multi-graph ${\displaystyle G(V,E)}$; while ${\displaystyle |V|>2}$ do choose an edge ${\displaystyle uv\in E}$ uniformly at random; ${\displaystyle G=Contract(G,uv)}$; return ${\displaystyle C=E}$ (the parallel edges between the only two vertices in ${\displaystyle V}$);

Another way of looking at the contraction operator Contract(${\displaystyle G}$,${\displaystyle e}$) is that we are dealing with classes of vertices. Let ${\displaystyle V=\{v_{1},v_{2},\ldots ,v_{n}\}}$ be the set of all vertices. We start with ${\displaystyle n}$ vertex classes ${\displaystyle S_{1},S_{2},\ldots ,S_{n}}$ with each class ${\displaystyle S_{i}=\{v_{i}\}}$ contains one vertex. By calling ${\displaystyle Contract(G,uv)}$, where ${\displaystyle u\in S_{i}}$ and ${\displaystyle v\in S_{j}}$ for distinct ${\displaystyle i\neq j}$, we take union of ${\displaystyle S_{i}}$ and ${\displaystyle S_{j}}$. The edges in the contracted multi-graph are the edges that cross between different vertex classes.

This view of contraction is illustrated by the following picture:

The following claim is left as an exercise for the class:

 With suitable choice of data structures, each operation ${\displaystyle Contract(G,e)}$ can be implemented within running time ${\displaystyle O(n)}$ where ${\displaystyle n=|V|}$ is the number of vertices.

In the above RandomContract algorithm, there are precisely ${\displaystyle n-2}$ contractions. Therefore, we have the following time upper bound.

 Theorem For any multigraph with ${\displaystyle n}$ vertices, the running time of the RandomContract algorithm is ${\displaystyle O(n^{2})}$.

We emphasize that it's the time complexity of a "single running" of the algorithm: later we will see we may need to run this algorithm for many times to guarantee a desirable accuracy.

## Analysis of accuracy

We now analyze the performance of the above algorithm. Since the algorithm is randomized, its output cut is a random variable even when the input is fixed, so the output may not always be correct. We want to give a theoretical guarantee of the chance that the algorithm returns a correct answer on an arbitrary input.

More precisely, on an arbitrarily fixed input multi-graph ${\displaystyle G}$, we want to answer the following question rigorously:

${\displaystyle p_{\text{correct}}=\Pr[\,{\text{a minimum cut is returned by }}RandomContract\,]\geq ?}$

To answer this question, we prove a stronger statement: for arbitrarily fixed input multi-graph ${\displaystyle G}$ and a particular minimum cut ${\displaystyle C}$ in ${\displaystyle G}$,

${\displaystyle p_{C}=\Pr[\,C{\mbox{ is returned by }}RandomContract\,]\geq ?}$

Obviously this will imply the previous lower bound for ${\displaystyle p_{\text{correct}}}$ because the event in ${\displaystyle p_{C}}$ implies the event in ${\displaystyle p_{\text{correct}}}$.

 In above argument we use the simple law in probability that ${\displaystyle \Pr[A]\leq \Pr[B]}$ if ${\displaystyle A\subseteq B}$, i.e. event ${\displaystyle A}$ implies event ${\displaystyle B}$.

We introduce the following notations:

• Let ${\displaystyle e_{1},e_{2},\ldots ,e_{n-2}}$ denote the sequence of random edges chosen to contract in a running of RandomContract algorithm.
• Let ${\displaystyle G_{1}=G}$ denote the original input multi-graph. And for ${\displaystyle i=1,2,\ldots ,n-2}$, let ${\displaystyle G_{i+1}=Contract(G_{i},e_{i})}$ be the multigraph after ${\displaystyle i}$th contraction.

Obviously ${\displaystyle e_{1},e_{2},\ldots ,e_{n-2}}$ are random variables, and they are the only random choices used in the algorithm: meaning that they along with the input ${\displaystyle G}$, uniquely determine the sequence of multi-graphs ${\displaystyle G_{1},G_{2},\ldots ,G_{n-2}}$ in every iteration as well as the final output.

We now compute the probability ${\displaystyle p_{C}}$ by decompose it into more elementary events involving ${\displaystyle e_{1},e_{2},\ldots ,e_{n-2}}$. This is due to the following proposition.

 Proposition 1 If ${\displaystyle C}$ is a minimum cut in a multi-graph ${\displaystyle G}$ and ${\displaystyle e\not \in C}$, then ${\displaystyle C}$ is still a minimum cut in the contracted graph ${\displaystyle G'=contract(G,e)}$.
Proof.
 We first observe that contraction will never create new cuts: every cut in the contracted graph ${\displaystyle G'}$ must also be a cut in the original graph ${\displaystyle G}$. We then observe that a cut ${\displaystyle C}$ in ${\displaystyle G}$ "survives" in the contracted graph ${\displaystyle G'}$ if and only if the contracted edge ${\displaystyle e\not \in C}$. Both observations are easy to verify by the definition of contraction operator (in particular, easier to verify if we take the vertex class interpretation). The detailed proofs are left as an exercise.
${\displaystyle \square }$

Recall that ${\displaystyle e_{1},e_{2},\ldots ,e_{n-2}}$ denote the sequence of random edges chosen to contract in a running of RandomContract algorithm.

By Proposition 1, the event ${\displaystyle {\mbox{}}C{\mbox{ is returned by }}RandomContract{\mbox{''}}\,}$ is equivalent to the event ${\displaystyle {\mbox{}}e_{i}\not \in C{\mbox{ for all }}i=1,2,\ldots ,n-2{\mbox{''}}}$. Therefore:

{\displaystyle {\begin{aligned}p_{C}&=\Pr[\,C{\mbox{ is returned by }}{RandomContract}\,]\\&=\Pr[\,e_{i}\not \in C{\mbox{ for all }}i=1,2,\ldots ,n-2\,]\\&=\prod _{i=1}^{n-2}\Pr[e_{i}\not \in C\mid \forall j

The last equation is due to the so called chain rule in probability.

 The chain rule, also known as the law of progressive conditioning, is the following proposition: for a sequence of events (not necessarily independent) ${\displaystyle A_{1},A_{2},\ldots ,A_{n}}$, ${\displaystyle \Pr[\forall i,A_{i}]=\prod _{i=1}^{n}\Pr[A_{i}\mid \forall j. It is a simple consequence of the definition of conditional probability. By definition of conditional probability, ${\displaystyle \Pr[A_{n}\mid \forall j, and equivalently we have ${\displaystyle \Pr[\forall i,A_{i}]=\Pr[\forall j. Recursively apply this to ${\displaystyle \Pr[\forall j we obtain the chain rule.

Back to the analysis of probability ${\displaystyle p_{C}}$.

Now our task is to give lower bound to each ${\displaystyle p_{i}=\Pr[e_{i}\not \in C\mid \forall j. The condition ${\displaystyle {\mbox{}}\forall j means the min-cut ${\displaystyle C}$ survives all first ${\displaystyle i-1}$ contractions ${\displaystyle e_{1},e_{2},\ldots ,e_{i-1}}$, which due to Proposition 1 means that ${\displaystyle C}$ is also a min-cut in the multi-graph ${\displaystyle G_{i}}$ obtained from applying the first ${\displaystyle (i-1)}$ contractions.

Then the conditional probability ${\displaystyle p_{i}=\Pr[e_{i}\not \in C\mid \forall j is the probability that no edge in ${\displaystyle C}$ is hit when a uniform random edge in the current multi-graph is chosen assuming that ${\displaystyle C}$ is a minimum cut in the current multi-graph. Intuitively this probability should be bounded from below, because as a min-cut ${\displaystyle C}$ should be sparse among all edges. This intuition is justified by the following proposition.

 Proposition 2 If ${\displaystyle C}$ is a min-cut in a multi-graph ${\displaystyle G(V,E)}$, then ${\displaystyle |E|\geq {\frac {|V||C|}{2}}}$.
Proof.
 It must hold that the degree of each vertex ${\displaystyle v\in V}$ is at least ${\displaystyle |C|}$, or otherwise the set of edges incident to ${\displaystyle v}$ forms a cut of size smaller than ${\displaystyle |C|}$ which separates ${\displaystyle \{v\}}$ from the rest of the graph, contradicting that ${\displaystyle C}$ is a min-cut. And the bound ${\displaystyle |E|\geq {\frac {|V||C|}{2}}}$ follows directly from applying the handshaking lemma to the fact that every vertex in ${\displaystyle G}$ has degree at least ${\displaystyle |C|}$.
${\displaystyle \square }$

Let ${\displaystyle V_{i}}$ and ${\displaystyle E_{i}}$ denote the vertex set and edge set of the multi-graph ${\displaystyle G_{i}}$ respectively, and recall that ${\displaystyle G_{i}}$ is the multi-graph obtained from applying first ${\displaystyle (i-1)}$ contractions. Obviously ${\displaystyle |V_{i}|=n-i+1}$. And due to Proposition 2, ${\displaystyle |E_{i}|\geq {\frac {|V_{i}||C|}{2}}}$ if ${\displaystyle C}$ is still a min-cut in ${\displaystyle G_{i}}$.

The probability ${\displaystyle p_{i}=\Pr[e_{i}\not \in C\mid \forall j can be computed as

{\displaystyle {\begin{aligned}p_{i}&=1-{\frac {|C|}{|E_{i}|}}\\&\geq 1-{\frac {2}{|V_{i}|}}\\&=1-{\frac {2}{n-i+1}}\end{aligned}},}

where the inequality is due to Proposition 2.

We now can put everything together. We arbitrarily fix the input multi-graph ${\displaystyle G}$ and any particular minimum cut ${\displaystyle C}$ in ${\displaystyle G}$.

{\displaystyle {\begin{aligned}p_{\text{correct}}&=\Pr[\,{\text{a minimum cut is returned by }}RandomContract\,]\\&\geq \Pr[\,C{\mbox{ is returned by }}{RandomContract}\,]\\&=\Pr[\,e_{i}\not \in C{\mbox{ for all }}i=1,2,\ldots ,n-2\,]\\&=\prod _{i=1}^{n-2}\Pr[e_{i}\not \in C\mid \forall j

This gives us the following theorem.

 Theorem For any multigraph with ${\displaystyle n}$ vertices, the RandomContract algorithm returns a minimum cut with probability at least ${\displaystyle {\frac {2}{n(n-1)}}}$.

At first glance this seems to be a miserable chance of success. However, notice that there may be exponential many cuts in a graph (because potentially every nonempty subset ${\displaystyle S\subset V}$ corresponds to a cut ${\displaystyle C=E(S,{\overline {S}})}$), and Karger's algorithm effectively reduce this exponential-sized space of feasible solutions to a quadratic size one, an exponential improvement!

We can run RandomContract independently for ${\displaystyle t={\frac {n(n-1)\ln n}{2}}}$ times and return the smallest cut ever returned. The probability that a minimum cut is found is at least:

{\displaystyle {\begin{aligned}&\quad 1-\Pr[\,{\mbox{all }}t{\mbox{ independent runnings of }}RandomContract{\mbox{ fails to find a min-cut}}\,]\\&=1-\Pr[\,{\mbox{a single running of }}{RandomContract}{\mbox{ fails}}\,]^{t}\\&\geq 1-\left(1-{\frac {2}{n(n-1)}}\right)^{\frac {n(n-1)\ln n}{2}}\\&\geq 1-{\frac {1}{n}}.\end{aligned}}}

Recall that a running of RandomContract algorithm takes ${\displaystyle O(n^{2})}$ time. Altogether this gives us a randomized algorithm running in time ${\displaystyle O(n^{4}\log n)}$ and find a minimum cut with high probability.

## A Corollary by the Probabilistic Method

The analysis of Karger's algorithm implies the following combinatorial proposition for the number of distinct minimum cuts in a graph.

 Corollary For any graph ${\displaystyle G(V,E)}$ of ${\displaystyle n}$ vertices, the number of distinct minimum cuts in ${\displaystyle G}$ is at most ${\displaystyle {\frac {n(n-2)}{2}}}$.
Proof.
 Let ${\displaystyle {\mathcal {C}}}$ denote the set of all minimum cuts in ${\displaystyle G}$. For each min-cut ${\displaystyle C\in {\mathcal {C}}}$, let ${\displaystyle A_{C}}$ denote the event "${\displaystyle C}$ is returned by RandomContract", whose probability is given by ${\displaystyle p_{C}=\Pr[A_{C}]\,}$. Clearly we have: for any distinct ${\displaystyle C,D\in {\mathcal {C}}}$, ${\displaystyle A_{C}\,}$ and ${\displaystyle A_{D}\,}$ are disjoint events; and the union ${\displaystyle \bigcup _{C\in {\mathcal {C}}}A_{C}}$ is precisely the event "a minimum cut is returned by RandomContract", whose probability is given by ${\displaystyle p_{\text{correct}}=\Pr[\,{\text{a minimum cut is returned by }}RandomContract\,]}$. Due to the additivity of probability, it holds that ${\displaystyle p_{\text{correct}}=\sum _{C\in {\mathcal {C}}}\Pr[A_{C}]=\sum _{C\in {\mathcal {C}}}p_{C}.}$ By the analysis of Karger's algorithm, we know ${\displaystyle p_{C}\geq {\frac {2}{n(n-1)}}}$. And since ${\displaystyle p_{\text{correct}}}$ is a well defined probability, due to the unitarity of probability, it must hold that ${\displaystyle p_{\text{correct}}\leq 1}$. Therefore, ${\displaystyle 1\geq p_{\text{correct}}=\sum _{C\in {\mathcal {C}}}p_{C}\geq |{\mathcal {C}}|{\frac {2}{n(n-1)}}}$, which means ${\displaystyle |{\mathcal {C}}|\leq {\frac {n(n-1)}{2}}}$.
${\displaystyle \square }$

Note that the statement of this theorem has no randomness at all, while the proof consists of a randomized procedure. This is an example of the probabilistic method.

## Fast Min-Cut

In the analysis of RandomContract algorithm, recall that we lower bound the probability ${\displaystyle p_{C}}$ that a min-cut ${\displaystyle C}$ is returned by RandomContract by the following telescopic product:

${\displaystyle p_{C}\geq \prod _{i=1}^{n-2}\left(1-{\frac {2}{n-i+1}}\right)}$.

Here the index ${\displaystyle i}$ corresponds to the ${\displaystyle i}$th contraction. The factor ${\displaystyle \left(1-{\frac {2}{n-i+1}}\right)}$ is decreasing in ${\displaystyle i}$, which means:

• The probability of success is only getting bad when the graph is getting "too contracted", that is, when the number of remaining vertices is getting small.

This motivates us to consider the following alternation to the algorithm: first using random contractions to reduce the number of vertices to a moderately small number, and then recursively finding a min-cut in this smaller instance. This seems just a restatement of exactly what we have been doing. Inspired by the idea of boosting the accuracy via independent repetition, here we apply the recursion on two smaller instances generated independently.

The algorithm obtained in this way is called FastCut. We first define a procedure to randomly contract edges until there are ${\displaystyle t}$ number of vertices left.

 RandomContract${\displaystyle (G,t)}$ Input: multi-graph ${\displaystyle G(V,E)}$, and integer ${\displaystyle t\geq 2}$; while ${\displaystyle |V|>t}$ do choose an edge ${\displaystyle uv\in E}$ uniformly at random; ${\displaystyle G=Contract(G,uv)}$; return ${\displaystyle G}$;

The FastCut algorithm is recursively defined as follows.

 FastCut${\displaystyle (G)}$ Input: multi-graph ${\displaystyle G(V,E)}$; if ${\displaystyle |V|\leq 6}$ then return a mincut by brute force; else let ${\displaystyle t=\left\lceil 1+|V|/{\sqrt {2}}\right\rceil }$; ${\displaystyle G_{1}=RandomContract(G,t)}$; ${\displaystyle G_{2}=RandomContract(G,t)}$; return the smaller one of ${\displaystyle FastCut(G_{1})}$ and ${\displaystyle FastCut(G_{2})}$;

As before, all ${\displaystyle G}$ are multigraphs.

Fix a min-cut ${\displaystyle C}$ in the original multigraph ${\displaystyle G}$. By the same analysis as in the case of RandomContract, we have

{\displaystyle {\begin{aligned}&\Pr[C{\text{ survives all contractions in }}RandomContract(G,t)]\\=&\prod _{i=1}^{n-t}\Pr[C{\text{ survives the }}i{\text{-th contraction}}\mid C{\text{ survives the first }}(i-1){\text{-th contractions}}]\\\geq &\prod _{i=1}^{n-t}\left(1-{\frac {2}{n-i+1}}\right)\\=&\prod _{k=t+1}^{n}{\frac {k-2}{k}}\\=&{\frac {t(t-1)}{n(n-1)}}.\end{aligned}}}

When ${\displaystyle t=\left\lceil 1+n/{\sqrt {2}}\right\rceil }$, this probability is at least ${\displaystyle 1/2}$. The choice of ${\displaystyle t}$ is due to our purpose to make this probability at least ${\displaystyle 1/2}$. You will see this is crucial in the following analysis of accuracy.

We denote by ${\displaystyle A}$ and ${\displaystyle B}$ the following events:

{\displaystyle {\begin{aligned}A:&\quad C{\text{ survives all contractions in }}RandomContract(G,t);\\B:&\quad {\text{size of min-cut is unchanged after }}RandomContract(G,t);\end{aligned}}}

Clearly, ${\displaystyle A}$ implies ${\displaystyle B}$ and by above analysis ${\displaystyle \Pr[B]\geq \Pr[A]\geq {\frac {1}{2}}}$.

We denote by ${\displaystyle p(n)}$ the lower bound on the probability that ${\displaystyle FastCut(G)}$ succeeds for a multigraph of ${\displaystyle n}$ vertices, that is

${\displaystyle p(n)=\min _{G:|V|=n}\Pr[\,FastCut(G){\text{ returns a min-cut in }}G\,].}$

Suppose that ${\displaystyle G}$ is the multigraph that achieves the minimum in above definition. The following recurrence holds for ${\displaystyle p(n)}$.

{\displaystyle {\begin{aligned}p(n)&=\Pr[\,FastCut(G){\text{ returns a min-cut in }}G\,]\\&=\Pr[\,{\text{ a min-cut of }}G{\text{ is returned by }}FastCut(G_{1}){\text{ or }}FastCut(G_{2})\,]\\&\geq 1-\left(1-\Pr[B\wedge FastCut(G_{1}){\text{ returns a min-cut in }}G_{1}\,]\right)^{2}\\&\geq 1-\left(1-\Pr[A\wedge FastCut(G_{1}){\text{ returns a min-cut in }}G_{1}\,]\right)^{2}\\&=1-\left(1-\Pr[A]\Pr[FastCut(G_{1}){\text{ returns a min-cut in }}G_{1}\mid A]\right)^{2}\\&\geq 1-\left(1-{\frac {1}{2}}p\left(\left\lceil 1+n/{\sqrt {2}}\right\rceil \right)\right)^{2},\end{aligned}}}

where ${\displaystyle A}$ and ${\displaystyle B}$ are defined as above such that ${\displaystyle \Pr[A]\geq {\frac {1}{2}}}$.

The base case is that ${\displaystyle p(n)=1}$ for ${\displaystyle n\leq 6}$. By induction it is easy to prove that

${\displaystyle p(n)=\Omega \left({\frac {1}{\log n}}\right).}$

Recall that we can implement an edge contraction in ${\displaystyle O(n)}$ time, thus it is easy to verify the following recursion of time complexity:

${\displaystyle T(n)=2T\left(\left\lceil 1+n/{\sqrt {2}}\right\rceil \right)+O(n^{2}),}$

where ${\displaystyle T(n)}$ denotes the running time of ${\displaystyle FastCut(G)}$ on a multigraph ${\displaystyle G}$ of ${\displaystyle n}$ vertices.

By induction with the base case ${\displaystyle T(n)=O(1)}$ for ${\displaystyle n\leq 6}$, it is easy to verify that ${\displaystyle T(n)=O(n^{2}\log n)}$.

 Theorem For any multigraph with ${\displaystyle n}$ vertices, the FastCut algorithm returns a minimum cut with probability ${\displaystyle \Omega \left({\frac {1}{\log n}}\right)}$ in time ${\displaystyle O(n^{2}\log n)}$.

At this point, we see the name FastCut is misleading because it is actually slower than the original RandomContract algorithm, only the chance of successfully finding a min-cut is much better (improved from an ${\displaystyle \Omega (1/n^{2})}$ to an ${\displaystyle \Omega (1/\log n)}$).

Given any input multi-graph, repeatedly running the FastCut algorithm independently for some ${\displaystyle O((\log n)^{2})}$ times and returns the smallest cut ever returned, we have an algorithm which runs in time ${\displaystyle O(n^{2}\log ^{3}n)}$ and returns a min-cut with probability ${\displaystyle 1-O(1/n)}$, i.e. with high probability.

Recall that the running time of best known deterministic algorithm for min-cut on multi-graph is ${\displaystyle O(mn+n^{2}\log n)}$. On dense graph, the randomized algorithm greatly outperforms the best known deterministic algorithm.

# Max-Cut

The maximum cut problem, in short the max-cut problem, is defined as follows.

 Max-cut problem Input: an undirected graph ${\displaystyle G(V,E)}$; Output: a bipartition of ${\displaystyle V}$ into disjoint subsets ${\displaystyle S}$ and ${\displaystyle T}$ that maximizes ${\displaystyle |E(S,T)|}$.

The problem is a typical MAX-CSP, an optimization version of the constraint satisfaction problem. An instance of CSP consists of:

• a set of variables ${\displaystyle x_{1},x_{2},\ldots ,x_{n}}$ usually taking values from some finite domain;
• a sequence of constraints (predicates) ${\displaystyle C_{1},C_{2},\ldots ,C_{m}}$ defined on those variables.

The MAX-CSP asks to find an assignment of values to variables ${\displaystyle x_{1},x_{2},\ldots ,x_{n}}$ which maximizes the number of satisfied constraints.

In particular, when the variables ${\displaystyle x_{1},x_{2},\ldots ,x_{n}}$ takes Boolean values ${\displaystyle \{0,1\}}$ and every constraint is a binary constraint ${\displaystyle \cdot \neq \cdot }$ in the form of ${\displaystyle x_{1}\neq x_{j}}$, then the MAX-CSP is precisely the max-cut problem.

Unlike the min-cut problem, which can be solved in polynomial time, the max-cut is known to be NP-hard. Its decision version is among the 21 NP-complete problems found by Karp. This means we should not hope for a polynomial-time algorithm for solving the problem if a famous conjecture in computational complexity is correct. And due to another less famous conjecture in computational complexity, randomization alone probably cannot help this situation either.

We may compromise our goal and allow algorithm to not always find the optimal solution. However, we still want to guarantee that the algorithm always returns a relatively good solution on all possible instances. This notion is formally captured by approximation algorithms and approximation ratio.

## Greedy algorithm

A natural heuristics for solving the max-cut is to sequentially join the vertices to one of the two disjoint subsets ${\displaystyle S}$ and ${\displaystyle T}$ to greedily maximize the current number of edges crossing between ${\displaystyle S}$ and ${\displaystyle T}$.

To state the algorithm, we overload the definition ${\displaystyle E(S,T)}$. Given an undirected graph ${\displaystyle G(V,E)}$, for any disjoint subsets ${\displaystyle S,T\subseteq V}$ of vertices, we define

${\displaystyle E(S,T)=\{uv\in E\mid u\in S,v\in T\}}$.

We also assume that the vertices are ordered arbitrarily as ${\displaystyle V=\{v_{1},v_{2},\ldots ,v_{n}\}}$.

The greedy heuristics is then described as follows.

 GreedyMaxCut Input: undirected graph ${\displaystyle G(V,E)}$, with an arbitrary order of vertices ${\displaystyle V=\{v_{1},v_{2},\ldots ,v_{n}\}}$; initially ${\displaystyle S=T=\emptyset }$; for ${\displaystyle i=1,2,\ldots ,n}$ ${\displaystyle v_{i}}$ joins one of ${\displaystyle S,T}$ to maximize the current ${\displaystyle |E(S,T)|}$ (breaking ties arbitrarily);

The algorithm certainly runs in polynomial time.

Without any guarantee of how good the solution returned by the algorithm approximates the optimal solution, the algorithm is only a heuristics, not an approximation algorithm.

### Approximation ratio

For now we restrict ourselves to the max-cut problem, although the notion applies more generally.

Let ${\displaystyle G}$ be an arbitrary instance of max-cut problem. Let ${\displaystyle OPT_{G}}$ denote the size of the of max-cut in graph ${\displaystyle G}$. More precisely,

${\displaystyle OPT_{G}=\max _{S\subseteq V}|E(S,{\overline {S}})|}$.

Let ${\displaystyle SOL_{G}}$ be the size of of the cut ${\displaystyle |E(S,T)|}$ returned by the GreedyMaxCut algorithm on input graph ${\displaystyle G}$.

As a maximization problem it is trivial that ${\displaystyle SOL_{G}\leq OPT_{G}}$ for all ${\displaystyle G}$. To guarantee that the GreedyMaxCut gives good approximation of optimal solution, we need the other direction:

 Approximation ratio We say that the approximation ratio of the GreedyMaxCut algorithm is ${\displaystyle \alpha }$, or GreedyMaxCut is an ${\displaystyle \alpha }$-approximation algorithm, for some ${\displaystyle 0<\alpha \leq 1}$, if ${\displaystyle {\frac {SOL_{G}}{OPT_{G}}}\geq \alpha }$ for every possible instance ${\displaystyle G}$ of max-cut.

With this notion, we now try to analyze the approximation ratio of the GreedyMaxCut algorithm.

A dilemma to apply this notion in our analysis is that in the definition of approximation ratio, we compare the solution returned by the algorithm with the optimal solution. However, in the analysis we can hardly conduct similar comparisons to the optimal solutions. A fallacy in this logic is that the optimal solutions are NP-hard, meaning there is no easy way to calculate them (e.g. a closed form).

A popular step (usually the first step of analyzing approximation ratio) to avoid this dilemma is that instead of directly comparing to the optimal solution, we compare to an upper bound of the optimal solution (for minimization problem, this needs to be a lower bound), that is, we compare to something which is even better than the optimal solution (which means it cannot be realized by any feasible solution).

For the max-cut problem, a simple upper bound to ${\displaystyle OPT_{G}}$ is ${\displaystyle |E|}$, the number of all edges. This is a trivial upper bound of max-cut since any cut is a subset of edges.

Let ${\displaystyle G(V,E)}$ be the input graph and ${\displaystyle V=\{v_{1},v_{2},\ldots ,v_{n}\}}$. Initially ${\displaystyle S_{1}=T_{1}=\emptyset }$. And for ${\displaystyle i=1,2,\ldots ,n}$, we let ${\displaystyle S_{i+1}}$ and ${\displaystyle T_{i+1}}$ be the respective ${\displaystyle S}$ and ${\displaystyle T}$ after ${\displaystyle v_{i}}$ joins one of ${\displaystyle S,T}$. More precisely,

• ${\displaystyle S_{i+1}=S_{i}\cup \{v_{i}\}}$ and ${\displaystyle T_{i+1}=T_{i}\,}$ if ${\displaystyle E(S_{i}\cup \{v_{i}\},T_{i})>E(S_{i},T_{i}\cup \{v_{i}\})}$;
• ${\displaystyle S_{i+1}=S_{i}\,}$ and ${\displaystyle T_{i+1}=T_{i}\cup \{v_{i}\}}$ if otherwise.

Finally, the max-cut is given by

${\displaystyle SOL_{G}=|E(S_{n+1},T_{n+1})|.}$

We first observe that we can count the number of edges ${\displaystyle |E|}$ by summarizing the contributions of individual ${\displaystyle v_{i}}$'s.

 Proposition 1 ${\displaystyle |E|=\sum _{i=1}^{n}\left(|E(S_{i},\{v_{i}\})|+|E(T_{i},\{v_{i}\})|\right)}$.
Proof.
 Note that ${\displaystyle S_{i}\cup T_{i}=\{v_{1},v_{2},\ldots ,v_{i-1}\}}$, i.e. ${\displaystyle S_{i}}$ and ${\displaystyle T_{i}}$ together contain precisely those vertices preceding ${\displaystyle v_{i}}$. Therefore, by taking the sum ${\displaystyle \sum _{i=1}^{n}\left(|E(S_{i},\{v_{i}\})|+|E(T_{i},\{v_{i}\})|\right)}$, we effectively enumerate all ${\displaystyle (v_{j},v_{i})}$ that ${\displaystyle v_{j}v_{i}\in E}$ and ${\displaystyle j. The total number is precisely ${\displaystyle |E|}$.
${\displaystyle \square }$

We then observe that the ${\displaystyle SOL_{G}}$ can be decomposed into contributions of individual ${\displaystyle v_{i}}$'s in the same way.

 Proposition 2 ${\displaystyle SOL_{G}=\sum _{i=1}^{n}\max \left(|E(S_{i},\{v_{i}\})|,|E(T_{i},\{v_{i}\})|\right)}$.
Proof.
 It is east to observe that ${\displaystyle E(S_{i},T_{i})\subseteq E(S_{i+1},T_{i+1})}$, i.e. once an edge joins the cut between current ${\displaystyle S,T}$ it will never drop from the cut in the future. We then define ${\displaystyle \Delta _{i}=|E(S_{i+1},T_{i+1})|-|E(S_{i},T_{i})|=|E(S_{i+1},T_{i+1})\setminus E(S_{i},T_{i})|}$ to be the contribution of ${\displaystyle v_{i}}$ in the final cut. It holds that ${\displaystyle \sum _{i=1}^{n}\Delta _{i}=|E(S_{n+1},T_{n+1})|-|E(S_{1},T_{1})|=|E(S_{n+1},T_{n+1})|=SOL_{G}}$. On the other hand, due to the greedy rule: ${\displaystyle S_{i+1}=S_{i}\cup \{v_{i}\}}$ and ${\displaystyle T_{i+1}=T_{i}\,}$ if ${\displaystyle E(S_{i}\cup \{v_{i}\},T_{i})>E(S_{i},T_{i}\cup \{v_{i}\})}$; ${\displaystyle S_{i+1}=S_{i}\,}$ and ${\displaystyle T_{i+1}=T_{i}\cup \{v_{i}\}}$ if otherwise; it holds that ${\displaystyle \Delta _{i}=|E(S_{i+1},T_{i+1})\setminus E(S_{i},T_{i})|=\max \left(|E(S_{i},\{v_{i}\})|,|E(T_{i},\{v_{i}\})|\right)}$. Together the proposition follows.
${\displaystyle \square }$

Combining the above Proposition 1 and Proposition 2, we have

{\displaystyle {\begin{aligned}SOL_{G}&=\sum _{i=1}^{n}\max \left(|E(S_{i},\{v_{i}\})|,|E(T_{i},\{v_{i}\})|\right)\\&\geq {\frac {1}{2}}\sum _{i=1}^{n}\left(|E(S_{i},\{v_{i}\})|+|E(T_{i},\{v_{i}\})|\right)\\&={\frac {1}{2}}|E|\\&\geq {\frac {1}{2}}OPT_{G}.\end{aligned}}}
 Theorem The GreedyMaxCut is a ${\displaystyle 0.5}$-approximation algorithm for the max-cut problem.

This is not the best approximation ratio achieved by polynomial-time algorithms for max-cut.

• The best known approximation ratio achieved by any polynomial-time algorithm is achieved by the Goemans-Williamson algorithm, which relies on rounding an SDP relaxation of the max-cut, and achieves an approximation ratio ${\displaystyle \alpha ^{*}\approx 0.878}$, where ${\displaystyle \alpha ^{*}}$ is an irrational whose precise value is given by ${\displaystyle \alpha ^{*}={\frac {2}{\pi }}\inf _{x\in [-1,1]}{\frac {\arccos(x)}{1-x}}}$.
• Assuming the unique game conjecture, there does not exist any polynomial-time algorithm for max-cut with approximation ratio ${\displaystyle \alpha >\alpha ^{*}}$.

## Derandomization by conditional expectation

There is a probabilistic interpretation of the greedy algorithm, which may explains why we use greedy scheme for max-cut and why it works for finding an approximate max-cut.

Given an undirected graph ${\displaystyle G(V,E)}$, let us calculate the average size of cuts in ${\displaystyle G}$. For every vertex ${\displaystyle v\in V}$ let ${\displaystyle X_{v}\in \{0,1\}}$ be a uniform and independent random bit which indicates whether ${\displaystyle v}$ joins ${\displaystyle S}$ or ${\displaystyle T}$. This gives us a uniform random bipartition of ${\displaystyle V}$ into ${\displaystyle S}$ and ${\displaystyle T}$.

The size of the random cut ${\displaystyle |E(S,T)|}$ is given by

${\displaystyle |E(S,T)|=\sum _{uv\in E}I[X_{u}\neq X_{v}],}$

where ${\displaystyle I[X_{u}\neq X_{v}]}$ is the Boolean indicator random variable that indicates whether event ${\displaystyle X_{u}\neq X_{v}}$ occurs.

Due to linearity of expectation,

${\displaystyle \mathbb {E} [|E(S,T)|]=\sum _{uv\in E}\mathbb {E} [I[X_{u}\neq X_{v}]]=\sum _{uv\in E}\Pr[X_{u}\neq X_{v}]={\frac {|E|}{2}}.}$

Recall that ${\displaystyle |E|}$ is a trivial upper bound for the max-cut ${\displaystyle OPT_{G}}$. Due to the above argument, we have

${\displaystyle \mathbb {E} [|E(S,T)|]\geq {\frac {OPT_{G}}{2}}.}$
 In above argument we use a few probability propositions. linearity of expectation: Let ${\displaystyle {\boldsymbol {X}}=(X_{1},X_{2},\ldots ,X_{n})}$ be a random vector. Then ${\displaystyle \mathbb {E} \left[\sum _{i=1}^{n}c_{i}X_{i}\right]=\sum _{i=1}^{n}c_{i}\mathbb {E} [X_{i}]}$, where ${\displaystyle c_{1},c_{2},\ldots ,c_{n}}$ are scalars. That is, the order of computations of expectation and linear (affine) function of a random vector can be exchanged. Note that this property ignores the dependency between random variables, and hence is very useful. Expectation of indicator random variable: We usually use the notation ${\displaystyle I[A]}$ to represent the Boolean indicator random variable that indicates whether the event ${\displaystyle A}$ occurs: i.e. ${\displaystyle I[A]=1}$ if event ${\displaystyle A}$ occurs and ${\displaystyle I[A]=0}$ if otherwise. It is easy to see that ${\displaystyle \mathbb {E} [I[A]]=\Pr[A]}$. The expectation of an indicator random variable equals the probability of the event it indicates.

By above analysis, the average (under uniform distribution) size of all cuts in any graph ${\displaystyle G}$ must be at least ${\displaystyle {\frac {OPT_{G}}{2}}}$. Due to the probabilistic method, in particular the averaging principle, there must exists a bipartition of ${\displaystyle V}$ into ${\displaystyle S}$ and ${\displaystyle T}$ whose cut ${\displaystyle E(S,T)}$ is of size at least ${\displaystyle {\frac {OPT_{G}}{2}}}$. Then next question is how to find such a bipartition ${\displaystyle \{S,T\}}$ algorithmically.

We still fix an arbitrary order of all vertices as ${\displaystyle V=\{v_{1},v_{2},\ldots ,v_{n}\}}$. Recall that each vertex ${\displaystyle v_{i}}$ is associated with a uniform and independent random bit ${\displaystyle X_{v_{i}}}$ to indicate whether ${\displaystyle v_{i}}$ joins ${\displaystyle S}$ or ${\displaystyle T}$. We want to fix the value of ${\displaystyle X_{v_{i}}}$ one after another to construct a bipartition ${\displaystyle \{{\hat {S}},{\hat {T}}\}}$ of ${\displaystyle V}$ such that

${\displaystyle |E({\hat {S}},{\hat {T}})|\geq \mathbb {E} [|E(S,T)|]\geq {\frac {OPT_{G}}{2}}}$.

We start with the first vertex ${\displaystyle v_{i}}$ and its random variable ${\displaystyle X_{v_{1}}}$. By the law of total expectation,

${\displaystyle \mathbb {E} [E(S,T)]={\frac {1}{2}}\mathbb {E} [E(S,T)\mid X_{v_{1}}=0]+{\frac {1}{2}}\mathbb {E} [E(S,T)\mid X_{v_{1}}=1].}$

There must exist an assignment ${\displaystyle x_{1}\in \{0,1\}}$ of ${\displaystyle X_{v_{1}}}$ such that

${\displaystyle \mathbb {E} [E(S,T)\mid X_{v_{1}}=x_{1}]\geq \mathbb {E} [E(S,T)]}$.

We can continuously applying this argument. In general, for any ${\displaystyle i\leq n}$ and any particular partial assignment ${\displaystyle x_{1},x_{2},\ldots ,x_{i-1}\in \{0,1\}}$ of ${\displaystyle X_{v_{1}},X_{v_{2}},\ldots ,X_{v_{i-1}}}$, by the law of total expectation

{\displaystyle {\begin{aligned}\mathbb {E} [E(S,T)\mid X_{v_{1}}=x_{1},\ldots ,X_{v_{i-1}}=x_{i-1}]=&{\frac {1}{2}}\mathbb {E} [E(S,T)\mid X_{v_{1}}=x_{1},\ldots ,X_{v_{i-1}}=x_{i-1},X_{v_{i}}=0]\\&+{\frac {1}{2}}\mathbb {E} [E(S,T)\mid X_{v_{1}}=x_{1},\ldots ,X_{v_{i-1}}=x_{i-1},X_{v_{i}}=1].\end{aligned}}}

There must exist an assignment ${\displaystyle x_{i}\in \{0,1\}}$ of ${\displaystyle X_{v_{i}}}$ such that

${\displaystyle \mathbb {E} [E(S,T)\mid X_{v_{1}}=x_{1},\ldots ,X_{v_{i}}=x_{i}]\geq \mathbb {E} [E(S,T)\mid X_{v_{1}}=x_{1},\ldots ,X_{v_{i-1}}=x_{i-1}].}$

By this argument, we can find a sequence ${\displaystyle x_{1},x_{2},\ldots ,x_{n}\in \{0,1\}}$ of bits which forms a monotone path:

${\displaystyle \mathbb {E} [E(S,T)]\leq \cdots \leq \mathbb {E} [E(S,T)\mid X_{v_{1}}=x_{1},\ldots ,X_{v_{i-1}}=x_{i-1}]\leq \mathbb {E} [E(S,T)\mid X_{v_{1}}=x_{1},\ldots ,X_{v_{i}}=x_{i}]\leq \cdots \leq \mathbb {E} [E(S,T)\mid X_{v_{1}}=x_{1},\ldots ,X_{v_{n}}=x_{n}].}$

We already know the first step of this monotone path ${\displaystyle \mathbb {E} [E(S,T)]\geq {\frac {OPT_{G}}{2}}}$. And for the last step of the monotone path ${\displaystyle \mathbb {E} [E(S,T)\mid X_{v_{1}}=x_{1},\ldots ,X_{v_{n}}=x_{n}]}$ since all random bits have been fixed, a bipartition ${\displaystyle ({\hat {S}},{\hat {T}})}$ is determined by the assignment ${\displaystyle x_{1},\ldots ,x_{n}}$, so the expectation has no effect except just retuning the size of that cut ${\displaystyle |E({\hat {S}},{\hat {T}})|}$. We found the cut ${\displaystyle E({\hat {S}},{\hat {T}})}$ such that ${\displaystyle |E({\hat {S}},{\hat {T}})|\geq {\frac {OPT_{G}}{2}}}$.

We translate the procedure of constructing this monotone path of conditional expectation to the following algorithm.

 MonotonePath Input: undirected graph ${\displaystyle G(V,E)}$, with an arbitrary order of vertices ${\displaystyle V=\{v_{1},v_{2},\ldots ,v_{n}\}}$; initially ${\displaystyle S=T=\emptyset }$; for ${\displaystyle i=1,2,\ldots ,n}$ ${\displaystyle v_{i}}$ joins one of ${\displaystyle S,T}$ to maximize the average size of cut conditioning on the choices made so far by the vertices ${\displaystyle v_{1},v_{2},\ldots ,v_{i}}$;

We leave as an exercise to verify that the choice of each ${\displaystyle v_{i}}$ (to join which one of ${\displaystyle S,T}$) in the MonotonePath algorithm (which maximizes the average size of cut conditioning on the choices made so far by the vertices ${\displaystyle v_{1},v_{2},\ldots ,v_{i}}$) must be the same choice made by ${\displaystyle v_{i}}$ in the GreedyMaxCut algorithm (which maximizes the current ${\displaystyle |E(S,T)|}$).

Therefore, the greedy algorithm for max-cut is actually due to a derandomization of average-case.

## Derandomization by pairwise independence

We still construct a random bipartition of ${\displaystyle V}$ into ${\displaystyle S}$ and ${\displaystyle T}$. But this time the random choices have bounded independence.

For each vertex ${\displaystyle v\in V}$, we use a Boolean random variable ${\displaystyle Y_{v}\in \{0,1\}}$ to indicate whether ${\displaystyle v}$ joins ${\displaystyle S}$ and ${\displaystyle T}$. The dependencies between ${\displaystyle Y_{v}}$'s are to be specified later.

By linearity of expectation, regardless of the dependencies between ${\displaystyle Y_{v}}$'s, it holds that:

${\displaystyle \mathbb {E} [|E(S,T)|]=\sum _{uv\in E}\Pr[Y_{u}\neq Y_{v}].}$

In order to have the average cut ${\displaystyle \mathbb {E} [|E(S,T)|]={\frac {|E|}{2}}}$ as the fully random case, we need ${\displaystyle \Pr[Y_{u}\neq Y_{v}]={\frac {1}{2}}}$. This only requires that the Boolean random variables ${\displaystyle Y_{v}}$'s are uniform and pairwise independent instead of being mutually independent.

The ${\displaystyle n}$ pairwise independent random bits ${\displaystyle \{Y_{v}\}_{v\in V}}$ can be constructed by at most ${\displaystyle k=\lceil \log(n+1)\rceil }$ mutually independent random bits ${\displaystyle X_{1},X_{2},\ldots ,X_{k}\in \{0,1\}}$ by the following standard routine.

 Theorem Let ${\displaystyle X_{1},X_{2},\ldots ,X_{k}\in \{0,1\}}$ be mutually independent uniform random bits. Let ${\displaystyle S_{1},S_{2},\ldots ,S_{2^{k}-1}\subseteq \{1,2,\ldots ,k\}}$ enumerate the ${\displaystyle 2^{k}-1}$ nonempty subsets of ${\displaystyle \{1,2,\ldots ,k\}}$. For each ${\displaystyle i\leq i\leq 2^{k}-1}$, let ${\displaystyle Y_{i}=\bigoplus _{j\in S_{i}}X_{j}=\left(\sum _{j\in S_{i}}X_{j}\right){\bmod {2}}.}$ Then ${\displaystyle Y_{1},Y_{2},\ldots ,Y_{2^{k}-1}}$ are pairwise independent uniform random bits.

If ${\displaystyle Y_{v}}$ for each vertex ${\displaystyle v\in V}$ is constructed in this way by at most ${\displaystyle k=\lceil \log(n+1)\rceil }$ mutually independent random bits ${\displaystyle X_{1},X_{2},\ldots ,X_{k}\in \{0,1\}}$, then they are uniform and pairwise independent, which by the above calculation, it holds for the corresponding bipartition ${\displaystyle \{S,T\}}$ of ${\displaystyle V}$ that

${\displaystyle \mathbb {E} [|E(S,T)|]=\sum _{uv\in E}\Pr[Y_{u}\neq Y_{v}]={\frac {|E|}{2}}.}$

Note that the average is taken over the random choices of ${\displaystyle X_{1},X_{2},\ldots ,X_{k}\in \{0,1\}}$ (because they are the only random choices used to construct the bipartition ${\displaystyle \{S,T\}}$). By the probabilistic method, there must exist an assignment of ${\displaystyle X_{1},X_{2},\ldots ,X_{k}\in \{0,1\}}$ such that the corresponding ${\displaystyle Y_{v}}$'s and the bipartition ${\displaystyle \{S,T\}}$ of ${\displaystyle V}$ indicated by the ${\displaystyle Y_{v}}$'s have that

${\displaystyle |E(S,T)|\geq {\frac {|E|}{2}}\geq {\frac {OPT}{2}}}$.

This gives us the following algorithm for exhaustive search in a smaller solution space of size ${\displaystyle 2^{k}-1=O(n^{2})}$.

 Algorithm Enumerate vertices as ${\displaystyle V=\{v_{1},v_{2},\ldots ,v_{n}\}}$; let ${\displaystyle k=\lceil \log(n+1)\rceil }$; for all ${\displaystyle {\vec {x}}\in \{0,1\}^{k}}$ initialize ${\displaystyle S_{\vec {x}}=T_{\vec {x}}=\emptyset }$; for ${\displaystyle i=1,2,\ldots ,n}$ if ${\displaystyle \bigoplus _{j:\lfloor i/2^{j}\rfloor {\bmod {2}}=1}x_{j}=1}$ then ${\displaystyle v_{i}}$ joins ${\displaystyle S_{\vec {x}}}$; else ${\displaystyle v_{i}}$ joins ${\displaystyle T_{\vec {x}}}$; return the ${\displaystyle \{S_{\vec {x}},T_{\vec {x}}\}}$ with the largest ${\displaystyle |E(S_{\vec {x}},T_{\vec {x}})|}$;

The algorithm has approximation ratio 1/2 and runs in polynomial time.