组合数学 (Spring 2015)/Existence problems and 随机算法 (Fall 2015)/Randomized rounding: Difference between pages

From TCS Wiki
(Difference between pages)
Jump to navigation Jump to search
imported>Etone
(Created page with "== Existence by Counting == === Shannon's circuit lower bound=== This is a fundamental problem in in Computer Science. A '''boolean function''' is a function in the form <mat...")
 
imported>Etone
(Created page with "= MAX-SAT= Suppose that we have a number of boolean variables <math>x_1,x_2,\ldots,\in\{\mathrm{true},\mathrm{false}\}</math>. A '''literal''' is either a variable <math>x_i</...")
 
Line 1: Line 1:
== Existence by Counting ==
= MAX-SAT=
=== Shannon's circuit lower bound===
Suppose that we have a number of boolean variables <math>x_1,x_2,\ldots,\in\{\mathrm{true},\mathrm{false}\}</math>. A '''literal''' is either a variable <math>x_i</math> itself or its negation <math>\neg x_i</math>. A logic expression is a '''conjunctive normal form (CNF)''' if it is written as the conjunction(AND) of a set of '''clauses''', where each clause is a disjunction(OR) of literals. For example:
This is a fundamental problem in in Computer Science.
:<math>
(x_1\vee \neg x_2 \vee \neg x_3)\wedge (\neg x_1\vee \neg x_3)\wedge (x_1\vee x_2\vee x_4)\wedge (x_4\vee \neg x_3)\wedge (x_4\vee \neg x_1).
</math>


A '''boolean function''' is a function in the form <math>f:\{0,1\}^n\rightarrow \{0,1\}</math>.
The '''satisfiability (SAT)''' problem is that given as input a CNF formula decide whether the CNF is satisfiable, i.e. there exists an assignment of variables to the values of true and false so that all clauses are true. SAT is the first problem known to be '''NP-complete''' (the Cook-Levin theorem).  


[http://en.wikipedia.org/wiki/Boolean_circuit Boolean circuit] is a mathematical model of computation.
We consider the the optimization version of SAT, which ask for an assignment that the number of satisfied clauses is maximized.
Formally, a boolean circuit is a directed acyclic graph. Nodes with indegree zero are input nodes, labeled <math>x_1, x_2, \ldots , x_n</math>. A circuit has a unique node with outdegree zero, called the output node. Every other node is a gate. There are three types of gates: AND, OR (both with indegree two), and NOT (with indegree one).
{{Theorem
 
|Problem (MAX-SAT)|
Computations in Turing machines can be simulated by circuits, and any boolean function in '''P''' can be computed by a circuit with polynomially many gates. Thus, if we can find a function in '''NP''' that cannot be computed by any circuit with polynomially many gates, then '''NP'''<math>\neq</math>'''P'''.
:Given a conjunctive normal form (CNF) formula of <math>m</math> clauses defined on <math>n</math> boolean variables <math>x_1,x_2,\ldots,x_n</math>, find a truth assignment to the boolean variables that maximizes the number of satisfied clauses.
 
}}
The following theorem due to Shannon says that functions with exponentially large circuit complexity do exist.


==The Probabilistic Method ==
A straightforward way to solve Max-SAT is to uniformly and independently assign each variable a random truth assignment. The following theorem is proved by the probabilistic method.
{{Theorem
{{Theorem
|Theorem (Shannon 1949)|
|Theorem|
:There is a boolean function <math>f:\{0,1\}^n\rightarrow \{0,1\}</math> with circuit complexity greater than <math>\frac{2^n}{3n}</math>.
:For any set of <math>m</math> clauses, there is a truth assignment that satisfies at least <math>\frac{m}{2}</math> clauses.
}}
}}
{{Proof|  
{{Proof| For each variable, independently assign a random value in <math>\{\mathrm{true},\mathrm{false}\}</math> with equal probability. For the <math>i</math>th clause, let <math>X_i</math> be the random variable which indicates whether the <math>i</math>th clause is satisfied. Suppose that there are <math>k</math> literals in the clause. The probability that the clause is satisfied is  
We first count the number of boolean functions <math>f:\{0,1\}^n\rightarrow \{0,1\}</math>. There are <math>2^{2^n}</math> boolean functions <math>f:\{0,1\}^n\rightarrow \{0,1\}</math>.
:<math>\Pr[X_k=1]\ge(1-2^{-k})\ge\frac{1}{2}</math>.
 
Then we count the number of boolean circuit with fixed number of gates.
Fix an integer <math>t</math>, we count the number of circuits with <math>t</math> gates. By the [http://en.wikipedia.org/wiki/De_Morgan's_laws De Morgan's laws], we can assume that all NOTs are pushed back to the inputs. Each gate has one of the two types (AND or OR), and has two inputs. Each of the inputs to a gate is either a constant 0 or 1, an input variable <math>x_i</math>, an inverted input variable <math>\neg x_i</math>, or the output of another gate; thus, there are at most <math>2+2n+t-1</math> possible gate inputs. It follows that the number of circuits with <math>t</math> gates is at most <math>2^t(t+2n+1)^{2t}</math>.  


If <math>t=2^n/3n</math>, then
Let <math>X=\sum_{i=1}^m X_i</math> be the number of satisfied clauses. By the linearity of expectation,
:<math>
:<math>
\frac{2^t(t+2n+1)^{2t}}{2^{2^n}}=o(1)<1,</math>      thus, <math>2^t(t+2n+1)^{2t} < 2^{2^n}.</math>
\mathbf{E}[X]=\sum_{i=1}^{m}\mathbf{E}[X_i]\ge \frac{m}{2}.
 
</math>
Each boolean circuit computes one boolean function. Therefore, there must exist a boolean function <math>f</math> which cannot be computed by any circuits with <math>2^n/3n</math> gates.
Therefore, there exists an assignment such that at least <math>\frac{m}{2}</math> clauses are satisfied.
}}
}}


Note that by Shannon's theorem, not only there exists a boolean function with exponentially large circuit complexity, but ''almost all'' boolean functions have exponentially large circuit complexity.
Note that this gives a randomized algorithm which returns a truth assignment satisfying at least <math>\frac{m}{2}</math> clauses in expectation. There are totally <math>m</math> clauses, thus the optimal solution is at most <math>m</math>, which means that this simple randomized algorithm is a <math>\frac{1}{2}</math>-approximation algorithm for the MAX-CUT problem.


=== Double counting ===
== LP Relaxation + Randomized Rounding ==
The double counting principle states the following obvious fact: if the elements of a set are counted in two different ways, the answers are the same.
For a clause <math>C_j</math>, let <math>C_i^+</math> be the set of indices of the variables that appear in the uncomplemented form in clause <math>C_j</math>, and let <math>C_i^-</math> be the set of indices of the variables that appear in the complemented form in clause <math>C_j</math>. The Max-SAT problem can be formulated as the following integer linear programing.
==== Handshaking lemma ====
The following lemma is a standard demonstration of double counting.
{{Theorem|Handshaking Lemma|
:At a party, the number of guests who shake hands an odd number of times is even.
}}


We model this scenario as an undirected graph <math>G(V,E)</math> with <math>|V|=n</math> standing for the <math>n</math> guests. There is an edge <math>uv\in E</math> if <math>u</math> and <math>v</math> shake hands. Let <math>d(v)</math> be the degree of vertex <math>v</math>, which represents the number of times that <math>v</math> shakes hand. The handshaking lemma states that in any undirected graph, the number of vertices whose degrees are odd is even. It is sufficient to show that the sum of odd degrees is even.
:<math>
\begin{align}
\mbox{maximize} &\quad \sum_{j=1}^m z_j\\
\mbox{subject to} &\quad \sum_{i\in C_j^+}y_i+\sum_{i\in C_j^-}(1-y_i) \ge z_j, &&\forall 1\le j\le m\\
&\qquad\qquad y_i \in\{0,1\}, &&\forall 1\le i\le n \\
&\qquad\qquad z_j \in\{0,1\}, &&\forall 1\le j\le m
\end{align}
</math>
Each <math>y_i</math> in the programing indicates the truth assignment to the variable <math>x_i</math>, and each <math>z_j</math> indicates whether the claus <math>C_j</math> is satisfied. The inequalities ensure that a clause is deemed to be true only if at least one of the literals in the clause is assigned the value 1.


The handshaking lemma is a direct consequence of the following lemma, which is proved by Euler in his 1736 paper on [http://en.wikipedia.org/wiki/Seven_Bridges_of_K%C3%B6nigsberg Seven Bridges of Königsberg] that began the study of graph theory.
The integer linear programming is relaxed to the following linear programming:
:<math>
\begin{align}
\mbox{maximize} &\quad \sum_{j=1}^m z_j\\
\mbox{subject to} &\quad \sum_{i\in C_j^+}y_i+\sum_{i\in C_j^-}(1-y_i) \ge z_j, &&\forall 1\le j\le m\\
&\qquad\qquad 0\le y_i\le 1, &&\forall 1\le i\le n \\
&\qquad\qquad 0\le z_j\le 1, &&\forall 1\le j\le m
\end{align}
</math>


{{Theorem|Lemma (Euler 1736)|
Let <math>y_i^*</math> and <math>z_j^*</math> be the fractional optimal solutions to the above linear programming. Clearly, <math>\sum_{j=1}^mz_j^*</math> is an upper bound on the optimal number of satisfied clauses, i.e. we have
:<math>\sum_{v\in V}d(v)=2|E|</math>
:<math>\mathrm{OPT}\le\sum_{j=1}^mz_j^*</math>.
}}
{{Proof|
We count the number of '''directed''' edges. A directed edge is an ordered pair <math>(u,v)</math> such that <math>\{u,v\}\in E</math>. There are two ways to count the directed edges.


First, we can enumerate by edges. Pick every edge <math>uv\in E</math> and apply two directions <math>(u,v)</math> and <math>(v,u)</math> to the edge. This gives us <math>2|E|</math> directed edges.
Apply a very natural randomized rounding scheme. For each <math>1\le i\le n</math>, independently
:<math>y_i
=\begin{cases}
1 & \mbox{with probability }y_i^*.\\
0 & \mbox{with probability }1-y_i^*.
\end{cases}
</math>
Correspondingly, each <math>x_i</math> is assigned to <tt>TRUE</tt> independently with probability <math>y_i^*</math>.  


On the other hand, we can enumerate by vertices. Pick every vertex <math>v\in V</math> and for each of its <math>d(v)</math> neighbors, say <math>u</math>, generate a directed edge <math>(v,u)</math>. This gives us <math>\sum_{v\in V}d(v)</math> directed edges.
{{Theorem
 
|Lemma|
It is obvious that the two terms are equal, since we just count the same thing twice with different methods. The lemma follows.
: Let <math>C_j</math> be a clause with <math>k</math> literals. The probability that it is satisfied by randomized rounding is at least
::<math>(1-(1-1/k)^k)z_j^*</math>.
}}
}}
{{Proof| Without loss of generality, we assume that all <math>k</math> variables appear in <math>C_j</math> in the uncomplemented form, and we assume that
:<math>C_j=x_1\vee x_2\vee\cdots\vee x_k</math>.
The complemented cases are symmetric.


The handshaking lemma is implied directly by the above lemma, since the sum of even degrees is even.
Clause <math>C_j</math> remains unsatisfied by randomized rounding only if every one of <math>x_i</math>, <math>1\le i\le k</math>, is assigned to <tt>FALSE</tt>, which corresponds to that every one of <math>y_i</math>, <math>1\le i\le k</math>, is rounded to 0. This event occurs with probability <math>\prod_{i=1}^k(1-y_i^*)</math>. Therefore, the clause <math>C_j</math> is satisfied by the randomized rounding with probability
==== Sperner's lemma ====
:<math>1-\prod_{i=1}^k(1-y_i^*)</math>.
A '''triangulation''' of a triangle <math>abc</math> is a decomposition of <math>abc</math> to small triangles (called ''cells''), such that any two different cells are either disjoint, or share an edge, or a vertex.


A '''proper coloring''' of a triangulation of triangle <math>abc</math> is a coloring of all vertices in the triangulation with three colors: <font color=red>red</font>, <font color=blue>blue</font>, and <font color=green>green</font>, such that the following constraints are satisfied:
By the linear programming constraints,
* The three vertices <math>a,b</math>, and <math>c</math> of the big triangle receive all three colors.
:<math>y_1^*+y_2^*+\cdots+y_k^*\ge z_j^*</math>.
* The vertices in each of the three lines <math>ab</math>, <math>bc</math>, and <math>ac</math> receive two colors.  


The following figure is an example of a properly colored triangulation.
Then the value of <math>1-\prod_{i=1}^k(1-y_i^*)</math> is minimized when all <math>y_i^*</math> are equal and <math>y_i^*=\frac{z_j^*}{k}</math>. Thus, the probability that <math>C_j</math> is satisfied is
[[Image:sperner-triangle.png|260px|center]]
:<math>1-\prod_{i=1}^k(1-y_i^*)\ge 1-(1-z_j^*/k)^k\ge (1-(1-1/k)^k)z_j^*</math>,
 
where the last inequality is due to the concaveness of the function <math>1-(1-z_j^*/k)^k</math> of variable <math>z_j^*</math>.
In 1928 young Emanuel Sperner gave a combinatorial proof of the famous Brouwer's fixed point theorem by proving the following lemma (now called Sperner's lemma), with an extremely elegant proof.
{{Theorem|Sperner's Lemma (1928)|
:For any properly colored triangulation, there exists a cell receiving all three colors.
}}
}}
{{Proof|
The proof is done by appropriately constructing a dual graph of the triangulation.


The dual graph is defined as follows:
For any <math>k\ge 1</math>, it holds that <math>1-(1-1/k)^k>1-1/e</math>. Therefore, by the linearity of expectation, the expected number of satisfied clauses by the randomized rounding, is at least
* Each cell in the triangulation corresponds to a distinct vertex in the dual graph.
:<math>(1-1/e)\sum_{j=1}z_j^*\ge (1-1/e)\cdot\mathrm{OPT}</math>.
* The outer space corresponds to a distinct vertex in the dual graph.
The inequality is due to the fact that <math>\hat{z}_j</math> are the optimal fractional solutions to the relaxed LP, thus are no worse than the optimal integral solutions.
* An edge is added between two vertices in the dual graph if the corresponding cells share a <math>{\color{Red}\mbox{red}}\mbox{--}{\color{Blue}\mbox{blue}}</math> edge.


The following is an example of the dual graph of a properly colored triangulation:
== Choose a better solution ==
[[Image:sperner-dual.png|260px|center]]
For any instance of the Max-SAT, let <math>m_1</math> be the expected number of satisfied clauses when each variable is independently set to <tt>TRUE</tt> with probability <math>\frac{1}{2}</math>; and let <math>m_2</math> be the expected number of satisfied clauses when we use the linear programming followed by randomized rounding.


For vertices in the dual graph:
We will show that on any instance  of the Max-SAT, one of the two algorithms is a <math>\frac{3}{4}</math>-approximation algorithm.
* If a cell receives all three colors, the corresponding vertex in the dual graph has degree 1;
{{Theorem
* if a cell receives only <math>{\color{Red}\mbox{red}}</math> and <math>{\color{Blue}\mbox{blue}}</math>, the corresponding vertex has degree 2;
|Theorem|
* for all other cases (the cell is monochromatic, or does not have blue or red), the corresponding vertex has degree 0.
:<math>\max\{m_1,m_2\}\ge\frac{3}{4}\cdot\mathrm{OPT}.</math>
 
Besides, the unique vertex corresponding to the outer space must have odd degree, since the number of <math>{\color{Red}\mbox{red}}\mbox{--}{\color{Blue}\mbox{blue}}</math> transitions between a <math>{\color{Red}\mbox{red}}</math> endpoint and a <math>{\color{Blue}\mbox{blue}}</math> endpoint must be odd.
 
By handshaking lemma, the number of odd-degree vertices in the dual graph is even, thus the number of cells receiving all three colors must be odd, which cannot be zero.
}}
 
== The Pigeonhole Principle ==
The '''pigeonhole principle''' states the following "obvious" fact:
:''<math>n+1</math> pigeons cannot sit in <math>n</math> holes so that every pigeon is alone in its hole.''
This is one of the oldest '''non-constructive''' principles: it states only the ''existence'' of a pigeonhole with more than one pigeons and says nothing about how to ''find'' such a pigeonhole.
 
The general form of pigeonhole principle, also known as the '''averaging principle''', is stated as follows.
{{Theorem|Generalized pigeonhole principle|
:If a set consisting of more than <math>mn</math> objects is partitioned into <math>n</math> classes, then some class receives more than <math>m</math> objects.
}}
 
=== Inevitable divisors ===
The following is one of Erdős' favorite initiation questions to mathematics. The proof uses the Pigeonhole Principle.
 
{{Theorem|Theorem|
:For any subset <math>S\subseteq\{1,2,\ldots,2n\}</math> of size <math>|S|>n\,</math>, there are two numbers <math>a,b\in S</math> such that <math>a|b\,</math>.
}}
{{Proof|
For every odd number <math>m\in\{1,2,\ldots,2n\}</math>, let
:<math>C_m=\{2^km\mid k\ge 0, 2^km\le 2n\}</math>.
It is easy to see that for any <math>b<a</math> from the same <math>C_m</math>, it holds that <math>a|b</math>.
 
Every number <math>a\in S</math> can be uniquely represented as <math>a=2^km</math> for some odd number <math>m</math>, thus belongs to exactly one of <math>C_m</math>, for odd <math>m\in\{1,2,\ldots, 2n\}</math>.  There are <math>n</math> odd numbers in <math>\{1,2,\ldots,2n\}</math>, thus <math>n</math> different <math>C_m</math>, but <math>|S|>n</math>, thus there must exist distinct <math>a,b\in S</math>, supposed that <math>b<a</math>, belonging to the same <math>C_m</math>, which implies that <math>a|b</math>.
}}
 
=== Monotonic subsequences ===
Let <math>(a_1,a_2,\ldots,a_n)</math> be a sequence of <math>n</math> distinct real numbers. A '''subsequence''' is a sequence of distinct terms of <math>(a_1,a_2,\ldots,a_n)</math> appearing in the same order in which they appear in <math>(a_1,a_2,\ldots,a_n)</math>. Formally, a subsequence of <math>(a_1,a_2,\ldots,a_n)</math> is an <math>(a_{i_1},a_{i_2},\ldots,a_{i_k})</math>, with <math>i_1<i_2<\cdots<i_k</math>.
 
A sequence <math>(a_1,a_2,\ldots,a_n)</math> is '''increasing''' if <math>a_1<a_2<\cdots<a_n</math>, and '''decreasing''' if <math>a_1>a_2>\cdots>a_n</math>.
 
We are interested in the ''longest'' increasing and decreasing subsequences of an <math>a_1<a_2<\cdots<a_n</math>. It is intuitive that the length of both the longest increasing subsequence and the longest decreasing subsequence cannot be small simultaneously. A famous result of Erdős and Szekeres formally justifies this intuition. This is one of the first results in extremal combinatorics, published in the influential 1935 paper of Erdős and Szekeres.
 
{{Theorem|Theorem (Erdős-Szekeres 1935)|
:A sequence of more than <math>mn</math> different real numbers must contain either an increasing subsequence of length <math>m+1</math>, or a decreasing subsequence of length <math>n+1</math>.
}}
{{Proof|(due to Seidenberg 1959)
Let <math>(a_1,a_2,\ldots,a_{N})</math> be the original sequence of <math>N>mn</math> distinct real numbers. Associate each <math>a_i</math> a pair <math>(x_i,y_i)</math>, defined as:
*<math>x_i</math>: the length of the longest ''increasing'' subsequence ''ending'' at <math>a_i</math>;
*<math>y_i</math>: the length of the longest ''decreasing'' subsequence ''starting'' at <math>a_i</math>.
A key observation is that <math>(x_i,y_i)\neq (x_j,y_j)</math> whenever <math>i\neq j</math>. This is proved as follows:
: '''Case 1:''' If <math>a_i<a_j</math>, then the longest increasing subsequence ending at <math>a_i</math> can be extended by adding on <math>a_j</math>, so <math>x_i<x_j</math>.
: '''Case 2:'''  If <math>a_i>a_j</math>, then the longest decreasing subsequence starting at <math>a_j</math> can be preceded by <math>a_i</math>, so <math>y_i>y_j</math>.
Now we put <math>N</math> "pigeons" <math>a_1,a_2,\ldots,a_N</math> into "pigeonholes" <math>\{1,2,\ldots,N\}\times\{1,2,\ldots,N\}</math>, such that <math>a_i</math> is put into hole <math>(x_i,y_i)</math>, with at most one pigeon per each hole (since different <math>a_i</math> has different <math>(x_i,y_i)</math>).
 
The number of pigeons is <math>N>mn</math>. Due to pigeonhole principle, there must be a pigeon which is outside the region <math>\{1,2,\ldots,m\}\times\{1,2,\ldots,n\}</math>, which implies that there exists an <math>a_i</math> with either <math>x_i>m</math> or <math>y_i>n</math>. Due to our definition of <math>(x_i,y_i)</math>, there must be either an increasing subsequence of length <math>m+1</math>, or a decreasing subsequence of length <math>n+1</math>.
}}
 
=== Dirichlet's approximation ===
Let <math>x</math> be an irrational number. We now want to approximate <math>x</math> be a rational number (a fraction).
 
Since every real interval <math>[a,b]</math> with <math>a<b</math> contains infinitely many rational numbers, there must exist rational numbers arbitrarily close to <math>x</math>. The trick is to let the denominator of the fraction sufficiently large.
 
Suppose however we restrict the rationals we may select to have denominators bounded by <math>n</math>. How closely we can approximate <math>x</math> now?
 
The following important theorem is due to Dirichlet and his ''Schubfachprinzip'' ("drawer principle"). The theorem is fundamental in numer theory and real analysis, but the proof is combinatorial.
 
{{Theorem|Theorem (Dirichlet 1879)|
:Let <math>x</math> be an irrational number. For any natural number <math>n</math>, there is a rational number <math>\frac{p}{q}</math> such that <math>1\le q\le n</math> and
::<math>\left|x-\frac{p}{q}\right|<\frac{1}{nq}</math>.
}}
}}
{{Proof|
{{Proof|It suffices to show that <math>\frac{(m_1+m_2)}{2}\ge\frac{3}{4}\sum_{j=1}^m z_j^*</math>. Letting <math>S_k</math> denote the set of clauses that contain <math>k</math> literals, we know that
Let <math>\{x\}=x-\lfloor x\rfloor</math> denote the '''fractional part''' of the real number <math>x</math>. It is obvious that <math>\{x\}\in[0,1)</math> for any real number <math>x</math>.
:<math>m_1=\sum_{k=1}^n\sum_{C_j\in S_k}(1-2^{-k})\ge\sum_{k=1}^n\sum_{C_j\in S_k}(1-2^{-k}) z_j^*.</math>
 
By the analysis of randomized rounding,
Consider the <math>n+1</math> numbers <math>\{kx\}</math>, <math>k=1,2,\ldots,n+1</math>. These <math>n+1</math> numbers (pigeons) belong to the following <math>n</math> intervals (pigeonholes):
:<math>m_2\ge\sum_{k=1}^n\sum_{C_j\in S_k}(1-(1-1/k)^k) z_j^*.</math>
:<math>\left(0,\frac{1}{n}\right),\left(\frac{1}{n},\frac{2}{n}\right),\ldots,\left(\frac{n-1}{n},1\right)</math>.
Thus
Since <math>x</math> is irrational, <math>\{kx\}</math> cannot coincide with any endpoint of the above intervals.
:<math>\frac{(m_1+m_2)}{2}\ge \sum_{k=1}^n\sum_{C_j\in S_k}
 
\frac{1-2^{-k}+1-(1-1/k)^k}{2} z_j^*.</math>
By the pigeonhole principle, there exist <math>1\le a<b\le n+1</math>, such that <math>\{ax\},\{bx\}</math> are in the same interval, thus
An easy calculation shows that <math>\frac{1-2^{-k}+1-(1-1/k)^k}{2}\ge\frac{3}{4}</math> for any <math>k</math>, so that we have
:<math>|\{bx\}-\{ax\}|<\frac{1}{n}</math>.
:<math>\frac{(m_1+m_2)}{2}\ge \frac{3}{4}\sum_{k=1}^n\sum_{C_j\in S_k}z_j^*=\frac{3}{4}\sum_{j=1}^m z_j^*\ge \frac{3}{4}\cdot\mathrm{OPT}.</math>
Therefore,
:<math>|(b-a)x-\left(\lfloor bx\rfloor-\lfloor ax\rfloor\right)|<\frac{1}{n}</math>.
Let <math>q=b-a</math> and <math>p=\lfloor bx\rfloor-\lfloor ax\rfloor</math>. We have <math>|qx-p|<\frac{1}{n}</math> and <math>1\le q\le n</math>. Dividing both sides by <math>q</math>, the theorem is proved.
}}
}}

Latest revision as of 05:44, 13 November 2015

MAX-SAT

Suppose that we have a number of boolean variables [math]\displaystyle{ x_1,x_2,\ldots,\in\{\mathrm{true},\mathrm{false}\} }[/math]. A literal is either a variable [math]\displaystyle{ x_i }[/math] itself or its negation [math]\displaystyle{ \neg x_i }[/math]. A logic expression is a conjunctive normal form (CNF) if it is written as the conjunction(AND) of a set of clauses, where each clause is a disjunction(OR) of literals. For example:

[math]\displaystyle{ (x_1\vee \neg x_2 \vee \neg x_3)\wedge (\neg x_1\vee \neg x_3)\wedge (x_1\vee x_2\vee x_4)\wedge (x_4\vee \neg x_3)\wedge (x_4\vee \neg x_1). }[/math]

The satisfiability (SAT) problem is that given as input a CNF formula decide whether the CNF is satisfiable, i.e. there exists an assignment of variables to the values of true and false so that all clauses are true. SAT is the first problem known to be NP-complete (the Cook-Levin theorem).

We consider the the optimization version of SAT, which ask for an assignment that the number of satisfied clauses is maximized.

Problem (MAX-SAT)
Given a conjunctive normal form (CNF) formula of [math]\displaystyle{ m }[/math] clauses defined on [math]\displaystyle{ n }[/math] boolean variables [math]\displaystyle{ x_1,x_2,\ldots,x_n }[/math], find a truth assignment to the boolean variables that maximizes the number of satisfied clauses.

The Probabilistic Method

A straightforward way to solve Max-SAT is to uniformly and independently assign each variable a random truth assignment. The following theorem is proved by the probabilistic method.

Theorem
For any set of [math]\displaystyle{ m }[/math] clauses, there is a truth assignment that satisfies at least [math]\displaystyle{ \frac{m}{2} }[/math] clauses.
Proof.
For each variable, independently assign a random value in [math]\displaystyle{ \{\mathrm{true},\mathrm{false}\} }[/math] with equal probability. For the [math]\displaystyle{ i }[/math]th clause, let [math]\displaystyle{ X_i }[/math] be the random variable which indicates whether the [math]\displaystyle{ i }[/math]th clause is satisfied. Suppose that there are [math]\displaystyle{ k }[/math] literals in the clause. The probability that the clause is satisfied is
[math]\displaystyle{ \Pr[X_k=1]\ge(1-2^{-k})\ge\frac{1}{2} }[/math].

Let [math]\displaystyle{ X=\sum_{i=1}^m X_i }[/math] be the number of satisfied clauses. By the linearity of expectation,

[math]\displaystyle{ \mathbf{E}[X]=\sum_{i=1}^{m}\mathbf{E}[X_i]\ge \frac{m}{2}. }[/math]

Therefore, there exists an assignment such that at least [math]\displaystyle{ \frac{m}{2} }[/math] clauses are satisfied.

[math]\displaystyle{ \square }[/math]

Note that this gives a randomized algorithm which returns a truth assignment satisfying at least [math]\displaystyle{ \frac{m}{2} }[/math] clauses in expectation. There are totally [math]\displaystyle{ m }[/math] clauses, thus the optimal solution is at most [math]\displaystyle{ m }[/math], which means that this simple randomized algorithm is a [math]\displaystyle{ \frac{1}{2} }[/math]-approximation algorithm for the MAX-CUT problem.

LP Relaxation + Randomized Rounding

For a clause [math]\displaystyle{ C_j }[/math], let [math]\displaystyle{ C_i^+ }[/math] be the set of indices of the variables that appear in the uncomplemented form in clause [math]\displaystyle{ C_j }[/math], and let [math]\displaystyle{ C_i^- }[/math] be the set of indices of the variables that appear in the complemented form in clause [math]\displaystyle{ C_j }[/math]. The Max-SAT problem can be formulated as the following integer linear programing.

[math]\displaystyle{ \begin{align} \mbox{maximize} &\quad \sum_{j=1}^m z_j\\ \mbox{subject to} &\quad \sum_{i\in C_j^+}y_i+\sum_{i\in C_j^-}(1-y_i) \ge z_j, &&\forall 1\le j\le m\\ &\qquad\qquad y_i \in\{0,1\}, &&\forall 1\le i\le n \\ &\qquad\qquad z_j \in\{0,1\}, &&\forall 1\le j\le m \end{align} }[/math]

Each [math]\displaystyle{ y_i }[/math] in the programing indicates the truth assignment to the variable [math]\displaystyle{ x_i }[/math], and each [math]\displaystyle{ z_j }[/math] indicates whether the claus [math]\displaystyle{ C_j }[/math] is satisfied. The inequalities ensure that a clause is deemed to be true only if at least one of the literals in the clause is assigned the value 1.

The integer linear programming is relaxed to the following linear programming:

[math]\displaystyle{ \begin{align} \mbox{maximize} &\quad \sum_{j=1}^m z_j\\ \mbox{subject to} &\quad \sum_{i\in C_j^+}y_i+\sum_{i\in C_j^-}(1-y_i) \ge z_j, &&\forall 1\le j\le m\\ &\qquad\qquad 0\le y_i\le 1, &&\forall 1\le i\le n \\ &\qquad\qquad 0\le z_j\le 1, &&\forall 1\le j\le m \end{align} }[/math]

Let [math]\displaystyle{ y_i^* }[/math] and [math]\displaystyle{ z_j^* }[/math] be the fractional optimal solutions to the above linear programming. Clearly, [math]\displaystyle{ \sum_{j=1}^mz_j^* }[/math] is an upper bound on the optimal number of satisfied clauses, i.e. we have

[math]\displaystyle{ \mathrm{OPT}\le\sum_{j=1}^mz_j^* }[/math].

Apply a very natural randomized rounding scheme. For each [math]\displaystyle{ 1\le i\le n }[/math], independently

[math]\displaystyle{ y_i =\begin{cases} 1 & \mbox{with probability }y_i^*.\\ 0 & \mbox{with probability }1-y_i^*. \end{cases} }[/math]

Correspondingly, each [math]\displaystyle{ x_i }[/math] is assigned to TRUE independently with probability [math]\displaystyle{ y_i^* }[/math].

Lemma
Let [math]\displaystyle{ C_j }[/math] be a clause with [math]\displaystyle{ k }[/math] literals. The probability that it is satisfied by randomized rounding is at least
[math]\displaystyle{ (1-(1-1/k)^k)z_j^* }[/math].
Proof.
Without loss of generality, we assume that all [math]\displaystyle{ k }[/math] variables appear in [math]\displaystyle{ C_j }[/math] in the uncomplemented form, and we assume that
[math]\displaystyle{ C_j=x_1\vee x_2\vee\cdots\vee x_k }[/math].

The complemented cases are symmetric.

Clause [math]\displaystyle{ C_j }[/math] remains unsatisfied by randomized rounding only if every one of [math]\displaystyle{ x_i }[/math], [math]\displaystyle{ 1\le i\le k }[/math], is assigned to FALSE, which corresponds to that every one of [math]\displaystyle{ y_i }[/math], [math]\displaystyle{ 1\le i\le k }[/math], is rounded to 0. This event occurs with probability [math]\displaystyle{ \prod_{i=1}^k(1-y_i^*) }[/math]. Therefore, the clause [math]\displaystyle{ C_j }[/math] is satisfied by the randomized rounding with probability

[math]\displaystyle{ 1-\prod_{i=1}^k(1-y_i^*) }[/math].

By the linear programming constraints,

[math]\displaystyle{ y_1^*+y_2^*+\cdots+y_k^*\ge z_j^* }[/math].

Then the value of [math]\displaystyle{ 1-\prod_{i=1}^k(1-y_i^*) }[/math] is minimized when all [math]\displaystyle{ y_i^* }[/math] are equal and [math]\displaystyle{ y_i^*=\frac{z_j^*}{k} }[/math]. Thus, the probability that [math]\displaystyle{ C_j }[/math] is satisfied is

[math]\displaystyle{ 1-\prod_{i=1}^k(1-y_i^*)\ge 1-(1-z_j^*/k)^k\ge (1-(1-1/k)^k)z_j^* }[/math],

where the last inequality is due to the concaveness of the function [math]\displaystyle{ 1-(1-z_j^*/k)^k }[/math] of variable [math]\displaystyle{ z_j^* }[/math].

[math]\displaystyle{ \square }[/math]

For any [math]\displaystyle{ k\ge 1 }[/math], it holds that [math]\displaystyle{ 1-(1-1/k)^k\gt 1-1/e }[/math]. Therefore, by the linearity of expectation, the expected number of satisfied clauses by the randomized rounding, is at least

[math]\displaystyle{ (1-1/e)\sum_{j=1}z_j^*\ge (1-1/e)\cdot\mathrm{OPT} }[/math].

The inequality is due to the fact that [math]\displaystyle{ \hat{z}_j }[/math] are the optimal fractional solutions to the relaxed LP, thus are no worse than the optimal integral solutions.

Choose a better solution

For any instance of the Max-SAT, let [math]\displaystyle{ m_1 }[/math] be the expected number of satisfied clauses when each variable is independently set to TRUE with probability [math]\displaystyle{ \frac{1}{2} }[/math]; and let [math]\displaystyle{ m_2 }[/math] be the expected number of satisfied clauses when we use the linear programming followed by randomized rounding.

We will show that on any instance of the Max-SAT, one of the two algorithms is a [math]\displaystyle{ \frac{3}{4} }[/math]-approximation algorithm.

Theorem
[math]\displaystyle{ \max\{m_1,m_2\}\ge\frac{3}{4}\cdot\mathrm{OPT}. }[/math]
Proof.
It suffices to show that [math]\displaystyle{ \frac{(m_1+m_2)}{2}\ge\frac{3}{4}\sum_{j=1}^m z_j^* }[/math]. Letting [math]\displaystyle{ S_k }[/math] denote the set of clauses that contain [math]\displaystyle{ k }[/math] literals, we know that
[math]\displaystyle{ m_1=\sum_{k=1}^n\sum_{C_j\in S_k}(1-2^{-k})\ge\sum_{k=1}^n\sum_{C_j\in S_k}(1-2^{-k}) z_j^*. }[/math]

By the analysis of randomized rounding,

[math]\displaystyle{ m_2\ge\sum_{k=1}^n\sum_{C_j\in S_k}(1-(1-1/k)^k) z_j^*. }[/math]

Thus

[math]\displaystyle{ \frac{(m_1+m_2)}{2}\ge \sum_{k=1}^n\sum_{C_j\in S_k} \frac{1-2^{-k}+1-(1-1/k)^k}{2} z_j^*. }[/math]

An easy calculation shows that [math]\displaystyle{ \frac{1-2^{-k}+1-(1-1/k)^k}{2}\ge\frac{3}{4} }[/math] for any [math]\displaystyle{ k }[/math], so that we have

[math]\displaystyle{ \frac{(m_1+m_2)}{2}\ge \frac{3}{4}\sum_{k=1}^n\sum_{C_j\in S_k}z_j^*=\frac{3}{4}\sum_{j=1}^m z_j^*\ge \frac{3}{4}\cdot\mathrm{OPT}. }[/math]
[math]\displaystyle{ \square }[/math]