Combinatorics (Fall 2010)/Duality, Matroid and 随机算法 (Fall 2011)/The Probabilistic Method: Difference between pages

From TCS Wiki
(Difference between pages)
Jump to navigation Jump to search
imported>WikiSysop
 
imported>WikiSysop
No edit summary
 
Line 1: Line 1:
== Duality ==
= Probabilistic Method =
===Ramsey number===


Consider the following LP:
Recall the Ramsey theorem which states that in a meeting of at least six people, there are either three people knowing each other or three people not knowing each other. In graph theoretical terms, this means that no matter how we color the edges of <math>K_6</math> (the complete graph on six vertices), there must be a '''monochromatic''' <math>K_3</math> (a triangle whose edges have the same color).
:<math>
\begin{align}
\text{minimize} && 7x_1+x_2+5x_3\\
\text{subject to}  &&
x_1-x_2+3x_3 &\ge 10\\
&&
5x_1-2x_2-x_3 &\ge 6\\
&& x_1,x_2,x_3 &\ge 0
\end{align}
</math>


Let <math>OPT</math> be the value of the optimal solution. We want to estimate the upper and lower bound of <math>OPT</math>.
Generally, the '''Ramsey number''' <math>R(k,\ell)</math> is the smallest integer <math>n</math> such that in any two-coloring of the edges of a complete graph on <math>n</math> vertices <math>K_n</math> by red and blue, either there is a red <math>K_k</math> or there is a blue <math>K_\ell</math>.


Since <math>OPT</math> is the minimum over the feasible set, every feasible solution forms an upper bound for <math>OPT</math>. For example <math>\boldsymbol{x}=(2,1,3)</math> is a feasible solution, thus <math>OPT\le 7\cdot 2+1+5\cdot 3=30</math>.
Ramsey showed in 1929 that <math>R(k,\ell)</math> is finite for any <math>k</math> and <math>\ell</math>. It is extremely hard to compute the exact value of <math>R(k,\ell)</math>. Here we give a lower bound of <math>R(k,k)</math> by the probabilistic method.


For the lower bound, the optimal solution must satisfy the two constraints:
{{Theorem
:<math>
|Theorem (Erdős 1947)|
\begin{align}
:If <math>{n\choose k}\cdot 2^{1-{k\choose 2}}<1</math> then it is possible to color the edges of <math>K_n</math> with two colors so that there is no monochromatic <math>K_k</math> subgraph.
x_1-x_2+3x_3 &\ge 10,\\
}}
5x_1-2x_2-x_3 &\ge 6.\\
{{Proof| Consider a random two-coloring of edges of <math>K_n</math> obtained as follows:
\end{align}
* For each edge of <math>K_n</math>, independently flip a fair coin to decide the color of the edge.
</math>
Since the <math>x_i</math>'s are restricted to be nonnegative, term-by-term comparison of coefficients shows that
:<math>7x_1+x_2+5x_3\ge(x_1-x_2+3x_3)+(5x_1-2x_2-x_3)\ge 16.</math>
The idea behind this lower bound process is that we are finding suitable nonnegative multipliers (in the above case the multipliers are all 1s) for the constraints so that when we take their sum, the coefficient of each <math>x_i</math> in the sum is dominated by the coefficient in the objective function. It is important to ensure that the multipliers are nonnegative, so they do not reverse the direction of the constraint inequality.


To find the best lower bound, we need to choose the multipliers in such a way that the sum is as large as possible. Interestingly, the problem of finding the best lower bound can be formulated as another LP:
For any fixed set <math>S</math> of <math>k</math> vertices, let <math>\mathcal{E}_S</math> be the event that the <math>K_k</math> subgraph induced by <math>S</math> is monochromatic. There are <math>{k\choose 2}</math> many edges in <math>K_k</math>, therefore
:<math>\Pr[\mathcal{E}_S]=2\cdot 2^{-{k\choose 2}}=2^{1-{k\choose 2}}.</math>


::<math>
Since there are <math>{n\choose k}</math> possible choices of <math>S</math>, by the union bound
\begin{align}
\text{maximize} && 10y_1+6y_2\\
\text{subject to} &&
y_1+5y_2 &\le 7\\
&&
-y_1+2y_2 &\le 1\\
&&3y_1-y_2 &\le 5\\
&& y_1,y_2&\ge 0
\end{align}
</math>
 
Here <math>y_1</math> and <math>y_2</math> were chosen to be nonnegative multipliers for the first and the second constraint, respectively. We call the first LP the '''primal program''' and the second LP the '''dual program'''. By definition, every feasible solution to the dual program gives a lower bound for the primal program.
 
=== LP duality ===
Given an LP in canonical form, called the '''primal''' LP:
:<math>
:<math>
\begin{align}
\Pr[\exists S, \mathcal{E}_S]\le {n\choose k}\cdot\Pr[\mathcal{E}_S]={n\choose k}\cdot 2^{1-{k\choose 2}}.
\text{minimize} && \boldsymbol{c}^T\boldsymbol{x}\\
\text{subject to} &&
A\boldsymbol{x} &\ge\boldsymbol{b}\\
&& \boldsymbol{x} &\ge \boldsymbol{0}
\end{align}
</math>
</math>
Due to the assumption, <math>{n\choose k}\cdot 2^{1-{k\choose 2}}<1</math>, thus there exists a two coloring that none of <math>\mathcal{E}_S</math> occurs, which means  there is no monochromatic <math>K_k</math> subgraph.
}}


the '''dual''' LP is defined as follows:
For <math>k\ge 3</math> and we take <math>n=\lfloor2^{k/2}\rfloor</math>, then
:<math>
:<math>
\begin{align}
\begin{align}
\text{maximum} && \boldsymbol{b}^T\boldsymbol{y}\\
{n\choose k}\cdot 2^{1-{k\choose 2}}
\text{subject to} &&
&<
A^T\boldsymbol{y} &\le\boldsymbol{c}\\
\frac{n^k}{k!}\cdot\frac{2^{1+\frac{k}{2}}}{2^{k^2/2}}\\
&& \boldsymbol{y} &\ge \boldsymbol{0}
&\le  
\frac{2^{k^2/2}}{k!}\cdot\frac{2^{1+\frac{k}{2}}}{2^{k^2/2}}\\
&=
\frac{2^{1+\frac{k}{2}}}{k!}\\
&<1.
\end{align}
\end{align}
</math>
</math>
By the above theorem, there exists a two-coloring of <math>K_n</math> that there is no monochromatic <math>K_k</math>. Therefore, the Ramsey number <math>R(k,k)>\lfloor2^{k/2}\rfloor</math> for all <math>k\ge 3</math>.


We then give some examples.
Note that for sufficiently large <math>k</math>, if <math>n= \lfloor 2^{k/2}\rfloor</math>, then the probability that there exists a monochromatic <math>K_k</math> is bounded by
 
;Surviving problem (diet problem)
Let us consider the surviving problem. Suppose we have <math>n</math> types of natural food, each containing up to <math>m</math> types of vitamins.  The <math>j</math>th food has <math>a_{ij}</math> amount of vitamin <math>i</math>, and the price of the <math>j</math>th food is <math>c_j</math>. We need to consume <math>b_i</math> amount of vitamin <math>i</math> for each <math>1\le i\le m</math> to keep a good health. We want to minimize the total costs of food while keeping healthy. The problem can be formalized as the following LP:
:<math>
\begin{align}
\text{minimize} \quad& c_1x_1+c_2x_2+\cdots+c_nx_n\\
\begin{align}
\text{subject to} \\
\\
\end{align}
\quad &
\begin{align} a_{i1}x_{1}+a_{i2}x_{2}+\cdots+a_{in}x_{n} &\le b_{i} &\quad& \forall 1\le i\le m\\
x_{j}&\ge 0 &\quad& \forall 1\le j\le n
\end{align}
\end{align}
</math>
The dual LP is
:<math>
:<math>
\begin{align}
{n\choose k}\cdot 2^{1-{k\choose 2}}
\text{maximize} \quad& b_1y_1+b_2y_2+\cdots+b_ny_m\\
<
\begin{align}
\frac{2^{1+\frac{k}{2}}}{k!}
\text{subject to} \\
\ll 1,
\\
\end{align}
\quad &
\begin{align} a_{1j}y_{1}+a_{2j}y_{2}+\cdots+a_{mj}y_{m} &\le c_{j} &\quad& \forall 1\le j\le n\\
y_{i}&\ge 0 &\quad& \forall 1\le i\le m
\end{align}
\end{align}
</math>
</math>
The problem can be interpreted as follows: A food company produces <math>m</math> types of vitamin pills. The company wants to design a pricing system such that
which means that a random two-coloring of <math>K_n</math> is very likely not to contain a monochromatic  <math>K_{2\log n}</math>. This gives us a very simple randomized algorithm for finding a two-coloring of <math>K_n</math> without monochromatic <math>K_{2\log n}</math>.
* The vitamin <math>i</math> has a nonnegative price <math>y_i</math>.
* The price system should be competitive to any natural food. A costumer cannot replace the vitamins by any natural food and get a cheaper price, that is, <math>\sum_{i=1}^my_ja_{ij}\le c_j</math> for any <math>1\le j\le n</math>.
* The company wants to find the maximal profit, assuming that the customer only buy exactly the necessary amount of vitamins (<math>b_i</math> for vitamin <math>i</math>).


;Maximum flow problem
= Averaging Principle =
In the last lecture, we defined the maximum flow problem, whose LP is
:<math>
\begin{align}
\text{maximize} \quad& \sum_{v:(s,v)\in E}f_{sv}\\
\begin{align}
\text{subject to} \\
\\
\\
\\
\end{align}
\quad &
\begin{align} f_{uv}&\le c_{uv} &\quad& \forall (u,v)\in E\\
\sum_{u:(u,v)\in E}f_{uv}-\sum_{w:(v,w)\in E}f_{vw} &=0 &\quad& \forall v\in V\setminus\{s,t\}\\
f_{uv}&\ge 0 &\quad& \forall (u,v)\in E
\end{align}
\end{align}
</math>
where directed graph <math>G(V,E)</math> is the flow network, <math>s\in V</math> is the source, <math>t\in V</math> is the sink, and <math>c_{uv}</math> is the capacity of directed edge <math>(u,v)\in E</math>.


We add a new edge from <math>t</math> to <math>s</math> to <math>E</math>, and let the capacity be <math>c_{ts}=\infty</math>. Let <math>E'</math> be the new edge set. The LP for the max-flow problem can be rewritten as:
===Maximum cut===
:<math>
\begin{align}
\text{maximize} \quad& f_{ts}\\
\begin{align}
\text{subject to} \\
\\
\\
\\
\end{align}
\quad &
\begin{align} f_{uv}&\le c_{uv} &\quad& \forall (u,v)\in E\\
\sum_{u:(u,v)\in E'}f_{uv}-\sum_{w:(v,w)\in E'}f_{vw} &\le0 &\quad& \forall v\in V\\
f_{uv}&\ge 0 &\quad& \forall (u,v)\in E'
\end{align}
\end{align}
</math>
The second set of inequalities seem weaker than the original conservation constraint of flows, however, if this inequality holds at every node, then in fact it must be satisfied with equality at every node, thereby implying the flow conservation.


To obtain the dual program we introduce variables <math>d_{uv}</math> and <math>p_v</math> corresponding to the two types of inequalities in the primal. The dual LP is:
Given an undirected graph <math>G(V,E)</math>, a set <math>C</math> of edges of <math>G</math> is called a '''cut''' if <math>G</math> is disconnected after removing the edges in <math>C</math>. We can represent a cut by <math>c(S,T)</math> where <math>(S,T)</math> is a bipartition of the vertex set <math>V</math>, and <math>c(S,T)=\{uv\in E\mid u\in S,v\in T\}</math> is the set of edges crossing between <math>S</math> and <math>T</math>.
:<math>
\begin{align}
\text{minimize} \quad& \sum_{(u,v)\in E}c_{uv}d_{uv}\\
\begin{align}
\text{subject to} \\
\\
\\
\\
\end{align}
\quad &
\begin{align} d_{uv}-p_u+p_v &\ge 0 &\quad& \forall (u,v)\in E\\
p_s-p_t &\ge1 \\
d_{uv} &\ge 0 &\quad& \forall (u,v)\in E\\
p_v&\ge 0 &\quad& \forall v\in V
\end{align}
\end{align}
</math>
It is more helpful to consider its integer version:
:<math>
\begin{align}
\text{minimize} \quad& \sum_{(u,v)\in E}c_{uv}d_{uv}\\
\begin{align}
\text{subject to} \\
\\
\\
\\
\end{align}
\quad &
\begin{align} d_{uv}-p_u+p_v &\ge 0 &\quad& \forall (u,v)\in E\\
p_s-p_t &\ge1 \\
d_{uv} &\in\{0,1\} &\quad& \forall (u,v)\in E\\
p_v&\in\{0,1\} &\quad& \forall v\in V
\end{align}
\end{align}
</math>
In the last lecture, we know that the LP for max-flow is totally unimordular, so is this dual LP, therefore the optimal solutions to the integer program are the optimal solutions to the LP.


The variables <math>p_v</math> defines a bipartition of vertex set <math>V</math>. Let <math>S=\{v\in V\mid p_v=1\}</math>. The complement <math>\bar{S}=\{v\in V\mid p_v=1\}</math>.
We have seen how to compute min-cut: either by deterministic max-flow algorithm, or by Karger's randomized algorithm. On the other hand, max-cut is hard to compute, because it is '''NP-complete'''. Actually, the weighted version of max-cut is among the [http://en.wikipedia.org/wiki/Karp's_21_NP-complete_problems Karp's 21 NP-complete problems].


For 0/1-valued variables, the only way to satisfy <math>p_s-p_t\ge1</math> is to have <math>p_s=1</math> and <math>p_t=0</math>. Therefore, <math>(S,\bar{S})</math> is an <math>s</math>-<math>t</math> cut.
We now show by the probabilistic method that a max-cut always has at least half the edges.


In an optimal solution, <math>d_{uv}=1</math> if and only if <math>u\in S,v\in\bar{S}</math> and <math>(u,v)\in E</math>. Therefore, the objective function of an optimal solution <math>\sum_{u\in S,v\not\in S\atop (u,v)\in E}c_{uv}</math> is the capacity of the minimum <math>s</math>-<math>t</math> cut <math>(S,\bar{S})</math>.
{{Theorem
|Theorem|
:Given an undirected graph <math>G</math> with <math>n</math> vertices and <math>m</math> edges, there is a cut of size at least <math>\frac{m}{2}</math>.
}}
{{Proof| Enumerate the vertices in an arbitrary order. Partition the vertex set <math>V</math> into two disjoint sets <math>S</math> and <math>T</math> as follows.
:For each vertex <math>v\in V</math>,
:* independently choose one of <math>S</math> and <math>T</math> with equal probability, and let <math>v</math> join the chosen set.


=== Duality theorems ===
For each vertex <math>v\in V</math>, let <math>X_v\in\{S,T\}</math> be the random variable which represents the set that <math>v</math> joins. For each edge <math>uv\in E</math>, let <math>Y_{uv}</math> be the 0-1 random variable which indicates whether <math>uv</math> crosses between <math>S</math> and <math>T</math>. Clearly,
Let the primal LP be:
:<math>
:<math>
\begin{align}
\Pr[Y_{uv}=1]=\Pr[X_u\neq X_v]=\frac{1}{2}.
\text{minimize} && \boldsymbol{c}^T\boldsymbol{x}\\
\text{subject to} &&
A\boldsymbol{x} &\ge\boldsymbol{b}\\
&& \boldsymbol{x} &\ge \boldsymbol{0}
\end{align}
</math>
</math>


Its dual LP is:
The size of <math>c(S,T)</math> is given by <math>Y=\sum_{uv\in E}Y_{uv}</math>. By the linearity of expectation,
:<math>
:<math>
\begin{align}
\mathbf{E}[Y]=\sum_{uv\in E}\mathbf{E}[Y_{uv}]=\sum_{uv\in E}\Pr[Y_{uv}=1]=\frac{m}{2}.
\text{maximum} && \boldsymbol{b}^T\boldsymbol{y}\\
\text{subject to} &&
A^T\boldsymbol{y} &\le\boldsymbol{c}\\
&& \boldsymbol{y} &\ge \boldsymbol{0}
\end{align}
</math>
</math>
Therefore, there exist a bipartition <math>(S,T)</math> of <math>V</math> such that <math>|c(S,T)|\ge\frac{m}{2}</math>, i.e. there exists a cut of <math>G</math> which contains at least <math>\frac{m}{2}</math> edges.
}}


{{Theorem|Theorem|
=Alternations=
: The dual of a dual is the primal.
}}
{{Proof|
The dual program can be written as the following minimization in canonical form:
:<math>
\begin{align}
\min && -\boldsymbol{b}^T\boldsymbol{y}\\
\text{s.t.} &&
-A^T\boldsymbol{y} &\ge-\boldsymbol{c}\\
&& \boldsymbol{y} &\ge \boldsymbol{0}
\end{align}
</math>
Its dual is:
:<math>
\begin{align}
\max && -\boldsymbol{c}^T\boldsymbol{x}\\
\text{s.t.} &&
-A\boldsymbol{x} &\le-\boldsymbol{b}\\
&& \boldsymbol{x} &\ge \boldsymbol{0}
\end{align}
</math>
which is equivalent to the primal:
:<math>
\begin{align}
\min && \boldsymbol{c}^T\boldsymbol{x}\\
\text{s.t.} &&
A\boldsymbol{x} &\ge\boldsymbol{b}\\
&& \boldsymbol{x} &\ge \boldsymbol{0}
\end{align}
</math>
}}


We have shown that feasible solutions of a dual program can be used to lower bound the optimum of the primal program. This is formalized by the following important theorem.


{{Theorem|Theorem (Weak duality theorem)|
===Independent sets===
:If there exists an optimal solution to the primal LP:
An independent set of a graph is a set of vertices with no edges between them. The following theorem gives a lower bound on the size of the largest independent set.
::<math>
{{Theorem
\begin{align}
|Theorem|
\min && \boldsymbol{c}^T\boldsymbol{x}\\
:Let <math>G(V,E)</math> be a graph on <math>n</math> vertices with <math>m</math> edges. Then <math>G</math> has an independent set with at least <math>\frac{n^2}{4m}</math> vertices.
\text{s.t.} &&
A\boldsymbol{x} &\ge\boldsymbol{b}\\
&& \boldsymbol{x} &\ge \boldsymbol{0}
\end{align}
</math>
:then,
::<math>
\begin{align}
\begin{align}
\min && \boldsymbol{c}^T\boldsymbol{x}\\
\text{s.t.} &&
A\boldsymbol{x} &\ge\boldsymbol{b}\\
&& \boldsymbol{x} &\ge \boldsymbol{0}
\end{align}
&\begin{align}
\ge\\
\\
\\
\end{align}&\quad
\begin{align}
\max && \boldsymbol{b}^T\boldsymbol{y}\\
\text{s.t.} &&
A^T\boldsymbol{y} &\le\boldsymbol{c}\\
&& \boldsymbol{y} &\ge \boldsymbol{0}
\end{align}
\end{align}
</math>
}}
}}
{{proof|
{{Proof| Let <math>S</math> be a set of vertices constructed as follows:
Let <math>\boldsymbol{x}</math> be an arbitrary feasible solution to the primal LP, and <math>\boldsymbol{y}</math> be an arbitrary feasible solution to the dual LP.
:For each vertex <math>v\in V</math>:
:* <math>v</math> is included in <math>S</math> independently with probability <math>p</math>,
<math>p</math> to be determined.


We estimate <math>\boldsymbol{y}^TA\boldsymbol{x}</math> in two ways. Recall that <math>A\boldsymbol{x} \ge\boldsymbol{b}</math> and <math>A^T\boldsymbol{y} \le\boldsymbol{c}</math>, thus
Let <math>X=|S|</math>. It is obvious that <math>\mathbf{E}[X]=np</math>.
:<math>\boldsymbol{y}^T\boldsymbol{b}\le\boldsymbol{y}^TA\boldsymbol{x}\le\boldsymbol{c}^T\boldsymbol{x}</math>.


Since this holds for any feasible solutions, it must also hold for the optimal solutions.
For each edge <math>e\in E</math>, let <math>Y_{e}</math> be the random variable which indicates whether both endpoints of <math></math> are in <math>S</math>.
}}
:<math>
\mathbf{E}[Y_{uv}]=\Pr[u\in S\wedge v\in S]=p^2.
</math>
Let <math>Y</math> be the number of edges in the subgraph of <math>G</math> induced by <math>S</math>. It holds that <math>Y=\sum_{e\in E}Y_e</math>. By linearity of expectation,
:<math>\mathbf{E}[Y]=\sum_{e\in E}\mathbf{E}[Y_e]=mp^2</math>.


A harmonically beautiful result is that the optimums of the primal LP and its dual are equal. This is called the strong duality theorem of linear programming.
Note that although <math>S</math> is not necessary an independent set, it can be modified to one if for each edge <math>e</math> of the induced subgraph <math>G(S)</math>, we delete one of the endpoint of <math>e</math> from <math>S</math>. Let <math>S^*</math> be the resulting set. It is obvious that <math>S^*</math> is an independent set since there is no edge left in the induced subgraph <math>G(S^*)</math>.  


{{Theorem|Theorem (Strong duality theorem)|
Since there are <math>Y</math> edges in <math>G(S)</math>, there are at most <math>Y</math> vertices in <math>S</math> are deleted to make it become <math>S^*</math>. Therefore, <math>|S^*|\ge X-Y</math>. By linearity of expectation,
:If there exists an optimal solution to the primal LP:
:<math>
::<math>
\mathbf{E}[|S^*|]\ge\mathbf{E}[X-Y]=\mathbf{E}[X]-\mathbf{E}[Y]=np-mp^2.
\begin{align}
\min && \boldsymbol{c}^T\boldsymbol{x}\\
\text{s.t.} &&
A\boldsymbol{x} &\ge\boldsymbol{b}\\
&& \boldsymbol{x} &\ge \boldsymbol{0}
\end{align}
</math>
</math>
:then,
The expectation is maximized when <math>p=\frac{n}{2m}</math>, thus
::<math>
:<math>
\begin{align}
\mathbf{E}[|S^*|]\ge n\cdot\frac{n}{2m}-m\left(\frac{n}{2m}\right)^2=\frac{n^2}{4m}.
\begin{align}
\min && \boldsymbol{c}^T\boldsymbol{x}\\
\text{s.t.} &&
A\boldsymbol{x} &\ge\boldsymbol{b}\\
&& \boldsymbol{x} &\ge \boldsymbol{0}
\end{align}
&\begin{align}
=\\
\\
\\
\end{align}&\quad
\begin{align}
\max && \boldsymbol{b}^T\boldsymbol{y}\\
\text{s.t.} &&
A^T\boldsymbol{y} &\le\boldsymbol{c}\\
&& \boldsymbol{y} &\ge \boldsymbol{0}
\end{align}
\end{align}
</math>
</math>
There exists an independent set which contains at least <math>\frac{n^2}{4m}</math> vertices.
}}
}}


== Matroid ==
The proof actually propose a randomized algorithm for constructing large independent set:
 
=== Kruskal's greedy algorithm for MST ===
 
=== Matroids ===
Let <math>X</math> be a finite set and <math>\mathcal{F}\subseteq 2^X</math> be a family of subsets of <math>X</math>.  A member set <math>S\in\mathcal{F}</math> is called '''maximal''' if <math>S\cup\{x\}\not\in\mathcal{F}</math> for any <math>x\in X\setminus S</math>.
 
For <math>Y\subseteq X</math>, denote <math>\mathcal{F}_Y=\{S\in\mathcal{F}\mid S\subseteq Y\}</math>. Obviously,<math>\mathcal{F}_Y=\mathcal{F}\cap 2^Y\,</math>.


{{Theorem|Definition|
{{Theorem
:A set system <math>\mathcal{F}\subseteq 2^X</math> is a '''matroid''' if it satisfies:
|Algorithm|
:*(hereditary) if <math>T\subseteq S\in\mathcal{F}</math> then <math>T\in\mathcal{F}</math>;
Given a graph on <math>n</math> vertices with <math>m</math> edges, let <math>d=\frac{2m}{n}</math> be the average degree.
:*(matroid property) for every <math>Y\subseteq X</math>, all maximal <math>S\in\mathcal{F}_Y</math> have the same <math>|S|</math>.
#For each vertex <math>v\in V</math>, <math>v</math> is included in <math>S</math> independently with probability <math>\frac{1}{d}</math>.
#For each remaining edge in the induced subgraph <math>G(S)</math>, remove one of the endpoints from <math>S</math>.
}}
}}


Suppose <math>\mathcal{F}</math> is a matroid. Some matroid terminologies:
Let <math>S^*</math> be the resulting set. We have shown that <math>S^*</math> is an independent set and <math>\mathbf{E}[|S^*|]\ge\frac{n^2}{4m}</math>.
* Each member set <math>S\in\mathcal{F}</math> is called an '''independent set'''.
* A maximal independent subset of a set <math>Y\subset X</math>, i.e., a maximal <math>S\in\mathcal{F}_Y</math>, is called a '''basis''' of <math>Y</math>.
* The size of the maximal <math>S\in\mathcal{F}_Y</math> is called the '''rank''' of <math>Y</math>, denoted <math>r(Y)</math>.
 
==== Graph matroids ====
Let <math>G(V,E)</math> be a graph. Define a set system with ground set <math>E</math> as
:<math>\mathcal{F}=\{S\subseteq E\mid \text{there is no cycle in }S\}.</math>
That is, <math>\mathcal{F}</math> is the set of all forests in <math>G</math>.
 
We claim that <math>\mathcal{F}</math> is a matroid.
 
First, <math>\mathcal{F}</math> is hereditary since any subgraph of a forest must also be a forest.
 
We then verify the matroid property of <math>\mathcal{F}</math>. Let <math>Y\subseteq E</math> be an arbitrary subgraph of <math>G</math>. Suppose <math>Y</math> has <math>k</math> connected components. For any maximal forest <math>S</math> in <math>Y</math> (i.e., <math>S</math> is a spanning forest in <math>Y</math>), it holds that <math>|S|=n-k</math>. In other words, for any <math>Y\subseteq E</math>, all maximal member of <math>\mathcal{F}_Y</math> have the same cardinality.
 
Therefore, <math>\mathcal{F}</math> is a matroid. Each independent set (of matroid) is a forest in <math>G</math>. For any subgraph <math>Y\subseteq G</math>, the rank of <math>Y</math> is the size of a spanning forest of <math>Y</math>.
 
==== Linear matroids ====
Let <math>A</math> be an <math>m\times n</math> matrix. Define a set system <math>\mathcal{F}\subseteq 2^{[n]}</math> as
:<math>\mathcal{F}=\{S\subseteq [n]\mid S\text{ is a set of linearly independent columns in }A\}.</math>
 
<math>\mathcal{F}</math> is hereditary since every any subset of a set of linearly independent vectors is still linearly independent.
 
For any subset <math>Y\subseteq [n]</math> of columns of <math>A</math>. Let <math>B</math> be the submatrix composed by these columns. Then <math>\mathcal{F}_Y</math> contains all sets of linearly independent columns of <math>B</math>. Clearly, all maximal such sets have the same size, which is the column-rank of <math>B</math>.
 
Therefore, <math>\mathcal{F}</math> is a matroid. Each independent set (of matroid) is a linearly independent set of columns of matrix <math>A</math>. For any set <math>Y\subseteq[n]</math> of columns of matrix <math>A</math>, the rank of <math>Y</math> is the column-rank of the submatrix defined by the columns in <math>Y</math>.
 
=== Weighted matroid maximization ===
 
=== Matroid intersections ===

Revision as of 03:57, 24 July 2011

Probabilistic Method

Ramsey number

Recall the Ramsey theorem which states that in a meeting of at least six people, there are either three people knowing each other or three people not knowing each other. In graph theoretical terms, this means that no matter how we color the edges of [math]\displaystyle{ K_6 }[/math] (the complete graph on six vertices), there must be a monochromatic [math]\displaystyle{ K_3 }[/math] (a triangle whose edges have the same color).

Generally, the Ramsey number [math]\displaystyle{ R(k,\ell) }[/math] is the smallest integer [math]\displaystyle{ n }[/math] such that in any two-coloring of the edges of a complete graph on [math]\displaystyle{ n }[/math] vertices [math]\displaystyle{ K_n }[/math] by red and blue, either there is a red [math]\displaystyle{ K_k }[/math] or there is a blue [math]\displaystyle{ K_\ell }[/math].

Ramsey showed in 1929 that [math]\displaystyle{ R(k,\ell) }[/math] is finite for any [math]\displaystyle{ k }[/math] and [math]\displaystyle{ \ell }[/math]. It is extremely hard to compute the exact value of [math]\displaystyle{ R(k,\ell) }[/math]. Here we give a lower bound of [math]\displaystyle{ R(k,k) }[/math] by the probabilistic method.

Theorem (Erdős 1947)
If [math]\displaystyle{ {n\choose k}\cdot 2^{1-{k\choose 2}}\lt 1 }[/math] then it is possible to color the edges of [math]\displaystyle{ K_n }[/math] with two colors so that there is no monochromatic [math]\displaystyle{ K_k }[/math] subgraph.
Proof.
Consider a random two-coloring of edges of [math]\displaystyle{ K_n }[/math] obtained as follows:
  • For each edge of [math]\displaystyle{ K_n }[/math], independently flip a fair coin to decide the color of the edge.

For any fixed set [math]\displaystyle{ S }[/math] of [math]\displaystyle{ k }[/math] vertices, let [math]\displaystyle{ \mathcal{E}_S }[/math] be the event that the [math]\displaystyle{ K_k }[/math] subgraph induced by [math]\displaystyle{ S }[/math] is monochromatic. There are [math]\displaystyle{ {k\choose 2} }[/math] many edges in [math]\displaystyle{ K_k }[/math], therefore

[math]\displaystyle{ \Pr[\mathcal{E}_S]=2\cdot 2^{-{k\choose 2}}=2^{1-{k\choose 2}}. }[/math]

Since there are [math]\displaystyle{ {n\choose k} }[/math] possible choices of [math]\displaystyle{ S }[/math], by the union bound

[math]\displaystyle{ \Pr[\exists S, \mathcal{E}_S]\le {n\choose k}\cdot\Pr[\mathcal{E}_S]={n\choose k}\cdot 2^{1-{k\choose 2}}. }[/math]

Due to the assumption, [math]\displaystyle{ {n\choose k}\cdot 2^{1-{k\choose 2}}\lt 1 }[/math], thus there exists a two coloring that none of [math]\displaystyle{ \mathcal{E}_S }[/math] occurs, which means there is no monochromatic [math]\displaystyle{ K_k }[/math] subgraph.

[math]\displaystyle{ \square }[/math]

For [math]\displaystyle{ k\ge 3 }[/math] and we take [math]\displaystyle{ n=\lfloor2^{k/2}\rfloor }[/math], then

[math]\displaystyle{ \begin{align} {n\choose k}\cdot 2^{1-{k\choose 2}} &\lt \frac{n^k}{k!}\cdot\frac{2^{1+\frac{k}{2}}}{2^{k^2/2}}\\ &\le \frac{2^{k^2/2}}{k!}\cdot\frac{2^{1+\frac{k}{2}}}{2^{k^2/2}}\\ &= \frac{2^{1+\frac{k}{2}}}{k!}\\ &\lt 1. \end{align} }[/math]

By the above theorem, there exists a two-coloring of [math]\displaystyle{ K_n }[/math] that there is no monochromatic [math]\displaystyle{ K_k }[/math]. Therefore, the Ramsey number [math]\displaystyle{ R(k,k)\gt \lfloor2^{k/2}\rfloor }[/math] for all [math]\displaystyle{ k\ge 3 }[/math].

Note that for sufficiently large [math]\displaystyle{ k }[/math], if [math]\displaystyle{ n= \lfloor 2^{k/2}\rfloor }[/math], then the probability that there exists a monochromatic [math]\displaystyle{ K_k }[/math] is bounded by

[math]\displaystyle{ {n\choose k}\cdot 2^{1-{k\choose 2}} \lt \frac{2^{1+\frac{k}{2}}}{k!} \ll 1, }[/math]

which means that a random two-coloring of [math]\displaystyle{ K_n }[/math] is very likely not to contain a monochromatic [math]\displaystyle{ K_{2\log n} }[/math]. This gives us a very simple randomized algorithm for finding a two-coloring of [math]\displaystyle{ K_n }[/math] without monochromatic [math]\displaystyle{ K_{2\log n} }[/math].

Averaging Principle

Maximum cut

Given an undirected graph [math]\displaystyle{ G(V,E) }[/math], a set [math]\displaystyle{ C }[/math] of edges of [math]\displaystyle{ G }[/math] is called a cut if [math]\displaystyle{ G }[/math] is disconnected after removing the edges in [math]\displaystyle{ C }[/math]. We can represent a cut by [math]\displaystyle{ c(S,T) }[/math] where [math]\displaystyle{ (S,T) }[/math] is a bipartition of the vertex set [math]\displaystyle{ V }[/math], and [math]\displaystyle{ c(S,T)=\{uv\in E\mid u\in S,v\in T\} }[/math] is the set of edges crossing between [math]\displaystyle{ S }[/math] and [math]\displaystyle{ T }[/math].

We have seen how to compute min-cut: either by deterministic max-flow algorithm, or by Karger's randomized algorithm. On the other hand, max-cut is hard to compute, because it is NP-complete. Actually, the weighted version of max-cut is among the Karp's 21 NP-complete problems.

We now show by the probabilistic method that a max-cut always has at least half the edges.

Theorem
Given an undirected graph [math]\displaystyle{ G }[/math] with [math]\displaystyle{ n }[/math] vertices and [math]\displaystyle{ m }[/math] edges, there is a cut of size at least [math]\displaystyle{ \frac{m}{2} }[/math].
Proof.
Enumerate the vertices in an arbitrary order. Partition the vertex set [math]\displaystyle{ V }[/math] into two disjoint sets [math]\displaystyle{ S }[/math] and [math]\displaystyle{ T }[/math] as follows.
For each vertex [math]\displaystyle{ v\in V }[/math],
  • independently choose one of [math]\displaystyle{ S }[/math] and [math]\displaystyle{ T }[/math] with equal probability, and let [math]\displaystyle{ v }[/math] join the chosen set.

For each vertex [math]\displaystyle{ v\in V }[/math], let [math]\displaystyle{ X_v\in\{S,T\} }[/math] be the random variable which represents the set that [math]\displaystyle{ v }[/math] joins. For each edge [math]\displaystyle{ uv\in E }[/math], let [math]\displaystyle{ Y_{uv} }[/math] be the 0-1 random variable which indicates whether [math]\displaystyle{ uv }[/math] crosses between [math]\displaystyle{ S }[/math] and [math]\displaystyle{ T }[/math]. Clearly,

[math]\displaystyle{ \Pr[Y_{uv}=1]=\Pr[X_u\neq X_v]=\frac{1}{2}. }[/math]

The size of [math]\displaystyle{ c(S,T) }[/math] is given by [math]\displaystyle{ Y=\sum_{uv\in E}Y_{uv} }[/math]. By the linearity of expectation,

[math]\displaystyle{ \mathbf{E}[Y]=\sum_{uv\in E}\mathbf{E}[Y_{uv}]=\sum_{uv\in E}\Pr[Y_{uv}=1]=\frac{m}{2}. }[/math]

Therefore, there exist a bipartition [math]\displaystyle{ (S,T) }[/math] of [math]\displaystyle{ V }[/math] such that [math]\displaystyle{ |c(S,T)|\ge\frac{m}{2} }[/math], i.e. there exists a cut of [math]\displaystyle{ G }[/math] which contains at least [math]\displaystyle{ \frac{m}{2} }[/math] edges.

[math]\displaystyle{ \square }[/math]

Alternations

Independent sets

An independent set of a graph is a set of vertices with no edges between them. The following theorem gives a lower bound on the size of the largest independent set.

Theorem
Let [math]\displaystyle{ G(V,E) }[/math] be a graph on [math]\displaystyle{ n }[/math] vertices with [math]\displaystyle{ m }[/math] edges. Then [math]\displaystyle{ G }[/math] has an independent set with at least [math]\displaystyle{ \frac{n^2}{4m} }[/math] vertices.
Proof.
Let [math]\displaystyle{ S }[/math] be a set of vertices constructed as follows:
For each vertex [math]\displaystyle{ v\in V }[/math]:
  • [math]\displaystyle{ v }[/math] is included in [math]\displaystyle{ S }[/math] independently with probability [math]\displaystyle{ p }[/math],

[math]\displaystyle{ p }[/math] to be determined.

Let [math]\displaystyle{ X=|S| }[/math]. It is obvious that [math]\displaystyle{ \mathbf{E}[X]=np }[/math].

For each edge [math]\displaystyle{ e\in E }[/math], let [math]\displaystyle{ Y_{e} }[/math] be the random variable which indicates whether both endpoints of [math]\displaystyle{ }[/math] are in [math]\displaystyle{ S }[/math].

[math]\displaystyle{ \mathbf{E}[Y_{uv}]=\Pr[u\in S\wedge v\in S]=p^2. }[/math]

Let [math]\displaystyle{ Y }[/math] be the number of edges in the subgraph of [math]\displaystyle{ G }[/math] induced by [math]\displaystyle{ S }[/math]. It holds that [math]\displaystyle{ Y=\sum_{e\in E}Y_e }[/math]. By linearity of expectation,

[math]\displaystyle{ \mathbf{E}[Y]=\sum_{e\in E}\mathbf{E}[Y_e]=mp^2 }[/math].

Note that although [math]\displaystyle{ S }[/math] is not necessary an independent set, it can be modified to one if for each edge [math]\displaystyle{ e }[/math] of the induced subgraph [math]\displaystyle{ G(S) }[/math], we delete one of the endpoint of [math]\displaystyle{ e }[/math] from [math]\displaystyle{ S }[/math]. Let [math]\displaystyle{ S^* }[/math] be the resulting set. It is obvious that [math]\displaystyle{ S^* }[/math] is an independent set since there is no edge left in the induced subgraph [math]\displaystyle{ G(S^*) }[/math].

Since there are [math]\displaystyle{ Y }[/math] edges in [math]\displaystyle{ G(S) }[/math], there are at most [math]\displaystyle{ Y }[/math] vertices in [math]\displaystyle{ S }[/math] are deleted to make it become [math]\displaystyle{ S^* }[/math]. Therefore, [math]\displaystyle{ |S^*|\ge X-Y }[/math]. By linearity of expectation,

[math]\displaystyle{ \mathbf{E}[|S^*|]\ge\mathbf{E}[X-Y]=\mathbf{E}[X]-\mathbf{E}[Y]=np-mp^2. }[/math]

The expectation is maximized when [math]\displaystyle{ p=\frac{n}{2m} }[/math], thus

[math]\displaystyle{ \mathbf{E}[|S^*|]\ge n\cdot\frac{n}{2m}-m\left(\frac{n}{2m}\right)^2=\frac{n^2}{4m}. }[/math]

There exists an independent set which contains at least [math]\displaystyle{ \frac{n^2}{4m} }[/math] vertices.

[math]\displaystyle{ \square }[/math]

The proof actually propose a randomized algorithm for constructing large independent set:

Algorithm

Given a graph on [math]\displaystyle{ n }[/math] vertices with [math]\displaystyle{ m }[/math] edges, let [math]\displaystyle{ d=\frac{2m}{n} }[/math] be the average degree.

  1. For each vertex [math]\displaystyle{ v\in V }[/math], [math]\displaystyle{ v }[/math] is included in [math]\displaystyle{ S }[/math] independently with probability [math]\displaystyle{ \frac{1}{d} }[/math].
  2. For each remaining edge in the induced subgraph [math]\displaystyle{ G(S) }[/math], remove one of the endpoints from [math]\displaystyle{ S }[/math].

Let [math]\displaystyle{ S^* }[/math] be the resulting set. We have shown that [math]\displaystyle{ S^* }[/math] is an independent set and [math]\displaystyle{ \mathbf{E}[|S^*|]\ge\frac{n^2}{4m} }[/math].