Combinatorics (Fall 2010)/Ramsey theory: Difference between revisions

From TCS Wiki
Jump to navigation Jump to search
imported>WikiSysop
imported>WikiSysop
 
(17 intermediate revisions by the same user not shown)
Line 35: Line 35:


=== Ramsey number ===
=== Ramsey number ===
The smallest number <math>R(k,\ell)</math> satisfying the condition in the Ramsey theory is called the '''Ramsey number'''.
Alternatively, we can define <math>R(k,\ell)</math> as the smallest <math>N</math> such that if <math>n\ge N</math>, for any 2-coloring of <math>K_n</math> in red and blue, there is either a red <math>K_k</math> or a blue <math>K_\ell</math>. The Ramsey theorem is stated as:
:"''<math>R(k,\ell)</math> is finite for any positive integers <math>k</math> and <math>\ell</math>.''"
The core of the inductive proof of the Ramsey theorem is the following recursion
The core of the inductive proof of the Ramsey theorem is the following recursion
:<math>\begin{align}
:<math>\begin{align}
Line 48: Line 53:
}}
}}


=== Lovász local lemma===
Consider a set of "bad" events <math>A_1,A_2,\ldots,A_n</math>. Suppose that <math>\Pr[A_i]\le p</math> for all <math>1\le i\le n</math>. We want to show that there is a situation that none of the bad events occurs. Due to the probabilistic method, we need to prove that
:<math>
\Pr\left[\bigwedge_{i=1}^n\overline{A_i}\right]>0.
</math>
;Case 1<nowiki>: mutually independent events.</nowiki>
If all the bad events <math>A_1,A_2,\ldots,A_n</math> are mutually independent, then
:<math>
\Pr\left[\bigwedge_{i=1}^n\overline{A_i}\right]\ge(1-p)^n>0,
</math>
for any <math>p<1</math>.
;Case 2<nowiki>: arbitrarily dependent events.</nowiki>
On the other hand, if we put no assumption on the dependencies between the events, then by the union bound (which holds unconditionally),
:<math>
\Pr\left[\bigwedge_{i=1}^n\overline{A_i}\right]=1-\Pr\left[\bigvee_{i=1}^n A_i\right]\ge 1-np,
</math>
which is not an interesting bound for <math>p\ge\frac{1}{n}</math>. We cannot improve bound without further information regarding the dependencies between the events.
----
We would like to know what is going on between the two extreme cases: mutually independent events, and arbitrarily dependent events. The Lovász local lemma provides such a tool.
The local lemma is powerful tool for showing the possibility of rare event under ''limited dependencies''. The structure of dependencies between a set of events is described by a '''dependency graph'''.
{{Theorem
|Definition|
:Let <math>A_1,A_2,\ldots,A_n</math> be a set of events. A graph <math>D=(V,E)</math> on the set of vertices <math>V=\{1,2,\ldots,n\}</math> is called a '''dependency graph''' for the events <math>A_1,\ldots,A_n</math> if for each <math>i</math>, <math>1\le i\le n</math>, the event <math>A_i</math> is mutually independent of all the events <math>\{A_j\mid (i,j)\not\in E\}</math>.
}}
;Example
:Let <math>X_1,X_2,\ldots,X_m</math> be a set of ''mutually independent'' random variables. Each event <math>A_i</math> is a predicate defined on a number of variables among <math>X_1,X_2,\ldots,X_m</math>. Let <math>v(A_i)</math> be the unique smallest set of variables which determine <math>A_i</math>. The dependency graph <math>D=(V,E)</math> is defined by
:::<math>(i,j)\in E</math> iff <math>v(A_i)\cap v(A_j)\neq \emptyset</math>.
The following lemma, known as the Lovász local lemma, first proved by Erdős and Lovász in 1975, is an extremely powerful tool, as it supplies a way for dealing with rare events.
{{Theorem
|Lovász Local Lemma (symmetric case)|
:Let <math>A_1,A_2,\ldots,A_n</math> be a set of events, and assume that the following hold:
:#for all <math>1\le i\le n</math>, <math>\Pr[A_i]\le p</math>;
:#the maximum degree of the dependency graph for the events <math>A_1,A_2,\ldots,A_n</math> is <math>d</math>, and
:::<math>ep(d+1)\le 1</math>.
:Then
::<math>\Pr\left[\bigwedge_{i=1}^n\overline{A_i}\right]>0</math>.
}}
We will prove a general version of the local lemma, where the events <math>A_i</math> are not symmetric. This generalization is due to Spencer.
{{Theorem
|Lovász Local Lemma (general case)|
:Let <math>D=(V,E)</math> be the dependency graph of events <math>A_1,A_2,\ldots,A_n</math>. Suppose there exist real numbers <math>x_1,x_2,\ldots, x_n</math> such that <math>0\le x_i<1</math> and for all <math>1\le i\le n</math>,
::<math>\Pr[A_i]\le x_i\prod_{(i,j)\in E}(1-x_j)</math>.
:Then
::<math>\Pr\left[\bigwedge_{i=1}^n\overline{A_i}\right]\ge\prod_{i=1}^n(1-x_i)</math>.
}}
{{Proof|
We can use the following probability identity to compute the probability of the intersection of events:
{{Theorem|Lemma 1|
:<math>\Pr\left[\bigwedge_{i=1}^n\overline{A_i}\right]=\prod_{i=1}^n\Pr\left[\overline{A_i}\mid \bigwedge_{j=1}^{i-1}\overline{A_{j}}\right]</math>.
}}
{{Proof|
By definition of conditional probability,
:<math>
\Pr\left[\overline{A_n}\mid\bigwedge_{i=1}^{n-1}\overline{A_{i}}\right]
=\frac{\Pr\left[\bigwedge_{i=1}^n\overline{A_{i}}\right]}
{\Pr\left[\bigwedge_{i=1}^{n-1}\overline{A_{i}}\right]}</math>,
so we have
:<math>\Pr\left[\bigwedge_{i=1}^n\overline{A_{i}}\right]=\Pr\left[\bigwedge_{i=1}^{n-1}\overline{A_{i}}\right]\Pr\left[\overline{A_n}\mid\bigwedge_{i=1}^{n-1}\overline{A_{i}}\right]</math>.
The lemma is proved by recursively applying this equation.
}}
Next we prove by induction on <math>m</math> that for any set of <math>m</math> events <math>i_1,\ldots,i_m</math>,
:<math>\Pr\left[A_{i_1}\mid \bigwedge_{j=2}^m\overline{A_{i_j}}\right]\le x_{i_1}</math>.
The local lemma is a direct consequence of this by applying Lemma 1.
For <math>m=1</math>, this is obvious. For general <math>m</math>, let <math>i_2,\ldots,i_k</math> be the set of vertices adjacent to  <math>i_1</math> in the dependency graph. Clearly <math>k-1\le d</math>. And it holds that
:<math>
\Pr\left[A_{i_1}\mid \bigwedge_{j=2}^m\overline{A_{i_j}}\right]
=\frac{\Pr\left[ A_i\wedge \bigwedge_{j=2}^k\overline{A_{i_j}}\mid \bigwedge_{j=k+1}^m\overline{A_{i_j}}\right]}
{\Pr\left[\bigwedge_{j=2}^k\overline{A_{i_j}}\mid \bigwedge_{j=k+1}^m\overline{A_{i_j}}\right]}
</math>,
which is due to the basic conditional probability identity
:<math>\Pr[A\mid BC]=\frac{\Pr[AB\mid C]}{\Pr[B\mid C]}</math>.
We bound the numerator
:<math>
\begin{align}
\Pr\left[ A_{i_1}\wedge \bigwedge_{j=2}^k\overline{A_{i_j}}\mid \bigwedge_{j=k+1}^m\overline{A_{i_j}}\right]
&\le\Pr\left[ A_{i_1}\mid \bigwedge_{j=k+1}^m\overline{A_{i_j}}\right]\\
&=\Pr[A_{i_1}]\\
&\le x_{i_1}\prod_{(i_1,j)\in E}(1-x_j).
\end{align}
</math>
The equation is due to the independence between <math>A_{i_1}</math> and <math>A_{i_k+1},\ldots,A_{i_m}</math>.
The denominator can be expanded using Lemma 1 as
:<math>
\Pr\left[\bigwedge_{j=2}^k\overline{A_{i_j}}\mid \bigwedge_{j=k+1}^m\overline{A_{i_j}}\right]
=\prod_{j=2}^k\Pr\left[\overline{A_{i_j}}\mid \bigwedge_{\ell=j+1}^m\overline{A_{i_\ell}}\right]
</math>
which by the induction hypothesis, is at least
:<math>
\prod_{j=2}^k(1-x_{i_j})=\prod_{\{i_1,i_j\}\in E}(1-x_j)
</math>
where <math>E</math> is the edge set of the dependency graph.
Therefore,
:<math>
\Pr\left[A_{i_1}\mid \bigwedge_{j=2}^m\overline{A_{i_j}}\right]
\le\frac{x_{i_1}\prod_{(i_1,j)\in E}(1-x_j)}{\prod_{\{i_1,i_j\}\in E}(1-x_j)}\le x_{i_1}.
</math>
Applying Lemma 1,
:<math>
\begin{align}
\Pr\left[\bigwedge_{i=1}^n\overline{A_i}\right]
&=\prod_{i=1}^n\Pr\left[\overline{A_i}\mid \bigwedge_{j=1}^{i-1}\overline{A_{j}}\right]\\
&=\prod_{i=1}^n\left(1-\Pr\left[A_i\mid \bigwedge_{j=1}^{i-1}\overline{A_{j}}\right]\right)\\
&\ge\prod_{i=1}^n\left(1-x_i\right).
\end{align}
</math>
}}
To prove the symmetric case. Let <math>x_i=\frac{1}{d+1}</math> for all <math>i=1,2,\ldots,n</math>. Note that <math>\left(1-\frac{1}{d+1}\right)^d>\frac{1}{\mathrm{e}}</math>.
If the following conditions are satisfied:
:#for all <math>1\le i\le n</math>, <math>\Pr[A_i]\le p</math>;
:#<math>ep(d+1)\le 1</math>;
then for all <math>1\le i\le n</math>,
:<math>\Pr[A_i]\le p\le\frac{1}{e(d+1)}<\frac{1}{d+1}\left(1-\frac{1}{d+1}\right)^d\le x_i\prod_{(i,j)\in E}(1-x_j)</math>.
Due to the local lemma for general cases, this implies that
:<math>\Pr\left[\bigwedge_{i=1}^n\overline{A_i}\right]\ge\prod_{i=1}^n(1-x_i)=\left(1-\frac{1}{d+1}\right)^n>0</math>.
This gives the symmetric version of local lemma.
=== Ramsey number (continued)===
We can use the local lemma to prove a lower bound for the diagonal Ramsey number.
{{Theorem|Theorem|
{{Theorem|Theorem|
:<math>R(k,k)\ge Ck2^{k/2}</math> for some constant <math>C>0</math>.
:<math>R(k,k)\ge Ck2^{k/2}</math> for some constant <math>C>0</math>.
Line 73: Line 211:
}}
}}


 
{{Theorem|Theorem|
:<math>\Omega\left(k2^{k/2}\right)\le R(k,k)\le{2k-2\choose k-1}=O\left(k^{-1/2}4^{k}\right)</math>.
}}


{| class="wikitable"
{| class="wikitable"
Line 208: Line 348:
| 798–23556
| 798–23556
|}
|}
=== Lovász local lemma===
{{Theorem
|Definition|
:Let <math>A_1,A_2,\ldots,A_n</math> be a set of events. A graph <math>D=(V,E)</math> on the set of vertices <math>V=\{1,2,\ldots,n\}</math> is called a '''dependency graph''' for the events <math>A_1,\ldots,A_n</math> if for each <math>i</math>, <math>1\le i\le n</math>, the event <math>A_i</math> is mutually independent of all the events <math>\{A_j\mid (i,j)\not\in E\}</math>.
}}
;Example
:Let <math>X_1,X_2,\ldots,X_m</math> be a set of ''mutually independent'' random variables. Each event <math>A_i</math> is a predicate defined on a number of variables among <math>X_1,X_2,\ldots,X_m</math>. Let <math>v(A_i)</math> be the unique smallest set of variables which determine <math>A_i</math>. The dependency graph <math>D=(V,E)</math> is defined by
:::<math>(i,j)\in E</math> iff <math>v(A_i)\cap v(A_j)\neq \emptyset</math>.
The following lemma, known as the Lovász local lemma, first proved by Erdős and Lovász in 1975, is an extremely powerful tool, as it supplies a way for dealing with rare events.
{{Theorem
|Lovász Local Lemma (symmetric case)|
:Let <math>A_1,A_2,\ldots,A_n</math> be a set of events, and assume that the following hold:
:#for all <math>1\le i\le n</math>, <math>\Pr[A_i]\le p</math>;
:#the maximum degree of the dependency graph for the events <math>A_1,A_2,\ldots,A_n</math> is <math>d</math>, and
:::<math>ep(d+1)\le 1</math>.
:Then
::<math>\Pr\left[\bigwedge_{i=1}^n\overline{A_i}\right]>0</math>.
}}
We will prove a general version of the local lemma, where the events <math>A_i</math> are not symmetric. This generalization is due to Spencer.
{{Theorem
|Lovász Local Lemma (general case)|
:Let <math>D=(V,E)</math> be the dependency graph of events <math>A_1,A_2,\ldots,A_n</math>. Suppose there exist real numbers <math>x_1,x_2,\ldots, x_n</math> such that <math>0\le x_i<1</math> and for all <math>1\le i\le n</math>,
::<math>\Pr[A_i]\le x_i\prod_{(i,j)\in E}(1-x_j)</math>.
:Then
::<math>\Pr\left[\bigwedge_{i=1}^n\overline{A_i}\right]\ge\prod_{i=1}^n(1-x_i)</math>.
}}
{{Proof|
We can use the following probability identity to compute the probability of the intersection of events:
{{Theorem|Lemma 1|
:<math>\Pr\left[\bigwedge_{i=1}^n\overline{A_i}\right]=\prod_{i=1}^n\Pr\left[\overline{A_i}\mid \bigwedge_{j=1}^{i-1}\overline{A_{j}}\right]</math>.
}}
{{Proof|
By definition of conditional probability,
:<math>
\Pr\left[\overline{A_n}\mid\bigwedge_{i=1}^{n-1}\overline{A_{i}}\right]
=\frac{\Pr\left[\bigwedge_{i=1}^n\overline{A_{i}}\right]}
{\Pr\left[\bigwedge_{i=1}^{n-1}\overline{A_{i}}\right]}</math>,
so we have
:<math>\Pr\left[\bigwedge_{i=1}^n\overline{A_{i}}\right]=\Pr\left[\bigwedge_{i=1}^{n-1}\overline{A_{i}}\right]\Pr\left[\overline{A_n}\mid\bigwedge_{i=1}^{n-1}\overline{A_{i}}\right]</math>.
The lemma is proved by recursively applying this equation.
}}
Next we prove by induction on <math>m</math> that for any set of <math>m</math> events <math>i_1,\ldots,i_m</math>,
:<math>\Pr\left[A_{i_1}\mid \bigwedge_{j=2}^m\overline{A_{i_j}}\right]\le x_{i_1}</math>.
The local lemma is a direct consequence of this by applying Lemma 1.
For <math>m=1</math>, this is obvious. For general <math>m</math>, let <math>i_2,\ldots,i_k</math> be the set of vertices adjacent to  <math>i_1</math> in the dependency graph. Clearly <math>k-1\le d</math>. And it holds that
:<math>
\Pr\left[A_{i_1}\mid \bigwedge_{j=2}^m\overline{A_{i_j}}\right]
=\frac{\Pr\left[ A_i\wedge \bigwedge_{j=2}^k\overline{A_{i_j}}\mid \bigwedge_{j=k+1}^m\overline{A_{i_j}}\right]}
{\Pr\left[\bigwedge_{j=2}^k\overline{A_{i_j}}\mid \bigwedge_{j=k+1}^m\overline{A_{i_j}}\right]}
</math>,
which is due to the basic conditional probability identity
:<math>\Pr[A\mid BC]=\frac{\Pr[AB\mid C]}{\Pr[B\mid C]}</math>.
We bound the numerator
:<math>
\begin{align}
\Pr\left[ A_{i_1}\wedge \bigwedge_{j=2}^k\overline{A_{i_j}}\mid \bigwedge_{j=k+1}^m\overline{A_{i_j}}\right]
&\le\Pr\left[ A_{i_1}\mid \bigwedge_{j=k+1}^m\overline{A_{i_j}}\right]\\
&=\Pr[A_{i_1}]\\
&\le x_{i_1}\prod_{(i_1,j)\in E}(1-x_j).
\end{align}
</math>
The equation is due to the independence between <math>A_{i_1}</math> and <math>A_{i_k+1},\ldots,A_{i_m}</math>.
The denominator can be expanded using Lemma 1 as
:<math>
\Pr\left[\bigwedge_{j=2}^k\overline{A_{i_j}}\mid \bigwedge_{j=k+1}^m\overline{A_{i_j}}\right]
=\prod_{j=2}^k\Pr\left[\overline{A_{i_j}}\mid \bigwedge_{\ell=j+1}^m\overline{A_{i_\ell}}\right]
</math>
which by the induction hypothesis, is at least
:<math>
\prod_{j=2}^k(1-x_{i_j})=\prod_{\{i_1,i_j\}\in E}(1-x_j)
</math>
where <math>E</math> is the edge set of the dependency graph.
Therefore,
:<math>
\Pr\left[A_{i_1}\mid \bigwedge_{j=2}^m\overline{A_{i_j}}\right]
\le\frac{x_{i_1}\prod_{(i_1,j)\in E}(1-x_j)}{\prod_{\{i_1,i_j\}\in E}(1-x_j)}\le x_{i_1}.
</math>
Applying Lemma 1,
:<math>
\begin{align}
\Pr\left[\bigwedge_{i=1}^n\overline{A_i}\right]
&=\prod_{i=1}^n\Pr\left[\overline{A_i}\mid \bigwedge_{j=1}^{i-1}\overline{A_{j}}\right]\\
&=\prod_{i=1}^n\left(1-\Pr\left[A_i\mid \bigwedge_{j=1}^{i-1}\overline{A_{j}}\right]\right)\\
&\ge\prod_{i=1}^n\left(1-x_i\right).
\end{align}
</math>
}}
To prove the symmetric case. Let <math>x_i=\frac{1}{d+1}</math> for all <math>i=1,2,\ldots,n</math>. Note that <math>\left(1-\frac{1}{d+1}\right)^d>\frac{1}{\mathrm{e}}</math>.
If the following conditions are satisfied:
:#for all <math>1\le i\le n</math>, <math>\Pr[A_i]\le p</math>;
:#<math>ep(d+1)\le 1</math>;
then for all <math>1\le i\le n</math>,
:<math>\Pr[A_i]\le p\le\frac{1}{e(d+1)}<\frac{1}{d+1}\left(1-\frac{1}{d+1}\right)^d\le x_i\prod_{(i,j)\in E}(1-x_j)</math>.
Due to the local lemma for general cases, this implies that
:<math>\Pr\left[\bigwedge_{i=1}^n\overline{A_i}\right]\ge\prod_{i=1}^n(1-x_i)=\left(1-\frac{1}{d+1}\right)^n>0</math>.
This gives the symmetric version of local lemma.


=== Ramsey's theorem for hypergraph ===
=== Ramsey's theorem for hypergraph ===
Line 354: Line 387:


{{Theorem|Theorem (Erdős-Szekeres 1935)|
{{Theorem|Theorem (Erdős-Szekeres 1935)|
:For any positive integer <math>n\ge 3</math>, there is an <math>N(n)</math> such that any set of at least <math>N(n)</math> points in general position in the plane (i.e., no three of the points are on a line) contains <math>n</math> points that are the vertices of a convex <math>n</math>-gon.
:For any positive integer <math>m\ge 3</math>, there is an <math>N(m)</math> such that any set of at least <math>N(m)</math> points in general position in the plane (i.e., no three of the points are on a line) contains <math>m</math> points that are the vertices of a convex <math>m</math>-gon.
}}
{{Proof|
Let <math>N(m)=R_3(m,m)</math>. For <math>n\ge N(m)</math>, let <math>X</math> be an arbitrary set of <math>n</math> points in the plane, no three of which are on a line. Define a 2-coloring of the 3-subsets of points <math>f:{X\choose 3}\rightarrow\{0,1\}</math> as follows: for any <math>\{a,b,c\}\in{X\choose 3}</math>, let <math>\triangle_{abc}\subset X</math> be the set of points covered by the triangle <math>abc</math>; and <math>f(\{a,b,c\})=|\triangle_{abc}|\bmod 2</math>, that is, <math>f(\{a,b,c\})</math> indicates the oddness of the number of points covered by the triangle <math>abc</math>.
 
Since <math>|X|\ge R_3(m,m)</math>, there exists a <math>Y\subseteq X</math> such that <math>|Y|=m</math> and all members of <math>{Y\choose 3}</math> are colored with the same value by <math>f</math>.
 
We claim that the <math>m</math> points in <math>Y</math> are the vertices of a convex <math>m</math>-gon. If otherwise, by the definition of convexity, there exist <math>\{a,b,c,d\}\subseteq Y</math> such that <math>d\in\triangle_{abc}</math>. Since no three points are in the same line,
:<math>\triangle_{abc}=\triangle_{abd}\cup\triangle_{acd}\cup\triangle_{bcd}\cup\{d\}</math>,
where all unions are disjoint. Then <math>|\triangle_{abc}|=|\triangle_{abd}|+|\triangle_{acd}|+|\triangle_{bcd}|+1</math>, which implies that <math>f(\{a,b,c\}), f(\{a,b,d\}), f(\{a,c,d\}), f(\{b,c,d\})\,</math> cannot be equal, contradicting that all members of <math>{Y\choose 3}</math> have the same color.
}}
}}


=== Yao's lower bound on implicit data structures ===
=== Yao's lower bound on implicit data structures ===
{{Theorem|Lemma|
{{Theorem|Lemma|
:Let <math>n\ge 2</math> and <math>N\ge 2n-1</math>. Suppose the universe is <math>[N]</math> and the size of the data set is <math>n</math>.
:Let <math>n\ge 2</math> be a power of 2 and <math>N\ge 2n</math>. Suppose the universe is <math>[N]</math> and the size of the data set is <math>n</math>.
:If the data structure is a sorted table, any search algorithm requires at least <math>\lceil\log(n+1)\rceil</math> accesses to the data structure in the worst case.
:If the data structure is a sorted table, any search algorithm requires at least <math>\log n</math> accesses to the data structure in the worst case.
}}
}}
{{Proof|
{{Proof|
We will show by an adversarial argument that <math>\lceil\log(n+1)\rceil</math> accesses are required to search for the key value <math>x=n</math> from the universe <math>[N]=\{1,2,\ldots,N\}</math>. The construction of the adversarial data set <math>S</math> is by induction on <math>n</math>.
We will show by an adversarial argument that <math>\log n</math> accesses are required to search for the key value <math>x=n</math> from the universe <math>[N]=\{1,2,\ldots,N\}</math>. The construction of the adversarial data set <math>S</math> is by induction on <math>n</math>.


For <math>n=2</math> and <math>N\ge 2n-1=3</math> it is easy to see that two accesses are necessary.
For <math>n=2</math> and <math>N\ge 2n-1=3</math> it is easy to see that two accesses are necessary.


Let <math>n_0>2</math>. Assume the induction hypothesis to be true for all <math>n<n_0</math>; we will prove it for <math>n=n_0, m\ge 2n_0-1</math> and <math>x=n_0</math>.  
Let <math>n>2</math>. Assume the induction hypothesis to be true for all smaller <math>n</math>; we will prove it for the size of data set <math>n</math>, size of universe <math>N\ge 2n</math> and the search key <math>x=n</math>.


By symmetry, assume that the first access position <math>\ell</math> satisfies <math>\ell\le\lceil n_0/2\rceil</math>. The adversary answers <math>T[\ell]=\ell</math>. Then the key <math>x=n_0</math> may be in any position <math>i</math>, where <math>\lceil n_0/2\rceil+1\le i\le n_0</math>. In fact, <math>T[\lceil n_0/2\rceil+1]</math> through <math>T[n_0]</math> is a sorted table of size <math>n'=\lfloor n_0/2\rfloor</math> which may contain any <math>n'</math>-subset of <math>\{\lceil n_0/2\rceil+1,\lceil n_0/2\rceil+2,\ldots,N\}</math>, and hence, in particular, any subset of the universe
Suppose that the first access position is <math>k</math>. The adversary chooses the table content <math>T[k]</math>. The adversary's strategy is:
:<math>U'=\{\lceil n_0/2\rceil+1,\lceil n_0/2\rceil+2,\ldots,N-\lceil n_0/2\rceil\}</math>.
:<math>
\begin{align}
T[k]=
\begin{cases}
k & k\le \frac{n}{2},\\
N-(n-k) & k> \frac{n}{2}.
\end{cases}
\end{align}
</math>
By symmetry, suppose it is the first case that <math>k\le \frac{n}{2}</math>. Then the key <math>x=n</math> may be in any position <math>i</math>, where <math>n/2+1\le i\le n</math>. In fact, <math>T[ n/2+1]</math> through <math>T[n]</math> is a sorted table of size <math>n'=n/2</math> which may contain any <math>n'</math>-subset of <math>\{n/2+1, n/2+2,\ldots,N\}</math>, and hence, in particular, any <math>n'</math>-subset of the universe
:<math>U'=\{n/2+1, n/2+2,\ldots,N-n/2\}</math>.
The size <math>N'</math> of <math>U'</math> satisfies
The size <math>N'</math> of <math>U'</math> satisfies
:<math>N'=N-2\lceil n_0/2\rceil\ge (2n_0-1)-2\lceil n_0/2\rceil\ge 2\lfloor n_0/2\rfloor-1=2n'-1</math>,
:<math>N'=N-n/2-n/2\ge 2n-n\ge 2n'</math>,
and the desired key <math>n_0</math> has the relative value <math>x'=n_0-\lceil n_0/2\rceil=n'</math> in the universe <math>U'</math>.  
and the desired key <math>n</math> has the relative value <math>x'=n- n/2=n'</math> in the universe <math>U'</math>.  


By the induction hypothesis, <math>\lceil\log(n'+1)\rceil</math> more accesses will be required. Hence the total number of accesses is at least  
By the induction hypothesis, <math>\log n'=-1+\log n</math> more accesses will be required. Hence the total number of accesses is at least <math>1+\log n'=\log n</math>.
:<math>1+\lceil\log(n'+1)\rceil=1+\lceil\log(\lfloor n_0/2\rfloor+1)\rceil\ge\lceil\log(n_0+1)\rceil</math>.
 
If the first access is <math>k> \frac{n}{2}</math>, we symmetrically get that <math>T[1]</math> through <math>T[n/2]</math> is a sorted table of size <math>n'=n/2</math> which may contain any <math>n'</math>-subset of the universe
:<math>U'=\{n/2+1, n/2+2,\ldots,N-n/2\}</math>.
The rest is the same.
}}
}}
We have seen that on a sorted table, there is no search algorithm outperforming the binary search in the worst case.
Our question is:
:''Is there any other order than the increasing order, on which there is a better search algorithm?''
An '''implicit data structure''' use no extra space in addition to the original data set, thus a data structure can only be represented ''implicitly'' by the order of the data items in the table. That is, each data set is stored as a permutation of the set. Formally, an implicit data structure is a function
:<math>f:{U\choose n}\rightarrow[n!]</math>,
where each <math>\pi\in[n!]</math> specify a permutation of the sorted table. Thus, the sorted table is the simplest implicit data structure, in which <math>f(S)</math> is the identity for all <math>S\in{U\choose n}</math>.


== Ramsey-like Theorems ==
== Ramsey-like Theorems ==

Latest revision as of 12:28, 1 December 2010

Ramsey's Theorem

Ramsey's theorem for graph

Ramsey's Theorem
Let [math]\displaystyle{ k,\ell }[/math] be positive integers. Then there exists an integer [math]\displaystyle{ R(k,\ell) }[/math] satisfying:
If [math]\displaystyle{ n\ge R(k,\ell) }[/math], for any coloring of edges of [math]\displaystyle{ K_n }[/math] with two colors red and blue, there exists a red [math]\displaystyle{ K_k }[/math] or a blue [math]\displaystyle{ K_\ell }[/math].
Proof.

We show that [math]\displaystyle{ R(k,\ell) }[/math] is finite by induction on [math]\displaystyle{ k+\ell }[/math]. For the base case, it is easy to verify that

[math]\displaystyle{ R(k,1)=R(1,\ell)=1 }[/math].

For general [math]\displaystyle{ k }[/math] and [math]\displaystyle{ \ell }[/math], we will show that

[math]\displaystyle{ R(k,\ell)\le R(k,\ell-1)+R(k-1,\ell) }[/math].

Suppose we have a two coloring of [math]\displaystyle{ K_n }[/math], where [math]\displaystyle{ n=R(k,\ell-1)+R(k-1,\ell) }[/math]. Take an arbitrary vertex [math]\displaystyle{ v }[/math], and split [math]\displaystyle{ V\setminus\{v\} }[/math] into two subsets [math]\displaystyle{ S }[/math] and [math]\displaystyle{ T }[/math], where

[math]\displaystyle{ \begin{align} S&=\{u\in V\setminus\{v\}\mid uv \text{ is blue }\}\\ T&=\{u\in V\setminus\{v\}\mid uv \text{ is red }\} \end{align} }[/math]

Since

[math]\displaystyle{ |S|+|T|+1=n=R(k,\ell-1)+R(k-1,\ell) }[/math],

we have either [math]\displaystyle{ |S|\ge R(k,\ell-1) }[/math] or [math]\displaystyle{ |T|\ge R(k-1,\ell) }[/math]. By symmetry, suppose [math]\displaystyle{ |S|\ge R(k,\ell-1) }[/math]. By induction hypothesis, the complete subgraph defined on [math]\displaystyle{ S }[/math] has either a red [math]\displaystyle{ K_k }[/math], in which case we are done; or a blue [math]\displaystyle{ K_{\ell-1} }[/math], in which case the complete subgraph defined on [math]\displaystyle{ S\cup{v} }[/math] must have a blue [math]\displaystyle{ K_\ell }[/math] since all edges from [math]\displaystyle{ v }[/math] to vertices in [math]\displaystyle{ S }[/math] are blue.

[math]\displaystyle{ \square }[/math]
Ramsey's Theorem (graph, multicolor)
Let [math]\displaystyle{ r, k_1,k_2,\ldots,k_r }[/math] be positive integers. Then there exists an integer [math]\displaystyle{ R(r;k_1,k_2,\ldots,k_r) }[/math] satisfying:
For any [math]\displaystyle{ r }[/math]-coloring of a complete graph of [math]\displaystyle{ n\ge R(r;k_1,k_2,\ldots,k_r) }[/math] vertices, there exists a monochromatic [math]\displaystyle{ k_i }[/math]-clique with the [math]\displaystyle{ i }[/math]th color for some [math]\displaystyle{ i\in\{1,2,\ldots,r\} }[/math].
Lemma (the "mixing color" trick)
[math]\displaystyle{ R(r;k_1,k_2,\ldots,k_r)\le R(r-1;k_1,k_2,\ldots,k_{r-2},R(2;k_{r-1},k_r)) }[/math]
Proof.

We transfer the [math]\displaystyle{ r }[/math]-coloring to [math]\displaystyle{ (r-1) }[/math]-coloring by identifying the [math]\displaystyle{ (r-1) }[/math]th and the [math]\displaystyle{ r }[/math]th colors.

If [math]\displaystyle{ n\ge R(r-1;k_1,k_2,\ldots,k_{r-2},R(2;k_{r-1},k_r)) }[/math], then for any [math]\displaystyle{ r }[/math]-coloring of [math]\displaystyle{ K_n }[/math], there either exist an [math]\displaystyle{ i\in\{1,2,\ldots,r-2\} }[/math] and a [math]\displaystyle{ k_i }[/math]-clique which is monochromatically colored with the [math]\displaystyle{ i }[/math]th color; or exists clique of [math]\displaystyle{ R(2;k_{r-1},k_r) }[/math] vertices which is monochromatically colored with the mixed color of the original [math]\displaystyle{ (r-1) }[/math]th and [math]\displaystyle{ r }[/math]th colors, which again implies that there exists either a [math]\displaystyle{ k }[/math]-clique which is monochromatically colored with the original [math]\displaystyle{ (r-1) }[/math]th color, or a [math]\displaystyle{ \ell }[/math]-clique which is monochromatically colored with the original [math]\displaystyle{ r }[/math]th color. This implies the recursion.

[math]\displaystyle{ \square }[/math]

Ramsey number

The smallest number [math]\displaystyle{ R(k,\ell) }[/math] satisfying the condition in the Ramsey theory is called the Ramsey number.

Alternatively, we can define [math]\displaystyle{ R(k,\ell) }[/math] as the smallest [math]\displaystyle{ N }[/math] such that if [math]\displaystyle{ n\ge N }[/math], for any 2-coloring of [math]\displaystyle{ K_n }[/math] in red and blue, there is either a red [math]\displaystyle{ K_k }[/math] or a blue [math]\displaystyle{ K_\ell }[/math]. The Ramsey theorem is stated as:

"[math]\displaystyle{ R(k,\ell) }[/math] is finite for any positive integers [math]\displaystyle{ k }[/math] and [math]\displaystyle{ \ell }[/math]."

The core of the inductive proof of the Ramsey theorem is the following recursion

[math]\displaystyle{ \begin{align} R(k,1) &=R(1,\ell)=1\\ R(k,\ell) &\le R(k,\ell-1)+R(k-1,\ell). \end{align} }[/math]

From this recursion, we can deduce an upper bound for the Ramsey number.

Theorem
[math]\displaystyle{ R(k,\ell)\le{k+\ell-2\choose k-1} }[/math].
Proof.
It is easy to verify the bound by induction.
[math]\displaystyle{ \square }[/math]

Lovász local lemma

Consider a set of "bad" events [math]\displaystyle{ A_1,A_2,\ldots,A_n }[/math]. Suppose that [math]\displaystyle{ \Pr[A_i]\le p }[/math] for all [math]\displaystyle{ 1\le i\le n }[/math]. We want to show that there is a situation that none of the bad events occurs. Due to the probabilistic method, we need to prove that

[math]\displaystyle{ \Pr\left[\bigwedge_{i=1}^n\overline{A_i}\right]\gt 0. }[/math]
Case 1: mutually independent events.

If all the bad events [math]\displaystyle{ A_1,A_2,\ldots,A_n }[/math] are mutually independent, then

[math]\displaystyle{ \Pr\left[\bigwedge_{i=1}^n\overline{A_i}\right]\ge(1-p)^n\gt 0, }[/math]

for any [math]\displaystyle{ p\lt 1 }[/math].

Case 2: arbitrarily dependent events.

On the other hand, if we put no assumption on the dependencies between the events, then by the union bound (which holds unconditionally),

[math]\displaystyle{ \Pr\left[\bigwedge_{i=1}^n\overline{A_i}\right]=1-\Pr\left[\bigvee_{i=1}^n A_i\right]\ge 1-np, }[/math]

which is not an interesting bound for [math]\displaystyle{ p\ge\frac{1}{n} }[/math]. We cannot improve bound without further information regarding the dependencies between the events.


We would like to know what is going on between the two extreme cases: mutually independent events, and arbitrarily dependent events. The Lovász local lemma provides such a tool.

The local lemma is powerful tool for showing the possibility of rare event under limited dependencies. The structure of dependencies between a set of events is described by a dependency graph.

Definition
Let [math]\displaystyle{ A_1,A_2,\ldots,A_n }[/math] be a set of events. A graph [math]\displaystyle{ D=(V,E) }[/math] on the set of vertices [math]\displaystyle{ V=\{1,2,\ldots,n\} }[/math] is called a dependency graph for the events [math]\displaystyle{ A_1,\ldots,A_n }[/math] if for each [math]\displaystyle{ i }[/math], [math]\displaystyle{ 1\le i\le n }[/math], the event [math]\displaystyle{ A_i }[/math] is mutually independent of all the events [math]\displaystyle{ \{A_j\mid (i,j)\not\in E\} }[/math].
Example
Let [math]\displaystyle{ X_1,X_2,\ldots,X_m }[/math] be a set of mutually independent random variables. Each event [math]\displaystyle{ A_i }[/math] is a predicate defined on a number of variables among [math]\displaystyle{ X_1,X_2,\ldots,X_m }[/math]. Let [math]\displaystyle{ v(A_i) }[/math] be the unique smallest set of variables which determine [math]\displaystyle{ A_i }[/math]. The dependency graph [math]\displaystyle{ D=(V,E) }[/math] is defined by
[math]\displaystyle{ (i,j)\in E }[/math] iff [math]\displaystyle{ v(A_i)\cap v(A_j)\neq \emptyset }[/math].

The following lemma, known as the Lovász local lemma, first proved by Erdős and Lovász in 1975, is an extremely powerful tool, as it supplies a way for dealing with rare events.

Lovász Local Lemma (symmetric case)
Let [math]\displaystyle{ A_1,A_2,\ldots,A_n }[/math] be a set of events, and assume that the following hold:
  1. for all [math]\displaystyle{ 1\le i\le n }[/math], [math]\displaystyle{ \Pr[A_i]\le p }[/math];
  2. the maximum degree of the dependency graph for the events [math]\displaystyle{ A_1,A_2,\ldots,A_n }[/math] is [math]\displaystyle{ d }[/math], and
[math]\displaystyle{ ep(d+1)\le 1 }[/math].
Then
[math]\displaystyle{ \Pr\left[\bigwedge_{i=1}^n\overline{A_i}\right]\gt 0 }[/math].

We will prove a general version of the local lemma, where the events [math]\displaystyle{ A_i }[/math] are not symmetric. This generalization is due to Spencer.

Lovász Local Lemma (general case)
Let [math]\displaystyle{ D=(V,E) }[/math] be the dependency graph of events [math]\displaystyle{ A_1,A_2,\ldots,A_n }[/math]. Suppose there exist real numbers [math]\displaystyle{ x_1,x_2,\ldots, x_n }[/math] such that [math]\displaystyle{ 0\le x_i\lt 1 }[/math] and for all [math]\displaystyle{ 1\le i\le n }[/math],
[math]\displaystyle{ \Pr[A_i]\le x_i\prod_{(i,j)\in E}(1-x_j) }[/math].
Then
[math]\displaystyle{ \Pr\left[\bigwedge_{i=1}^n\overline{A_i}\right]\ge\prod_{i=1}^n(1-x_i) }[/math].
Proof.

We can use the following probability identity to compute the probability of the intersection of events:

Lemma 1
[math]\displaystyle{ \Pr\left[\bigwedge_{i=1}^n\overline{A_i}\right]=\prod_{i=1}^n\Pr\left[\overline{A_i}\mid \bigwedge_{j=1}^{i-1}\overline{A_{j}}\right] }[/math].
Proof.

By definition of conditional probability,

[math]\displaystyle{ \Pr\left[\overline{A_n}\mid\bigwedge_{i=1}^{n-1}\overline{A_{i}}\right] =\frac{\Pr\left[\bigwedge_{i=1}^n\overline{A_{i}}\right]} {\Pr\left[\bigwedge_{i=1}^{n-1}\overline{A_{i}}\right]} }[/math],

so we have

[math]\displaystyle{ \Pr\left[\bigwedge_{i=1}^n\overline{A_{i}}\right]=\Pr\left[\bigwedge_{i=1}^{n-1}\overline{A_{i}}\right]\Pr\left[\overline{A_n}\mid\bigwedge_{i=1}^{n-1}\overline{A_{i}}\right] }[/math].

The lemma is proved by recursively applying this equation.

[math]\displaystyle{ \square }[/math]

Next we prove by induction on [math]\displaystyle{ m }[/math] that for any set of [math]\displaystyle{ m }[/math] events [math]\displaystyle{ i_1,\ldots,i_m }[/math],

[math]\displaystyle{ \Pr\left[A_{i_1}\mid \bigwedge_{j=2}^m\overline{A_{i_j}}\right]\le x_{i_1} }[/math].

The local lemma is a direct consequence of this by applying Lemma 1.

For [math]\displaystyle{ m=1 }[/math], this is obvious. For general [math]\displaystyle{ m }[/math], let [math]\displaystyle{ i_2,\ldots,i_k }[/math] be the set of vertices adjacent to [math]\displaystyle{ i_1 }[/math] in the dependency graph. Clearly [math]\displaystyle{ k-1\le d }[/math]. And it holds that

[math]\displaystyle{ \Pr\left[A_{i_1}\mid \bigwedge_{j=2}^m\overline{A_{i_j}}\right] =\frac{\Pr\left[ A_i\wedge \bigwedge_{j=2}^k\overline{A_{i_j}}\mid \bigwedge_{j=k+1}^m\overline{A_{i_j}}\right]} {\Pr\left[\bigwedge_{j=2}^k\overline{A_{i_j}}\mid \bigwedge_{j=k+1}^m\overline{A_{i_j}}\right]} }[/math],

which is due to the basic conditional probability identity

[math]\displaystyle{ \Pr[A\mid BC]=\frac{\Pr[AB\mid C]}{\Pr[B\mid C]} }[/math].

We bound the numerator

[math]\displaystyle{ \begin{align} \Pr\left[ A_{i_1}\wedge \bigwedge_{j=2}^k\overline{A_{i_j}}\mid \bigwedge_{j=k+1}^m\overline{A_{i_j}}\right] &\le\Pr\left[ A_{i_1}\mid \bigwedge_{j=k+1}^m\overline{A_{i_j}}\right]\\ &=\Pr[A_{i_1}]\\ &\le x_{i_1}\prod_{(i_1,j)\in E}(1-x_j). \end{align} }[/math]

The equation is due to the independence between [math]\displaystyle{ A_{i_1} }[/math] and [math]\displaystyle{ A_{i_k+1},\ldots,A_{i_m} }[/math].

The denominator can be expanded using Lemma 1 as

[math]\displaystyle{ \Pr\left[\bigwedge_{j=2}^k\overline{A_{i_j}}\mid \bigwedge_{j=k+1}^m\overline{A_{i_j}}\right] =\prod_{j=2}^k\Pr\left[\overline{A_{i_j}}\mid \bigwedge_{\ell=j+1}^m\overline{A_{i_\ell}}\right] }[/math]

which by the induction hypothesis, is at least

[math]\displaystyle{ \prod_{j=2}^k(1-x_{i_j})=\prod_{\{i_1,i_j\}\in E}(1-x_j) }[/math]

where [math]\displaystyle{ E }[/math] is the edge set of the dependency graph.

Therefore,

[math]\displaystyle{ \Pr\left[A_{i_1}\mid \bigwedge_{j=2}^m\overline{A_{i_j}}\right] \le\frac{x_{i_1}\prod_{(i_1,j)\in E}(1-x_j)}{\prod_{\{i_1,i_j\}\in E}(1-x_j)}\le x_{i_1}. }[/math]

Applying Lemma 1,

[math]\displaystyle{ \begin{align} \Pr\left[\bigwedge_{i=1}^n\overline{A_i}\right] &=\prod_{i=1}^n\Pr\left[\overline{A_i}\mid \bigwedge_{j=1}^{i-1}\overline{A_{j}}\right]\\ &=\prod_{i=1}^n\left(1-\Pr\left[A_i\mid \bigwedge_{j=1}^{i-1}\overline{A_{j}}\right]\right)\\ &\ge\prod_{i=1}^n\left(1-x_i\right). \end{align} }[/math]
[math]\displaystyle{ \square }[/math]

To prove the symmetric case. Let [math]\displaystyle{ x_i=\frac{1}{d+1} }[/math] for all [math]\displaystyle{ i=1,2,\ldots,n }[/math]. Note that [math]\displaystyle{ \left(1-\frac{1}{d+1}\right)^d\gt \frac{1}{\mathrm{e}} }[/math].

If the following conditions are satisfied:

  1. for all [math]\displaystyle{ 1\le i\le n }[/math], [math]\displaystyle{ \Pr[A_i]\le p }[/math];
  2. [math]\displaystyle{ ep(d+1)\le 1 }[/math];

then for all [math]\displaystyle{ 1\le i\le n }[/math],

[math]\displaystyle{ \Pr[A_i]\le p\le\frac{1}{e(d+1)}\lt \frac{1}{d+1}\left(1-\frac{1}{d+1}\right)^d\le x_i\prod_{(i,j)\in E}(1-x_j) }[/math].

Due to the local lemma for general cases, this implies that

[math]\displaystyle{ \Pr\left[\bigwedge_{i=1}^n\overline{A_i}\right]\ge\prod_{i=1}^n(1-x_i)=\left(1-\frac{1}{d+1}\right)^n\gt 0 }[/math].

This gives the symmetric version of local lemma.

Ramsey number (continued)

We can use the local lemma to prove a lower bound for the diagonal Ramsey number.

Theorem
[math]\displaystyle{ R(k,k)\ge Ck2^{k/2} }[/math] for some constant [math]\displaystyle{ C\gt 0 }[/math].
Proof.

To prove a lower bound [math]\displaystyle{ R(k,k)\gt n }[/math], it is sufficient to show that there exists a 2-coloring of [math]\displaystyle{ K_n }[/math] without a monochromatic [math]\displaystyle{ K_k }[/math]. We prove this by the probabilistic method.

Pick a random 2-coloring of [math]\displaystyle{ K_n }[/math] by coloring each edge uniformly and independently with one of the two colors. For any set [math]\displaystyle{ S }[/math] of [math]\displaystyle{ k }[/math] vertices, let [math]\displaystyle{ A_S }[/math] denote the event that [math]\displaystyle{ S }[/math] forms a monochromatic [math]\displaystyle{ K_k }[/math]. It is easy to see that [math]\displaystyle{ \Pr[A_s]=2^{1-{k\choose 2}}=p }[/math].

For any [math]\displaystyle{ k }[/math]-subset [math]\displaystyle{ T }[/math] of vertices, [math]\displaystyle{ A_S }[/math] and [math]\displaystyle{ A_T }[/math] are dependent if and only if [math]\displaystyle{ |S\cap T|\ge 2 }[/math]. For each [math]\displaystyle{ S }[/math], the number of [math]\displaystyle{ T }[/math] that [math]\displaystyle{ |S\cap T|\ge 2 }[/math] is at most [math]\displaystyle{ {k\choose 2}{n\choose k-2} }[/math], so the max degree of the dependency graph is [math]\displaystyle{ d\le{k\choose 2}{n\choose k-2} }[/math].

Take [math]\displaystyle{ n=Ck2^{k/2} }[/math] for some appropriate constant [math]\displaystyle{ C\gt 0 }[/math].

[math]\displaystyle{ \begin{align} \mathrm{e}p(d+1) &\le \mathrm{e}2^{1-{k\choose 2}}\left({k\choose 2}{n\choose k-2}+1\right)\\ &\le 2^{3-{k\choose 2}}{k\choose 2}{n\choose k-2}\\ &\le 1 \end{align} }[/math]

Applying the local lemma, the probability that there is no monochromatic [math]\displaystyle{ K_k }[/math] is

[math]\displaystyle{ \Pr\left[\bigwedge_{S\in{[n]\choose k}}\overline{A_S}\right]\gt 0 }[/math].

Therefore, there exists a 2-coloring of [math]\displaystyle{ K_n }[/math] which has no monochromatic [math]\displaystyle{ K_k }[/math], which means

[math]\displaystyle{ R(k,k)\gt n=Ck2^{k/2} }[/math].
[math]\displaystyle{ \square }[/math]
Theorem
[math]\displaystyle{ \Omega\left(k2^{k/2}\right)\le R(k,k)\le{2k-2\choose k-1}=O\left(k^{-1/2}4^{k}\right) }[/math].
[math]\displaystyle{ k }[/math],[math]\displaystyle{ l }[/math] 1 2 3 4 5 6 7 8 9 10
1 1 1 1 1 1 1 1 1 1 1
2 1 2 3 4 5 6 7 8 9 10
3 1 3 6 9 14 18 23 28 36 40–43
4 1 4 9 18 25 35–41 49–61 56–84 73–115 92–149
5 1 5 14 25 43–49 58–87 80–143 101–216 125–316 143–442
6 1 6 18 35–41 58–87 102–165 113–298 127–495 169–780 179–1171
7 1 7 23 49–61 80–143 113–298 205–540 216–1031 233–1713 289–2826
8 1 8 28 56–84 101–216 127–495 216–1031 282–1870 317–3583 317-6090
9 1 9 36 73–115 125–316 169–780 233–1713 317–3583 565–6588 580–12677
10 1 10 40–43 92–149 143–442 179–1171 289–2826 317-6090 580–12677 798–23556

Ramsey's theorem for hypergraph

Ramsey's Theorem (hypergraph, multicolor)
Let [math]\displaystyle{ r, t, k_1,k_2,\ldots,k_r }[/math] be positive integers. Then there exists an integer [math]\displaystyle{ R_t(r;k_1,k_2,\ldots,k_r) }[/math] satisfying:
For any [math]\displaystyle{ r }[/math]-coloring of [math]\displaystyle{ {[n]\choose t} }[/math] with [math]\displaystyle{ n\ge R_t(r;k_1,k_2,\ldots,k_r) }[/math], there exist an [math]\displaystyle{ i\in\{1,2,\ldots,r\} }[/math] and a subset [math]\displaystyle{ X\subseteq [n] }[/math] with [math]\displaystyle{ |X|\ge k_i }[/math] such that all members of [math]\displaystyle{ {X\choose t} }[/math] are colored with the [math]\displaystyle{ i }[/math]th color.

[math]\displaystyle{ n\rightarrow(k_1,k_2,\ldots,k_r)^t }[/math]

Lemma (the "mixing color" trick)
[math]\displaystyle{ R_t(r;k_1,k_2,\ldots,k_r)\le R_t(r-1;k_1,k_2,\ldots,k_{r-2},R_t(2;k_{r-1},k_r)) }[/math]

It is then sufficient to prove the Ramsey's theorem for the two-coloring of a hypergraph, that is, to prove [math]\displaystyle{ R_t(k,\ell)=R_t(2;k,\ell) }[/math] is finite.

Lemma
[math]\displaystyle{ R_t(k,\ell)\le R_{t-1}(R_t(k-1,\ell),R_t(k,\ell-1))+1 }[/math]
Proof.

Let [math]\displaystyle{ n=R_{t-1}(R_t(k-1,\ell),R_t(k,\ell-1))+1 }[/math]. Denote [math]\displaystyle{ [n]=\{1,2,\ldots,n\} }[/math].

Let [math]\displaystyle{ f:{[n]\choose t}\rightarrow\{{\color{red}\text{red}},{\color{blue}\text{blue}}\} }[/math] be an arbitrary 2-coloring of [math]\displaystyle{ {[n]\choose t} }[/math]. It is then sufficient to show that there either exists an [math]\displaystyle{ X\subseteq[n] }[/math] with [math]\displaystyle{ |X|=k }[/math] such that all members of [math]\displaystyle{ {X\choose t} }[/math] are colored red by [math]\displaystyle{ f }[/math]; or exists an [math]\displaystyle{ X\subseteq[n] }[/math] with [math]\displaystyle{ |X|=\ell }[/math] such that all members of [math]\displaystyle{ {X\choose t} }[/math] are colored blue by [math]\displaystyle{ f }[/math].

We remove [math]\displaystyle{ n }[/math] from [math]\displaystyle{ [n] }[/math] and define a new coloring [math]\displaystyle{ f' }[/math] of [math]\displaystyle{ {[n-1]\choose t-1} }[/math] by

[math]\displaystyle{ f'(A)=f(A\cup\{n\}) }[/math] for any [math]\displaystyle{ A\in{[n-1]\choose t-1} }[/math].

By the choice of [math]\displaystyle{ n }[/math] and by symmetry, there exists a subset [math]\displaystyle{ S\subseteq[n-1] }[/math] with [math]\displaystyle{ |X|=R_t(k-1,\ell) }[/math] such that all members of [math]\displaystyle{ {S\choose t-1} }[/math] are colored with red by [math]\displaystyle{ f' }[/math]. Then there either exists an [math]\displaystyle{ X\subseteq S }[/math] with [math]\displaystyle{ |X|=\ell }[/math] such that [math]\displaystyle{ {X\choose t} }[/math] is colored all blue by [math]\displaystyle{ f }[/math], in which case we are done; or exists an [math]\displaystyle{ X\subseteq S }[/math] with [math]\displaystyle{ |X|=k-1 }[/math] such that [math]\displaystyle{ {X\choose t} }[/math] is colored all red by [math]\displaystyle{ f }[/math]. Next we prove that in the later case [math]\displaystyle{ {X\cup{n}\choose t} }[/math] is all red, which will close our proof. Since all [math]\displaystyle{ A\in{S\choose t-1} }[/math] are colored with red by [math]\displaystyle{ f' }[/math], then by our definition of [math]\displaystyle{ f' }[/math], [math]\displaystyle{ f(A\cup\{n\})={\color{red}\text{red}} }[/math] for all [math]\displaystyle{ A\in {X\choose t-1}\subseteq{S\choose t-1} }[/math]. Recalling that [math]\displaystyle{ {X\choose t} }[/math] is colored all red by [math]\displaystyle{ f }[/math], [math]\displaystyle{ {X\cup\{n\}\choose t} }[/math] is colored all red by [math]\displaystyle{ f }[/math] and we are done.

[math]\displaystyle{ \square }[/math]

Applications of Ramsey Theorem

The "Happy Ending" problem

The happy ending problem
Any set of 5 points in the plane, no three on a line, has a subset of 4 points that form the vertices of a convex quadrilateral.

See the article [1] for the proof.

We say a set of points in the plane in general positions if no three of the points are on the same line.

Theorem (Erdős-Szekeres 1935)
For any positive integer [math]\displaystyle{ m\ge 3 }[/math], there is an [math]\displaystyle{ N(m) }[/math] such that any set of at least [math]\displaystyle{ N(m) }[/math] points in general position in the plane (i.e., no three of the points are on a line) contains [math]\displaystyle{ m }[/math] points that are the vertices of a convex [math]\displaystyle{ m }[/math]-gon.
Proof.

Let [math]\displaystyle{ N(m)=R_3(m,m) }[/math]. For [math]\displaystyle{ n\ge N(m) }[/math], let [math]\displaystyle{ X }[/math] be an arbitrary set of [math]\displaystyle{ n }[/math] points in the plane, no three of which are on a line. Define a 2-coloring of the 3-subsets of points [math]\displaystyle{ f:{X\choose 3}\rightarrow\{0,1\} }[/math] as follows: for any [math]\displaystyle{ \{a,b,c\}\in{X\choose 3} }[/math], let [math]\displaystyle{ \triangle_{abc}\subset X }[/math] be the set of points covered by the triangle [math]\displaystyle{ abc }[/math]; and [math]\displaystyle{ f(\{a,b,c\})=|\triangle_{abc}|\bmod 2 }[/math], that is, [math]\displaystyle{ f(\{a,b,c\}) }[/math] indicates the oddness of the number of points covered by the triangle [math]\displaystyle{ abc }[/math].

Since [math]\displaystyle{ |X|\ge R_3(m,m) }[/math], there exists a [math]\displaystyle{ Y\subseteq X }[/math] such that [math]\displaystyle{ |Y|=m }[/math] and all members of [math]\displaystyle{ {Y\choose 3} }[/math] are colored with the same value by [math]\displaystyle{ f }[/math].

We claim that the [math]\displaystyle{ m }[/math] points in [math]\displaystyle{ Y }[/math] are the vertices of a convex [math]\displaystyle{ m }[/math]-gon. If otherwise, by the definition of convexity, there exist [math]\displaystyle{ \{a,b,c,d\}\subseteq Y }[/math] such that [math]\displaystyle{ d\in\triangle_{abc} }[/math]. Since no three points are in the same line,

[math]\displaystyle{ \triangle_{abc}=\triangle_{abd}\cup\triangle_{acd}\cup\triangle_{bcd}\cup\{d\} }[/math],

where all unions are disjoint. Then [math]\displaystyle{ |\triangle_{abc}|=|\triangle_{abd}|+|\triangle_{acd}|+|\triangle_{bcd}|+1 }[/math], which implies that [math]\displaystyle{ f(\{a,b,c\}), f(\{a,b,d\}), f(\{a,c,d\}), f(\{b,c,d\})\, }[/math] cannot be equal, contradicting that all members of [math]\displaystyle{ {Y\choose 3} }[/math] have the same color.

[math]\displaystyle{ \square }[/math]

Yao's lower bound on implicit data structures

Lemma
Let [math]\displaystyle{ n\ge 2 }[/math] be a power of 2 and [math]\displaystyle{ N\ge 2n }[/math]. Suppose the universe is [math]\displaystyle{ [N] }[/math] and the size of the data set is [math]\displaystyle{ n }[/math].
If the data structure is a sorted table, any search algorithm requires at least [math]\displaystyle{ \log n }[/math] accesses to the data structure in the worst case.
Proof.

We will show by an adversarial argument that [math]\displaystyle{ \log n }[/math] accesses are required to search for the key value [math]\displaystyle{ x=n }[/math] from the universe [math]\displaystyle{ [N]=\{1,2,\ldots,N\} }[/math]. The construction of the adversarial data set [math]\displaystyle{ S }[/math] is by induction on [math]\displaystyle{ n }[/math].

For [math]\displaystyle{ n=2 }[/math] and [math]\displaystyle{ N\ge 2n-1=3 }[/math] it is easy to see that two accesses are necessary.

Let [math]\displaystyle{ n\gt 2 }[/math]. Assume the induction hypothesis to be true for all smaller [math]\displaystyle{ n }[/math]; we will prove it for the size of data set [math]\displaystyle{ n }[/math], size of universe [math]\displaystyle{ N\ge 2n }[/math] and the search key [math]\displaystyle{ x=n }[/math].

Suppose that the first access position is [math]\displaystyle{ k }[/math]. The adversary chooses the table content [math]\displaystyle{ T[k] }[/math]. The adversary's strategy is:

[math]\displaystyle{ \begin{align} T[k]= \begin{cases} k & k\le \frac{n}{2},\\ N-(n-k) & k\gt \frac{n}{2}. \end{cases} \end{align} }[/math]

By symmetry, suppose it is the first case that [math]\displaystyle{ k\le \frac{n}{2} }[/math]. Then the key [math]\displaystyle{ x=n }[/math] may be in any position [math]\displaystyle{ i }[/math], where [math]\displaystyle{ n/2+1\le i\le n }[/math]. In fact, [math]\displaystyle{ T[ n/2+1] }[/math] through [math]\displaystyle{ T[n] }[/math] is a sorted table of size [math]\displaystyle{ n'=n/2 }[/math] which may contain any [math]\displaystyle{ n' }[/math]-subset of [math]\displaystyle{ \{n/2+1, n/2+2,\ldots,N\} }[/math], and hence, in particular, any [math]\displaystyle{ n' }[/math]-subset of the universe

[math]\displaystyle{ U'=\{n/2+1, n/2+2,\ldots,N-n/2\} }[/math].

The size [math]\displaystyle{ N' }[/math] of [math]\displaystyle{ U' }[/math] satisfies

[math]\displaystyle{ N'=N-n/2-n/2\ge 2n-n\ge 2n' }[/math],

and the desired key [math]\displaystyle{ n }[/math] has the relative value [math]\displaystyle{ x'=n- n/2=n' }[/math] in the universe [math]\displaystyle{ U' }[/math].

By the induction hypothesis, [math]\displaystyle{ \log n'=-1+\log n }[/math] more accesses will be required. Hence the total number of accesses is at least [math]\displaystyle{ 1+\log n'=\log n }[/math].

If the first access is [math]\displaystyle{ k\gt \frac{n}{2} }[/math], we symmetrically get that [math]\displaystyle{ T[1] }[/math] through [math]\displaystyle{ T[n/2] }[/math] is a sorted table of size [math]\displaystyle{ n'=n/2 }[/math] which may contain any [math]\displaystyle{ n' }[/math]-subset of the universe

[math]\displaystyle{ U'=\{n/2+1, n/2+2,\ldots,N-n/2\} }[/math].

The rest is the same.

[math]\displaystyle{ \square }[/math]


We have seen that on a sorted table, there is no search algorithm outperforming the binary search in the worst case. Our question is:

Is there any other order than the increasing order, on which there is a better search algorithm?

An implicit data structure use no extra space in addition to the original data set, thus a data structure can only be represented implicitly by the order of the data items in the table. That is, each data set is stored as a permutation of the set. Formally, an implicit data structure is a function

[math]\displaystyle{ f:{U\choose n}\rightarrow[n!] }[/math],

where each [math]\displaystyle{ \pi\in[n!] }[/math] specify a permutation of the sorted table. Thus, the sorted table is the simplest implicit data structure, in which [math]\displaystyle{ f(S) }[/math] is the identity for all [math]\displaystyle{ S\in{U\choose n} }[/math].

Ramsey-like Theorems

Van der Waerden's Theorem

Theorem (Van der Waerden 1927)
For every choice of positive integers [math]\displaystyle{ r }[/math] and [math]\displaystyle{ t }[/math], there exists an integer [math]\displaystyle{ W(r,t) }[/math] such that for every [math]\displaystyle{ r }[/math]-coloring of [math]\displaystyle{ [n] }[/math] where [math]\displaystyle{ n\ge W(r,t) }[/math], there exists a monochromatic arithmetic progression of length [math]\displaystyle{ t }[/math].

Hales–Jewett Theorem

Theorem (Hales-Jewett 1963)
Let [math]\displaystyle{ A }[/math] be a finte alphabet of [math]\displaystyle{ t }[/math] symbols and let [math]\displaystyle{ r }[/math] be a positive integer. Then there exists an integer [math]\displaystyle{ \mathrm{HJ}(r,t) }[/math] such that for every [math]\displaystyle{ r }[/math]-coloring of the cube [math]\displaystyle{ A^n }[/math] where [math]\displaystyle{ n\ge \mathrm{HJ}(r,t) }[/math], there exists a combinatorial line, which is monochromatic.
Theorem (Hales-Jewett 1963)
Let [math]\displaystyle{ A }[/math] be a finte alphabet of [math]\displaystyle{ t }[/math] symbols and let [math]\displaystyle{ m,r }[/math] be positive integers. Then there exists an integer [math]\displaystyle{ \mathrm{HJ}(m,r,t) }[/math] such that for every [math]\displaystyle{ r }[/math]-coloring of the cube [math]\displaystyle{ A^n }[/math] where [math]\displaystyle{ n\ge \mathrm{HJ}(r,t) }[/math], there exists a combinatorial [math]\displaystyle{ m }[/math]-space, which is monochromatic.