组合数学 (Fall 2023)/Ramsey theory: Difference between revisions
(8 intermediate revisions by the same user not shown) | |||
Line 358: | Line 358: | ||
=== Linial's lower bound for local computation=== | === Linial's lower bound for local computation=== | ||
In the studies of '''local computation''' (initiated by [https://www.cs.huji.ac.il/~nati/PAPERS/locality_dist_graph_algs.pdf Linial] and [https://www.wisdom.weizmann.ac.il/~naor/PAPERS/lcl.pdf Naor and Stockmeyer]), people wants to answer questions like: | |||
::''Can locally defined problems be computed locally?'' | |||
In general, the answer is no to the above question. A famous example is Linial's lower bound for '''maximal independent set''' ('''MIS''') in a ring. | |||
Consider a very simple distributed network, a ring that contains <math>n</math> nodes, where each node is assigned a unique ID from <math>[n]=\{1,2,\ldots, n\}</math>. The labeled network is then described by a tuple <math>(a_1,a_2,\ldots,a_n)</math> of IDs, which is a permutation of <math>[n]</math>, where <math>a_i\in [n]</math> gives the ID of the <math>i</math>th node in the ring. | |||
In a distributed algorithm, in each round, every node communicates with its 2 neighbors in the ring, and when the algorithm terminates, each node returns its local output. For example, in the MIS problem, the goal of the algorithm is to construct a maximal independent set: upon termination, each node returns a bit to indicate whether the node is in the constructed independent set. And the output gives a correct MIS as long as it satisfies both the followings: | |||
* there are no two consecutive nodes in the ring both outputting 1; | |||
* there are no three consecutive nodes in the ring all outputting 0. | |||
This is clearly a locally defined problem. In fact, it is a constraint satisfaction problem (CSP) where each constraint only involves 1-local or 2-local neighborhood. | |||
We are interested in the distributed algorithms that can always produce the correct answer, and want to prove a lower bound for the number of rounds required in the worst case by such distributed algorithms. | |||
As a local distributed algorithm, each node <math>i</math> initially does not know anything beyond its local information, which is just its own ID <math>a_i\in [n]</math>. | |||
And after <math>t</math> rounds, information-theoretically, each node <math>i</math> can at best know all information within its <math>t</math>-local neighborhood, which is represented by the <math>(2t-1)</math>-tuple <math>(a_{i-t},\ldots,a_{i-1},a_i,a_{i+1},\ldots a_{i+t})</math>, with the addition/subtraction modulo <math>n</math> along the ring. | |||
This suggests us to define such a computational model for local distributed algorithms: any <math>t</math>-round local algorithm is described by a function | |||
:<math>\mathcal{L}:[n]^{2t+1}\to\{0,1\}</math>. | |||
For the <math>i</math>th node in the ring, its output is given by <math>\mathcal{L}(a_{i-t},\ldots,a_{i-1},a_i,a_{i+1},\ldots a_{i+t})</math>, where <math>a_j</math> represents the ID of the <math>j</math>th node in the ring, with the addition/subtraction modulo <math>n</math>. | |||
As a correct algorithm for constructing MIS, it must hold that any three consecutive nodes can never output the same value. We then have the following lower bound for <math>t</math> for such algorithms. | |||
{{Theorem|Theorem (Linial 1992)| | |||
:For any <math>t</math>-round local algorithm for maximal independent set (MIS) on a ring of <math>n</math> nodes, it holds that | |||
:: <math>t=\Omega(\log^*n)</math> | |||
:where <math>\log^*n</math> represents the [https://en.wikipedia.org/wiki/Iterated_logarithm iterated logarithm], which is the number of times the logarithm function must be iteratively applied to <math>n</math> before the result is less than or equal to 1. | |||
}} | |||
This lower bound shows that even on very simple network like ring, some very basic locally defined problem (MIS) cannot be computed locally (within constant locality). | |||
The original proof of Linial relies on chromatic number of so-called neighborhood graphs. Here we give an alternative proof based on Ramsey theorem found by Baruch Awerbuch. | |||
{{Proof| | |||
As we discussed earlier, any <math>t</math>-round local algorithm can be represented by a mapping | |||
:<math>\mathcal{L}:[n]^{2t+1}\to\{0,1\}</math>. | |||
It can naturally defines a 2-coloring: | |||
:<math>f:{[n]\choose {2t+1}}\to\{0,1\}</math> | |||
by the following construction: for any <math>\{a_1,a_2,\ldots,a_{2t+1}\}\subseteq [n]</math>, where <math>a_1<a_2<\cdots<a_{2t+1}</math>, we define | |||
:<math>f(\{a_1,a_2,\ldots,a_{2t+1}\})=\mathcal{L}(a_1,a_2,\ldots,a_{2t+1})</math>. | |||
By Ramsey theorem, for <math>n\ge R_{2t+1}(2;2t+3,2t+3)</math>, there exists a subset <math>\{a_1,a_2,\ldots,a_{2t+3}\}\subseteq [n]</math>, where <math>a_1<a_2<\cdots<a_{2t+3}</math>, such that | |||
:<math>\left|f\left({S\choose {2t+1}}\right)\right|=1</math>. | |||
By our construction of <math>f</math>, this means | |||
:<math>\mathcal{L}(a_1,a_2,\ldots,a_{2t+1})=\mathcal{L}(a_2,a_3,\ldots,a_{2t+2})=\mathcal{L}(a_3,a_4,\ldots,a_{2t+3})</math>, | |||
which contradicts to that the output of <math>\mathcal{L}</math> should indicate an MIS, on any ring with <math>2t+3</math> consecutive nodes labeled as <math>(a_1,a_2,\ldots,a_{2t+3})</math>, because on such rings, there would be 3 consecutive nodes with the same output bit. | |||
Therefore, any <math>t</math>-round local algorithm that can always correctly produce an MIS on a ring, must satisfies that | |||
:<math>n<R_{2t+1}(2;2t+3,2t+3)\le \underbrace{2^{2^{\unicode{x22F0}^{2}}}}_{ct}</math>, | |||
for some constant <math>c>0</math>, whose inverse function gives the lower bound | |||
:<math>t=\Omega(\log^*n)</math>. | |||
}} |
Latest revision as of 12:14, 26 May 2023
Ramsey's Theorem
Ramsey's theorem for graph
Ramsey's Theorem - Let [math]\displaystyle{ k,\ell }[/math] be positive integers. Then there exists an integer [math]\displaystyle{ R(k,\ell) }[/math] satisfying:
- If [math]\displaystyle{ n\ge R(k,\ell) }[/math], for any coloring of edges of [math]\displaystyle{ K_n }[/math] with two colors red and blue, there exists a red [math]\displaystyle{ K_k }[/math] or a blue [math]\displaystyle{ K_\ell }[/math].
Proof. We show that [math]\displaystyle{ R(k,\ell) }[/math] is finite by induction on [math]\displaystyle{ k+\ell }[/math]. For the base case, it is easy to verify that
- [math]\displaystyle{ R(k,1)=R(1,\ell)=1 }[/math].
For general [math]\displaystyle{ k }[/math] and [math]\displaystyle{ \ell }[/math], we will show that
- [math]\displaystyle{ R(k,\ell)\le R(k,\ell-1)+R(k-1,\ell) }[/math].
Suppose we have a two coloring of [math]\displaystyle{ K_n }[/math], where [math]\displaystyle{ n=R(k,\ell-1)+R(k-1,\ell) }[/math]. Take an arbitrary vertex [math]\displaystyle{ v }[/math], and split [math]\displaystyle{ V\setminus\{v\} }[/math] into two subsets [math]\displaystyle{ S }[/math] and [math]\displaystyle{ T }[/math], where
- [math]\displaystyle{ \begin{align} S&=\{u\in V\setminus\{v\}\mid uv \text{ is blue }\}\\ T&=\{u\in V\setminus\{v\}\mid uv \text{ is red }\} \end{align} }[/math]
Since
- [math]\displaystyle{ |S|+|T|+1=n=R(k,\ell-1)+R(k-1,\ell) }[/math],
we have either [math]\displaystyle{ |S|\ge R(k,\ell-1) }[/math] or [math]\displaystyle{ |T|\ge R(k-1,\ell) }[/math]. By symmetry, suppose [math]\displaystyle{ |S|\ge R(k,\ell-1) }[/math]. By induction hypothesis, the complete subgraph defined on [math]\displaystyle{ S }[/math] has either a red [math]\displaystyle{ K_k }[/math], in which case we are done; or a blue [math]\displaystyle{ K_{\ell-1} }[/math], in which case the complete subgraph defined on [math]\displaystyle{ S\cup{v} }[/math] must have a blue [math]\displaystyle{ K_\ell }[/math] since all edges from [math]\displaystyle{ v }[/math] to vertices in [math]\displaystyle{ S }[/math] are blue.
- [math]\displaystyle{ \square }[/math]
Ramsey's Theorem (graph, multicolor) - Let [math]\displaystyle{ r, k_1,k_2,\ldots,k_r }[/math] be positive integers. Then there exists an integer [math]\displaystyle{ R(r;k_1,k_2,\ldots,k_r) }[/math] satisfying:
- For any [math]\displaystyle{ r }[/math]-coloring of a complete graph of [math]\displaystyle{ n\ge R(r;k_1,k_2,\ldots,k_r) }[/math] vertices, there exists a monochromatic [math]\displaystyle{ k_i }[/math]-clique with the [math]\displaystyle{ i }[/math]th color for some [math]\displaystyle{ i\in\{1,2,\ldots,r\} }[/math].
Lemma (the "mixing color" trick) - [math]\displaystyle{ R(r;k_1,k_2,\ldots,k_r)\le R(r-1;k_1,k_2,\ldots,k_{r-2},R(2;k_{r-1},k_r)) }[/math]
Proof. We transfer the [math]\displaystyle{ r }[/math]-coloring to [math]\displaystyle{ (r-1) }[/math]-coloring by identifying the [math]\displaystyle{ (r-1) }[/math]th and the [math]\displaystyle{ r }[/math]th colors.
If [math]\displaystyle{ n\ge R(r-1;k_1,k_2,\ldots,k_{r-2},R(2;k_{r-1},k_r)) }[/math], then for any [math]\displaystyle{ r }[/math]-coloring of [math]\displaystyle{ K_n }[/math], there either exist an [math]\displaystyle{ i\in\{1,2,\ldots,r-2\} }[/math] and a [math]\displaystyle{ k_i }[/math]-clique which is monochromatically colored with the [math]\displaystyle{ i }[/math]th color; or exists clique of [math]\displaystyle{ R(2;k_{r-1},k_r) }[/math] vertices which is monochromatically colored with the mixed color of the original [math]\displaystyle{ (r-1) }[/math]th and [math]\displaystyle{ r }[/math]th colors, which again implies that there exists either a [math]\displaystyle{ k }[/math]-clique which is monochromatically colored with the original [math]\displaystyle{ (r-1) }[/math]th color, or a [math]\displaystyle{ \ell }[/math]-clique which is monochromatically colored with the original [math]\displaystyle{ r }[/math]th color. This implies the recursion.
- [math]\displaystyle{ \square }[/math]
Ramsey number
The smallest number [math]\displaystyle{ R(k,\ell) }[/math] satisfying the condition in the Ramsey theory is called the Ramsey number.
Alternatively, we can define [math]\displaystyle{ R(k,\ell) }[/math] as the smallest [math]\displaystyle{ N }[/math] such that if [math]\displaystyle{ n\ge N }[/math], for any 2-coloring of [math]\displaystyle{ K_n }[/math] in red and blue, there is either a red [math]\displaystyle{ K_k }[/math] or a blue [math]\displaystyle{ K_\ell }[/math]. The Ramsey theorem is stated as:
- "[math]\displaystyle{ R(k,\ell) }[/math] is finite for any positive integers [math]\displaystyle{ k }[/math] and [math]\displaystyle{ \ell }[/math]."
The core of the inductive proof of the Ramsey theorem is the following recursion
- [math]\displaystyle{ \begin{align} R(k,1) &=R(1,\ell)=1\\ R(k,\ell) &\le R(k,\ell-1)+R(k-1,\ell). \end{align} }[/math]
From this recursion, we can deduce an upper bound for the Ramsey number.
Theorem - [math]\displaystyle{ R(k,\ell)\le{k+\ell-2\choose k-1} }[/math].
Proof. It is easy to verify the bound by induction.
- [math]\displaystyle{ \square }[/math]
The following theorem is due to Spencer in 1975, which is the best known lower bound for diagonal Ramsey number.
Theorem (Spencer 1975) - [math]\displaystyle{ R(k,k)\ge Ck2^{k/2} }[/math] for some constant [math]\displaystyle{ C\gt 0 }[/math].
Its proof uses the Lovász local lemma in the probabilistic method.
Lovász Local Lemma (symmetric case) - Let [math]\displaystyle{ A_1,A_2,\ldots,A_n }[/math] be a set of events, and assume that the following hold:
- for all [math]\displaystyle{ 1\le i\le n }[/math], [math]\displaystyle{ \Pr[A_i]\le p }[/math];
- each event [math]\displaystyle{ A_i }[/math] is independent of all but at most [math]\displaystyle{ d }[/math] other events, and
- [math]\displaystyle{ ep(d+1)\le 1 }[/math].
- Then
- [math]\displaystyle{ \Pr\left[\bigwedge_{i=1}^n\overline{A_i}\right]\gt 0 }[/math].
- Let [math]\displaystyle{ A_1,A_2,\ldots,A_n }[/math] be a set of events, and assume that the following hold:
We can use the local lemma to prove a lower bound for the diagonal Ramsey number.
Proof. To prove a lower bound [math]\displaystyle{ R(k,k)\gt n }[/math], it is sufficient to show that there exists a 2-coloring of [math]\displaystyle{ K_n }[/math] without a monochromatic [math]\displaystyle{ K_k }[/math]. We prove this by the probabilistic method.
Pick a random 2-coloring of [math]\displaystyle{ K_n }[/math] by coloring each edge uniformly and independently with one of the two colors. For any set [math]\displaystyle{ S }[/math] of [math]\displaystyle{ k }[/math] vertices, let [math]\displaystyle{ A_S }[/math] denote the event that [math]\displaystyle{ S }[/math] forms a monochromatic [math]\displaystyle{ K_k }[/math]. It is easy to see that [math]\displaystyle{ \Pr[A_s]=2^{1-{k\choose 2}}=p }[/math].
For any [math]\displaystyle{ k }[/math]-subset [math]\displaystyle{ T }[/math] of vertices, [math]\displaystyle{ A_S }[/math] and [math]\displaystyle{ A_T }[/math] are dependent if and only if [math]\displaystyle{ |S\cap T|\ge 2 }[/math]. For each [math]\displaystyle{ S }[/math], the number of [math]\displaystyle{ T }[/math] that [math]\displaystyle{ |S\cap T|\ge 2 }[/math] is at most [math]\displaystyle{ {k\choose 2}{n\choose k-2} }[/math], so the max degree of the dependency graph is [math]\displaystyle{ d\le{k\choose 2}{n\choose k-2} }[/math].
Take [math]\displaystyle{ n=Ck2^{k/2} }[/math] for some appropriate constant [math]\displaystyle{ C\gt 0 }[/math].
- [math]\displaystyle{ \begin{align} \mathrm{e}p(d+1) &\le \mathrm{e}2^{1-{k\choose 2}}\left({k\choose 2}{n\choose k-2}+1\right)\\ &\le 2^{3-{k\choose 2}}{k\choose 2}{n\choose k-2}\\ &\le 1 \end{align} }[/math]
Applying the local lemma, the probability that there is no monochromatic [math]\displaystyle{ K_k }[/math] is
- [math]\displaystyle{ \Pr\left[\bigwedge_{S\in{[n]\choose k}}\overline{A_S}\right]\gt 0 }[/math].
Therefore, there exists a 2-coloring of [math]\displaystyle{ K_n }[/math] which has no monochromatic [math]\displaystyle{ K_k }[/math], which means
- [math]\displaystyle{ R(k,k)\gt n=Ck2^{k/2} }[/math].
- [math]\displaystyle{ \square }[/math]
Theorem - [math]\displaystyle{ \Omega\left(k2^{k/2}\right)\le R(k,k)\le{2k-2\choose k-1}=O\left(k^{-1/2}4^{k}\right) }[/math].
[math]\displaystyle{ k }[/math],[math]\displaystyle{ l }[/math] | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 |
---|---|---|---|---|---|---|---|---|---|---|
1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 |
2 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 |
3 | 1 | 3 | 6 | 9 | 14 | 18 | 23 | 28 | 36 | 40–43 |
4 | 1 | 4 | 9 | 18 | 25 | 35–41 | 49–61 | 56–84 | 73–115 | 92–149 |
5 | 1 | 5 | 14 | 25 | 43–49 | 58–87 | 80–143 | 101–216 | 125–316 | 143–442 |
6 | 1 | 6 | 18 | 35–41 | 58–87 | 102–165 | 113–298 | 127–495 | 169–780 | 179–1171 |
7 | 1 | 7 | 23 | 49–61 | 80–143 | 113–298 | 205–540 | 216–1031 | 233–1713 | 289–2826 |
8 | 1 | 8 | 28 | 56–84 | 101–216 | 127–495 | 216–1031 | 282–1870 | 317–3583 | 317-6090 |
9 | 1 | 9 | 36 | 73–115 | 125–316 | 169–780 | 233–1713 | 317–3583 | 565–6588 | 580–12677 |
10 | 1 | 10 | 40–43 | 92–149 | 143–442 | 179–1171 | 289–2826 | 317-6090 | 580–12677 | 798–23556 |
Ramsey's theorem for hypergraph
Ramsey's Theorem (hypergraph, multicolor) - Let [math]\displaystyle{ r, t, k_1,k_2,\ldots,k_r }[/math] be positive integers. Then there exists an integer [math]\displaystyle{ R_t(r;k_1,k_2,\ldots,k_r) }[/math] satisfying:
- For any [math]\displaystyle{ r }[/math]-coloring of [math]\displaystyle{ {[n]\choose t} }[/math] with [math]\displaystyle{ n\ge R_t(r;k_1,k_2,\ldots,k_r) }[/math], there exist an [math]\displaystyle{ i\in\{1,2,\ldots,r\} }[/math] and a subset [math]\displaystyle{ X\subseteq [n] }[/math] with [math]\displaystyle{ |X|\ge k_i }[/math] such that all members of [math]\displaystyle{ {X\choose t} }[/math] are colored with the [math]\displaystyle{ i }[/math]th color.
[math]\displaystyle{ n\rightarrow(k_1,k_2,\ldots,k_r)^t }[/math]
Lemma (the "mixing color" trick) - [math]\displaystyle{ R_t(r;k_1,k_2,\ldots,k_r)\le R_t(r-1;k_1,k_2,\ldots,k_{r-2},R_t(2;k_{r-1},k_r)) }[/math]
It is then sufficient to prove the Ramsey's theorem for the two-coloring of a hypergraph, that is, to prove [math]\displaystyle{ R_t(k,\ell)=R_t(2;k,\ell) }[/math] is finite.
Lemma - [math]\displaystyle{ R_t(k,\ell)\le R_{t-1}(R_t(k-1,\ell),R_t(k,\ell-1))+1 }[/math]
Proof. Let [math]\displaystyle{ n=R_{t-1}(R_t(k-1,\ell),R_t(k,\ell-1))+1 }[/math]. Denote [math]\displaystyle{ [n]=\{1,2,\ldots,n\} }[/math].
Let [math]\displaystyle{ f:{[n]\choose t}\rightarrow\{{\color{red}\text{red}},{\color{blue}\text{blue}}\} }[/math] be an arbitrary 2-coloring of [math]\displaystyle{ {[n]\choose t} }[/math]. It is then sufficient to show that there either exists an [math]\displaystyle{ X\subseteq[n] }[/math] with [math]\displaystyle{ |X|=k }[/math] such that all members of [math]\displaystyle{ {X\choose t} }[/math] are colored red by [math]\displaystyle{ f }[/math]; or exists an [math]\displaystyle{ X\subseteq[n] }[/math] with [math]\displaystyle{ |X|=\ell }[/math] such that all members of [math]\displaystyle{ {X\choose t} }[/math] are colored blue by [math]\displaystyle{ f }[/math].
We remove [math]\displaystyle{ n }[/math] from [math]\displaystyle{ [n] }[/math] and define a new coloring [math]\displaystyle{ f' }[/math] of [math]\displaystyle{ {[n-1]\choose t-1} }[/math] by
- [math]\displaystyle{ f'(A)=f(A\cup\{n\}) }[/math] for any [math]\displaystyle{ A\in{[n-1]\choose t-1} }[/math].
By the choice of [math]\displaystyle{ n }[/math] and by symmetry, there exists a subset [math]\displaystyle{ S\subseteq[n-1] }[/math] with [math]\displaystyle{ |X|=R_t(k-1,\ell) }[/math] such that all members of [math]\displaystyle{ {S\choose t-1} }[/math] are colored with red by [math]\displaystyle{ f' }[/math]. Then there either exists an [math]\displaystyle{ X\subseteq S }[/math] with [math]\displaystyle{ |X|=\ell }[/math] such that [math]\displaystyle{ {X\choose t} }[/math] is colored all blue by [math]\displaystyle{ f }[/math], in which case we are done; or exists an [math]\displaystyle{ X\subseteq S }[/math] with [math]\displaystyle{ |X|=k-1 }[/math] such that [math]\displaystyle{ {X\choose t} }[/math] is colored all red by [math]\displaystyle{ f }[/math]. Next we prove that in the later case [math]\displaystyle{ {X\cup{n}\choose t} }[/math] is all red, which will close our proof. Since all [math]\displaystyle{ A\in{S\choose t-1} }[/math] are colored with red by [math]\displaystyle{ f' }[/math], then by our definition of [math]\displaystyle{ f' }[/math], [math]\displaystyle{ f(A\cup\{n\})={\color{red}\text{red}} }[/math] for all [math]\displaystyle{ A\in {X\choose t-1}\subseteq{S\choose t-1} }[/math]. Recalling that [math]\displaystyle{ {X\choose t} }[/math] is colored all red by [math]\displaystyle{ f }[/math], [math]\displaystyle{ {X\cup\{n\}\choose t} }[/math] is colored all red by [math]\displaystyle{ f }[/math] and we are done.
- [math]\displaystyle{ \square }[/math]
Applications of Ramsey Theorem
The "Happy Ending" problem
The happy ending problem - Any set of 5 points in the plane, no three on a line, has a subset of 4 points that form the vertices of a convex quadrilateral.
See the article [1] for the proof.
We say a set of points in the plane in general positions if no three of the points are on the same line.
Theorem (Erdős-Szekeres 1935) - For any positive integer [math]\displaystyle{ m\ge 3 }[/math], there is an [math]\displaystyle{ N(m) }[/math] such that any set of at least [math]\displaystyle{ N(m) }[/math] points in general position in the plane (i.e., no three of the points are on a line) contains [math]\displaystyle{ m }[/math] points that are the vertices of a convex [math]\displaystyle{ m }[/math]-gon.
Proof. Let [math]\displaystyle{ N(m)=R_3(m,m) }[/math]. For [math]\displaystyle{ n\ge N(m) }[/math], let [math]\displaystyle{ X }[/math] be an arbitrary set of [math]\displaystyle{ n }[/math] points in the plane, no three of which are on a line. Define a 2-coloring of the 3-subsets of points [math]\displaystyle{ f:{X\choose 3}\rightarrow\{0,1\} }[/math] as follows: for any [math]\displaystyle{ \{a,b,c\}\in{X\choose 3} }[/math], let [math]\displaystyle{ \triangle_{abc}\subset X }[/math] be the set of points covered by the triangle [math]\displaystyle{ abc }[/math]; and [math]\displaystyle{ f(\{a,b,c\})=|\triangle_{abc}|\bmod 2 }[/math], that is, [math]\displaystyle{ f(\{a,b,c\}) }[/math] indicates the oddness of the number of points covered by the triangle [math]\displaystyle{ abc }[/math].
Since [math]\displaystyle{ |X|\ge R_3(m,m) }[/math], there exists a [math]\displaystyle{ Y\subseteq X }[/math] such that [math]\displaystyle{ |Y|=m }[/math] and all members of [math]\displaystyle{ {Y\choose 3} }[/math] are colored with the same value by [math]\displaystyle{ f }[/math].
We claim that the [math]\displaystyle{ m }[/math] points in [math]\displaystyle{ Y }[/math] are the vertices of a convex [math]\displaystyle{ m }[/math]-gon. If otherwise, by the definition of convexity, there exist [math]\displaystyle{ \{a,b,c,d\}\subseteq Y }[/math] such that [math]\displaystyle{ d\in\triangle_{abc} }[/math]. Since no three points are in the same line,
- [math]\displaystyle{ \triangle_{abc}=\triangle_{abd}\cup\triangle_{acd}\cup\triangle_{bcd}\cup\{d\} }[/math],
where all unions are disjoint. Then [math]\displaystyle{ |\triangle_{abc}|=|\triangle_{abd}|+|\triangle_{acd}|+|\triangle_{bcd}|+1 }[/math], which implies that [math]\displaystyle{ f(\{a,b,c\}), f(\{a,b,d\}), f(\{a,c,d\}), f(\{b,c,d\})\, }[/math] cannot be equal, contradicting that all members of [math]\displaystyle{ {Y\choose 3} }[/math] have the same color.
- [math]\displaystyle{ \square }[/math]
Yao's lower bound for implicit data structures
We consider the following fundamental problem of membership query.
Membership Query - Input: A data set [math]\displaystyle{ S\subset U }[/math] of size [math]\displaystyle{ n }[/math], where [math]\displaystyle{ U }[/math] is a data universe of size [math]\displaystyle{ N }[/math].
- Query: a data item (also called a key) [math]\displaystyle{ x\in U }[/math].
- Answer: Whether [math]\displaystyle{ x\in S }[/math].
This is a basic problem for data structures. People want to design efficient data structures to store the data set [math]\displaystyle{ S }[/math] so that the query "Is [math]\displaystyle{ x\in S }[/math]?" can be efficiently answered by accessing the data structure as little as possible in the worst case.
A sorted table for a data set [math]\displaystyle{ S\subset [N] }[/math] is a natural data structure in which the elements of [math]\displaystyle{ S=\{x_1,x_2,\ldots,x_n\} }[/math], where [math]\displaystyle{ x_1\lt x_2\lt \cdots\lt x_n }[/math], are stored in an array, one element [math]\displaystyle{ x_i }[/math] in each entry, in the increasing order. For a sorted table, the membership query problem can be solved via binary search within [math]\displaystyle{ \Omega(\log_2 n) }[/math] memory accesses in the worst case. The following fundamental result of Andrew Chi-Chih Yao (姚期智) shows that this is the best possible for sorted tables. The proof is an elegant application of the adversarial argument(对手论证).
Lemma (Yao 1981) - Suppose that [math]\displaystyle{ n\ge 2 }[/math], [math]\displaystyle{ N\ge 2n-1 }[/math], the data universe is [math]\displaystyle{ U=[N] }[/math], and the size of the data set is [math]\displaystyle{ n }[/math].
- If the data structure is a sorted table, any search algorithm requires at least [math]\displaystyle{ \lceil\log_2 (n+1)\rceil }[/math] accesses to the data structure in the worst case.
Proof. We will show by an adversarial argument that [math]\displaystyle{ \lceil\log_2 (n+1)\rceil }[/math] accesses are required to search for the key value [math]\displaystyle{ x=n }[/math] in the universe [math]\displaystyle{ [N]=\{1,2,\ldots,N\} }[/math]. The construction of the adversarial data set [math]\displaystyle{ S }[/math] is by induction on [math]\displaystyle{ n }[/math].
For [math]\displaystyle{ n=2 }[/math] and [math]\displaystyle{ N\ge 2n-1=3 }[/math] it is easy to see that 2 memory accesses are required to make sure whether the key value [math]\displaystyle{ n=2 }[/math] presents in a sorted table containing 2 keys out of a data universe of size 3, in the worst case.
Let [math]\displaystyle{ n\gt 2 }[/math]. Assume the induction hypothesis for all smaller [math]\displaystyle{ n }[/math]. We will prove it for the size of data set [math]\displaystyle{ n }[/math], size of universe [math]\displaystyle{ N\ge 2n-1 }[/math] and the search key [math]\displaystyle{ x=n }[/math].
Suppose that the first access position is [math]\displaystyle{ k }[/math]. The adversary chooses the table content [math]\displaystyle{ T[k] }[/math]. The adversary's strategy is:
- [math]\displaystyle{ \begin{align} T[k]= \begin{cases} k & k\le \frac{n}{2},\\ N-(n-k) & k\gt \frac{n}{2}. \end{cases} \end{align} }[/math]
By symmetry, suppose it is the first case that [math]\displaystyle{ k\le \frac{n}{2} }[/math]. Then the key [math]\displaystyle{ x=n }[/math] may be in any position [math]\displaystyle{ i }[/math], where [math]\displaystyle{ \frac{n}{2}+1\le i\le n }[/math]. In fact, [math]\displaystyle{ T\left[ \left\lceil \frac{n}{2}\right\rceil +1\right] }[/math] through [math]\displaystyle{ T[n] }[/math] is a sorted table of size [math]\displaystyle{ n'=\left\lfloor \frac{n}{2}\right\rfloor }[/math] which may contain any [math]\displaystyle{ n' }[/math]-subset of [math]\displaystyle{ \left\{\left\lceil \frac{n}{2}\right\rceil+1, \left\lceil \frac{n}{2}\right\rceil+2,\ldots,N\right\} }[/math], and hence, in particular, any [math]\displaystyle{ n' }[/math]-subset of the new universe
- [math]\displaystyle{ U'=\left\{\left\lceil \frac{n}{2}\right\rceil+1, \left\lceil \frac{n}{2}\right\rceil+2,\ldots,N-\left\lceil \frac{n}{2}\right\rceil\right\} }[/math].
The size [math]\displaystyle{ N' }[/math] of [math]\displaystyle{ U' }[/math] satisfies
- [math]\displaystyle{ N'=N-2\left\lceil \frac{n}{2}\right\rceil\ge 2(n-1)-2\left\lceil \frac{n}{2}\right\rceil \ge 2\left\lfloor \frac{n}{2}\right\rfloor-1= 2n'-1 }[/math],
and the desired key [math]\displaystyle{ n }[/math] has the relative value [math]\displaystyle{ x'=n- \left\lceil \frac{n}{2}\right\rceil=\left\lfloor \frac{n}{2}\right\rfloor=n' }[/math] in the new universe [math]\displaystyle{ U' }[/math].
By the induction hypothesis, [math]\displaystyle{ \lceil\log_2 (n'+1)\rceil }[/math] more memory accesses will be required. Hence the total number of memory accesses is at least
- [math]\displaystyle{ 1+\lceil\log_2 (n'+1)\rceil=1+\left\lceil\log_2 \left(\left\lfloor \frac{n}{2}\right\rfloor+1\right)\right\rceil\ge \lceil\log_2 (n+1)\rceil }[/math].
If the first access is [math]\displaystyle{ k\gt \frac{n}{2} }[/math], we symmetrically get that [math]\displaystyle{ T[1] }[/math] through [math]\displaystyle{ T\left[\left\lfloor \frac{n}{2}\right\rfloor\right] }[/math] is a sorted table of size [math]\displaystyle{ n'=\left\lfloor \frac{n}{2}\right\rfloor }[/math] which may contain any [math]\displaystyle{ n' }[/math]-subset of the universe
- [math]\displaystyle{ U'=\left\{\left\lceil \frac{n}{2}\right\rceil+1, \left\lceil \frac{n}{2}\right\rceil+2,\ldots,N-\left\lceil \frac{n}{2}\right\rceil\right\} }[/math].
The rest is the same as before. This completes the induction.
- [math]\displaystyle{ \square }[/math]
We have seen that on a sorted table, there is no search algorithm outperforming the binary search in the worst case. Our question is:
- Is there any other order than the increasing order, on which there is a better search algorithm?
An implicit data structure use no extra space in addition to the original data set, thus a data structure can only be represented implicitly by the order of the data items in the table. That is, each data set is stored as a permutation of the set. Formally, an implicit data structure is described by a function
- [math]\displaystyle{ f:{U\choose n}\rightarrow[n!] }[/math],
where each [math]\displaystyle{ \pi\in[n!] }[/math] specify a permutation of the sorted table, and a data set [math]\displaystyle{ S=\{x_1,x_2,\ldots,x_n\} }[/math] where [math]\displaystyle{ x_1\lt x_2\lt \cdots\lt x_n }[/math] is stored as an array [math]\displaystyle{ (x_{\pi(1)},x_{\pi(2)},\ldots,x_{\pi(n)}\} }[/math] where [math]\displaystyle{ \pi=f(S) }[/math].
Thus, the sorted table is the simplest implicit data structure, in which [math]\displaystyle{ f(S) }[/math] always gives the identity permutation for all [math]\displaystyle{ S\in{U\choose n} }[/math]. We observe that if [math]\displaystyle{ f }[/math] maps all data sets [math]\displaystyle{ S\in{U\choose n} }[/math] to the same permutation [math]\displaystyle{ \pi }[/math], then the data structure is equivalent to the sorted table, under the bijection that the [math]\displaystyle{ i }[/math]th entry in the sorted table corresponds to the [math]\displaystyle{ \pi(i) }[/math]th entry of the actual array, where the same [math]\displaystyle{ \Omega(\log_2 n) }[/math] lower bound applies.
This observation can be generalized and made formal as follows.
Observation - If there is a sub-universe [math]\displaystyle{ X\subseteq U }[/math] such that for every data set [math]\displaystyle{ S\in {X\choose n} }[/math], the implicit data structure [math]\displaystyle{ f(S)=\pi }[/math] stores the data set using the the same permutation [math]\displaystyle{ \pi }[/math], i.e.
- [math]\displaystyle{ f\left({X\choose n}\right)=\{\pi\} }[/math]
- then this implicit data structure is equivalent to the sorted table for all data sets from the new universe [math]\displaystyle{ X }[/math], under the bijection that the [math]\displaystyle{ i }[/math]th entry in the sorted table corresponds to the [math]\displaystyle{ \pi(i) }[/math]th entry of the array.
- Therefore, if [math]\displaystyle{ |X|\ge 2n }[/math], then the same [math]\displaystyle{ \Omega(\log_2 n) }[/math] lower bound for searching in a sorted table applies.
- If there is a sub-universe [math]\displaystyle{ X\subseteq U }[/math] such that for every data set [math]\displaystyle{ S\in {X\choose n} }[/math], the implicit data structure [math]\displaystyle{ f(S)=\pi }[/math] stores the data set using the the same permutation [math]\displaystyle{ \pi }[/math], i.e.
Due to Ramsey theorem, for sufficiently large [math]\displaystyle{ N }[/math] which satisfies [math]\displaystyle{ N\ge R_{n}(n!;2n) }[/math], for any [math]\displaystyle{ f:{U\choose n}\rightarrow[n!] }[/math], there is an [math]\displaystyle{ X\subseteq U }[/math] of size [math]\displaystyle{ |X|\ge 2n }[/math], such that [math]\displaystyle{ \left|f\left({X\choose n}\right)\right|=1 }[/math], which guarantees the existence of the sub-universe [math]\displaystyle{ X\subseteq U }[/math] required in the above observation for (wildly) large universe sizes [math]\displaystyle{ N }[/math], which implies the following lower bound for implicit data structures.
Theorem (Yao 1981) - Suppose that [math]\displaystyle{ n\ge 2 }[/math], and [math]\displaystyle{ N\ge 2n }[/math], the data universe is [math]\displaystyle{ U=[N] }[/math], and the size of the data set is [math]\displaystyle{ n }[/math].
- For any implicit data structure, if [math]\displaystyle{ N }[/math] is sufficiently large, then any search algorithm requires at least [math]\displaystyle{ \lfloor\log_2 n\rfloor }[/math] accesses to the data structure in the worst case.
Linial's lower bound for local computation
In the studies of local computation (initiated by Linial and Naor and Stockmeyer), people wants to answer questions like:
- Can locally defined problems be computed locally?
In general, the answer is no to the above question. A famous example is Linial's lower bound for maximal independent set (MIS) in a ring.
Consider a very simple distributed network, a ring that contains [math]\displaystyle{ n }[/math] nodes, where each node is assigned a unique ID from [math]\displaystyle{ [n]=\{1,2,\ldots, n\} }[/math]. The labeled network is then described by a tuple [math]\displaystyle{ (a_1,a_2,\ldots,a_n) }[/math] of IDs, which is a permutation of [math]\displaystyle{ [n] }[/math], where [math]\displaystyle{ a_i\in [n] }[/math] gives the ID of the [math]\displaystyle{ i }[/math]th node in the ring.
In a distributed algorithm, in each round, every node communicates with its 2 neighbors in the ring, and when the algorithm terminates, each node returns its local output. For example, in the MIS problem, the goal of the algorithm is to construct a maximal independent set: upon termination, each node returns a bit to indicate whether the node is in the constructed independent set. And the output gives a correct MIS as long as it satisfies both the followings:
- there are no two consecutive nodes in the ring both outputting 1;
- there are no three consecutive nodes in the ring all outputting 0.
This is clearly a locally defined problem. In fact, it is a constraint satisfaction problem (CSP) where each constraint only involves 1-local or 2-local neighborhood.
We are interested in the distributed algorithms that can always produce the correct answer, and want to prove a lower bound for the number of rounds required in the worst case by such distributed algorithms.
As a local distributed algorithm, each node [math]\displaystyle{ i }[/math] initially does not know anything beyond its local information, which is just its own ID [math]\displaystyle{ a_i\in [n] }[/math]. And after [math]\displaystyle{ t }[/math] rounds, information-theoretically, each node [math]\displaystyle{ i }[/math] can at best know all information within its [math]\displaystyle{ t }[/math]-local neighborhood, which is represented by the [math]\displaystyle{ (2t-1) }[/math]-tuple [math]\displaystyle{ (a_{i-t},\ldots,a_{i-1},a_i,a_{i+1},\ldots a_{i+t}) }[/math], with the addition/subtraction modulo [math]\displaystyle{ n }[/math] along the ring.
This suggests us to define such a computational model for local distributed algorithms: any [math]\displaystyle{ t }[/math]-round local algorithm is described by a function
- [math]\displaystyle{ \mathcal{L}:[n]^{2t+1}\to\{0,1\} }[/math].
For the [math]\displaystyle{ i }[/math]th node in the ring, its output is given by [math]\displaystyle{ \mathcal{L}(a_{i-t},\ldots,a_{i-1},a_i,a_{i+1},\ldots a_{i+t}) }[/math], where [math]\displaystyle{ a_j }[/math] represents the ID of the [math]\displaystyle{ j }[/math]th node in the ring, with the addition/subtraction modulo [math]\displaystyle{ n }[/math].
As a correct algorithm for constructing MIS, it must hold that any three consecutive nodes can never output the same value. We then have the following lower bound for [math]\displaystyle{ t }[/math] for such algorithms.
Theorem (Linial 1992) - For any [math]\displaystyle{ t }[/math]-round local algorithm for maximal independent set (MIS) on a ring of [math]\displaystyle{ n }[/math] nodes, it holds that
- [math]\displaystyle{ t=\Omega(\log^*n) }[/math]
- where [math]\displaystyle{ \log^*n }[/math] represents the iterated logarithm, which is the number of times the logarithm function must be iteratively applied to [math]\displaystyle{ n }[/math] before the result is less than or equal to 1.
- For any [math]\displaystyle{ t }[/math]-round local algorithm for maximal independent set (MIS) on a ring of [math]\displaystyle{ n }[/math] nodes, it holds that
This lower bound shows that even on very simple network like ring, some very basic locally defined problem (MIS) cannot be computed locally (within constant locality).
The original proof of Linial relies on chromatic number of so-called neighborhood graphs. Here we give an alternative proof based on Ramsey theorem found by Baruch Awerbuch.
Proof. As we discussed earlier, any [math]\displaystyle{ t }[/math]-round local algorithm can be represented by a mapping
- [math]\displaystyle{ \mathcal{L}:[n]^{2t+1}\to\{0,1\} }[/math].
It can naturally defines a 2-coloring:
- [math]\displaystyle{ f:{[n]\choose {2t+1}}\to\{0,1\} }[/math]
by the following construction: for any [math]\displaystyle{ \{a_1,a_2,\ldots,a_{2t+1}\}\subseteq [n] }[/math], where [math]\displaystyle{ a_1\lt a_2\lt \cdots\lt a_{2t+1} }[/math], we define
- [math]\displaystyle{ f(\{a_1,a_2,\ldots,a_{2t+1}\})=\mathcal{L}(a_1,a_2,\ldots,a_{2t+1}) }[/math].
By Ramsey theorem, for [math]\displaystyle{ n\ge R_{2t+1}(2;2t+3,2t+3) }[/math], there exists a subset [math]\displaystyle{ \{a_1,a_2,\ldots,a_{2t+3}\}\subseteq [n] }[/math], where [math]\displaystyle{ a_1\lt a_2\lt \cdots\lt a_{2t+3} }[/math], such that
- [math]\displaystyle{ \left|f\left({S\choose {2t+1}}\right)\right|=1 }[/math].
By our construction of [math]\displaystyle{ f }[/math], this means
- [math]\displaystyle{ \mathcal{L}(a_1,a_2,\ldots,a_{2t+1})=\mathcal{L}(a_2,a_3,\ldots,a_{2t+2})=\mathcal{L}(a_3,a_4,\ldots,a_{2t+3}) }[/math],
which contradicts to that the output of [math]\displaystyle{ \mathcal{L} }[/math] should indicate an MIS, on any ring with [math]\displaystyle{ 2t+3 }[/math] consecutive nodes labeled as [math]\displaystyle{ (a_1,a_2,\ldots,a_{2t+3}) }[/math], because on such rings, there would be 3 consecutive nodes with the same output bit.
Therefore, any [math]\displaystyle{ t }[/math]-round local algorithm that can always correctly produce an MIS on a ring, must satisfies that
- [math]\displaystyle{ n\lt R_{2t+1}(2;2t+3,2t+3)\le \underbrace{2^{2^{\unicode{x22F0}^{2}}}}_{ct} }[/math],
for some constant [math]\displaystyle{ c\gt 0 }[/math], whose inverse function gives the lower bound
- [math]\displaystyle{ t=\Omega(\log^*n) }[/math].
- [math]\displaystyle{ \square }[/math]