imported>WikiSysop |
imported>Etone |
Line 1: |
Line 1: |
| == Sauer's lemma and VC-dimension == | | == Groups == |
|
| |
|
| === Shattering and the VC-dimension === | | == Burnside's Lemma == |
| {{Theorem|Definition (shatter)| | | {{Theorem|Burnside's Lemma| |
| :Let <math>\mathcal{F}\subseteq 2^X</math> be set family and let <math>R\subseteq X</math> be a subset. The '''trace''' of <math>\mathcal{F}</math> on <math>R</math>, denoted <math>\mathcal{F}|_R</math> is defined as | | :Let <math>G</math> be a permutation group acting on a set <math>X</math>. For each <math>\pi\in G</math>, let <math>X_\pi=\{x\in X\mid \pi\circ x=x\}</math> be the set of elements invariant under group action by <math>\pi</math>. The number of orbits, denoted <math>|X/G|</math>, is |
| ::<math>\mathcal{F}|_R=\{S\cap R\mid S\in\mathcal{F}\}</math>.
| | ::<math>|X/G|=\frac{1}{|G|}\sum_{\pi\in G}|X_{\pi}|.</math> |
| :We say that <math>\mathcal{F}</math> '''shatters''' <math>R</math> if <math>\mathcal{F}|_R=2^R</math>, i.e. for all <math>T\subseteq R</math>, there exists an <math>S\in\mathcal{F}</math> such that <math>T=S\cap R</math>.
| |
| }} | | }} |
|
| |
|
| The [http://en.wikipedia.org/wiki/VC_dimension '''VC dimension'''] is defined by the power of a family to shatter a set.
| | ==Pólya's Theory of Counting == |
| | |
| {{Theorem|Definition (VC-dimension)|
| |
| :The '''Vapnik–Chervonenkis dimension''' ('''VC-dimension''') of a set family <math>\mathcal{F}\subseteq 2^X</math>, denoted <math>\text{VC-dim}(\mathcal{F})</math>, is the size of the largest <math>R\subseteq X</math> shattered by <math>\mathcal{F}</math>.
| |
| }}
| |
| | |
| It is a core concept in [http://en.wikipedia.org/wiki/Computational_learning_theory computational learning theory].
| |
| | |
| Each subset <math>S\subseteq X</math> can be equivalently represented by its characteristic function <math>f_S:X\rightarrow\{0,1\}</math>, such that for each <math>x\in X</math>,
| |
| :<math>f_S(x)=\begin{cases}
| |
| 1 & x\in S\\
| |
| 0 & x\not\in S.
| |
| \end{cases}</math>
| |
| | |
| Then a set family <math>\mathcal{F}\subseteq2^X</math> corresponds to a collection of boolean functions <math>\{f_S\mid S\in\mathcal{F}\}</math>, which is a subset of all Boolean functions in the form <math>f:X\rightarrow\{0,1\}</math>. We wonder on how large a subdomain <math>Y\subseteq X</math>, <math>\mathcal{F}</math> includes all the <math>2^{|Y|}</math> mappings <math>Y\rightarrow\{0,1\}</math>. The largest size of such subdomain is the VC-dimension. It measures how complicated a collection of boolean functions (or equivalently a set family) is.
| |
| | |
| === Sauer's Lemma ===
| |
| The definition of the VC-dimension involves enumerating all subsets, thus is difficult to analyze in general. The following famous result state a very simple sufficient condition to lower bound the VC-dimension, regarding only the size of the family. The lemma is due to Sauer, and independently due to Shelah and Perles. A slightly weaker version is found by Vapnik and Chervonenkis, who use the framework to develop a theory of classifications.
| |
| | |
| {{Theorem|Sauer's Lemma (Sauer; Shelah-Perles; Vapnik-Chervonenkis)|
| |
| :Let <math>\mathcal{F}\subseteq 2^X</math> where <math>|X|=n</math>. If <math>|\mathcal{F}|>\sum_{1\le i<k}{n\choose i}</math>, then there exists an <math>R\in{X\choose k}</math> such that <math>\mathcal{F}</math> shatters <math>R</math>.
| |
| }}
| |
| In other words, for any set family <math>\mathcal{F}</math> with <math>|\mathcal{F}|>\sum_{1\le i<k}{n\choose i}</math>, its VC-dimension <math>\text{VC-dim}(\mathcal{F})\ge k</math>.
| |
| | |
| === Hereditary family ===
| |
| We note the Sauer's lemma is especially easy to prove for a special type of set families, called the '''hereditary''' families.
| |
| {{Theorem|Definition (hereditary family)|
| |
| :A set system <math>\mathcal{F}\subseteq 2^X</math> is said to be '''hereditary''' (also called an '''ideal''' or an '''abstract simplicial complex'''), if
| |
| ::<math>S\subseteq T\in\mathcal{F}</math> implies <math>S\in\mathcal{F}</math>.
| |
| }}
| |
| In other words, for a hereditary family <math>\mathcal{F}</math>, if <math>R\in\mathcal{F}</math>, then all subsets of <math>R</math> are also in <math>\mathcal{F}</math>. An immediate consequence is the following proposition.
| |
| {{Theorem|Proposition|
| |
| :Let <math>\mathcal{F}</math> be a hereditary family. If <math>R\in\mathcal{F}</math> then <math>\mathcal{F}</math> shatters <math>R</math>.
| |
| }}
| |
| | |
| Therefore, it is very easy to prove the Sauer's lemma for hereditary families:
| |
| {{Theorem|Lemma|
| |
| :For <math>\mathcal{F}\subseteq 2^X</math>, <math>|X|=n</math>, if <math>\mathcal{F}</math> is hereditary and <math>|\mathcal{F}|>\sum_{1\le i<k}{n\choose i}</math> then there exists an <math>R\in{X\choose k}</math> such that <math>\mathcal{F}</math> shatters <math>R</math>.
| |
| }}
| |
| {{Proof|
| |
| Since <math>\mathcal{F}</math> is hereditary, we only need to show that there exists an <math>R\in\mathcal{F}</math> of size <math>|R|\ge k</math>, which must be true, because if all members of <math>\mathcal{F}</math> are of sizes <math><k</math>, then <math>|\mathcal{F}|\le\left|\bigcup_{1\le i<k}{X\choose k}\right|=\sum_{1\le i<k}{n\choose i}</math>, a contradiction.
| |
| }}
| |
| | |
| To prove the Sauer's lemma for general non-hereditary families, we can use some way to reduce arbitrary families to hereditary families. Here we apply the shifting technique to achieve this.
| |
| | |
| === Down-shifts ===
| |
| Note that we work on <math>\mathcal{F}\subseteq2^X</math>, instead of <math>\mathcal{F}\subseteq{X\choose k}</math> like in the Erdős–Ko–Rado theorem, so we do not need to preserve the size of member sets. Instead, we need to reduce an arbitrary family to a hereditary one, thus we use a shift operator which replaces a member set by a subset of it.
| |
| {{Theorem|Definition (down-shifts)|
| |
| : Assume <math>\mathcal{F}\subseteq 2^{[n]}</math>, and <math>i\in[n]</math>. Define the '''down-shift''' operator <math>S_{i}</math> as follows:
| |
| :* for each <math>T\in\mathcal{F}</math>, let
| |
| ::<math>S_{i}(T)=
| |
| \begin{cases}
| |
| T\setminus\{i\} & \mbox{if }i\in T \mbox{ and }T\setminus\{i\} \not\in\mathcal{F},\\
| |
| T & \mbox{otherwise;}
| |
| \end{cases}</math>
| |
| :* let <math>S_{i}(\mathcal{F})=\{S_{i}(T)\mid T\in \mathcal{F}\}</math>.
| |
| }}
| |
| | |
| Repeatedly applying <math>S_i</math> to <math>\mathcal{F}</math> for all <math>i\in[n]</math>, due to the finiteness, eventually <math>\mathcal{F}</math> is not changed by any <math>S_i</math>. We call such a family '''down-shifted'''. A family <math>\mathcal{F}</math> is down-shifted if and only if <math>S_i(\mathcal{F})=\mathcal{F}</math> for all <math>i\in[n]</math>. It is then easy to see that a down-shifted <math>\mathcal{F}</math> must be hereditary.
| |
| | |
| {{Theorem|Theorem|
| |
| :If <math>\mathcal{F}\subseteq2^X</math> is down-shifted, then <math>\mathcal{F}</math> is hereditary.
| |
| }}
| |
| | |
| In order to use down-shift to prove the Sauer's lemma, we need to make sure that down-shift does not decrease <math>|\mathcal{F}|</math> and does not increase the VC-dimension <math>\text{VC-dim}(\mathcal{F})</math>.
| |
| | |
| {{Theorem|Proposition|
| |
| # <math>|S_{i}(\mathcal{F})|=\mathcal{F}</math>;
| |
| # <math>|S_i(\mathcal{F})|_R|\le |\mathcal{F}|_R|</math>, thus if <math>S_{i}(\mathcal{F})</math> shatters an <math>R</math>, so does <math>\mathcal{F}</math>.
| |
| }}
| |
| {{Proof|
| |
| (1) is immediate. To prove (2), it is sufficient to prove that <math>S_i(\mathcal{F})|_R\subseteq S_i(\mathcal{F}|_R)</math> and then apply (1).
| |
| | |
| [[file:Down-shift-property.png|600px]]
| |
| }}
| |
| | |
| ;Proof of Sauer's lemma
| |
| Now we can prove the Sauer's lemma for arbitrary <math>\mathcal{F}\subseteq 2^{[n]}</math>.
| |
| | |
| For any <math>\mathcal{F}\subseteq 2^{[n]}</math>, repeatedly apply <math>S_i</math> for all <math>i\in</math> till the family is down-shifted, which is denoted by <math>\mathcal{F}'</math>. We have proved that <math>|\mathcal{F}'|=|\mathcal{F}|>\sum_{1\le i<k}{n\choose i}</math> and <math>\mathcal{F}'</math> is hereditary, thus as argued before, here exists an <math>R</math> of size <math>k</math> shattered by <math>\mathcal{F}'</math>. By the above proposition, <math>|\mathcal{F}'|_R|\le |\mathcal{F}|_R|</math>, thus <math>\mathcal{F}</math> also shatters <math>R</math>. The lemma is proved.
| |
| | |
| <math>\square</math>
| |
| | |
| == The Kruskal–Katona Theorem ==
| |
| The '''shadow''' of a set system <math>\mathcal{F}</math>, denoted <math>\Delta\mathcal{F}</math>, consists of all sets which can be obtained by removing an element from a set in <math>\mathcal{F}</math>.
| |
| | |
| {{Theorem|Definition|
| |
| :Let <math>\mathcal{F}\subseteq {X\choose k}</math>. The '''shadow''' of <math>\mathcal{F}</math> is defined to be
| |
| ::<math>\Delta\mathcal{F}=\left\{T\in {X\choose k-1}\,\,\bigg|\,\, \exists S\in\mathcal{F}\mbox{ such that } T\subset S\right\}</math>.
| |
| }}
| |
| | |
| The shadow contains rich information about the set system. An extremal question is: for a system of fixed number of <math>k</math>-sets, how small can its shadow be? The Kruskal–Katona theorem gives an answer to this question.
| |
| | |
| To state the result of the Kruskal–Katona theorem, we need to introduce the concepts of the <math>k</math>-cascade representation of numbers and the colex order of sets.
| |
| | |
| === <math>k</math>-cascade representation of a number ===
| |
| {{Theorem|Theorem|
| |
| :Given positive integers <math>m</math> and <math>k</math>, there exists a unique representation of <math>m</math> in the form
| |
| ::<math>m={m_k\choose k}+{m_{k-1}\choose k-1}+\cdots+{m_t\choose t}=\sum_{\ell=t}^k{m_\ell\choose \ell}</math>,
| |
| :where <math>m_k>m_{k-1}>\cdots>m_t\ge t\ge 1</math>.
| |
| :This representation of <math>m</math> is called a '''<math>k</math>-cascade''' (or a '''<math>k</math>-binomial''') representation of <math>m</math>.
| |
| }}
| |
| {{Proof|
| |
| In fact, the <math>k</math>-cascade representation <math>(m_k,m_{k-1},\ldots,m_t)</math> of an <math>m</math> can be found by the following simple greedy algorithm:
| |
| <math>\ell=k;</math>
| |
| while (<math>m>0</math>) do
| |
| let <math>m_\ell</math> be the largest integer for which <math>{m_\ell\choose \ell}\le m;</math>
| |
| <math>m=m-m_\ell;</math>
| |
| <math>\ell=\ell-1;</math>
| |
| end
| |
| | |
| We then show that the above algorithm constructs a sequence <math>(m_k,m_{k-1},\ldots,m_t)</math> with <math>m_k>m_{k-1}>\cdots>m_t\ge t\ge 1</math>.
| |
| | |
| Suppose the current <math>\ell</math> and the current <math>m</math>. To see that <math>m_{\ell-1}< m_\ell</math>, we suppose otherwise <math>m_{\ell-1}\ge m_\ell</math>. Then
| |
| :<math>m\ge {m_\ell\choose \ell}+{m_{\ell-1}\choose \ell-1}\ge{m_\ell\choose \ell}+{m_{\ell}\choose \ell-1}={1+m_{\ell}\choose \ell}</math>
| |
| contradicting the maximality of <math>m_\ell</math>. Therefore, <math>m_\ell>m_{\ell-1}</math>.
| |
| | |
| The algorithm continues reducing <math>m</math> to smaller nonnegative values, and eventually reaches a stage where the choice of <math>m_t</math> for some <math>t\ge 2</math> where <math>{m_t\choose t}</math> equals the current <math>m</math>; or gets right down to choosing <math>m_1</math> as the integer such that <math>m_1={m_1\choose 1}</math> equals the current <math>m</math>.
| |
| | |
| Therefore, <math>m={m_k\choose k}+{m_{k-1}\choose k-1}+\cdots+{m_t\choose t}</math> and <math>m_k>m_{k-1}>\cdots>m_t\ge t\ge 1</math>.
| |
| | |
| The uniqueness of the <math>k</math>-cascade representation follows by the induction on <math>k</math>.
| |
| | |
| When <math>k=1</math>, <math>m</math> has a unique <math>k</math>-cascade representation <math>m={m\choose 1}</math>.
| |
| | |
| For general <math>k>1</math>, suppose that every nonnegative integer <math>m</math> has a unique <math>(k-1)</math>-cascade representation.
| |
| | |
| Suppose then that <math>m</math> has two <math>k</math>-cascade representations:
| |
| :<math>m={a_k\choose k}+{a_{k-1}\choose k-1}+\cdots+{a_t\choose t}={b_k\choose k}+{b_{k-1}\choose k-1}+\cdots+{b_r\choose r}</math>.
| |
| We then show that it must hold that <math>a_k=b_k</math>.
| |
| If <math>a_k\neq b_k</math>, WLOG, suppose that <math>a_k<b_k</math>. We obtain
| |
| :<math>
| |
| \begin{align}
| |
| m&
| |
| ={a_k\choose k}+{a_{k-1}\choose k-1}+\cdots+{a_t\choose t}\\
| |
| &\le {a_k\choose k}+{a_{k}-1\choose k-1}+\cdots+{a_k-(k-t)\choose t}+\cdots+{a_k-k+1\choose 1}\\
| |
| &={a_k+1\choose k}-1,
| |
| \end{align}</math>
| |
| where the last equation is got by repeatedly applying the identity
| |
| :<math>{n\choose k}+{n\choose k-1}={n+1\choose k}</math>.
| |
| We then obtain
| |
| :<math>m<{a_k+1\choose k}\le{b_k\choose k}\le m</math>,
| |
| which is a contradiction. Therefore, <math>a_k=b_k</math>, and by the induction hypothesis, the remaining value <math>m-a_k=m-b_k</math> has a unique <math>(k-1)</math>-cascade representation, so <math>a_i=b_i</math> for all <math>i</math>.
| |
| }}
| |
| | |
| === Co-lexicographic order of subsets ===
| |
| The co-lexicographic order of sets plays a particularly important role in the investigation of the size of the shadow of a system of <math>k</math>-sets.
| |
| {{Theorem|Definition|
| |
| : The '''co-lexicographic (colex) order''' (also called the '''reverse lexicographic order''') of sets is defined as follows: for any <math>A,B\subseteq [n]</math>, <math>A\neq B</math>,
| |
| ::<math>A<B</math> if <math>\max A\setminus B < \max B\setminus A</math>.
| |
| }}
| |
| | |
| We can sort sets in colex order by first writing each set as a tuple, whose elements are in decreasing order, and then sorting the tuples in the lexicographic order of tuples.
| |
| | |
| For example, <math>{[5]\choose 3}</math> in colex order is
| |
| {3,2,1}
| |
| {4,2,1}
| |
| {4,3,1}
| |
| {4,3,2}
| |
| {5,2,1}
| |
| {5,3,1}
| |
| {5,3,2}
| |
| {5,4,1}
| |
| {5,4,2}
| |
| {5,4,3}
| |
| | |
| We find that the first <math>{n\choose 3}</math> sets in this order for <math>n=3,4,5</math>, form precisely <math>{[n]\choose 3}</math>. And if we write the colex order of <math>{[6]\choose 3}</math>, the above colex order of <math>{[5]\choose 3}</math> appears as a prefix of that order. Elaborating on this, we have:
| |
| {{Theorem|Proposition|
| |
| :Let <math>\mathcal{R}(m,k)</math> be the first <math>m</math> sets in the colex order of <math>{\mathbb{N}\choose k}</math>. Then
| |
| ::<math>\mathcal{R}\left({n\choose k},k\right)={[n]\choose k}</math>,
| |
| :that is, the first <math>{n\choose k}</math> sets in the colex order of all <math>k</math>-sets of natural numbers is precisely <math>{[n]\choose k}</math>.
| |
| }}
| |
| | |
| This proposition says that the sets in <math>\mathcal{R}(m,k)</math> is highly overlapped, which suggests that <math>\mathcal{R}(m,k)</math> may have small shadow. The size of the shadow of <math>\mathcal{R}(m,k)</math> is closely related to the <math>k</math>-cascade representation of <math>m</math>.
| |
| | |
| {{Theorem|Theorem|
| |
| :Suppose the <math>k</math>-cascade representation of <math>m</math> is
| |
| ::<math>m={m_k\choose k}+{m_{k-1}\choose k-1}+\cdots+{m_t\choose t}</math>.
| |
| :Then
| |
| ::<math>|\Delta\mathcal{R}(m,k)|={m_k\choose k-1}+{m_{k-1}\choose k-2}+\cdots+{m_t\choose t-1}</math>.
| |
| }}
| |
| {{Proof|
| |
| Given <math>m</math> and its <math>k</math>-cascade representation <math>m={m_k\choose k}+{m_{k-1}\choose k-1}+\cdots+{m_t\choose t}</math>, the <math>\mathcal{R}(m,k)</math> is constructed as:
| |
| * all sets in <math>{[m_k]\choose k}</math>;
| |
| * all sets in <math>{[m_{k-1}]\choose k-1}</math>, unioned with <math>\{1+m_k\}\,</math>;
| |
| ::<math>\vdots</math>
| |
| * all sets in <math>{[m_{t}]\choose t}</math>, unioned with <math>\{1+m_k,1+m_{k-1},\ldots,1+m_{t+1}\}</math>.
| |
| The shadow <math>\Delta\mathcal{R}(m,k)</math> is the collection of all <math>(k-1)</math>-sets contained by the above sets, which are
| |
| * all sets in <math>{[m_k]\choose k-1}</math>;
| |
| * all sets in <math>{[m_{k-1}]\choose k-2}</math>, unioned with <math>\{1+m_k\}\,</math>;
| |
| ::<math>\vdots</math>
| |
| * all sets in <math>{[m_{t}]\choose t-1}</math>, unioned with <math>\{1+m_k,1+m_{k-1},\ldots,1+m_{t+1}\}</math>.
| |
| Thus,
| |
| :<math>|\Delta\mathcal{R}(m,k)|={m_k\choose k-1}+{m_{k-1}\choose k-2}+\cdots+{m_t\choose t-1}</math>.
| |
| }}
| |
| | |
| === The Kruskal–Katona theorem ===
| |
| The Kruskal–Katona theorem states that among all systems of <math>m</math> <math>k</math>-sets, <math>\mathcal{R}(m,k)</math>, i.e., the first <math>m</math> <math>k</math>-sets in the colex order, has the smallest shadow.
| |
| | |
| The theorem is proved independently by Joseph Kruskal in 1963 and G.O.H. Katona in 1966, and is a fundamental result in finite set theory and combinatorial topology.
| |
| {{Theorem|Theorem (Kruskal 1963, Katona 1966)|
| |
| :Let <math>\mathcal{F}\subseteq {X\choose k}</math> with <math>|\mathcal{F}|=m</math>, and suppose that
| |
| ::<math>m={m_k\choose k}+{m_{k-1}\choose k-1}+\cdots+{m_t\choose t}</math>
| |
| :where <math>m_k>m_{k-1}>\cdots>m_t\ge t\ge 1</math>. Then
| |
| ::<math>|\Delta\mathcal{F}|\ge {m_k\choose k-1}+{m_{k-1}\choose k-2}+\cdots+{m_t\choose t-1}</math>.
| |
| }}
| |
| | |
| The <math>f</math>-vector of a set system <math>\mathcal{F}\subseteq 2^X</math> is a vector <math>(|\mathcal{F}_0|,|\mathcal{F}_1|,\ldots,|\mathcal{F}_n|)</math> where
| |
| :<math>\mathcal{F}_k=\{S\mid S\in\mathcal{F}, |S|=k\}</math>,
| |
| i.e., the <math>f</math>-vector gives the number of member sets of each size.
| |
| | |
| In a hereditary family (also called an [http://en.wikipedia.org/wiki/Abstract_simplicial_complex abstract simplicial complex]), <math>\mathcal{F}_k</math> is formed by the shadow of <math>\mathcal{F}_{k+1}</math> as well as some additional <math>k</math>-sets introduced in this level. The Kruskal-Katona theorem gives a lower bound on <math>|\mathcal{F}_i|</math>, given the <math>|\mathcal{F}_{i-1}|</math>.
| |
| | |
| The original proof of the theorem is rather complicated. In the following years, several different proofs were discovered. Here we present a proof dueto Frankl by the shifting technique.
| |
| | |
| ;Frankl's shifting proof of Kruskal-Katonal theorem (Frankl 1984)
| |
| | |
| We take the classic <math>(i,j)</math>-shift operator <math>S_{ij}</math> defined in the original proof of the Erdős-Ko-Rado theorem.
| |
| {{Theorem|Definition (<math>(i,j)</math>-shift)|
| |
| : Assume <math>\mathcal{F}\subseteq 2^{[n]}</math>, and <math>0\le i<j\le n-1</math>. Define the '''<math>(i,j)</math>-shift''' <math>S_{ij}</math> as an operator on <math>\mathcal{F}</math> as follows:
| |
| :*for each <math>T\in\mathcal{F}</math>, write <math>T_{ij}=(T\setminus\{j\})\cup\{i\} </math>, and let
| |
| ::<math>S_{ij}(T)=
| |
| \begin{cases}
| |
| T_{ij} & \mbox{if }j\in T, i\not\in T, \mbox{ and }T_{ij} \not\in\mathcal{F},\\
| |
| T & \mbox{otherwise;}
| |
| \end{cases}</math>
| |
| :* let <math>S_{ij}(\mathcal{F})=\{S_{ij}(T)\mid T\in \mathcal{F}\}</math>.
| |
| }}
| |
| | |
| It is immediate that shifting does not change the size of the set or the size of the system, i.e., <math>|S_{ij}(T)|=|T|\,</math> and <math>|S_{ij}(\mathcal{F})|=\mathcal{F}</math>. And due to the finiteness, repeatedly applying <math>(i,j)</math>-shifts for any <math>1\le i<j\le n</math>, eventually <math>\mathcal{F}</math> does not changing any more. We called such an <math>\mathcal{F}</math> with <math>\mathcal{F}=S_{ij}(\mathcal{F})</math> for any <math>1\le i<j\le n</math> '''shifted'''.
| |
| | |
| In order to make the shifting technique work for shadows, we have to prove that shifting does not increase the size of the shadow.
| |
| | |
| {{Theorem|Proposition|
| |
| :<math>\Delta S_{ij}(\mathcal{F})\subseteq S_{ij}(\Delta\mathcal{F})</math>.
| |
| }}
| |
| {{Proof|
| |
| We abuse the notation <math>\Delta</math> and let <math>\Delta A=\Delta\{A\}</math> if <math>A</math> is a set instead of a set system.
| |
| | |
| It is sufficient to show that for any <math>A\in\mathcal{F}</math>, <math>\Delta S_{ij}(A)\subseteq S_{ij}(\Delta\mathcal{F})</math>.
| |
| | |
| [[file:Property-shift-shadow.png|600px]]
| |
| }}
| |
| | |
| An immediate corollary of the above proposition is that the <math>(i,j)</math>-shifts <math>S_{ij}</math> for any <math>1\le i<j\le n</math> do not increase the size of the shadow.
| |
| | |
| {{Theorem|Corollary|
| |
| :<math>|\Delta S_{ij}(\mathcal{F})|\le |\Delta\mathcal{F}|</math>.
| |
| }}
| |
| {{Proof|
| |
| By the above proposition, <math>|\Delta S_{ij}(\mathcal{F})|\le|S_{ij}(\Delta\mathcal{F})|</math>, and we know that <math>S_{ij}</math> does not change the cardinality of a set family, that is, <math>|S_{ij}(\Delta\mathcal{F})|=|\Delta\mathcal{F}|</math>, therefore <math>\Delta S_{ij}(\mathcal{F})|\le|\Delta\mathcal{F}|</math>.
| |
| }}
| |
| | |
| ;Proof of Kruskal-Katona theorem
| |
| We know that shifts never enlarge the shadow, thus it is sufficient to prove the theorem for shifted <math>\mathcal{F}</math>. We then assume <math>\mathcal{F}</math> is shifted.
| |
| | |
| Apply induction on <math>m</math> and for given <math>m</math> on <math>k</math>. The theorem holds trivially for the case that <math>k=1</math> and <math>m</math> is arbitrary.
| |
| | |
| Define
| |
| :<math>\mathcal{F}_0=\{A\in\mathcal{F}\mid 1\not\in A\}</math>,
| |
| :<math>\mathcal{F}_1=\{A\in\mathcal{F}\mid 1\in A\}</math>.
| |
| And let <math>\mathcal{F}_1'=\{A\setminus\{1\}\mid A\in\mathcal{F}_1\}</math>.
| |
| | |
| Clearly <math>\mathcal{F}_0,\mathcal{F}_1\subseteq{X\choose k}</math>, <math>\mathcal{F}_1'\subseteq{X\choose k-1}</math>, and
| |
| :<math>|\mathcal{F}|=|\mathcal{F}_0|+|\mathcal{F}_1|=|\mathcal{F}_0|+|\mathcal{F}_1'|</math>.
| |
| | |
| Our induction is based on the following observation regarding the size of the shadow.
| |
| {{Theorem|Lemma 1|
| |
| :<math>|\Delta\mathcal{F}|\ge|\Delta\mathcal{F}_1'|+|\mathcal{F}_1'|</math>.
| |
| }}
| |
| {{Proof|
| |
| Obviously <math>\mathcal{F}\supseteq\mathcal{F}_1</math> and <math>\Delta\mathcal{F}\supseteq\Delta\mathcal{F}_1</math>.
| |
| :<math>\begin{align}
| |
| \Delta\mathcal{F}_1
| |
| &=\left\{A\in{X\choose k-1}\,\,\bigg|\,\, \exists B\in\mathcal{F}_1, A\subset B\right\}\\
| |
| &=\left\{A\in{X\choose k-1}\,\,\bigg|\,\, 1\in A, \exists B\in\mathcal{F}_1, A\subset B\right\}\\
| |
| &\quad\, \cup \left\{A\in{X\choose k-1}\,\,\bigg|\,\, 1\not\in A, \exists B\in\mathcal{F}_1, A\subset B\right\}\\
| |
| &=\{S\cup\{1\}\mid S\in\Delta\mathcal{F}_1'\}\cup\mathcal{F}_1'.
| |
| \end{align}</math>
| |
| The union is taken over two disjoint families. Therefore,
| |
| :<math>|\Delta\mathcal{F}|\ge|\Delta\mathcal{F}_1|=|\Delta\mathcal{F}_1'|+|\mathcal{F}_1'|</math>.
| |
| }}
| |
| | |
| The following property of shifted <math>\mathcal{F}</math> is essential for our proof.
| |
| {{Theorem|Lemma 2|
| |
| :For shifted <math>\mathcal{F}</math>, it holds that <math>\Delta\mathcal{F}_0\subseteq \mathcal{F}_1'</math>.
| |
| }}
| |
| {{Proof|
| |
| If <math>A\in\Delta\mathcal{F}_0</math> then <math>A\cup\{j\}\in\mathcal{F}_0\subseteq\mathcal{F}</math> for some <math>j>1</math> so that, since <math>\mathcal{F}</math> is shifted, applying the <math>(1,j)</math>-shift <math>S_{1j}</math>, <math>A\cup\{1\}\in\mathcal{F}</math>, thus, <math>A\in\mathcal{F}_1'</math>.
| |
| }}
| |
| | |
| We then bound the size of <math>\mathcal{F}_1'</math> as:
| |
| | |
| {{Theorem|Lemma 3|
| |
| :If <math>\mathcal{F}</math> is shifted, then
| |
| ::<math>|\mathcal{F}_1'|\ge{m_k-1\choose k-1}+{m_{k-1}-1\choose k-2}+\cdots+{m_t-1\choose t-1}</math>.
| |
| }}
| |
| {{Proof|
| |
| By contradiction, assume that
| |
| :<math>|\mathcal{F}_1'|<{m_k-1\choose k-1}+{m_{k-1}-1\choose k-2}+\cdots+{m_t-1\choose t-1}</math>.
| |
| Then by <math>|\mathcal{F}|=|\mathcal{F}_0|+|\mathcal{F}_1'|</math>, it holds that
| |
| :<math>
| |
| \begin{align}
| |
| |\mathcal{F}_0| &=m-|\mathcal{F}_1'|\\
| |
| &>\left\{{m_k\choose k}- {m_k-1\choose k-1}\right\}+\cdots+\left\{{m_t\choose t}- {m_t-1\choose t-1}\right\}\\
| |
| &={m_k-1\choose k}+\cdots+{m_t-1\choose t},
| |
| \end{align}</math>
| |
| so that, by the induction hypothesis,
| |
| :<math>|\Delta\mathcal{F}_0|\ge{m_k-1\choose k-1}+{m_{k-1}-1\choose k-1}+\cdots+{m_t-1\choose t-1}>|\mathcal{F}_1'|</math>.
| |
| On the other hand, by Lemma 2, <math>|\mathcal{F}_1'|\ge|\Delta\mathcal{F}_0|</math>. Thus <math>|\mathcal{F}_1'|\ge|\Delta\mathcal{F}_0|>|\mathcal{F}_1'|</math>, a contradiction.
| |
| }}
| |
| | |
| Now we officially apply the induction. By Lemma 1,
| |
| :<math>|\Delta\mathcal{F}|\ge|\Delta\mathcal{F}_1'|+|\mathcal{F}_1'|</math>.
| |
| Note that <math>\mathcal{F}_1'\subseteq{X\choose k-1}</math>, and due to Lemma 3,
| |
| :<math>|\mathcal{F}_1'|\ge{m_k-1\choose k-1}+{m_{k-1}-1\choose k-2}+\cdots+{m_t-1\choose t-1}</math>,
| |
| thus by the induction hypothesis,
| |
| :<math>|\Delta\mathcal{F}_1'|\ge{m_k-1\choose k-2}+{m_{k-1}-1\choose k-3}+\cdots+{m_t-1\choose t-2}</math>.
| |
| Combining them together, we have
| |
| :<math>
| |
| \begin{align}
| |
| |\Delta\mathcal{F}|
| |
| &\ge |\Delta\mathcal{F}_1'|+|\mathcal{F}_1'|\\
| |
| &\ge {m_k-1\choose k-2}+\cdots+{m_t-1\choose t-2}+{m_k-1\choose k-1}+\cdots+{m_t-1\choose t-1}\\
| |
| &= {m_k\choose k-1}+{m_{k-1}\choose k-2}+\cdots+{m_t\choose t-1}.
| |
| \end{align}</math>
| |
| | |
| :<math>\square</math>
| |
| | |
| === Shadows of specific sizes ===
| |
| The definition of shadow can be generalized to the subsets of any size.
| |
| {{Theorem|Definition|
| |
| :The '''<math>r</math>-shadow''' of <math>\mathcal{F}</math> is defined as
| |
| ::<math>\Delta_r\mathcal{F}=\left\{S\mid |S|=r\text{ and }\exists T\in\mathcal{F}, S\subseteq T\right\}</math>.
| |
| }}
| |
| And the general version of the Kruskal-Katona theorem can be deduced.
| |
| {{Theorem|Kruskal-Katona Theorem (general version)|
| |
| :Let <math>\mathcal{F}\subseteq {X\choose k}</math> with <math>|\mathcal{F}|=m</math>, and suppose that
| |
| ::<math>m={m_k\choose k}+{m_{k-1}\choose k-1}+\cdots+{m_t\choose t}</math>
| |
| :where <math>m_k>m_{k-1}>\cdots>m_t\ge t\ge 1</math>. Then for all <math>r</math>, <math>1\le r\le k</math>,
| |
| ::<math>\left|\Delta_r\mathcal{F}\right|\ge {m_k\choose r}+{m_{k-1}\choose r-1}+\cdots+{m_t\choose t-k+r}</math>.
| |
| }}
| |
| {{Proof|
| |
| Note that for <math>\mathcal{F}\subseteq {X\choose k}</math>,
| |
| :<math>\Delta_r\mathcal{F}=\underbrace{\Delta\cdots\Delta}_{k-r}\mathcal{F}</math>.
| |
| The theorem follows by repeatedly applying the Kruskal-Katona theorem for <math>\Delta\mathcal{F}</math>.
| |
| }}
| |
| | |
| === The Erdős–Ko–Rado theorem, revisited===
| |
| To demonstrate the power of the Krulskal-Katona theorem, we show that it actually includes the Erdős–Ko–Rado theorem as a special case. The following elegant proof of the Erdős–Ko–Rado theorem is due to Daykin and Clements independently.
| |
| {{Theorem|Erdős–Ko–Rado Theorem|
| |
| :Let <math>\mathcal{F}\subseteq {X\choose k}</math> where <math>|X|=n</math> and <math>n\ge 2k</math>. If <math>S\cap T\neq\emptyset</math> for any <math>S,T\in\mathcal{F}</math>, then
| |
| ::<math>|\mathcal{F}|\le{n-1\choose k-1}</math>.
| |
| }}
| |
| | |
| {{Prooftitle|Proof by the Kruskal-Katona theorem|(Daykin 1974, Clements 1976)
| |
| By contradiction, suppose that <math>|\mathcal{F}|>{n-1\choose k-1}</math>.
| |
| | |
| We define the dual system
| |
| :<math>\mathcal{G}=\{\bar{S}\mid S\in\mathcal{F}\}</math>, where <math>\bar{S}=X\setminus S</math>.
| |
| For any <math>S,T\in\mathcal{F}</math>, the condition <math>S\cap T\neq \emptyset</math> is equivalent to <math>S\not\subseteq \bar{T}</math>, so <math>\mathcal{F}</math> and <math>\Delta_k\mathcal{G}</math> are disjoint, thus
| |
| :<math>|\mathcal{F}|+|\Delta_k\mathcal{G}|\le{n\choose k}</math>.
| |
| Clearly, <math>\mathcal{G}\subseteq {X\choose n-k}</math> and
| |
| :<math>|\mathcal{G}|=|\mathcal{F}|>{n-1\choose k-1}={n-1\choose n-k}</math>.
| |
| By the Kruskal-Katona theorem, <math>|\Delta_k\mathcal{G}|\ge{n-1\choose k}</math>.
| |
| :<math>|\mathcal{F}|+|\Delta_k\mathcal{G}|>{n-1\choose k-1}+{n-1\choose k}={n\choose k}</math>,
| |
| a contradiction.
| |
| }}
| |