Combinatorics (Fall 2010)/Extremal set theory II

Sauer's lemma and VC-dimension

Shattering

Definition

Let [math]\displaystyle{ \mathcal{F}\subseteq 2^X }[/math] be set family and let [math]\displaystyle{ R\subseteq X }[/math] be a subset. The trace of [math]\displaystyle{ \mathcal{F} }[/math] on [math]\displaystyle{ R }[/math], denoted [math]\displaystyle{ \mathcal{F}|_R }[/math] is defined as

[math]\displaystyle{ \mathcal{F}|_R=\{S\cap R\mid S\in\mathcal{F}\} }[/math].

We say that [math]\displaystyle{ \mathcal{F} }[/math] shatters [math]\displaystyle{ R }[/math] if [math]\displaystyle{ \mathcal{F}|_R=2^R }[/math], i.e. for all [math]\displaystyle{ T\subseteq R }[/math], there exists an [math]\displaystyle{ S\in\mathcal{F} }[/math] such that [math]\displaystyle{ T=S\cap R }[/math].

Sauer's Lemma (Sauer; Shelah-Perles; Vapnik-Chervonenkis)

Let [math]\displaystyle{ \mathcal{F}\subseteq 2^X }[/math] where [math]\displaystyle{ |X|=n }[/math]. If [math]\displaystyle{ |\mathcal{F}|\gt \sum_{1\le i\lt k}{n\choose i} }[/math], then there exists an [math]\displaystyle{ R\in{X\choose k} }[/math] such that [math]\displaystyle{ \mathcal{F} }[/math] shatters [math]\displaystyle{ R }[/math].

Down-shifts

Definition (down-shifts)

Assume [math]\displaystyle{ \mathcal{F}\subseteq 2^{[n]} }[/math], and [math]\displaystyle{ i\in[n] }[/math]. Define the down-shift operator [math]\displaystyle{ S_{i} }[/math] as follows:

for each [math]\displaystyle{ T\in\mathcal{F} }[/math], let

[math]\displaystyle{ S_{i}(T)= \begin{cases} T\setminus\{i\} & \mbox{if }i\in T \mbox{ and }T\setminus\{i\} \not\in\mathcal{F},\\ T & \mbox{otherwise;} \end{cases} }[/math]

let [math]\displaystyle{ S_{i}(\mathcal{F})=\{S_{i}(T)\mid T\in \mathcal{F}\} }[/math].

Proposition

[math]\displaystyle{ |S_{i}(\mathcal{F})|=\mathcal{F} }[/math];
[math]\displaystyle{ S_i(\mathcal{F})|_R\subseteq S_i(\mathcal{F}|_R) }[/math], thus [math]\displaystyle{ |S_i(\mathcal{F})|_R|\le |\mathcal{F}|_R| }[/math].

Proof.

1 is immediate. We prove 2, that [math]\displaystyle{ S_i(\mathcal{F})|_R\subseteq S_i(\mathcal{F}|_R) }[/math].

File:Down-shift-property.png

[math]\displaystyle{ \square }[/math]

The Vapnik-Chervonenkis dimenison

Definition (VC-dimension)

The Vapnik–Chervonenkis dimension (VC-dimension) of a set family [math]\displaystyle{ \mathcal{F}\subseteq 2^X }[/math], denoted [math]\displaystyle{ \text{VC-dim}(\mathcal{F}) }[/math], is the size of the largest [math]\displaystyle{ R\subseteq X }[/math] shattered by [math]\displaystyle{ \mathcal{F} }[/math].

The Kruskal–Katona Theorem

[math]\displaystyle{ k }[/math]-cascade representation of a number

Theorem

Given positive integers [math]\displaystyle{ m }[/math] and [math]\displaystyle{ k }[/math], there exists a unique representation of [math]\displaystyle{ m }[/math] in the form

[math]\displaystyle{ m={m_k\choose k}+{m_{k-1}\choose k-1}+\cdots+{m_t\choose t}=\sum_{\ell=t}^k{m_\ell\choose \ell} }[/math],

where [math]\displaystyle{ m_k\gt m_{k-1}\gt \cdots\gt m_t\ge t\ge 1 }[/math].

This representation of [math]\displaystyle{ m }[/math] is called a [math]\displaystyle{ k }[/math]-cascade (or a [math]\displaystyle{ k }[/math]-binomial) representation of [math]\displaystyle{ m }[/math].

Proof.

In fact, the [math]\displaystyle{ k }[/math]-cascade representation [math]\displaystyle{ (m_k,m_{k-1},\ldots,m_t) }[/math] of an [math]\displaystyle{ m }[/math] can be found by the following simple greedy algorithm:

[math]\displaystyle{ \ell=k; }[/math]
while ([math]\displaystyle{ m\gt 0 }[/math]) do
    let [math]\displaystyle{ m_\ell }[/math] be the largest integer for which [math]\displaystyle{ {m_\ell\choose \ell}\le m; }[/math]
    [math]\displaystyle{ m=m-m_\ell; }[/math]
    [math]\displaystyle{ \ell=\ell-1; }[/math]
end

We then show that the above algorithm constructs a sequence [math]\displaystyle{ (m_k,m_{k-1},\ldots,m_t) }[/math] with [math]\displaystyle{ m_k\gt m_{k-1}\gt \cdots\gt m_t\ge t\ge 1 }[/math].

Suppose the current [math]\displaystyle{ \ell }[/math] and the current [math]\displaystyle{ m }[/math]. To see that [math]\displaystyle{ m_{\ell-1}\lt m_\ell }[/math], we suppose otherwise [math]\displaystyle{ m_{\ell-1}\ge m_\ell }[/math]. Then

[math]\displaystyle{ m\ge {m_\ell\choose \ell}+{m_{\ell-1}\choose \ell-1}\ge{m_\ell\choose \ell}+{m_{\ell}\choose \ell-1}={1+m_{\ell}\choose \ell} }[/math]

contradicting the maximality of [math]\displaystyle{ m_\ell }[/math]. Therefore, [math]\displaystyle{ m_\ell\gt m_{\ell-1} }[/math].

The algorithm continues reducing [math]\displaystyle{ m }[/math] to smaller nonnegative values, and eventually reaches a stage where the choice of [math]\displaystyle{ m_t }[/math] for some [math]\displaystyle{ t\ge 2 }[/math] where [math]\displaystyle{ {m_t\choose t} }[/math] equals the current [math]\displaystyle{ m }[/math]; or gets right down to choosing [math]\displaystyle{ m_1 }[/math] as the integer such that [math]\displaystyle{ m_1={m_1\choose 1} }[/math] equals the current [math]\displaystyle{ m }[/math].

Therefore, [math]\displaystyle{ m={m_k\choose k}+{m_{k-1}\choose k-1}+\cdots+{m_t\choose t} }[/math] and [math]\displaystyle{ m_k\gt m_{k-1}\gt \cdots\gt m_t\ge t\ge 1 }[/math].

The uniqueness of the [math]\displaystyle{ k }[/math]-cascade representation follows by the induction on [math]\displaystyle{ k }[/math].

When [math]\displaystyle{ k=1 }[/math], [math]\displaystyle{ m }[/math] has a unique [math]\displaystyle{ k }[/math]-cascade representation [math]\displaystyle{ m={m\choose 1} }[/math].

For general [math]\displaystyle{ k\gt 1 }[/math], suppose that every nonnegative integer [math]\displaystyle{ m }[/math] has a unique [math]\displaystyle{ (k-1) }[/math]-cascade representation.

Suppose then that [math]\displaystyle{ m }[/math] has two [math]\displaystyle{ k }[/math]-cascade representations:

[math]\displaystyle{ m={a_k\choose k}+{a_{k-1}\choose k-1}+\cdots+{a_t\choose t}={b_k\choose k}+{b_{k-1}\choose k-1}+\cdots+{b_r\choose r} }[/math].

We then show that it must hold that [math]\displaystyle{ a_k=b_k }[/math]. If [math]\displaystyle{ a_k\neq b_k }[/math], WLOG, suppose that [math]\displaystyle{ a_k\lt b_k }[/math]. We obtain

[math]\displaystyle{ \begin{align} m& ={a_k\choose k}+{a_{k-1}\choose k-1}+\cdots+{a_t\choose t}\\ &\le {a_k\choose k}+{a_{k}-1\choose k-1}+\cdots+{a_k-(k-t)\choose t}+\cdots+{a_k-k+1\choose 1}\\ &={a_k+1\choose k}-1, \end{align} }[/math]

by repeatedly applying the identity

[math]\displaystyle{ {n\choose k}+{n\choose k-1}={n+1\choose k} }[/math].

We then obtain

[math]\displaystyle{ m\lt {a_k+1\choose k}\le{b_k\choose k}\le m }[/math],

which is a contradiction. Therefore, [math]\displaystyle{ a_k=b_k }[/math], and by the induction hypothesis, the remaining value [math]\displaystyle{ m-a_k=m-b_k }[/math] has a unique [math]\displaystyle{ (k-1) }[/math]-cascade representation, so [math]\displaystyle{ a_i=b_i }[/math] for all [math]\displaystyle{ i }[/math].

[math]\displaystyle{ \square }[/math]

Co-lexicographic order of subsets

The co-lexicographic order defined on the sets plays a particularly important role in the investigation of the size of the shadow of a system of [math]\displaystyle{ k }[/math]-sets.

Definition

The co-lexicographic (colex) order (also called the reverse lexicographic order) of sets is defined as follows: for any [math]\displaystyle{ A,B\subseteq [n] }[/math], [math]\displaystyle{ A\neq B }[/math],

[math]\displaystyle{ A\lt B }[/math] if [math]\displaystyle{ \max A\setminus B \lt \max B\setminus A }[/math].

We can sort sets in colex order by first writing each set as a tuple, whose elements are in decreasing order, and then sorting the tuples in the lexicographic order of tuples.

For example, [math]\displaystyle{ {[5]\choose 3} }[/math] in colex order is

{3,2,1}
{4,2,1}
{4,3,1}
{4,3,2}
{5,2,1}
{5,3,1}
{5,3,2}
{5,4,1}
{5,4,2}
{5,4,3}

We find that the first [math]\displaystyle{ {n\choose 3} }[/math] sets in this order for [math]\displaystyle{ n=3,4,5 }[/math], form precisely [math]\displaystyle{ {[n]\choose 3} }[/math]. And if we write the colex order of [math]\displaystyle{ {[6]\choose 3} }[/math], the above colex order of [math]\displaystyle{ {[5]\choose 3} }[/math] appears as a prefix of that order. Elaborating on this, we have:

Proposition

Let [math]\displaystyle{ \mathcal{R}(m,k) }[/math] be the first [math]\displaystyle{ m }[/math] sets in the colex order of [math]\displaystyle{ {\mathbb{N}\choose k} }[/math]. Then

[math]\displaystyle{ \mathcal{R}\left({n\choose k},k\right)={[n]\choose k} }[/math],

that is, the first [math]\displaystyle{ {n\choose k} }[/math] sets in the colex order of all [math]\displaystyle{ k }[/math]-sets of natural numbers is precisely [math]\displaystyle{ {[n]\choose k} }[/math].

This proposition says that the sets in [math]\displaystyle{ \mathcal{R}(m,k) }[/math] is highly overlapped, which suggests that [math]\displaystyle{ \mathcal{R}(m,k) }[/math] may have small shadow. The size of the shadow of [math]\displaystyle{ \mathcal{R}(m,k) }[/math] is related to the [math]\displaystyle{ k }[/math]-cascade representation of [math]\displaystyle{ m }[/math].

Theorem

Suppose the [math]\displaystyle{ k }[/math]-cascade representation of [math]\displaystyle{ m }[/math] is

[math]\displaystyle{ m={m_k\choose k}+{m_{k-1}\choose k-1}+\cdots+{m_t\choose t} }[/math].

Then

[math]\displaystyle{ |\Delta\mathcal{R}(m,k)|={m_k\choose k-1}+{m_{k-1}\choose k-2}+\cdots+{m_t\choose t-1} }[/math].

The Kruskal–Katona theorem

Theorem (Kruskal 1963, Katona 1966)

Let [math]\displaystyle{ \mathcal{F}\subseteq {X\choose k} }[/math] with [math]\displaystyle{ |\mathcal{F}|=m }[/math], and suppose that

[math]\displaystyle{ m={m_k\choose k}+{m_{k-1}\choose k-1}+\cdots+{m_t\choose t} }[/math]

where [math]\displaystyle{ m_k\gt m_{k-1}\gt \cdots\gt m_t\ge t\ge 1 }[/math]. Then

[math]\displaystyle{ |\Delta\mathcal{F}|\ge {m_k\choose k-1}+{m_{k-1}\choose k-2}+\cdots+{m_t\choose t-1} }[/math].

Frankl's shifting proof of Kruskal-Katonal theorem (Frankl 1984)

Proposition

[math]\displaystyle{ \Delta S_{ij}(\mathcal{F})\subseteq S_{ij}(\Delta\mathcal{F}) }[/math].

Proof.

We abuse the notation [math]\displaystyle{ \Delta }[/math] and let [math]\displaystyle{ \Delta A=\Delta\{A\} }[/math] if [math]\displaystyle{ A }[/math] is a set instead of a set system.

It is sufficient to show that for any [math]\displaystyle{ A\in\mathcal{F} }[/math], [math]\displaystyle{ \Delta S_{ij}(A)\subseteq S_{ij}(\Delta\mathcal{F}) }[/math].

File:Property-shift-shadow.png

[math]\displaystyle{ \square }[/math]

An immediate corollary of the above proposition is that the [math]\displaystyle{ (i,j) }[/math]-shifts [math]\displaystyle{ S_{ij} }[/math] for any [math]\displaystyle{ 1\le i\lt j\le n }[/math] do not increase the size of the shadow.

Corollary

[math]\displaystyle{ |\Delta S_{ij}(\mathcal{F})|\le |\Delta\mathcal{F}| }[/math].

Proof.

By the above proposition, [math]\displaystyle{ |\Delta S_{ij}(\mathcal{F})|\le|S_{ij}(\Delta\mathcal{F})| }[/math], and we know that [math]\displaystyle{ S_{ij} }[/math] does not change the cardinality of a set family, that is, [math]\displaystyle{ |S_{ij}(\Delta\mathcal{F})|=|\Delta\mathcal{F}| }[/math], therefore [math]\displaystyle{ \Delta S_{ij}(\mathcal{F})|\le|\Delta\mathcal{F}| }[/math].

[math]\displaystyle{ \square }[/math]

Proof of Kruskal-Katona theorem

We know that shifts never enlarge the shadow, thus it is sufficient to prove the theorem for shifted [math]\displaystyle{ \mathcal{F} }[/math]. We then assume [math]\displaystyle{ \mathcal{F} }[/math] is shifted.

Apply induction on [math]\displaystyle{ m }[/math] and for given [math]\displaystyle{ m }[/math] on [math]\displaystyle{ k }[/math]. The theorem holds trivially for the case that [math]\displaystyle{ k=1 }[/math] and [math]\displaystyle{ m }[/math] is arbitrary.

Define

[math]\displaystyle{ \mathcal{F}_0=\{A\in\mathcal{F}\mid 1\not\in A\} }[/math],

[math]\displaystyle{ \mathcal{F}_1=\{A\in\mathcal{F}\mid 1\in A\} }[/math].

And let [math]\displaystyle{ \mathcal{F}_1'=\{A\setminus\{1\}\mid A\in\mathcal{F}_1\} }[/math].

Clearly [math]\displaystyle{ \mathcal{F}_0,\mathcal{F}_1\subseteq{X\choose k} }[/math], [math]\displaystyle{ \mathcal{F}_1'\subseteq{X\choose k-1} }[/math], and

[math]\displaystyle{ |\mathcal{F}|=|\mathcal{F}_0|+|\mathcal{F}_1|=|\mathcal{F}_0|+|\mathcal{F}_1'| }[/math].

Our induction is based on the following observation regarding the size of the shadow.

Lemma 1

[math]\displaystyle{ |\Delta\mathcal{F}|\ge|\Delta\mathcal{F}_1'|+|\mathcal{F}_1'| }[/math].

Proof.

Obviously [math]\displaystyle{ \mathcal{F}\supseteq\mathcal{F}_1 }[/math] and [math]\displaystyle{ \Delta\mathcal{F}\supseteq\Delta\mathcal{F}_1 }[/math].

[math]\displaystyle{ \begin{align} \Delta\mathcal{F}_1 &=\left\{A\in{X\choose k-1}\,\,\bigg|\,\, \exists B\in\mathcal{F}_1, A\subset B\right\}\\ &=\left\{A\in{X\choose k-1}\,\,\bigg|\,\, 1\in A, \exists B\in\mathcal{F}_1, A\subset B\right\}\\ &\quad\, \cup \left\{A\in{X\choose k-1}\,\,\bigg|\,\, 1\not\in A, \exists B\in\mathcal{F}_1, A\subset B\right\}\\ &=\{S\cup\{1\}\mid S\in\Delta\mathcal{F}_1'\}\cup\mathcal{F}_1'. \end{align} }[/math]

The union is taken over two disjoint families. Therefore,

[math]\displaystyle{ |\Delta\mathcal{F}|\ge|\Delta\mathcal{F}_1|=|\Delta\mathcal{F}_1'|+|\mathcal{F}_1'| }[/math].

[math]\displaystyle{ \square }[/math]

The following property of shifted [math]\displaystyle{ \mathcal{F} }[/math] is essential for our proof.

Lemma 2

For shifted [math]\displaystyle{ \mathcal{F} }[/math], it holds that [math]\displaystyle{ \Delta\mathcal{F}_0\subseteq \mathcal{F}_1' }[/math].

Proof.

If [math]\displaystyle{ A\in\Delta\mathcal{F}_0 }[/math] then [math]\displaystyle{ A\cup\{j\}\in\mathcal{F}_0\subseteq\mathcal{F} }[/math] for some [math]\displaystyle{ j\gt 1 }[/math] so that, since [math]\displaystyle{ \mathcal{F} }[/math] is shifted, applying the [math]\displaystyle{ (1,j) }[/math]-shift [math]\displaystyle{ S_{1j} }[/math], [math]\displaystyle{ A\cup\{1\}\in\mathcal{F} }[/math], thus, [math]\displaystyle{ A\in\mathcal{F}_1' }[/math].

[math]\displaystyle{ \square }[/math]

We then bound the size of [math]\displaystyle{ \mathcal{F}_1' }[/math] as:

Lemma 3

If [math]\displaystyle{ \mathcal{F} }[/math] is shifted, then

[math]\displaystyle{ |\mathcal{F}_1'|\ge{m_k-1\choose k-1}+{m_{k-1}-1\choose k-2}+\cdots+{m_t-1\choose t-1} }[/math].

Proof.

By contradiction, assume that

[math]\displaystyle{ |\mathcal{F}_1'|\lt {m_k-1\choose k-1}+{m_{k-1}-1\choose k-2}+\cdots+{m_t-1\choose t-1} }[/math].

Then by [math]\displaystyle{ |\mathcal{F}|=|\mathcal{F}_0|+|\mathcal{F}_1'| }[/math], it holds that

[math]\displaystyle{ \begin{align} |\mathcal{F}_0| &=m-|\mathcal{F}_1'|\\ &\gt \left\{{m_k\choose k}- {m_k-1\choose k-1}\right\}+\cdots+\left\{{m_t\choose t}- {m_t-1\choose t-1}\right\}\\ &={m_k-1\choose k}+\cdots+{m_t-1\choose t}, \end{align} }[/math]

so that, by the induction hypothesis,

[math]\displaystyle{ |\Delta\mathcal{F}_0|\ge{m_k-1\choose k-1}+{m_{k-1}-1\choose k-1}+\cdots+{m_t-1\choose t-1}\gt |\mathcal{F}_1'| }[/math].

On the other hand, by Lemma 2, [math]\displaystyle{ |\mathcal{F}_1'|\ge|\Delta\mathcal{F}_0| }[/math]. Thus [math]\displaystyle{ |\mathcal{F}_1'|\ge|\Delta\mathcal{F}_0|\gt |\mathcal{F}_1'| }[/math], a contradiction.

[math]\displaystyle{ \square }[/math]

Now we officially apply the induction. By Lemma 1,

[math]\displaystyle{ |\Delta\mathcal{F}|\ge|\Delta\mathcal{F}_1'|+|\mathcal{F}_1'| }[/math].

Note that [math]\displaystyle{ \mathcal{F}_1'\subseteq{X\choose k-1} }[/math], and due to Lemma 3,

[math]\displaystyle{ |\mathcal{F}_1'|\ge{m_k-1\choose k-1}+{m_{k-1}-1\choose k-2}+\cdots+{m_t-1\choose t-1} }[/math],

thus by the induction hypothesis,

[math]\displaystyle{ |\Delta\mathcal{F}_1'|\ge{m_k-1\choose k-2}+{m_{k-1}-1\choose k-3}+\cdots+{m_t-1\choose t-2} }[/math].

Combining them together, we have

[math]\displaystyle{ \begin{align} |\Delta\mathcal{F}| &\ge |\Delta\mathcal{F}_1'|+|\mathcal{F}_1'|\\ &\ge {m_k-1\choose k-2}+\cdots+{m_t-1\choose t-2}+{m_k-1\choose k-1}+\cdots+{m_t-1\choose t-1}\\ &= {m_k\choose k-1}+{m_{k-1}\choose k-2}+\cdots+{m_t\choose t-1}. \end{align} }[/math]

[math]\displaystyle{ \square }[/math]

Shadows of specific sizes

Kruskal-Katona Theorem (general version)

Let [math]\displaystyle{ \mathcal{F}\subseteq {X\choose k} }[/math] with [math]\displaystyle{ |\mathcal{F}|=m }[/math], and suppose that

[math]\displaystyle{ m={m_k\choose k}+{m_{k-1}\choose k-1}+\cdots+{m_t\choose t} }[/math]

where [math]\displaystyle{ m_k\gt m_{k-1}\gt \cdots\gt m_t\ge t\ge 1 }[/math]. Then for all [math]\displaystyle{ r }[/math], [math]\displaystyle{ 1\le r\le k }[/math],

[math]\displaystyle{ \left|\Delta_r\mathcal{F}\right|\ge {m_k\choose r}+{m_{k-1}\choose r-1}+\cdots+{m_t\choose t-k+r} }[/math].

Erdős–Ko–Rado theorem

Erdős–Ko–Rado theorem (proved in 1938, published in 1961)

Let [math]\displaystyle{ \mathcal{F}\subseteq {X\choose k} }[/math] where [math]\displaystyle{ |X|=n }[/math] and [math]\displaystyle{ n\ge 2k }[/math]. If [math]\displaystyle{ \mathcal{F} }[/math] is intersecting, then

[math]\displaystyle{ |\mathcal{F}|\le{n-1\choose k-1} }[/math].

Proof of Erdős–Ko–Rado theorem by Kruskal-Katona theorem

(Daykin 1974, Clements 1976)

[math]\displaystyle{ \square }[/math]

Combinatorics (Fall 2010)/Extremal set theory II

Contents

Sauer's lemma and VC-dimension

Shattering

Down-shifts

The Vapnik-Chervonenkis dimenison

The Kruskal–Katona Theorem

[math]\displaystyle{ k }[/math]-cascade representation of a number

Co-lexicographic order of subsets

The Kruskal–Katona theorem

Shadows of specific sizes

Erdős–Ko–Rado theorem

Navigation menu

Combinatorics (Fall 2010)/Extremal set theory II

Sauer's lemma and VC-dimension

Shattering

Down-shifts

The Vapnik-Chervonenkis dimenison

The Kruskal–Katona Theorem

[math]\displaystyle{ k }[/math]-cascade representation of a number

Co-lexicographic order of subsets

The Kruskal–Katona theorem

Shadows of specific sizes

Erdős–Ko–Rado theorem

Navigation menu

Search