组合数学 (Fall 2017)/Extremal set theory

From EtoneWiki
Jump to: navigation, search

Sunflowers

An set system is a sunflower if all its member sets intersect at the same set of elements.

Definition (sunflower)
A set family is a sunflower of size with a core if
that , .

Note that we do not require the core to be nonempty, thus a family of disjoint sets is also a sunflower (with the core ).

The next result due to Erdős and Rado, called the sunflower lemma, is a famous result in extremal set theory, and has some important applications in Boolean circuit complexity.

Sunflower Lemma (Erdős-Rado)
Let . If , then contains a sunflower of size .
Proof.

We proceed by induction on . For , , thus all sets in are disjoint. And since , we can choose of these sets and form a sunflower.

Now let and assume the lemma holds for all smaller . Take a maximal family whose members are disjoint, i.e. for any that , .

If , then is a sunflower of size at least and we are done.

Assume that , and let . Then (since all members of ) are disjoint). We claim that intersets all members of , since if otherwise, there exists an such that , then we can enlarge by adding into and still have all members of disjoint, which contradicts the assumption that is the maximum of such families.

By the pigeonhole principle, some elements must contained in at least

members of . We delete this from these sets and consider the family

.

We have and , thus by the induction hypothesis, contains a sunflower of size . Adding to the members of this sunflower, we get the desired sunflower in the original family .

The Erdős–Ko–Rado Theorem

A set family is called intersecting, if for any , . A natural question of extremal favor is: "how large can an intersecting family be?"

Assume . When , every pair of -subsets of intersects. So the non-trivial case is when . The famous Erdős–Ko–Rado theorem gives the largest possible cardinality of a nontrivially intersecting family.

According to Erdős, the theorem itself was proved in 1938, but was not published until 23 years later.

Erdős–Ko–Rado theorem (proved in 1938, published in 1961)
Let where and . If is intersecting, then
.

Katona's proof

We first introduce a proof discovered by Katona in 1972. The proof uses double counting.

Let be a cyclic permutation of , that is, we think of assigning in a circle and ignore the rotations of the circle. It is easy to see that there are cyclic permutations of an -set (each cyclic permutation corresponds to permutations). Let

.

The next lemma states the following observation: in a circle of points, supposed , there can be at most arcs, each consisting of points, such that every pair of arcs share at least one point.

Lemma
Let where and . If is intersecting, then for any cyclic permutation of , it holds that .
Proof.

Fix a cyclic permutation of . Let . Then can be written as .

Suppose that . Since is intersecting, the only sets that can be in other than itself are the sets with . We partition these sets into pairs , where .

Note that for , it holds that . Since is intersecting, can contain at most one set of each such pair. The lemma follows.

The Katona's proof of Erdős–Ko–Rado theorem is done by counting in two ways the pairs of member of and cyclic permutation which contain as a continuous path on the circle (i.e., an arc).

Katona's proof of Erdős–Ko–Rado theorem
(double counting)

Let

.

We count in two ways.

First, due to the lemma, for any cyclic permutation . There are cyclic permutations in total. Thus,

.

Next, for each , the number of cyclic permutations in which is continuous is . Thus,

.

Altogether, we have

.

Erdős' shifting technique

We now introduce the original proof of the Erdős–Ko–Rado theorem, which uses a technique called shifting (originally called compression).

Without loss of generality, we assume , and restate the Erdős–Ko–Rado theorem as follows.

Erdős–Ko–Rado theorem
Let and . If is intersecting, then .

We define a shift operator for the set family.

Definition (shift operator)
Assume , and . Define the -shift as an operator on as follows:
  • for each , write , and let
  • let .

It is easy to verify the following propositions of shifts.

Proposition
  1. and ;
  2. if is intersecting, then so is .
Proof.

(1) is quite obvious. Now we prove (2).

Consider any . If includes any element other than then and are still intersecting after -shift. Thus without loss of generality, we may consider the only unsafe case where , and is successfully shifted to but fails to shift to .

Since is successfully shifted to , we know that it must hold that and . And the only two reasons for which may fail to shift to are: (1) and (2) but .

Case.1: . In this case, since , we have , i.e. is still intersecting with after shifting.

Case.2: but . In this case, it is easy to verify that . Recall that we assume . This contradicts to that is intersecting.

In conclusion, in all cases, remains intersecting.

Repeatedly applying for any , since we only replace elements by smaller elements, eventually will stop changing, that is, for all . We call such an shifted.

The idea behind the shifting technique is very natural: by applying shifting, all intersecting families are transformed to some special forms, and we only need to prove the theorem for these special form of intersecting families.

Proof of Erdős-Ko-Rado theorem
(The original proof of Erdős-Ko-Rado by shifting)

By the above lemma, it is sufficient to prove the Erdős-Ko-Rado theorem holds for shifted . We assume that is shifted.

First, it is trivial to see that the theorem holds for (no matter whether shifted).

Next, we show that the theorem holds when (no matter whether shifted). For any , both and are in , but at most one of them can be in . Thus,

.

We then apply the induction on . For , the induction hypothesis is stated as:

  • the Erdős-Ko-Rado theorem holds for any smaller .

Define

and .

Clearly, and is intersecting. Due to the induction hypothesis, .

In order to apply the induction, we let

.

Clearly, . If only it is also intersecting, we can apply the induction hypothesis, and indeed it is. To see this, by contradiction we assume that is not intersecting. Then there must exist such that , which means that . Thus, there is some such that . Since is shifted, . On the other hand it can be verified that , which contradicts that is intersecting.

Thus, and is intersecting. Due to the induction hypothesis, .

Combining these together,

.

Sperner system

A set family with the relation define a poset. Thus, a chain is a sequence .

A set family is an antichain (also called a Sperner system) if for all that , we have .

The -uniform is an antichain. Let . The size of is maximized when . We wonder whether this is also the largest possible size of any antichain .

In 1928, Emanuel Sperner proved a theorem saying that it is indeed the largest possible antichain. This result, called Sperner's theorem today, initiated the studies of extremal set theory.

Theorem (Sperner 1928)
Let where . If is an antichain, then
.

First proof (shadows)

We first introduce the original proof by Sperner, which uses concepts called shadows and shades of set systems.

Definition
Let and , .
The shade of is defined to be
.
Thus the shade of consists of all subsets of which can be obtained by adding an element to a set in .
Similarly, the shadow of is defined to be
.
Thus the shadow of consists of all subsets of which can be obtained by removing an element from a set in .

Next lemma bounds the effects of shadows and shades on the sizes of set systems.

Lemma (Sperner)
Let and . Then
Proof.

The lemma is proved by double counting. We prove the inequality of . Assume that .

Define

.

We estimate in two ways.

For each , there are different that .

.

For each , there are ways to choose an with , some of which may not be in .

.

Altogether, we show that .

The inequality of can be proved in the same way.

An immediate corollary of the previous lemma is as follows.

Proposition 1
If , then .
If , then .

The idea of Sperner's proof is pretty clear:

  • we "push up" all the sets in of size replacing them by their shades;
  • and also "push down" all the sets in of size replacing them by their shadows.

Repeat this process we end up with a set system . We need to show that this process does not decrease the size of .

Proposition 2
Suppose that where . Let . Let be the smallest that , and let
Similarly, let be the largest that , and let
If is an antichain, and are antichains, and we have and .
Proof.

We show that is an antichain and .

First, observe that , otherwise cannot be an antichain, and due to Proposition 1, when , so .

Now we prove that is an antichain . By contradiction, assume that there are , such that . One of the must be in , or otherwise cannot be an antichain. Recall that is the smallest that , thus it must be , and . This implies that there is an such that , which contradicts that is an antichain.

The statement for can be proved in the same way.

Applying the above process, we prove the Sperner's theorem.

Proof of Sperner's theorem
(original proof of Sperner)

Let , where .

We change as follows:

  • for the smallest that , if , replace by .

Due to Proposition 2, this procedure preserves as an antichain and does not decrease . Repeat this procedure, until for all , that is, there is no member set of has size less than .

We then define another symmetric procedure:

  • for the largest that , if , replace by .

Also due to Proposition 2, this procedure preserves as an antichain and does not decrease . After repeatedly applying this procedure, for all .

The resulting has , and since is never decreased, for the original , we have

.

Second proof (counting)

We now introduce an elegant proof due to Lubell. The proof uses a counting argument, and tells more information than just the size of the set system.

Proof of Sperner's theorem
(Lubell 1966)

Let be a permutation of . We say that an prefixes , if , that is, is precisely the set of the first elements in the permutation .

Fix an . It is easy to see that the number of permutations of prefixed by is . Also, since is an antichain, no permutation of can be prefixed by more than one members of , otherwise one of the member sets must contain the other, which contradicts that is an antichain. Thus, the number of permutations prefixed by some is

,

which cannot be larger than the total number of permutations, , therefore,

.

Dividing both sides by , we have

,

where , so

.

Combining this with the above inequality, we prove the Sperner's theorem.

The LYM inequality

Lubell's proof proves the following inequality:

which is actually stronger than Sperner's original statement that .

This inequality is independently discovered by Lubell-Yamamoto, Meschalkin, and Bollobás, and is called the LYM inequality today.

Theorem (Lubell, Yamamoto 1954; Meschalkin 1963)
Let where . If is an antichain, then
.

In Lubell's counting argument proves the LYM inequality, which implies the Sperner's theorem. Here we give another proof of the LYM inequality by the probabilistic method, due to Noga Alon.

Third proof (the probabilistic method)
(Due to Alon.)

Let be a uniformly random permutation of . Define a random maximal chain by

.

For any , let be the 0-1 random variable which indicates whether , that is

Note that for a uniformly random , has exact one member set of size , uniformly distributed over , thus

.

Let . Note that . By the linearity of expectation,

.

On the other hand, since is an antichain, it can never intersect a chain at more than one elements, thus we always have . Therefore,

.

The Sperner's theorem is an immediate consequence of the LYM inequality.

Proposition
implies that .
Proof.

It holds that for any . Thus,

,

which implies that .

Sauer's lemma and VC-dimension

Shattering and the VC-dimension

Definition (shatter)
Let be set family and let be a subset. The trace of on , denoted is defined as
.
We say that shatters if , i.e. for all , there exists an such that .

The VC dimension is defined by the power of a family to shatter a set.

Definition (VC-dimension)
The Vapnik–Chervonenkis dimension (VC-dimension) of a set family , denoted , is the size of the largest shattered by .

It is a core concept in computational learning theory.

Each subset can be equivalently represented by its characteristic function , such that for each ,

Then a set family corresponds to a collection of boolean functions , which is a subset of all Boolean functions in the form . We wonder on how large a subdomain , includes all the mappings . The largest size of such subdomain is the VC-dimension. It measures how complicated a collection of boolean functions (or equivalently a set family) is.

Sauer's Lemma

The definition of the VC-dimension involves enumerating all subsets, thus is difficult to analyze in general. The following famous result state a very simple sufficient condition to lower bound the VC-dimension, regarding only the size of the family. The lemma is due to Sauer, and independently due to Shelah and Perles. A slightly weaker version is found by Vapnik and Chervonenkis, who use the framework to develop a theory of classifications.

Sauer's Lemma (Sauer; Shelah-Perles; Vapnik-Chervonenkis)
Let where . If , then there exists an such that shatters .

In other words, for any set family with , its VC-dimension .

Hereditary family

We note the Sauer's lemma is especially easy to prove for a special type of set families, called the hereditary families.

Definition (hereditary family)
A set system is said to be hereditary (also called an ideal or an abstract simplicial complex), if
implies .

In other words, for a hereditary family , if , then all subsets of are also in . An immediate consequence is the following proposition.

Proposition
Let be a hereditary family. If then shatters .

Therefore, it is very easy to prove the Sauer's lemma for hereditary families:

Lemma
For , , if is hereditary and then there exists an such that shatters .
Proof.

Since is hereditary, we only need to show that there exists an of size , which must be true, because if all members of are of sizes , then , a contradiction.

To prove the Sauer's lemma for general non-hereditary families, we can use some way to reduce arbitrary families to hereditary families. Here we apply the shifting technique to achieve this.

Down-shifts

Note that we work on , instead of like in the Erdős–Ko–Rado theorem, so we do not need to preserve the size of member sets. Instead, we need to reduce an arbitrary family to a hereditary one, thus we use a shift operator which replaces a member set by a subset of it.

Definition (down-shifts)
Assume , and . Define the down-shift operator as follows:
  • for each , let
  • let .

Repeatedly applying to for all , due to the finiteness, eventually is not changed by any . We call such a family down-shifted. A family is down-shifted if and only if for all . It is then easy to see that a down-shifted must be hereditary.

Theorem
If is down-shifted, then is hereditary.

In order to use down-shift to prove the Sauer's lemma, we need to make sure that down-shift does not decrease and does not increase the VC-dimension .

Proposition
  1. ;
  2. , thus if shatters an , so does .

(1) is immediate. (2) is proved by case analysis. We omit the proof.

Proof of Sauer's lemma

Now we can prove the Sauer's lemma for arbitrary .

For any , repeatedly apply for all till the family is down-shifted, which is denoted by . We have proved that and is hereditary, thus as argued before, here exists an of size shattered by . By the above proposition, , thus also shatters . The lemma is proved.