Difference between revisions of "组合数学 (Fall 2019)/Extremal set theory"
(Created page with "== Sunflowers == An set system is a '''sunflower''' if all its member sets intersect at the same set of elements. {{TheoremDefinition (sunflower) : A set family <math>\mathc...") 
(No difference)

Latest revision as of 07:54, 28 November 2019
Sunflowers
An set system is a sunflower if all its member sets intersect at the same set of elements.
Definition (sunflower)  A set family is a sunflower of size with a core if
 that , .
 A set family is a sunflower of size with a core if
Note that we do not require the core to be nonempty, thus a family of disjoint sets is also a sunflower (with the core ).
The next result due to Erdős and Rado, called the sunflower lemma, is a famous result in extremal set theory, and has some important applications in Boolean circuit complexity.
Sunflower Lemma (ErdősRado)  Let . If , then contains a sunflower of size .
Proof. We proceed by induction on . For , , thus all sets in are disjoint. And since , we can choose of these sets and form a sunflower.
Now let and assume the lemma holds for all smaller . Take a maximal family whose members are disjoint, i.e. for any that , .
If , then is a sunflower of size at least and we are done.
Assume that , and let . Then (since all members of ) are disjoint). We claim that intersets all members of , since if otherwise, there exists an such that , then we can enlarge by adding into and still have all members of disjoint, which contradicts the assumption that is the maximum of such families.
By the pigeonhole principle, some elements must contained in at least
members of . We delete this from these sets and consider the family
 .
We have and , thus by the induction hypothesis, contains a sunflower of size . Adding to the members of this sunflower, we get the desired sunflower in the original family .
The Erdős–Ko–Rado Theorem
A set family is called intersecting, if for any , . A natural question of extremal favor is: "how large can an intersecting family be?"
Assume . When , every pair of subsets of intersects. So the nontrivial case is when . The famous Erdős–Ko–Rado theorem gives the largest possible cardinality of a nontrivially intersecting family.
According to Erdős, the theorem itself was proved in 1938, but was not published until 23 years later.
Erdős–Ko–Rado theorem (proved in 1938, published in 1961)  Let where and . If is intersecting, then
 .
 Let where and . If is intersecting, then
Katona's proof
We first introduce a proof discovered by Katona in 1972. The proof uses double counting.
Let be a cyclic permutation of , that is, we think of assigning in a circle and ignore the rotations of the circle. It is easy to see that there are cyclic permutations of an set (each cyclic permutation corresponds to permutations). Let
 .
The next lemma states the following observation: in a circle of points, supposed , there can be at most arcs, each consisting of points, such that every pair of arcs share at least one point.
Lemma  Let where and . If is intersecting, then for any cyclic permutation of , it holds that .
Proof. Fix a cyclic permutation of . Let . Then can be written as .
Suppose that . Since is intersecting, the only sets that can be in other than itself are the sets with . We partition these sets into pairs , where .
Note that for , it holds that . Since is intersecting, can contain at most one set of each such pair. The lemma follows.
The Katona's proof of Erdős–Ko–Rado theorem is done by counting in two ways the pairs of member of and cyclic permutation which contain as a continuous path on the circle (i.e., an arc).
Katona's proof of Erdős–Ko–Rado theorem (double counting) Let
 .
We count in two ways.
First, due to the lemma, for any cyclic permutation . There are cyclic permutations in total. Thus,
 .
Next, for each , the number of cyclic permutations in which is continuous is . Thus,
 .
Altogether, we have
 .
Erdős' shifting technique
We now introduce the original proof of the Erdős–Ko–Rado theorem, which uses a technique called shifting (originally called compression).
Without loss of generality, we assume , and restate the Erdős–Ko–Rado theorem as follows.
Erdős–Ko–Rado theorem  Let and . If is intersecting, then .
We define a shift operator for the set family.
Definition (shift operator)  Assume , and . Define the shift as an operator on as follows:
 for each , write , and let
 let .
 Assume , and . Define the shift as an operator on as follows:
It is easy to verify the following propositions of shifts.
Proposition  and ;
 if is intersecting, then so is .
Proof. (1) is quite obvious. Now we prove (2).
Consider any . If includes any element other than then and are still intersecting after shift. Thus without loss of generality, we may consider the only unsafe case where , and is successfully shifted to but fails to shift to .
Since is successfully shifted to , we know that it must hold that and . And the only two reasons for which may fail to shift to are: (1) and (2) but .
Case.1: . In this case, since , we have , i.e. is still intersecting with after shifting.
Case.2: but . In this case, it is easy to verify that . Recall that we assume . This contradicts to that is intersecting.
In conclusion, in all cases, remains intersecting.
Repeatedly applying for any , since we only replace elements by smaller elements, eventually will stop changing, that is, for all . We call such an shifted.
The idea behind the shifting technique is very natural: by applying shifting, all intersecting families are transformed to some special forms, and we only need to prove the theorem for these special form of intersecting families.
Proof of ErdősKoRado theorem (The original proof of ErdősKoRado by shifting) By the above lemma, it is sufficient to prove the ErdősKoRado theorem holds for shifted . We assume that is shifted.
First, it is trivial to see that the theorem holds for (no matter whether shifted).
Next, we show that the theorem holds when (no matter whether shifted). For any , both and are in , but at most one of them can be in . Thus,
 .
We then apply the induction on . For , the induction hypothesis is stated as:
 the ErdősKoRado theorem holds for any smaller .
Define
 and .
Clearly, and is intersecting. Due to the induction hypothesis, .
In order to apply the induction, we let
 .
Clearly, . If only it is also intersecting, we can apply the induction hypothesis, and indeed it is. To see this, by contradiction we assume that is not intersecting. Then there must exist such that , which means that . Thus, there is some such that . Since is shifted, . On the other hand it can be verified that , which contradicts that is intersecting.
Thus, and is intersecting. Due to the induction hypothesis, .
Combining these together,
 .
Sperner system
A set family with the relation define a poset. Thus, a chain is a sequence .
A set family is an antichain (also called a Sperner system) if for all that , we have .
The uniform is an antichain. Let . The size of is maximized when . We wonder whether this is also the largest possible size of any antichain .
In 1928, Emanuel Sperner proved a theorem saying that it is indeed the largest possible antichain. This result, called Sperner's theorem today, initiated the studies of extremal set theory.
Theorem (Sperner 1928)  Let where . If is an antichain, then
 .
 Let where . If is an antichain, then
First proof (shadows)
We first introduce the original proof by Sperner, which uses concepts called shadows and shades of set systems.
Definition  Let and , .
 The shade of is defined to be
 .
 Thus the shade of consists of all subsets of which can be obtained by adding an element to a set in .
 Similarly, the shadow of is defined to be
 .
 Thus the shadow of consists of all subsets of which can be obtained by removing an element from a set in .
Next lemma bounds the effects of shadows and shades on the sizes of set systems.
Lemma (Sperner)  Let and . Then
 Let and . Then
Proof. The lemma is proved by double counting. We prove the inequality of . Assume that .
Define
 .
We estimate in two ways.
For each , there are different that .
 .
For each , there are ways to choose an with , some of which may not be in .
 .
Altogether, we show that .
The inequality of can be proved in the same way.
An immediate corollary of the previous lemma is as follows.
Proposition 1  If , then .
 If , then .
The idea of Sperner's proof is pretty clear:
 we "push up" all the sets in of size replacing them by their shades;
 and also "push down" all the sets in of size replacing them by their shadows.
Repeat this process we end up with a set system . We need to show that this process does not decrease the size of .
Proposition 2  Suppose that where . Let . Let be the smallest that , and let
 Similarly, let be the largest that , and let
 If is an antichain, and are antichains, and we have and .
 Suppose that where . Let . Let be the smallest that , and let
Proof. We show that is an antichain and .
First, observe that , otherwise cannot be an antichain, and due to Proposition 1, when , so .
Now we prove that is an antichain . By contradiction, assume that there are , such that . One of the must be in , or otherwise cannot be an antichain. Recall that is the smallest that , thus it must be , and . This implies that there is an such that , which contradicts that is an antichain.
The statement for can be proved in the same way.
Applying the above process, we prove the Sperner's theorem.
Proof of Sperner's theorem (original proof of Sperner) Let , where .
We change as follows:
 for the smallest that , if , replace by .
Due to Proposition 2, this procedure preserves as an antichain and does not decrease . Repeat this procedure, until for all , that is, there is no member set of has size less than .
We then define another symmetric procedure:
 for the largest that , if , replace by .
Also due to Proposition 2, this procedure preserves as an antichain and does not decrease . After repeatedly applying this procedure, for all .
The resulting has , and since is never decreased, for the original , we have
 .
Second proof (counting)
We now introduce an elegant proof due to Lubell. The proof uses a counting argument, and tells more information than just the size of the set system.
Proof of Sperner's theorem (Lubell 1966) Let be a permutation of . We say that an prefixes , if , that is, is precisely the set of the first elements in the permutation .
Fix an . It is easy to see that the number of permutations of prefixed by is . Also, since is an antichain, no permutation of can be prefixed by more than one members of , otherwise one of the member sets must contain the other, which contradicts that is an antichain. Thus, the number of permutations prefixed by some is
 ,
which cannot be larger than the total number of permutations, , therefore,
 .
Dividing both sides by , we have
 ,
where , so
 .
Combining this with the above inequality, we prove the Sperner's theorem.
The LYM inequality
Lubell's proof proves the following inequality:
which is actually stronger than Sperner's original statement that .
This inequality is independently discovered by LubellYamamoto, Meschalkin, and Bollobás, and is called the LYM inequality today.
Theorem (Lubell, Yamamoto 1954; Meschalkin 1963)  Let where . If is an antichain, then
 .
 Let where . If is an antichain, then
In Lubell's counting argument proves the LYM inequality, which implies the Sperner's theorem. Here we give another proof of the LYM inequality by the probabilistic method, due to Noga Alon.
Third proof (the probabilistic method) (Due to Alon.) Let be a uniformly random permutation of . Define a random maximal chain by
 .
For any , let be the 01 random variable which indicates whether , that is
Note that for a uniformly random , has exact one member set of size , uniformly distributed over , thus
 .
Let . Note that . By the linearity of expectation,
 .
On the other hand, since is an antichain, it can never intersect a chain at more than one elements, thus we always have . Therefore,
 .
The Sperner's theorem is an immediate consequence of the LYM inequality.
Proposition  implies that .
Proof. It holds that for any . Thus,
 ,
which implies that .
Sauer's lemma and VCdimension
Shattering and the VCdimension
Definition (shatter)  Let be set family and let be a subset. The trace of on , denoted is defined as
 .
 We say that shatters if , i.e. for all , there exists an such that .
 Let be set family and let be a subset. The trace of on , denoted is defined as
The VC dimension is defined by the power of a family to shatter a set.
Definition (VCdimension)  The Vapnik–Chervonenkis dimension (VCdimension) of a set family , denoted , is the size of the largest shattered by .
It is a core concept in computational learning theory.
Each subset can be equivalently represented by its characteristic function , such that for each ,
Then a set family corresponds to a collection of boolean functions , which is a subset of all Boolean functions in the form . We wonder on how large a subdomain , includes all the mappings . The largest size of such subdomain is the VCdimension. It measures how complicated a collection of boolean functions (or equivalently a set family) is.
Sauer's Lemma
The definition of the VCdimension involves enumerating all subsets, thus is difficult to analyze in general. The following famous result state a very simple sufficient condition to lower bound the VCdimension, regarding only the size of the family. The lemma is due to Sauer, and independently due to Shelah and Perles. A slightly weaker version is found by Vapnik and Chervonenkis, who use the framework to develop a theory of classifications.
Sauer's Lemma (Sauer; ShelahPerles; VapnikChervonenkis)  Let where . If , then there exists an such that shatters .
In other words, for any set family with , its VCdimension .
Hereditary family
We note the Sauer's lemma is especially easy to prove for a special type of set families, called the hereditary families.
Definition (hereditary family)  A set system is said to be hereditary (also called an ideal or an abstract simplicial complex), if
 implies .
 A set system is said to be hereditary (also called an ideal or an abstract simplicial complex), if
In other words, for a hereditary family , if , then all subsets of are also in . An immediate consequence is the following proposition.
Proposition  Let be a hereditary family. If then shatters .
Therefore, it is very easy to prove the Sauer's lemma for hereditary families:
Lemma  For , , if is hereditary and then there exists an such that shatters .
Proof. Since is hereditary, we only need to show that there exists an of size , which must be true, because if all members of are of sizes , then , a contradiction.
To prove the Sauer's lemma for general nonhereditary families, we can use some way to reduce arbitrary families to hereditary families. Here we apply the shifting technique to achieve this.
Downshifts
Note that we work on , instead of like in the Erdős–Ko–Rado theorem, so we do not need to preserve the size of member sets. Instead, we need to reduce an arbitrary family to a hereditary one, thus we use a shift operator which replaces a member set by a subset of it.
Definition (downshifts)  Assume , and . Define the downshift operator as follows:
 for each , let
 let .
 Assume , and . Define the downshift operator as follows:
Repeatedly applying to for all , due to the finiteness, eventually is not changed by any . We call such a family downshifted. A family is downshifted if and only if for all . It is then easy to see that a downshifted must be hereditary.
Theorem  If is downshifted, then is hereditary.
In order to use downshift to prove the Sauer's lemma, we need to make sure that downshift does not decrease and does not increase the VCdimension .
Proposition  ;
 , thus if shatters an , so does .
(1) is immediate. (2) is proved by case analysis. We omit the proof.
 Proof of Sauer's lemma
Now we can prove the Sauer's lemma for arbitrary .
For any , repeatedly apply for all till the family is downshifted, which is denoted by . We have proved that and is hereditary, thus as argued before, here exists an of size shattered by . By the above proposition, , thus also shatters . The lemma is proved.