Combinatorics (Fall 2010)/Finite set systems: Difference between revisions

From TCS Wiki
Jump to navigation Jump to search
imported>WikiSysop
imported>WikiSysop
Line 54: Line 54:


=== Doubly stochastic matrices ===
=== Doubly stochastic matrices ===
An <math>n\times n</math> nonnegative matrix <math>A=(a_{ij})</math> is called a [http://en.wikipedia.org/wiki/Doubly_stochastic_matrix '''doubly stochastic matrix'''] if <math>\sum_{j}a_{ij}=1</math> for every row <math>i</math> and <math>\sum_{i}a_{ij}=1</math> for every column <math>j</math>.
Doubly stochastic matrix has significance in studying of Markov chains.
A [http://en.wikipedia.org/wiki/Convex_combination '''convex combination'''] is a special type of [http://en.wikipedia.org/wiki/Linear_combination linear combination], such that <math>\lambda_1x_1+\lambda_2x_2+\cdots+\lambda_nx_n</math> is said to be a convex combination of <math>x_1,x_2,\ldots,x_n</math> if <math>\lambda_i\ge 0</math> for all <math>i=1,2\ldots,n</math>, and <math>\sum_{i=1}^n\lambda_i=1</math>.
A [http://en.wikipedia.org/wiki/Permutation_matrix permutation matrix] is a 0-1 matrix such that every row and every column has exactly one 1-entry.
A fundamental result regarding doubly stochastic matrices is the Birkhoff-von  Neumann theorem, which states an amazing fact that every doubly stochastic matrix can be expressed as a convex combination of finite number of permutation matrices.
{{Theorem|Theorem (Birkhoff 1949; von Neumann 1953)|
{{Theorem|Theorem (Birkhoff 1949; von Neumann 1953)|
:Every doubly stochastic matrix is a convex combination of permutation matrices.
:Every doubly stochastic matrix is a convex combination of permutation matrices.

Revision as of 02:42, 26 October 2010

Systems of Distinct Representatives (SDR)

A system of distinct representatives (SDR) (also called a transversal) for a sequence of (not necessarily distinct) sets [math]\displaystyle{ S_1,S_2,\ldots,S_m }[/math] is a sequence of distinct elements [math]\displaystyle{ x_1,x_2,\ldots,x_m }[/math] such that [math]\displaystyle{ x_i\in S_i }[/math] for all [math]\displaystyle{ i=1,2,\ldots,m }[/math].

Hall's marriage theorem

If the sets [math]\displaystyle{ S_1,S_2,\ldots,S_m }[/math] have a system of distinct representatives [math]\displaystyle{ x_1\in S_1,x_2\in S_2,\ldots,x_m\in S_m }[/math], then it is obvious that [math]\displaystyle{ \left|S_1\cup S_2\cup\cdots\cup S_m\right|\ge |\{x_1,x_2,\ldots,x_m\}|=m }[/math]. Moreover, for any subset [math]\displaystyle{ I\subseteq\{1,2,\ldots,m\} }[/math],

[math]\displaystyle{ \left|\bigcup_{i\in I}S_i\right|\ge |\{x_i\mid i\in I\}|=|I| }[/math]

because [math]\displaystyle{ x_1,x_2,\ldots,x_m }[/math] are distinct.

Surprisingly, this obvious necessary condition for the existence of SDR is also sufficient, which is stated by the Hall's theorem, also called the mariage theorem.

Hall's Theorem
The sets [math]\displaystyle{ S_1,S_2,\ldots,S_m }[/math] have a system of distinct representatives (SDR) if and only if
[math]\displaystyle{ \left|\bigcup_{i\in I}S_i\right|\ge |I| }[/math] for all [math]\displaystyle{ I\subseteq\{1,2,\ldots,m\} }[/math].

The condition that [math]\displaystyle{ \left|\bigcup_{i\in I}S_i\right|\ge |I| }[/math] for all [math]\displaystyle{ I\subseteq\{1,2,\ldots,m\} }[/math] is also called the Hall's condition.

Proof.

We only need to prove the sufficiency of Hall's condition for the existence of SDR. We do it by induction on [math]\displaystyle{ m }[/math]. When [math]\displaystyle{ m=1 }[/math], the theorem trivially holds. Now assume the theorem hold for any integer smaller than [math]\displaystyle{ m }[/math].

A subcollection of sets [math]\displaystyle{ \{S_i\mid i\in I\} }[/math], [math]\displaystyle{ |I|\lt m\, }[/math], is called a critical family if [math]\displaystyle{ \left|\bigcup_{i\in I}S_i\right|=|I| }[/math].

Case.1: There is no critical family, i.e. for each [math]\displaystyle{ I\subset\{1,2,\ldots,m\} }[/math], [math]\displaystyle{ \left|\bigcup_{i\in I}S_i\right|\gt |I| }[/math].

Take [math]\displaystyle{ S_m }[/math] and choose an arbitrary [math]\displaystyle{ x\in S_m }[/math] as its representative. Remove [math]\displaystyle{ x }[/math] from all other sets by letting [math]\displaystyle{ S'_i=S_i\setminus \{x\} }[/math] for [math]\displaystyle{ 1\le i\le m-1 }[/math]. Then for all [math]\displaystyle{ I\subseteq\{1,2,\ldots,m-1\} }[/math],

[math]\displaystyle{ \left|\bigcup_{i\in I}S_i'\right|\ge \left|\bigcup_{i\in I}S_i\right|-1\ge |I| }[/math].

Due to the induction hypothesis, [math]\displaystyle{ S_1,\ldots,S_m }[/math] have an SDR, say [math]\displaystyle{ x_1\in S_1,\ldots,x_{m-1}\in S_{m-1} }[/math]. It is obvious that none of them equals [math]\displaystyle{ x }[/math] because [math]\displaystyle{ x }[/math] is removed. Thus, [math]\displaystyle{ x_1,\ldots,x_{m-1} }[/math] and [math]\displaystyle{ x\, }[/math] form an SDR for [math]\displaystyle{ S_1,S_2,\ldots,S_m }[/math].

Case.1: There is a critical family, i.e. [math]\displaystyle{ \exists I\subset\{1,2,\ldots,m\}, |I|\lt m }[/math], such that [math]\displaystyle{ \left|\bigcup_{i\in I}S_i\right|=|I| }[/math].

Suppose [math]\displaystyle{ S_{m-k+1},\ldots, S_m }[/math] are such a collection of [math]\displaystyle{ k\lt m }[/math] sets. Hall's condition certainly holds for these sets. Since [math]\displaystyle{ k\lt m }[/math], due to the induction hypothesis, there is an SDR for the sets, say [math]\displaystyle{ x_{m-k+1}\in S_{m-k+1},\ldots,x_m\in S_m }[/math].

Again, remove the [math]\displaystyle{ k }[/math] elements from the remaining sets by letting [math]\displaystyle{ S'_i=S_i\setminus\{x_{m-k+1},\ldots,x_m\} }[/math] for [math]\displaystyle{ 1\le i\le m-k }[/math]. By the Hall's condition, for any [math]\displaystyle{ I\subseteq\{1,2,\ldots,m-k\} }[/math], writing that [math]\displaystyle{ S=\bigcup_{i\in I}S_i\cup\bigcup_{i=m-k+1}^m S_i }[/math],

[math]\displaystyle{ \left|S\right|\ge |I|+k }[/math],

thus

[math]\displaystyle{ \left|\bigcup_{i\in I}S_i'\right|\ge |S|-\left|\bigcup_{i=m-k+1}^m S_i\right|\ge |I| }[/math].

Due to the induction hypothesis, there is an SDR for [math]\displaystyle{ S_1,\ldots, S_{m-k} }[/math], say [math]\displaystyle{ x_1\in S_1,\ldots, x_{m-k}\in S_{m-k} }[/math]. Combining it with the SDR [math]\displaystyle{ x_{m-k+1},\ldots, x_{m} }[/math] for [math]\displaystyle{ S_{m-k+1},\ldots, S_{m} }[/math], we have an SDR for [math]\displaystyle{ S_{1},\ldots, S_{m} }[/math].

[math]\displaystyle{ \square }[/math]

Hall's theorem is usually stated as a theorem for the existence of matching in a bipartite graph.

In a graph [math]\displaystyle{ G(V,E) }[/math], a matching [math]\displaystyle{ M\subseteq E }[/math] is an independent set for edges, that is, for any [math]\displaystyle{ e_1,e_2\in M }[/math] that [math]\displaystyle{ e_1\neq e_2 }[/math], [math]\displaystyle{ e_1 }[/math] and [math]\displaystyle{ e_2 }[/math] are not adjacent to the same vertex.

In a bipartite graph [math]\displaystyle{ G(U,V,E) }[/math], we say [math]\displaystyle{ M }[/math] is a matching of [math]\displaystyle{ U }[/math] (or a matching of [math]\displaystyle{ V }[/math]), if every vertex in [math]\displaystyle{ U }[/math] (or [math]\displaystyle{ V }[/math]) is adjacent to some edge in [math]\displaystyle{ M }[/math], i.e., all vertices in [math]\displaystyle{ U }[/math] (or [math]\displaystyle{ V }[/math]) are matched.

In a graph [math]\displaystyle{ G(V,E) }[/math], for any vertex [math]\displaystyle{ v\in V }[/math], let [math]\displaystyle{ N(v) }[/math] denote the set of neighbors of [math]\displaystyle{ v }[/math] in [math]\displaystyle{ G }[/math]; and for any vertex set [math]\displaystyle{ S\subseteq V }[/math], we override the notation as [math]\displaystyle{ N(S)=\bigcup_{v\in S}N(v) }[/math], i.e. the set of vertices that are adjacent to one of the vertices in [math]\displaystyle{ S }[/math].

Hall's Theorem (graph theory form)
A bipartite graph [math]\displaystyle{ G(U,V,E) }[/math] has a matching of [math]\displaystyle{ U }[/math] if and only if
[math]\displaystyle{ \left|N(S)\right|\ge |S| }[/math] for all [math]\displaystyle{ S\subseteq U }[/math].

Consider the collection of sets [math]\displaystyle{ N(u), u\in U }[/math]. A matching of [math]\displaystyle{ U }[/math] is an SDR for these sets. Then clearly the theorem is equivalent to Hall's theorem.

Doubly stochastic matrices

An [math]\displaystyle{ n\times n }[/math] nonnegative matrix [math]\displaystyle{ A=(a_{ij}) }[/math] is called a doubly stochastic matrix if [math]\displaystyle{ \sum_{j}a_{ij}=1 }[/math] for every row [math]\displaystyle{ i }[/math] and [math]\displaystyle{ \sum_{i}a_{ij}=1 }[/math] for every column [math]\displaystyle{ j }[/math].

Doubly stochastic matrix has significance in studying of Markov chains.

A convex combination is a special type of linear combination, such that [math]\displaystyle{ \lambda_1x_1+\lambda_2x_2+\cdots+\lambda_nx_n }[/math] is said to be a convex combination of [math]\displaystyle{ x_1,x_2,\ldots,x_n }[/math] if [math]\displaystyle{ \lambda_i\ge 0 }[/math] for all [math]\displaystyle{ i=1,2\ldots,n }[/math], and [math]\displaystyle{ \sum_{i=1}^n\lambda_i=1 }[/math].

A permutation matrix is a 0-1 matrix such that every row and every column has exactly one 1-entry.

A fundamental result regarding doubly stochastic matrices is the Birkhoff-von Neumann theorem, which states an amazing fact that every doubly stochastic matrix can be expressed as a convex combination of finite number of permutation matrices.

Theorem (Birkhoff 1949; von Neumann 1953)
Every doubly stochastic matrix is a convex combination of permutation matrices.
Proof.

We prove a more general result:

Every [math]\displaystyle{ n\times n }[/math] nonnegative matrix [math]\displaystyle{ A=(a_{ij}) }[/math] with
[math]\displaystyle{ \forall i, \sum_{j=1}^na_{ij}=\gamma }[/math] and [math]\displaystyle{ \forall j, \sum_{i=1}^na_{ij}=\gamma }[/math], for some [math]\displaystyle{ \gamma\gt 0 }[/math],
can be expressed as a linear combination [math]\displaystyle{ A=\sum_{i=1}^s\lambda_iP_i }[/math] of permutation matrices [math]\displaystyle{ P_1,\ldots,P_s }[/math], where [math]\displaystyle{ \lambda_1,\ldots,\lambda_{s} }[/math] are nonnegative reals such that [math]\displaystyle{ \sum_{i=1}^s\lambda_i=\gamma }[/math].

The Birkhoff-von Neumann theorem is a special case of above statement with [math]\displaystyle{ \gamma=1 }[/math].

We prove by induction on the number of non-zero entries in [math]\displaystyle{ A }[/math], denoted by [math]\displaystyle{ m }[/math]. Since all the row and column sums equal to some [math]\displaystyle{ \gamma\gt 0 }[/math], there are at least [math]\displaystyle{ n }[/math] such entries. If there are exactly [math]\displaystyle{ n }[/math] non-zero entries, the only possible [math]\displaystyle{ A }[/math] with [math]\displaystyle{ \forall i, \sum_{j=1}^na_{ij}=\gamma }[/math] and [math]\displaystyle{ \forall j, \sum_{i=1}^na_{ij}=\gamma }[/math] is that [math]\displaystyle{ A=\gamma P }[/math] for some permutation matrix [math]\displaystyle{ P }[/math].

Now suppose [math]\displaystyle{ A }[/math] has [math]\displaystyle{ m\gt n }[/math] non-zero entries and the theorem holds for matrices with a smaller number of non-zero entries. Define

[math]\displaystyle{ S_i=\{j\mid a_{ij}\gt 0\},\quad i=1,2,\ldots,n, }[/math]

and observe that [math]\displaystyle{ S_1, S_2,\ldots, S_n }[/math] satisfy Hall's condition. Otherwise if there exists an [math]\displaystyle{ I\subseteq\{1,2,\ldots,m\} }[/math] such that [math]\displaystyle{ \left|\bigcup_{i\in I}S_i\right|\lt |I| }[/math], then all the non-zero entries of the rows in [math]\displaystyle{ I }[/math] would occupy less than [math]\displaystyle{ |I| }[/math] columns; hence the sum of these entries

[math]\displaystyle{ \sum_{i\in I}\sum_{j\in S_j}a_{ij}\le\sum_{j\in\bigcup_{i\in I}S_i}\sum_{i=1}^na_{ij}\lt |I|\gamma }[/math],

which contradicts that

[math]\displaystyle{ \sum_{i\in I}\sum_{j\in S_j}a_{ij}=\sum_{i\in I}\sum_{j=1}^na_{ij}=|I|\gamma }[/math].

By Hall's theorem, there is an SDR

[math]\displaystyle{ j_1\in S_1,\ldots,j_n\in S_n }[/math].

Take the permutation matrix [math]\displaystyle{ P=(p_{ij}) }[/math] with entries [math]\displaystyle{ p_{ij}=1 }[/math] if and only if [math]\displaystyle{ j=j_i }[/math]. Let [math]\displaystyle{ \lambda=\min\{a_{ij}\mid j=j_i, i=1,2,\ldots,n\} }[/math], and consider the matrix [math]\displaystyle{ A'=A-\lambda P }[/math]. It is obvious that [math]\displaystyle{ A' }[/math] has less non-zero entries than [math]\displaystyle{ A }[/math] has. Moreover, [math]\displaystyle{ A' }[/math] satisfies that

[math]\displaystyle{ \forall i, \sum_{j=1}^na'_{ij}=\gamma-\lambda }[/math] and [math]\displaystyle{ \forall j, \sum_{i=1}^na'_{ij}=\gamma-\lambda }[/math].

Apply the induction hypothesis, [math]\displaystyle{ A'=\sum_{i=1}^s\lambda_iP_i }[/math] for permutation matrices [math]\displaystyle{ P_1,\ldots,P_m }[/math] and [math]\displaystyle{ \lambda_1,\ldots,\lambda_s\gt 0 }[/math] with [math]\displaystyle{ \sum_{i=1}^s\lambda_i=\gamma-\lambda }[/math]. Therefore,

[math]\displaystyle{ A=\lambda P+A'=\lambda P+\sum_{i=1}^s\lambda_iP_i }[/math],

where [math]\displaystyle{ \lambda+\sum_{i=1}^s\lambda_i=\lambda+(\gamma-\lambda)=\gamma }[/math].

[math]\displaystyle{ \square }[/math]

Min-max theorems

  • König-Egerváry theorem (König 1931; Egerváry 1931): in a bipartite graph, the maximum number of edges in a matching equals the minimum number of vertices in a vertex cover.
  • Menger's theorem (Menger 1927): the minimum number of vertices separating two given vertices in a graph equals the maximum number of vertex-disjoint paths between the two vertices.
  • Dilworth's theorem (Dilworth 1950): the minimum number of chains which cover a partially ordered set equals the maximum number of elements in an antichain.
König-Egerváry Theorem (graph theory form)
In any bipartite graph, the size of a maximum matching equals the size of a minimum vertex cover.
König-Egerváry Theorem (matrix form)
Let [math]\displaystyle{ A }[/math] be an [math]\displaystyle{ m\times n }[/math] 0-1 matrix. The maximum number of independent 1's is equal to the minimum number of rows and columns required to cover all the 1's in [math]\displaystyle{ A }[/math].
Proof.

Let [math]\displaystyle{ r }[/math] denote the maximum number of independent 1's in [math]\displaystyle{ A }[/math] and [math]\displaystyle{ s }[/math] be the minimum number of rows and columns to cover all 1's in [math]\displaystyle{ A }[/math]. Clearly, [math]\displaystyle{ r\le s }[/math], since any set of [math]\displaystyle{ r }[/math] independent 1's requires together [math]\displaystyle{ r }[/math] rows and columns to cover.

We now prove [math]\displaystyle{ r\ge s }[/math]. Assume that some [math]\displaystyle{ a }[/math] rows and [math]\displaystyle{ b }[/math] columns cover all the 1's in [math]\displaystyle{ A }[/math], and [math]\displaystyle{ s=a+b }[/math], i.e. the covering is minimum. Because permuting the rows or the columns change neither [math]\displaystyle{ r }[/math] nor [math]\displaystyle{ s }[/math] (as reordering the vertices on either side in a bipartite graph changes nothing to the size of matchings and vertex covers), we may assume that the first [math]\displaystyle{ a }[/math] rows and the first [math]\displaystyle{ b }[/math] columns cover the 1's. Write [math]\displaystyle{ A }[/math] in the form

[math]\displaystyle{ A=\begin{bmatrix} B_{a\times b} &C_{a\times (n-b)}\\ B_{(m-a)\times b} &E_{(m-a)\times (n-b)} \end{bmatrix} }[/math],

where the submatrix [math]\displaystyle{ E }[/math] has only zero entries. We will show that there are [math]\displaystyle{ a }[/math] independent 1's in [math]\displaystyle{ C }[/math] and [math]\displaystyle{ b }[/math] independent 1's in [math]\displaystyle{ D }[/math], thus together [math]\displaystyle{ A }[/math] has [math]\displaystyle{ a+b=s }[/math] independent 1's, which will imply that [math]\displaystyle{ r\ge s }[/math], as desired.

Define

[math]\displaystyle{ S_i=\{j\mid c_{ij}=1\} }[/math]

It is obvious that [math]\displaystyle{ S_i\subseteq\{1,2,\ldots,n-b\} }[/math]. We claim that the sequence [math]\displaystyle{ S_1,S_2,\ldots, S_a }[/math] has a system of distinct representatives, i.e., we can choose a 1 from each row, no two in the same column. Otherwise, Hall's theorem tells us that there exists some [math]\displaystyle{ I\subseteq\{1,2,\ldots,a\} }[/math], such that [math]\displaystyle{ \left|\bigcup_{i\in I}S_i\right|\lt |I| }[/math], that is, the 1's in the rows contained by [math]\displaystyle{ I }[/math] can be covered by less than [math]\displaystyle{ |I| }[/math] columns. Thus, the 1's in [math]\displaystyle{ C }[/math] can be covered by [math]\displaystyle{ a-|I| }[/math] and less than [math]\displaystyle{ |I| }[/math] columns, altogether less than [math]\displaystyle{ a }[/math] rows and columns. Therefore, the 1's in [math]\displaystyle{ A }[/math] can be covered by less than [math]\displaystyle{ a+b }[/math] rows and columns, which contradicts the assumption that the size of the minimum covering of all 1's in [math]\displaystyle{ A }[/math] is [math]\displaystyle{ a+b }[/math]. Therefore, we show that [math]\displaystyle{ C }[/math] has [math]\displaystyle{ a }[/math] independent 1's.

By the same argument, we can show that [math]\displaystyle{ D }[/math] has [math]\displaystyle{ b }[/math] independent 1's. Since [math]\displaystyle{ C }[/math] and [math]\displaystyle{ D }[/math] share no rows or columns, the number of independent 1's in [math]\displaystyle{ A }[/math] is [math]\displaystyle{ r\ge a+b=s }[/math].

[math]\displaystyle{ \square }[/math]

Chains and antichains

Dilworth's theorem

Dilworth's Theorem
Suppose that the largest antichain in the poset [math]\displaystyle{ P }[/math] has size [math]\displaystyle{ r }[/math]. Then [math]\displaystyle{ P }[/math] can be partitioned into [math]\displaystyle{ r }[/math] chains.
Proof.

Suppose [math]\displaystyle{ P }[/math] has an antichain [math]\displaystyle{ |A| }[/math], and [math]\displaystyle{ P }[/math] can be partitioned into disjoint chains [math]\displaystyle{ C_1,C_2,\ldots,C_s }[/math]. Then [math]\displaystyle{ |A|\le s }[/math], since every chain can pass though an antichain at most once, that is, [math]\displaystyle{ |C_i\cap A|\le 1 }[/math] for all [math]\displaystyle{ i=1,2,\ldots,s }[/math].

Therefore, we only need to prove that there exist an antichain [math]\displaystyle{ A\subseteq P }[/math] of size [math]\displaystyle{ r }[/math], and a partition of [math]\displaystyle{ P }[/math] into at most [math]\displaystyle{ r }[/math] chains.

Define a bipartite graph [math]\displaystyle{ G(U,V,E) }[/math] where [math]\displaystyle{ U=V=P }[/math], and for any [math]\displaystyle{ u\in U }[/math] and [math]\displaystyle{ v\in v }[/math], [math]\displaystyle{ uv\in E }[/math] if and only if [math]\displaystyle{ u\lt v }[/math] in the poset [math]\displaystyle{ P }[/math]. By König-Egerváry theorem, there is a matching [math]\displaystyle{ M\subseteq E }[/math] and a vertex set [math]\displaystyle{ C }[/math] such that every edge in [math]\displaystyle{ E }[/math] is adjacent to at least a vertex in [math]\displaystyle{ C }[/math], and [math]\displaystyle{ |M|=|C| }[/math].

Denote [math]\displaystyle{ |M|=|C|=m }[/math]. Let [math]\displaystyle{ A }[/math] be the set of uncovered elements in poset [math]\displaystyle{ P }[/math], i.e., the elements of [math]\displaystyle{ P }[/math] that do not correspond to any vertex in [math]\displaystyle{ C }[/math]. Clearly, [math]\displaystyle{ |A|\ge n-m }[/math]. We claim that [math]\displaystyle{ A\subseteq P }[/math] is an antichain. By contradiction, assume there exists [math]\displaystyle{ x,y\in A }[/math] such that [math]\displaystyle{ x\lt y }[/math]. Then, by the definition of [math]\displaystyle{ G }[/math], there exist [math]\displaystyle{ u_x\in U,v_x\in V }[/math] which corresponds to [math]\displaystyle{ x }[/math], and [math]\displaystyle{ u_y\in U,v_y\in V }[/math] which corresponds to [math]\displaystyle{ y }[/math], such that [math]\displaystyle{ u_xv_y\in E }[/math]. But since [math]\displaystyle{ A }[/math] includes only those elements whose corresponding vertices are not in [math]\displaystyle{ C }[/math], none of [math]\displaystyle{ u_x,v_x,u_y,v_y }[/math] can be in [math]\displaystyle{ C }[/math], which contradicts the fact that [math]\displaystyle{ C }[/math] is a vertex cover of [math]\displaystyle{ G }[/math] that every edges in [math]\displaystyle{ G }[/math] are adjacent to at least a vertex in [math]\displaystyle{ C }[/math].

Let [math]\displaystyle{ B }[/math] be a family of chains formed by including [math]\displaystyle{ u }[/math] and [math]\displaystyle{ v }[/math] in the same chain whenever [math]\displaystyle{ uv\in M }[/math]. A moment thought would tell us that the number of chains in [math]\displaystyle{ B }[/math] is equal to the unmatched vertices in [math]\displaystyle{ U }[/math] (or [math]\displaystyle{ V }[/math]). Thus, [math]\displaystyle{ |B|=n-m }[/math].

Altogether, we construct an antichain of size [math]\displaystyle{ |A|\ge n-m }[/math] and partition the poset [math]\displaystyle{ P }[/math] into [math]\displaystyle{ |B|=n-m }[/math] disjoint chains. The theorem is proved.

[math]\displaystyle{ \square }[/math]

Application: Erdős-Szekeres Theorem

Erdős-Szekeres Theorem
A sequence of more than [math]\displaystyle{ mn }[/math] different real numbers must contain either an increasing subsequence of length [math]\displaystyle{ m+1 }[/math], or a decreasing subsequence of length [math]\displaystyle{ n+1 }[/math].
Proof by Dilworth's theorem
(Original proof of Erdős-Szekeres)

Let [math]\displaystyle{ (a_1,a_2,\ldots,a_N) }[/math] be the sequence of [math]\displaystyle{ N\gt mn }[/math] distinct real numbers. Define the poset as

[math]\displaystyle{ P=\{(i,a_i)\mid i=1,2,\ldots,N\} }[/math]

and [math]\displaystyle{ (i,a_i)\le (j,a_j) }[/math] if and only if [math]\displaystyle{ i\le j }[/math] and [math]\displaystyle{ a_i\le a_j }[/math].

A chain [math]\displaystyle{ (i_1,a_{i_1})\lt (i_2,a_{i_2})\lt \cdots\lt (i_k,a_{i_k}) }[/math] must have [math]\displaystyle{ i_1\lt i_2\lt \cdots\lt i_k }[/math] and [math]\displaystyle{ a_{i_1}\lt a_{i_2}\lt \cdots\lt a_{i_k} }[/math]. Thus, each chain correspond to an increasing subsequence.

Let [math]\displaystyle{ (j_1,a_{j_1}),(j_2,a_{j_2}),\cdots,(j_k,a_{j_k}) }[/math] be an antichain. Without loss of generality, we can assume that [math]\displaystyle{ j_1\lt j_2\lt \cdots\lt j_k }[/math]. The only case that these elements are non-comparable is that [math]\displaystyle{ a_{j_1}\gt a_{j_2}\gt \cdots\gt a_{j_k} }[/math], otherwise if [math]\displaystyle{ a_{j_s}\lt a_{j_t} }[/math] for some [math]\displaystyle{ s\lt t }[/math], then [math]\displaystyle{ (j_s,a_{j_s})\lt (j_t,a_{j_t}) }[/math], which contradicts the fact that it is an antichain. Thus, each antichain corresponds to a decreasing subsequence.

If [math]\displaystyle{ P }[/math] has an antichain of size [math]\displaystyle{ n+1 }[/math], then [math]\displaystyle{ (a_1,a_2,\ldots,a_N) }[/math] has a decreasing subsequence of size [math]\displaystyle{ n+1 }[/math], and we are done.

Alternatively, if the largest antichain in [math]\displaystyle{ P }[/math] is of size at most [math]\displaystyle{ n }[/math], then by Dilworth's theorem, [math]\displaystyle{ P }[/math] can be partitioned into at most [math]\displaystyle{ n }[/math] disjoint chains, due to pigeonhole principle, one of which must be of length [math]\displaystyle{ m+1 }[/math], which means [math]\displaystyle{ (a_1,a_2,\ldots,a_N) }[/math] has an increasing subsequence of size [math]\displaystyle{ m+1 }[/math].

[math]\displaystyle{ \square }[/math]

Application: Hall's Theorem

Hall's Theorem
The sets [math]\displaystyle{ S_1,S_2,\ldots,S_m }[/math] have a system of distinct representatives (SDR) if and only if
[math]\displaystyle{ \left|\bigcup_{i\in I}S_i\right|\ge |I| }[/math] for all [math]\displaystyle{ I\subseteq\{1,2,\ldots,m\} }[/math].
Proof by Dilworth's theorem

As we discussed before, the necessity of Hall's condition for the existence of SDR is easy. We prove its sufficiency by Dilworth's theorem.

Denote [math]\displaystyle{ X=\bigcup_{i=1}^mS_i }[/math]. Construct a poset [math]\displaystyle{ P }[/math] by letting [math]\displaystyle{ P=X\cup\{S_1,S_2,\ldots,S_m\} }[/math] and [math]\displaystyle{ x\lt S_{i} }[/math] for all [math]\displaystyle{ x\in S_{i} }[/math]. There are no other comparabilities.

It is obvious that [math]\displaystyle{ X }[/math] is an antichain. We claim it is also the largest one. To prove this, let [math]\displaystyle{ A }[/math] be an arbitrary antichain, and let [math]\displaystyle{ I=\{i\mid S_i\in A\} }[/math]. Then [math]\displaystyle{ A }[/math] contains no elements of [math]\displaystyle{ \bigcup_{i\in I}S_i }[/math], since if [math]\displaystyle{ x\in A }[/math] and [math]\displaystyle{ x\in S_i\in A }[/math], then [math]\displaystyle{ x\lt S_i }[/math] and [math]\displaystyle{ A }[/math] cannot be an antichain. Thus,

[math]\displaystyle{ |A|\le |I|+|X|-\left|\bigcup_{i\in I}S_i\right| }[/math]

and by Hall's condition [math]\displaystyle{ \left|\bigcup_{i\in I}S_i\right|\ge|I| }[/math], thus [math]\displaystyle{ |A|\le |X| }[/math], as claimed.

Now, Dilworth's theorem implies that [math]\displaystyle{ P }[/math] can be partitioned into [math]\displaystyle{ |X| }[/math] chains. Since [math]\displaystyle{ X }[/math] is an antichain and each chain can pass though an antichain on at most one element, each of the [math]\displaystyle{ |X| }[/math] chains contain precisely one [math]\displaystyle{ x\in X }[/math]. And since [math]\displaystyle{ \{S_1,\ldots,S_m\} }[/math] is also an antichain, each of these [math]\displaystyle{ |X| }[/math] chains contain at most one [math]\displaystyle{ S_i }[/math]. Altogether, the [math]\displaystyle{ |X| }[/math] chains are in the form:

[math]\displaystyle{ \{x_1,S_1\},\{x_2,S_2\},\ldots,\{x_m,S_m\},\{x_{m+1}\}\ldots,\{x_{|X|}\} }[/math].

Since the only comparabilities in our posets are [math]\displaystyle{ x\in S_i }[/math] and the above chains are disjoint, we have [math]\displaystyle{ x_1\in S_1, x_2\in S_2,\ldots,x_m\in S_m }[/math] as an SDR.

[math]\displaystyle{ \square }[/math]