Difference between revisions of "组合数学 (Fall 2019)/Pólya's theory of counting"

From EtoneWiki
Jump to: navigation, search
(Created page with "== Principle of Inclusion-Exclusion == Let <math>A</math> and <math>B</math> be two finite sets. The cardinality of their union is :<math>|A\cup B|=|A|+|B|-{\color{Blue}|A\cap...")
(No difference)

Revision as of 06:15, 8 October 2019

Principle of Inclusion-Exclusion

Let [math]A[/math] and [math]B[/math] be two finite sets. The cardinality of their union is

[math]|A\cup B|=|A|+|B|-{\color{Blue}|A\cap B|}[/math].

For three sets [math]A[/math], [math]B[/math], and [math]C[/math], the cardinality of the union of these three sets is computed as

[math]|A\cup B\cup C|=|A|+|B|+|C|-{\color{Blue}|A\cap B|}-{\color{Blue}|A\cap C|}-{\color{Blue}|B\cap C|}+{\color{Red}|A\cap B\cap C|}[/math].

This is illustrated by the following figure.

Inclusion-exclusion.png

Generally, the Principle of Inclusion-Exclusion states the rule for computing the union of [math]n[/math] finite sets [math]A_1,A_2,\ldots,A_n[/math], such that

[math] \begin{align} \left|\bigcup_{i=1}^nA_i\right| &= \sum_{I\subseteq\{1,\ldots,n\}}(-1)^{|I|-1}\left|\bigcap_{i\in I}A_i\right|. \end{align} [/math]


In combinatorial enumeration, the Principle of Inclusion-Exclusion is usually applied in its complement form.

Let [math]A_1,A_2,\ldots,A_n\subseteq U[/math] be subsets of some finite set [math]U[/math]. Here [math]U[/math] is some universe of combinatorial objects, whose cardinality is easy to calculate (e.g. all strings, tuples, permutations), and each [math]A_i[/math] contains the objects with some specific property (e.g. a "pattern") which we want to avoid. The problem is to count the number of objects without any of the [math]n[/math] properties. We write [math]\bar{A_i}=U-A_i[/math]. The number of objects without any of the properties [math]A_1,A_2,\ldots,A_n[/math] is

[math] \begin{align} \left|\bar{A_1}\cap\bar{A_2}\cap\cdots\cap\bar{A_n}\right|=\left|U-\bigcup_{i=1}^nA_i\right| &= |U|+\sum_{I\subseteq\{1,\ldots,n\}}(-1)^{|I|}\left|\bigcap_{i\in I}A_i\right|. \end{align} [/math]

For an [math]I\subseteq\{1,2,\ldots,n\}[/math], we denote

[math]A_I=\bigcap_{i\in I}A_i[/math]

with the convention that [math]A_\emptyset=U[/math]. The above equation is stated as:

Principle of Inclusion-Exclusion
Let [math]A_1,A_2,\ldots,A_n[/math] be a family of subsets of [math]U[/math]. Then the number of elements of [math]U[/math] which lie in none of the subsets [math]A_i[/math] is
[math]\sum_{I\subseteq\{1,\ldots, n\}}(-1)^{|I|}|A_I|[/math].

Let [math]S_k=\sum_{|I|=k}|A_I|\,[/math]. Conventionally, [math]S_0=|A_\emptyset|=|U|[/math]. The principle of inclusion-exclusion can be expressed as

[math] \left|\bar{A_1}\cap\bar{A_2}\cap\cdots\cap\bar{A_n}\right|= S_0-S_1+S_2+\cdots+(-1)^nS_n. [/math]

Surjections

In the twelvefold way, we discuss the counting problems incurred by the mappings [math]f:N\rightarrow M[/math]. The basic case is that elements from both [math]N[/math] and [math]M[/math] are distinguishable. In this case, it is easy to count the number of arbitrary mappings (which is [math]m^n[/math]) and the number of injective (one-to-one) mappings (which is [math](m)_n[/math]), but the number of surjective is difficult. Here we apply the principle of inclusion-exclusion to count the number of surjective (onto) mappings.

Theorem
The number of surjective mappings from an [math]n[/math]-set to an [math]m[/math]-set is given by
[math]\sum_{k=1}^m(-1)^{m-k}{m\choose k}k^n[/math].
Proof.

Let [math]U=\{f:[n]\rightarrow[m]\}[/math] be the set of mappings from [math][n][/math] to [math][m][/math]. Then [math]|U|=m^n[/math].

For [math]i\in[m][/math], let [math]A_i[/math] be the set of mappings [math]f:[n]\rightarrow[m][/math] that none of [math]j\in[n][/math] is mapped to [math]i[/math], i.e. [math]A_i=\{f:[n]\rightarrow[m]\setminus\{i\}\}[/math], thus [math]|A_i|=(m-1)^n[/math].

More generally, for [math]I\subseteq [m][/math], [math]A_I=\bigcap_{i\in I}A_i[/math] contains the mappings [math]f:[n]\rightarrow[m]\setminus I[/math]. And [math]|A_I|=(m-|I|)^n\,[/math].

A mapping [math]f:[n]\rightarrow[m][/math] is surjective if [math]f[/math] lies in none of [math]A_i[/math]. By the principle of inclusion-exclusion, the number of surjective [math]f:[n]\rightarrow[m][/math] is

[math]\sum_{I\subseteq[m]}(-1)^{|I|}\left|A_I\right|=\sum_{I\subseteq[m]}(-1)^{|I|}(m-|I|)^n=\sum_{j=0}^m(-1)^j{m\choose j}(m-j)^n[/math].

Let [math]k=m-j[/math]. The theorem is proved.

[math]\square[/math]

Recall that, in the twelvefold way, we establish a relation between surjections and partitions.

  • Surjection to ordered partition:
For a surjective [math]f:[n]\rightarrow[m][/math], [math](f^{-1}(0),f^{-1}(1),\ldots,f^{-1}(m-1))[/math] is an ordered partition of [math][n][/math].
  • Ordered partition to surjection:
For an ordered [math]m[/math]-partition [math](B_0,B_1,\ldots, B_{m-1})[/math] of [math][n][/math], we can define a function [math]f:[n]\rightarrow[m][/math] by letting [math]f(i)=j[/math] if and only if [math]i\in B_j[/math]. [math]f[/math] is surjective since as a partition, none of [math]B_i[/math] is empty.

Therefore, we have a one-to-one correspondence between surjective mappings from an [math]n[/math]-set to an [math]m[/math]-set and the ordered [math]m[/math]-partitions of an [math]n[/math]-set.

The Stirling number of the second kind [math]\left\{{n\atop m}\right\}[/math] is the number of [math]m[/math]-partitions of an [math]n[/math]-set. There are [math]m![/math] ways to order an [math]m[/math]-partition, thus the number of surjective mappings [math]f:[n]\rightarrow[m][/math] is [math]m! \left\{{n\atop m}\right\}[/math]. Combining with what we have proved for surjections, we give the following result for the Stirling number of the second kind.

Proposition
[math]\left\{{n\atop m}\right\}=\frac{1}{m!}\sum_{k=1}^m(-1)^{m-k}{m\choose k}k^n[/math].

Derangements

We now count the number of bijections from a set to itself with no fixed points. This is the derangement problem.

For a permutation [math]\pi[/math] of [math]\{1,2,\ldots,n\}[/math], a fixed point is such an [math]i\in\{1,2,\ldots,n\}[/math] that [math]\pi(i)=i[/math]. A derangement of [math]\{1,2,\ldots,n\}[/math] is a permutation of [math]\{1,2,\ldots,n\}[/math] that has no fixed points.

Theorem
The number of derangements of [math]\{1,2,\ldots,n\}[/math] given by
[math]n!\sum_{k=0}^n\frac{(-1)^k}{k!}\approx \frac{n!}{\mathrm{e}}[/math].
Proof.

Let [math]U[/math] be the set of all permutations of [math]\{1,2,\ldots,n\}[/math]. So [math]|U|=n![/math].

Let [math]A_i[/math] be the set of permutations with fixed point [math]i[/math]; so [math]|A_i|=(n-1)![/math]. More generally, for any [math]I\subseteq \{1,2,\ldots,n\}[/math], [math]A_I=\bigcap_{i\in I}A_i[/math], and [math]|A_I|=(n-|I|)![/math], since permutations in [math]A_I[/math] fix every point in [math]I[/math] and permute the remaining points arbitrarily. A permutation is a derangement if and only if it lies in none of the sets [math]A_i[/math]. So the number of derangements is

[math]\sum_{I\subseteq\{1,2,\ldots,n\}}(-1)^{|I|}(n-|I|)!=\sum_{k=0}^n(-1)^k{n\choose k}(n-k)!=n!\sum_{k=0}^n\frac{(-1)^k}{k!}.[/math]

By Taylor's series,

[math]\frac{1}{\mathrm{e}}=\sum_{k=0}^\infty\frac{(-1)^k}{k!}=\sum_{k=0}^n\frac{(-1)^k}{k!}\pm o\left(\frac{1}{n!}\right)[/math].

It is not hard to see that [math]n!\sum_{k=0}^n\frac{(-1)^k}{k!}[/math] is the closest integer to [math]\frac{n!}{\mathrm{e}}[/math].

[math]\square[/math]

Therefore, there are about [math]\frac{1}{\mathrm{e}}[/math] fraction of all permutations with no fixed points.

Permutations with restricted positions

We introduce a general theory of counting permutations with restricted positions. In the derangement problem, we count the number of permutations that [math]\pi(i)\neq i[/math]. We now generalize to the problem of counting permutations which avoid a set of arbitrarily specified positions.

It is traditionally described using terminology from the game of chess. Let [math]B\subseteq \{1,\ldots,n\}\times \{1,\ldots,n\}[/math], called a board. As illustrated below, we can think of [math]B[/math] as a chess board, with the positions in [math]B[/math] marked by "[math]\times[/math]".

Solid white.svg a b c d e f g h Solid white.svg
8 a8 __ b8 cross c8 cross d8 __ e8 cross f8 __ g8 __ h8 cross 8
7 a7 cross b7 __ c7 __ d7 cross e7 __ f7 __ g7 cross h7 __ 7
6 a6 cross b6 __ c6 cross d6 cross e6 __ f6 cross g6 cross h6 __ 6
5 a5 __ b5 cross c5 __ d5 __ e5 cross f5 __ g5 cross h5 __ 5
4 a4 cross b4 __ c4 __ d4 __ e4 cross f4 cross g4 cross h4 __ 4
3 a3 __ b3 cross c3 __ d3 cross e3 __ f3 __ g3 __ h3 cross 3
2 a2 __ b2 __ c2 cross d2 __ e2 cross f2 __ g2 __ h2 cross 2
1 a1 cross b1 __ c1 __ d1 cross e1 __ f1 cross g1 __ h1 __ 1
Solid white.svg a b c d e f g h Solid white.svg

For a permutation [math]\pi[/math] of [math]\{1,\ldots,n\}[/math], define the graph [math]G_\pi(V,E)[/math] as

[math] \begin{align} G_\pi &= \{(i,\pi(i))\mid i\in \{1,2,\ldots,n\}\}. \end{align} [/math]

This can also be viewed as a set of marked positions on a chess board. Each row and each column has only one marked position, because [math]\pi[/math] is a permutation. Thus, we can identify each [math]G_\pi[/math] as a placement of [math]n[/math] rooks (“城堡”,规则同中国象棋里的“车”) without attacking each other.

For example, the following is the [math]G_\pi[/math] of such [math]\pi[/math] that [math]\pi(i)=i[/math].

Solid white.svg a b c d e f g h Solid white.svg
8 a8 white rook b8 __ c8 __ d8 __ e8 __ f8 __ g8 __ h8 __ 8
7 a7 __ b7 white rook c7 __ d7 __ e7 __ f7 __ g7 __ h7 __ 7
6 a6 __ b6 __ c6 white rook d6 __ e6 __ f6 __ g6 __ h6 __ 6
5 a5 __ b5 __ c5 __ d5 white rook e5 __ f5 __ g5 __ h5 __ 5
4 a4 __ b4 __ c4 __ d4 __ e4 white rook f4 __ g4 __ h4 __ 4
3 a3 __ b3 __ c3 __ d3 __ e3 __ f3 white rook g3 __ h3 __ 3
2 a2 __ b2 __ c2 __ d2 __ e2 __ f2 __ g2 white rook h2 __ 2
1 a1 __ b1 __ c1 __ d1 __ e1 __ f1 __ g1 __ h1 white rook 1
Solid white.svg a b c d e f g h Solid white.svg

Now define

[math]\begin{align} N_0 &= \left|\left\{\pi\mid B\cap G_\pi=\emptyset\right\}\right|\\ r_k &= \mbox{number of }k\mbox{-subsets of }B\mbox{ such that no two elements have a common coordinate}\\ &=\left|\left\{S\in{B\choose k} \,\bigg|\, \forall (i_1,j_1),(i_2,j_2)\in S, i_1\neq i_2, j_1\neq j_2 \right\}\right| \end{align} [/math]

Interpreted in chess game,

  • [math]B[/math]: a set of marked positions in an [math][n]\times [n][/math] chess board.
  • [math]N_0[/math]: the number of ways of placing [math]n[/math] non-attacking rooks on the chess board such that none of these rooks lie in [math]B[/math].
  • [math]r_k[/math]: number of ways of placing [math]k[/math] non-attacking rooks on [math]B[/math].

Our goal is to count [math]N_0[/math] in terms of [math]r_k[/math]. This gives the number of permutations avoid all positions in a [math]B[/math].

Theorem
[math]N_0=\sum_{k=0}^n(-1)^kr_k(n-k)![/math].
Proof.

For each [math]i\in[n][/math], let [math]A_i=\{\pi\mid (i,\pi(i))\in B\}[/math] be the set of permutations [math]\pi[/math] whose [math]i[/math]-th position is in [math]B[/math].

[math]N_0[/math] is the number of permutations avoid all positions in [math]B[/math]. Thus, our goal is to count the number of permutations [math]\pi[/math] in none of [math]A_i[/math] for [math]i\in [n][/math].

For each [math]I\subseteq [n][/math], let [math]A_I=\bigcap_{i\in I}A_i[/math], which is the set of permutations [math]\pi[/math] such that [math](i,\pi(i))\in B[/math] for all [math]i\in I[/math]. Due to the principle of inclusion-exclusion,

[math]N_0=\sum_{I\subseteq [n]} (-1)^{|I|}|A_I|=\sum_{k=0}^n(-1)^k\sum_{I\in{[n]\choose k}}|A_I|[/math].

The next observation is that

[math]\sum_{I\in{[n]\choose k}}|A_I|=r_k(n-k)![/math],

because we can count both sides by first placing [math]k[/math] non-attacking rooks on [math]B[/math] and placing [math]n-k[/math] additional non-attacking rooks on [math][n]\times [n][/math] in [math](n-k)![/math] ways.

Therefore,

[math]N_0=\sum_{k=0}^n(-1)^kr_k(n-k)![/math].
[math]\square[/math]

Derangement problem

We use the above general method to solve the derange problem again.

Take [math]B=\{(1,1),(2,2),\ldots,(n,n)\}[/math] as the chess board. A derangement [math]\pi[/math] is a placement of [math]n[/math] non-attacking rooks such that none of them is in [math]B[/math].

Solid white.svg a b c d e f g h Solid white.svg
8 a8 cross b8 __ c8 __ d8 __ e8 __ f8 __ g8 __ h8 __ 8
7 a7 __ b7 cross c7 __ d7 __ e7 __ f7 __ g7 __ h7 __ 7
6 a6 __ b6 __ c6 cross d6 __ e6 __ f6 __ g6 __ h6 __ 6
5 a5 __ b5 __ c5 __ d5 cross e5 __ f5 __ g5 __ h5 __ 5
4 a4 __ b4 __ c4 __ d4 __ e4 cross f4 __ g4 __ h4 __ 4
3 a3 __ b3 __ c3 __ d3 __ e3 __ f3 cross g3 __ h3 __ 3
2 a2 __ b2 __ c2 __ d2 __ e2 __ f2 __ g2 cross h2 __ 2
1 a1 __ b1 __ c1 __ d1 __ e1 __ f1 __ g1 __ h1 cross 1
Solid white.svg a b c d e f g h Solid white.svg

Clearly, the number of ways of placing [math]k[/math] non-attacking rooks on [math]B[/math] is [math]r_k={n\choose k}[/math]. We want to count [math]N_0[/math], which gives the number of ways of placing [math]n[/math] non-attacking rooks such that none of these rooks lie in [math]B[/math].

By the above theorem

[math] N_0=\sum_{k=0}^n(-1)^kr_k(n-k)!=\sum_{k=0}^n(-1)^k{n\choose k}(n-k)!=\sum_{k=0}^n(-1)^k\frac{n!}{k!}=n!\sum_{k=0}^n(-1)^k\frac{1}{k!}\approx\frac{n!}{e}. [/math]

Problème des ménages

Suppose that in a banquet, we want to seat [math]n[/math] couples at a circular table, satisfying the following constraints:

  • Men and women are in alternate places.
  • No one sits next to his/her spouse.

In how many ways can this be done?

(For convenience, we assume that every seat at the table marked differently so that rotating the seats clockwise or anti-clockwise will end up with a different solution.)

First, let the [math]n[/math] ladies find their seats. They may either sit at the odd numbered seats or even numbered seats, in either case, there are [math]n![/math] different orders. Thus, there are [math]2(n!)[/math] ways to seat the [math]n[/math] ladies.

After sitting the wives, we label the remaining [math]n[/math] places clockwise as [math]0,1,\ldots, n-1[/math]. And a seating of the [math]n[/math] husbands is given by a permutation [math]\pi[/math] of [math][n][/math] defined as follows. Let [math]\pi(i)[/math] be the seat of the husband of he lady sitting at the [math]i[/math]-th place.

It is easy to see that [math]\pi[/math] satisfies that [math]\pi(i)\neq i[/math] and [math]\pi(i)\not\equiv i+1\pmod n[/math], and every permutation [math]\pi[/math] with these properties gives a feasible seating of the [math]n[/math] husbands. Thus, we only need to count the number of permutations [math]\pi[/math] such that [math]\pi(i)\not\equiv i, i+1\pmod n[/math].

Take [math]B=\{(0,0),(1,1),\ldots,(n-1,n-1), (0,1),(1,2),\ldots,(n-2,n-1),(n-1,0)\}[/math] as the chess board. A permutation [math]\pi[/math] which defines a way of seating the husbands, is a placement of [math]n[/math] non-attacking rooks such that none of them is in [math]B[/math].

Solid white.svg a b c d e f g h Solid white.svg
8 a8 cross b8 cross c8 __ d8 __ e8 __ f8 __ g8 __ h8 __ 8
7 a7 __ b7 cross c7 cross d7 __ e7 __ f7 __ g7 __ h7 __ 7
6 a6 __ b6 __ c6 cross d6 cross e6 __ f6 __ g6 __ h6 __ 6
5 a5 __ b5 __ c5 __ d5 cross e5 cross f5 __ g5 __ h5 __ 5
4 a4 __ b4 __ c4 __ d4 __ e4 cross f4 cross g4 __ h4 __ 4
3 a3 __ b3 __ c3 __ d3 __ e3 __ f3 cross g3 cross h3 __ 3
2 a2 __ b2 __ c2 __ d2 __ e2 __ f2 __ g2 cross h2 cross 2
1 a1 cross b1 __ c1 __ d1 __ e1 __ f1 __ g1 __ h1 cross 1
Solid white.svg a b c d e f g h Solid white.svg

We need to compute [math]r_k[/math], the number of ways of placing [math]k[/math] non-attacking rooks on [math]B[/math]. For our choice of [math]B[/math], [math]r_k[/math] is the number of ways of choosing [math]k[/math] points, no two consecutive, from a collection of [math]2n[/math] points arranged in a circle.

We first see how to do this in a line.

Lemma
The number of ways of choosing [math]k[/math] non-consecutive objects from a collection of [math]m[/math] objects arranged in a line, is [math]{m-k+1\choose k}[/math].
Proof.

We draw a line of [math]m-k[/math] black points, and then insert [math]k[/math] red points into the [math]m-k+1[/math] spaces between the black points (including the beginning and end).

[math] \begin{align} &\sqcup \, \bullet \, \sqcup \, \bullet \, \sqcup \, \bullet \, \sqcup \, \bullet \, \sqcup \, \bullet \, \sqcup \, \bullet \, \sqcup \, \bullet \, \sqcup \\ &\qquad\qquad\qquad\quad\Downarrow\\ &\sqcup \, \bullet \,\, {\color{Red}\bullet} \, \bullet \,\, {\color{Red}\bullet} \, \bullet \, \sqcup \, \bullet \,\, {\color{Red}\bullet}\, \, \bullet \, \sqcup \, \bullet \, \sqcup \, \bullet \,\, {\color{Red}\bullet} \end{align} [/math]

This gives us a line of [math]m[/math] points, and the red points specifies the chosen objects, which are non-consecutive. The mapping is 1-1 correspondence. There are [math]{m-k+1\choose k}[/math] ways of placing [math]k[/math] red points into [math]m-k+1[/math] spaces.

[math]\square[/math]

The problem of choosing non-consecutive objects in a circle can be reduced to the case that the objects are in a line.

Lemma
The number of ways of choosing [math]k[/math] non-consecutive objects from a collection of [math]m[/math] objects arranged in a circle, is [math]\frac{m}{m-k}{m-k\choose k}[/math].
Proof.

Let [math]f(m,k)[/math] be the desired number; and let [math]g(m,k)[/math] be the number of ways of choosing [math]k[/math] non-consecutive points from [math]m[/math] points arranged in a circle, next coloring the [math]k[/math] points red, and then coloring one of the uncolored point blue.

Clearly, [math]g(m,k)=(m-k)f(m,k)[/math].

But we can also compute [math]g(m,k)[/math] as follows:

  • Choose one of the [math]m[/math] points and color it blue. This gives us [math]m[/math] ways.
  • Cut the circle to make a line of [math]m-1[/math] points by removing the blue point.
  • Choose [math]k[/math] non-consecutive points from the line of [math]m-1[/math] points and color them red. This gives [math]{m-k\choose k}[/math] ways due to the previous lemma.

Thus, [math]g(m,k)=m{m-k\choose k}[/math]. Therefore we have the desired number [math]f(m,k)=\frac{m}{m-k}{m-k\choose k}[/math].

[math]\square[/math]

By the above lemma, we have that [math]r_k=\frac{2n}{2n-k}{2n-k\choose k}[/math]. Then apply the theorem of counting permutations with restricted positions,

[math] N_0=\sum_{k=0}^n(-1)^kr_k(n-k)!=\sum_{k=0}^n(-1)^k\frac{2n}{2n-k}{2n-k\choose k}(n-k)!. [/math]

This gives the number of ways of seating the [math]n[/math] husbands after the ladies are seated. Recall that there are [math]2n![/math] ways of seating the [math]n[/math] ladies. Thus, the total number of ways of seating [math]n[/math] couples as required by problème des ménages is

[math] 2n!\sum_{k=0}^n(-1)^k\frac{2n}{2n-k}{2n-k\choose k}(n-k)!. [/math]

Inversion

Posets

A partially ordered set or poset for short is a set [math]P[/math] together with a binary relation denoted [math]\le_P[/math] (or just [math]\le[/math] if no confusion is caused), satisfying

  • (reflexivity) For all [math]x\in P, x\le x[/math].
  • (antisymmetry) If [math]x\le y[/math] and [math]y\le x[/math], then [math]x=y[/math].
  • (transitivity) If [math]x\le y[/math] and [math]y\le z[/math], then [math]x\le z[/math].

We say two elements [math]x[/math] and [math]y[/math] are comparable if [math]x\le y[/math] or [math]y\le x[/math]; otherwise [math]x[/math] and [math]y[/math] are incomparable.

Notation
  • [math]x\ge y[/math] means [math]y\le x[/math].
  • [math]x\lt y[/math] means [math]x\le y[/math] and [math]x\neq y[/math].
  • [math]x\gt y[/math] means [math]y\lt x[/math].

The Möbius function

Let [math]P[/math] be a finite poset. Consider functions in form of [math]\alpha:P\times P\rightarrow\mathbb{R}[/math] defined over domain [math]P\times P[/math]. It is convenient to treat such functions as matrices whose rows and columns are indexed by [math]P[/math].

Incidence algebra of poset
Let
[math]I(P)=\{\alpha:P\times P\rightarrow\mathbb{R}\mid \alpha(x,y)=0\text{ for all }x\not\le_P y\}[/math]
be the class of [math]\alpha[/math] such that [math]\alpha(x,y)[/math] is non-zero only for [math]x\le_P y[/math].
Treating [math]\alpha[/math] as matrix, it is trivial to see that [math]I(P)[/math] is closed under addition and scalar multiplication, that is,
  • if [math]\alpha,\beta\in I(P)[/math] then [math]\alpha+\beta\in I(P)[/math];
  • if [math]\alpha\in I(P)[/math] then [math]c\alpha\in I(P)[/math] for any [math]c\in\mathbb{R}[/math];
where [math]\alpha,\beta[/math] are treated as matrices.
With this spirit, it is natural to define the matrix multiplication in [math]I(P)[/math]. For [math]\alpha,\beta\in I(P)[/math],
[math](\alpha\beta)(x,y)=\sum_{z\in P}\alpha(x,z)\beta(z,y)=\sum_{x\le z\le y}\alpha(x,z)\beta(z,y)[/math].
The second equation is due to that for [math]\alpha,\beta\in I(P)[/math], for all [math]z[/math] other than [math]x\le z\le y[/math], [math]\alpha(x,z)\beta(z,y)[/math] is zero.
By the transitivity of relation [math]\le_P[/math], it is also easy to prove that [math]I(P)[/math] is closed under matrix multiplication (the detailed proof is left as an exercise). Therefore, [math]I(P)[/math] is closed under addition, scalar multiplication and matrix multiplication, so we have an algebra [math]I(P)[/math], called incidence algebra, over functions on [math]P\times P[/math].
Zeta function and Möbius function
A special function in [math]I(P)[/math] is the so-called zeta function [math]\zeta[/math], defined as
[math]\zeta(x,y)=\begin{cases}1&\text{if }x\le_P y,\\0 &\text{otherwise.}\end{cases}[/math]
As a matrix (or more accurately, as an element of the incidence algebra), [math]\zeta[/math] is invertible and its inversion, denoted by [math]\mu[/math], is called the Möbius function. More precisely, [math]\mu[/math] is also in the incidence algebra [math]I(P)[/math], and [math]\mu\zeta=I[/math] where [math]I[/math] is the identity matrix (the identity of the incidence algebra [math]I(P)[/math]).

There is an equivalent explicit definition of Möbius function.

Definition (Möbius function)
[math]\mu(x,y)=\begin{cases} -\sum_{x\le z\lt y}\mu(x,z)&\text{if }x\lt y,\\ 1&\text{if }x=y,\\ 0&\text{if }x\not\le y. \end{cases} [/math]

To see the equivalence between this definition and the inversion of zeta function, we may have the following proposition, which is proved by directly evaluating [math]\mu\zeta[/math].

Proposition
For any [math]x,y\in P[/math],
[math]\sum_{x\le z\le y}\mu(x,z)=\begin{cases}1 &\text{if }x=y,\\ 0 &\text{otherwise.}\end{cases}[/math]
Proof.

It holds that

[math](\mu\zeta)(x,y)=\sum_{x\le z\le y}\mu(x,z)\zeta(z,y)=\sum_{x\le z\le y}\mu(x,z)[/math].

On the other hand, [math]\mu\zeta=I[/math], i.e.

[math](\mu\zeta)(x,y)=\begin{cases}1 &\text{if }x=y,\\ 0 &\text{otherwise.}\end{cases}[/math]

The proposition follows.

[math]\square[/math]

Note that [math]\mu(x,y)=\sum_{x\le z\le y}\mu(x,z)-\sum_{x\le z\lt y}\mu(x,z)[/math], which gives the above inductive definition of Möbius function.

Computing Möbius functions

We consider the simple poset [math]P=[n][/math], where [math]\le[/math] is the total order. It follows directly from the recursive definition of Möbius function that

[math]\mu(i,j)=\begin{cases}1 & \text{if }i=j,\\ -1 & \text{if }i+1=j,\\ 0 & \text{otherwise.} \end{cases} [/math]

Usually for general posets, it is difficult to directly compute the Möbius function from its definition. We introduce a rule helping us compute the Möbius function by decomposing the poset into posets with simple structures.

Theorem (the product rule)
Let [math]P[/math] and [math]Q[/math] be two finite posets, and [math]P\times Q[/math] be the poset resulted from Cartesian product of [math]P[/math] and [math]Q[/math], where for all [math](x,y), (x',y')\in P\times Q[/math], [math](x,y)\le (x',y')[/math] if and only if [math]x\le x'[/math] and [math]y\le y'[/math]. Then
[math]\mu_{P\times Q}((x,y),(x',y'))=\mu_P(x,x')\mu_Q(y,y')[/math].
Proof.

We use the recursive definition

[math]\mu(x,y)=\begin{cases} -\sum_{x\le z\lt y}\mu(x,z)&\text{if }x\lt y,\\ 1&\text{if }x=y,\\ 0&\text{if }x\not\le y. \end{cases} [/math]

to prove the equation in the theorem.

If [math](x,y)=(x',y')[/math], then [math]x=x'[/math] and [math]y=y'[/math]. It is easy to see that both sides of the equation are 1. If [math](x,y)\not\le(x',y')[/math], then either [math]x\not\le x'[/math] or [math]y\not\le y'[/math]. It is also easy to see that both sides are 0.

The only remaining case is that [math](x,y)\lt (x',y')[/math], in which case either [math]x\lt x'[/math] or [math]y\lt y'[/math].

[math] \begin{align} \sum_{(x,y)\le (u,v)\le (x',y')}\mu_P(x,u)\mu_Q(y,v) &=\left(\sum_{x\le u\le x'}\mu_P(x,u)\right)\left(\sum_{y\le v\le y'}\mu_Q(y,v)\right)=I(x,x')I(y,y')=0, \end{align} [/math]

where the last two equations are due to the proposition for [math]\mu[/math]. Thus

[math]\mu_P(x,x')\mu_Q(y,y')=-\sum_{(x,y)\le (u,v)\lt (x',y')}\mu_P(x,u)\mu_Q(y,v)[/math].

By induction, assume that the equation [math]\mu_{P\times Q}((x,y),(u,v))=\mu_P(x,u)\mu_Q(y,v)[/math] is true for all [math](u,v)\lt (x',y')[/math]. Then

[math] \begin{align} \mu_{P\times Q}((x,y),(x',y')) &=-\sum_{(x,y)\le (u,v)\lt (x',y')}\mu_{P\times Q}((x,y),(u,v))\\ &=-\sum_{(x,y)\le (u,v)\lt (x',y')}\mu_P(x,u)\mu_Q(y,v)\\ &=\mu_P(x,x')\mu_Q(y,y'), \end{align} [/math]

which complete the proof.

[math]\square[/math]
Poset of subsets
Consider the poset defined by all subsets of a finite universe [math]U[/math], that is [math]P=2^U[/math], and for [math]S,T\subseteq U[/math], [math]S\le_P T[/math] if and only if [math]S\subseteq T[/math].
Möbius function for subsets
The Möbius function for the above defined poset [math]P[/math] is that for [math]S,T\subseteq U[/math],
[math]\mu(S,T)= \begin{cases} (-1)^{|T|-|S|} & \text{if }S\subseteq T,\\ 0 &\text{otherwise.} \end{cases} [/math]
Proof.

We can equivalently represent each [math]S\subseteq U[/math] by a boolean string [math]S\in\{0,1\}^U[/math], where [math]S(x)=1[/math] if and only if [math]x\in S[/math].

For each element [math]x\in U[/math], we can define a poset [math]P_x=\{0, 1\}[/math] with [math]0\le 1[/math]. By definition of Möbius function, the Möbius function of this elementary poset is given by [math]\mu_x(0,0)=\mu_x(1,1)=1[/math], [math]\mu_x(0,1)=-1[/math] and [math]\mu(1,0)=0[/math].

The poset [math]P[/math] of all subsets of [math]U[/math] is the Cartesian product of all [math]P_x[/math], [math]x\in U[/math]. By the product rule,

[math]\mu(S,T)=\prod_{x\in U}\mu_x(S(x), T(x))=\prod_{x\in S\atop x\in T}1\prod_{x\not\in S\atop x\not\in T}1\prod_{x\in S\atop x\not\in T}0\prod_{x\not\in S\atop x\in T}(-1)=\begin{cases} (-1)^{|T|-|S|} & \text{if }S\subseteq T,\\ 0 &\text{otherwise.} \end{cases}[/math]
[math]\square[/math]
Note that the poset [math]P[/math] is actually the Boolean algebra of rank [math]|U|[/math]. The proof relies only on that the fact that the poset is a Boolean algebra, thus the theorem holds for Boolean algebra posets.
Posets of divisors
Consider the poset defined by all devisors of a positive integer [math]n[/math], that is [math]P=\{a\gt 0\mid a|n\}[/math], and for [math]a,b\in P[/math], [math]a\le_P b[/math] if and only if [math]a|b\,[/math].
Möbius function for divisors
The Möbius function for the above defined poset [math]P[/math] is that for [math]a,b\gt 0[/math] that [math]a|n[/math] and [math]b|n[/math],
[math]\mu(a,b)= \begin{cases} (-1)^{r} & \text{if }\frac{b}{a}\text{ is the product of }r\text{ distinct primes},\\ 0 &\text{otherwise, i.e. if }a\not|b\text{ or }\frac{b}{a}\text{ is not squarefree.} \end{cases} [/math]
Proof.

Denote [math]n=p_1^{n_1}p_2^{n_2}\cdots p_k^{n_k}[/math]. Represent [math]n[/math] by a tuple [math](n_1,n_2,\ldots,n_k)[/math]. Every [math]a\in P[/math] corresponds in this way to a tuple [math](a_1,a_2,\ldots,a_k)[/math] with [math]a_i\le n_i[/math] for all [math]1\le i\le k[/math].

Let [math]P_i=\{1,2,\ldots,n_i\}[/math] be the poset with [math]\le[/math] being the total order. The poset [math]P[/math] of divisors of [math]n[/math] is thus isomorphic to the poset constructed by the Cartesian product of all [math]P_i[/math], [math]1\le i\le k[/math]. Then

[math] \begin{align} \mu(a,b) &=\prod_{1\le i\le k}\mu(a_i,b_i)=\prod_{1\le i\le k\atop a_i=b_i}1\prod_{1\le i\le k\atop b_i-a_i=1}(-1)\prod_{1\le i\le k\atop b_i-a_i\not\in\{0,1\}}0 =\begin{cases} (-1)^{\sum_{i}(b_i-a_i)} & \text{if all }b_i-a_i\in\{0,1\},\\ 0 &\text{otherwise.} \end{cases}\\ &=\begin{cases} (-1)^{r} & \text{if }\frac{b}{a}\text{ is the product of }r\text{ distinct primes},\\ 0 &\text{otherwise.} \end{cases} \end{align} [/math]
[math]\square[/math]

Principle of Möbius inversion

We now introduce the the famous Möbius inversion formula.

Möbius inversion formula
Let [math]P[/math] be a finite poset and [math]\mu[/math] its Möbius function. Let [math]f,g:P\rightarrow \mathbb{R}[/math]. Then
[math]\forall x\in P,\,\, g(x)=\sum_{y\le x} f(y)[/math],
if and only if
[math]\forall x\in P,\,\, f(x)=\sum_{y\le x}g(y)\mu(y,x)[/math].

The functions [math]f,g:P\rightarrow\mathbb{R}[/math] are vectors. Evaluate the matrix multiplications [math]f\zeta[/math] and [math]g\mu[/math] as follows:

[math](f\zeta)(x)=\sum_{y\in P}f(y)\zeta(y,x)=\sum_{y\le x}f(y)[/math],

and

[math](g\mu)(x)=\sum_{y\in P}g(y)\mu(y,x)=\sum_{y\le x}g(y)\mu(y,x)[/math].

The Möbius inversion formula is nothing but the following statement

[math]f\zeta=g\Leftrightarrow f=g\mu[/math],

which is trivially true due to [math]\mu\zeta=I[/math] by basic linear algebra.

The following dual form of the inversion formula is also useful.

Möbius inversion formula, dual form
Let [math]P[/math] be a finite poset and [math]\mu[/math] its Möbius function. Let [math]f,g:P\rightarrow \mathbb{R}[/math]. Then
[math]\forall x\in P, \,\, g(x)=\sum_{y{\color{red}\ge} x} f(y)[/math],
if and only if
[math]\forall x\in P, \,\, f(x)=\sum_{y{\color{red}\ge} x}\mu(x,y)g(y)[/math].

To prove the dual form, we only need to evaluate the matrix multiplications on left:

[math]\zeta f=g\Leftrightarrow f=\mu g[/math].
Principle of Inclusion-Exclusion
Let [math]A_1,A_2,\ldots,A_n\subseteq U[/math]. For any [math]J\subseteq\{1,2,\ldots,n\}[/math],
  • let [math]f(J)[/math] be the number of elements that belongs to exactly the sets [math]A_i, i\in J[/math] and to no others, i.e.
[math]f(J)=\left|\left(\bigcap_{i\in J}A_i\right)\setminus\left(\bigcup_{i\not\in J}A_i\right)\right|[/math];
  • let [math]g(J)=\left|\bigcap_{i\in J}A_i\right|[/math].
For any [math]J\subseteq\{1,2,\ldots,n\}[/math], the following relation holds for the above defined [math]f[/math] and [math]g[/math]:
[math]g(J)=\sum_{I\supseteq J}f(I)[/math].
Applying the dual form of the Möbius inversion formula, we have that for any [math]J\subseteq\{1,2,\ldots,n\}[/math],
[math]f(J)=\sum_{I\supseteq J}\mu(J,I)g(I)=\sum_{I\supseteq J}\mu(J,I)\left|\bigcap_{i\in I}A_i\right|[/math],
where the Möbius function is for the poset of all subsets of [math]\{1,2,\ldots,n\}[/math], ordered by [math]\subseteq[/math], thus it holds that [math]\mu(J,I)=(-1)^{|I|-|J|}\,[/math] for [math]J\subseteq I[/math]. Therefore,
[math]f(J)=\sum_{I\supseteq J}(-1)^{|I|-|J|}\left|\bigcap_{i\in I}A_i\right|[/math].
We have a formula for the number of elements with exactly those properties [math]A_i, i\in J[/math] for any [math]J\subseteq\{1,2,\ldots,n\}[/math]. For the special case that [math]J=\emptyset[/math], [math]f(\emptyset)[/math] is the number of elements satisfying no property of [math]A_1,A_2,\ldots,A_n[/math], and
[math]f(\emptyset)=\left|U\setminus\bigcup_iA_i\right|=\sum_{I\subseteq \{1,\ldots,n\}}(-1)^{|I|}\left|\bigcap_{i\in I}A_i\right|[/math]
which gives precisely the Principle of Inclusion-Exclusion.
Möbius inversion formula for number theory
The number-theoretical Möbius inversion formula is stated as such: Let [math]N[/math] be a positive integer,
[math]g(n)=\sum_{d|n}f(d)\,[/math] for all [math]n|N[/math]
if and only if
[math]f(n)=\sum_{d|n}g(d)\mu\left(\frac{n}{d}\right)\,[/math] for all [math]n|N[/math],
where [math]\mu[/math] is the number-theoretical Möbius function, defined as
[math]\mu(n)=\begin{cases}1 & \text{if }n\text{ is product of an even number of distinct primes,}\\ -1 &\text{if }n\text{ is product of an odd number of distinct primes,}\\ 0 &\text{otherwise.}\end{cases}[/math]
The number-theoretical Möbius inversion formula is just a special case of the Möbius inversion formula for posets, when the poset is the set of divisors of [math]N[/math], and for any [math]a,b\in P[/math], [math]a\le_P b[/math] if [math]a|b[/math].

Sieve Method in Number Theory

The Euler totient function

Two integers [math]m, n[/math] are said to be relatively prime if their greatest common diviser [math]\mathrm{gcd}(m,n)=1[/math]. For a positive integer [math]n[/math], let [math]\phi(n)[/math] be the number of positive integers from [math]\{1,2,\ldots,n\}[/math] that are relative prime to [math]n[/math]. This function, called the Euler [math]\phi[/math] function or the Euler totient function, is fundamental in number theory.

We now derive a formula for this function by using the principle of inclusion-exclusion.

Theorem (The Euler totient function)

Suppose [math]n[/math] is divisible by precisely [math]r[/math] different primes, denoted [math]p_1,\ldots,p_r[/math]. Then

[math]\phi(n)=n\prod_{i=1}^r\left(1-\frac{1}{p_i}\right)[/math].
Proof.

Let [math]U=\{1,2,\ldots,n\}[/math] be the universe. The number of positive integers from [math]U[/math] which is divisible by some [math]p_{i_1},p_{i_2},\ldots,p_{i_s}\in\{p_1,\ldots,p_r\}[/math], is [math]\frac{n}{p_{i_1}p_{i_2}\cdots p_{i_s}}[/math].

[math]\phi(n)[/math] is the number of integers from [math]U[/math] which is not divisible by any [math]p_1,\ldots,p_r[/math]. By principle of inclusion-exclusion,

[math] \begin{align} \phi(n) &=n+\sum_{k=1}^r(-1)^k\sum_{1\le i_1\lt i_2\lt \cdots \lt i_k\le n}\frac{n}{p_{i_1}p_{i_2}\cdots p_{i_k}}\\ &=n-\sum_{1\le i\le n}\frac{n}{p_i}+\sum_{1\le i\lt j\le n}\frac{n}{p_i p_j}-\sum_{1\le i\lt j\lt k\le n}\frac{n}{p_{i} p_{j} p_{k}}+\cdots + (-1)^r\frac{n}{p_{1}p_{2}\cdots p_{r}}\\ &=n\left(1-\sum_{1\le i\le n}\frac{1}{p_i}+\sum_{1\le i\lt j\le n}\frac{1}{p_i p_j}-\sum_{1\le i\lt j\lt k\le n}\frac{1}{p_{i} p_{j} p_{k}}+\cdots + (-1)^r\frac{1}{p_{1}p_{2}\cdots p_{r}}\right)\\ &=n\prod_{i=1}^r\left(1-\frac{1}{p_i}\right). \end{align} [/math]
[math]\square[/math]


Reference

  • Stanley, Enumerative Combinatorics, Volume 1, Chapter 2.
  • van Lin and Wilson, A course in combinatorics, Chapter 10, 25.