组合数学 (Fall 2011)/Sieve methods

Principle of Inclusion-Exclusion

Let ${\displaystyle A}$ and ${\displaystyle B}$ be two finite sets. The cardinality of their union is

${\displaystyle |A\cup B|=|A|+|B|-{\color {Blue}|A\cap B|}}$.

For three sets ${\displaystyle A}$, ${\displaystyle B}$, and ${\displaystyle C}$, the cardinality of the union of these three sets is computed as

${\displaystyle |A\cup B\cup C|=|A|+|B|+|C|-{\color {Blue}|A\cap B|}-{\color {Blue}|A\cap C|}-{\color {Blue}|B\cap C|}+{\color {Red}|A\cap B\cap C|}}$.

This is illustrated by the following figure.

Generally, the Principle of Inclusion-Exclusion states the rule for computing the union of ${\displaystyle n}$ finite sets ${\displaystyle A_{1},A_{2},\ldots ,A_{n}}$, such that

{\displaystyle {\begin{aligned}\left|\bigcup _{i=1}^{n}A_{i}\right|&=\sum _{I\subseteq \{1,\ldots ,n\}}(-1)^{|I|-1}\left|\bigcap _{i\in I}A_{i}\right|.\end{aligned}}}

In combinatorial enumeration, the Principle of Inclusion-Exclusion is usually applied in its complement form.

Let ${\displaystyle A_{1},A_{2},\ldots ,A_{n}\subseteq U}$ be subsets of some finite set ${\displaystyle U}$. Here ${\displaystyle U}$ is some universe of combinatorial objects, whose cardinality is easy to calculate (e.g. all strings, tuples, permutations), and each ${\displaystyle A_{i}}$ contains the objects with some specific property (e.g. a "pattern") which we want to avoid. The problem is to count the number of objects without any of the ${\displaystyle n}$ properties. We write ${\displaystyle {\bar {A_{i}}}=U-A_{i}}$. The number of objects without any of the properties ${\displaystyle A_{1},A_{2},\ldots ,A_{n}}$ is

{\displaystyle {\begin{aligned}\left|{\bar {A_{1}}}\cap {\bar {A_{2}}}\cap \cdots \cap {\bar {A_{n}}}\right|=\left|U-\bigcup _{i=1}^{n}A_{i}\right|&=|U|+\sum _{I\subseteq \{1,\ldots ,n\}}(-1)^{|I|}\left|\bigcap _{i\in I}A_{i}\right|.\end{aligned}}}

For an ${\displaystyle I\subseteq \{1,2,\ldots ,n\}}$, we denote

${\displaystyle A_{I}=\bigcap _{i\in I}A_{i}}$

with the convention that ${\displaystyle A_{\emptyset }=U}$. The above equation is stated as:

 Principle of Inclusion-Exclusion Let ${\displaystyle A_{1},A_{2},\ldots ,A_{n}}$ be a family of subsets of ${\displaystyle U}$. Then the number of elements of ${\displaystyle U}$ which lie in none of the subsets ${\displaystyle A_{i}}$ is ${\displaystyle \sum _{I\subseteq \{1,\ldots ,n\}}(-1)^{|I|}|A_{I}|}$.

Let ${\displaystyle S_{k}=\sum _{|I|=k}|A_{I}|\,}$. Conventionally, ${\displaystyle S_{0}=|A_{\emptyset }|=|U|}$. The principle of inclusion-exclusion can be expressed as

${\displaystyle S_{0}-S_{1}+S_{2}+\cdots +(-1)^{n}S_{n}.}$

Surjections

In the twelvefold way, we discuss the counting problems incurred by the mappings ${\displaystyle f:N\rightarrow M}$. The basic case is that elements from both ${\displaystyle N}$ and ${\displaystyle M}$ are distinguishable. In this case, it is easy to count the number of arbitrary mappings (which is ${\displaystyle m^{n}}$) and the number of injective (one-to-one) mappings (which is ${\displaystyle (m)_{n}}$), but the number of surjective is difficult. Here we apply the principle of inclusion-exclusion to count the number of surjective (onto) mappings.

 Theorem The number of surjective mappings from an ${\displaystyle n}$-set to an ${\displaystyle m}$-set is given by ${\displaystyle \sum _{k=1}^{m}(-1)^{m-k}{m \choose k}k^{n}}$.
Proof.
 Let ${\displaystyle U=\{f:[n]\rightarrow [m]\}}$ be the set of mappings from ${\displaystyle [n]}$ to ${\displaystyle [m]}$. Then ${\displaystyle |U|=m^{n}}$. For ${\displaystyle i\in [m]}$, let ${\displaystyle A_{i}}$ be the set of mappings ${\displaystyle f:[n]\rightarrow [m]}$ that none of ${\displaystyle j\in [n]}$ is mapped to ${\displaystyle i}$, i.e. ${\displaystyle A_{i}=\{f:[n]\rightarrow [m]\setminus \{i\}\}}$, thus ${\displaystyle |A_{i}|=(m-1)^{n}}$. More generally, for ${\displaystyle I\subseteq [m]}$, ${\displaystyle A_{I}=\bigcap _{i\in I}A_{i}}$ contains the mappings ${\displaystyle f:[n]\rightarrow [m]\setminus I}$. And ${\displaystyle |A_{I}|=(m-|I|)^{n}\,}$. A mapping ${\displaystyle f:[n]\rightarrow [m]}$ is surjective if ${\displaystyle f}$ lies in none of ${\displaystyle A_{i}}$. By the principle of inclusion-exclusion, the number of surjective ${\displaystyle f:[n]\rightarrow [m]}$ is ${\displaystyle \sum _{I\subseteq [m]}(-1)^{|I|}\left|A_{I}\right|=\sum _{I\subseteq [m]}(-1)^{|I|}(m-|I|)^{n}=\sum _{j=0}^{m}(-1)^{j}{m \choose j}(m-j)^{n}}$. Let ${\displaystyle k=m-j}$. The theorem is proved.
${\displaystyle \square }$

Recall that, in the twelvefold way, we establish a relation between surjections and partitions.

• Surjection to ordered partition:
For a surjective ${\displaystyle f:[n]\rightarrow [m]}$, ${\displaystyle (f^{-1}(0),f^{-1}(1),\ldots ,f^{-1}(m-1))}$ is an ordered partition of ${\displaystyle [n]}$.
• Ordered partition to surjection:
For an ordered ${\displaystyle m}$-partition ${\displaystyle (B_{0},B_{1},\ldots ,B_{m-1})}$ of ${\displaystyle [n]}$, we can define a function ${\displaystyle f:[n]\rightarrow [m]}$ by letting ${\displaystyle f(i)=j}$ if and only if ${\displaystyle i\in B_{j}}$. ${\displaystyle f}$ is surjective since as a partition, none of ${\displaystyle B_{i}}$ is empty.

Therefore, we have a one-to-one correspondence between surjective mappings from an ${\displaystyle n}$-set to an ${\displaystyle m}$-set and the ordered ${\displaystyle m}$-partitions of an ${\displaystyle n}$-set.

The Stirling number of the second kind ${\displaystyle S(n,m)}$ is the number of ${\displaystyle m}$-partitions of an ${\displaystyle n}$-set. There are ${\displaystyle m!}$ ways to order an ${\displaystyle m}$-partition, thus the number of surjective mappings ${\displaystyle f:[n]\rightarrow [m]}$ is ${\displaystyle m!S(n,m)}$. Combining with what we have proved for surjections, we give the following result for the Stirling number of the second kind.

 Proposition ${\displaystyle S(n,m)={\frac {1}{m!}}\sum _{k=1}^{m}(-1)^{m-k}{m \choose k}k^{n}}$.

Derangements

We now count the number of bijections from a set to itself with no fixed points. This is the derangement problem.

For a permutation ${\displaystyle \pi }$ of ${\displaystyle \{1,2,\ldots ,n\}}$, a fixed point is such an ${\displaystyle i\in \{1,2,\ldots ,n\}}$ that ${\displaystyle \pi (i)=i}$. A derangement of ${\displaystyle \{1,2,\ldots ,n\}}$ is a permutation of ${\displaystyle \{1,2,\ldots ,n\}}$ that has no fixed points.

 Theorem The number of derangements of ${\displaystyle \{1,2,\ldots ,n\}}$ given by ${\displaystyle n!\sum _{k=0}^{n}{\frac {(-1)^{k}}{k!}}\approx {\frac {n!}{\mathrm {e} }}}$.
Proof.
 Let ${\displaystyle U}$ be the set of all permutations of ${\displaystyle \{1,2,\ldots ,n\}}$. So ${\displaystyle |U|=n!}$. Let ${\displaystyle A_{i}}$ be the set of permutations with fixed point ${\displaystyle i}$; so ${\displaystyle |A_{i}|=(n-1)!}$. More generally, for any ${\displaystyle I\subseteq \{1,2,\ldots ,n\}}$, ${\displaystyle A_{I}=\bigcap _{i\in I}A_{i}}$, and ${\displaystyle |A_{I}|=(n-|I|)!}$, since permutations in ${\displaystyle A_{I}}$ fix every point in ${\displaystyle I}$ and permute the remaining points arbitrarily. A permutation is a derangement if and only if it lies in none of the sets ${\displaystyle A_{i}}$. So the number of derangements is ${\displaystyle \sum _{I\subseteq \{1,2,\ldots ,n\}}(-1)^{|I|}(n-|I|)!=\sum _{k=0}^{n}(-1)^{k}{n \choose k}(n-k)!=n!\sum _{k=0}^{n}{\frac {(-1)^{k}}{k!}}.}$ By Taylor's series, ${\displaystyle {\frac {1}{\mathrm {e} }}=\sum _{k=0}^{\infty }{\frac {(-1)^{k}}{k!}}=\sum _{k=0}^{n}{\frac {(-1)^{k}}{k!}}\pm o\left({\frac {1}{n!}}\right)}$. It is not hard to see that ${\displaystyle n!\sum _{k=0}^{n}{\frac {(-1)^{k}}{k!}}}$ is the closest integer to ${\displaystyle {\frac {n!}{\mathrm {e} }}}$.
${\displaystyle \square }$

Therefore, there are about ${\displaystyle {\frac {1}{\mathrm {e} }}}$ fraction of all permutations with no fixed points.

Permutations with restricted positions

We introduce a general theory of counting permutations with restricted positions. In the derangement problem, we count the number of permutations that ${\displaystyle \pi (i)\neq i}$. We now generalize to the problem of counting permutations which avoid a set of arbitrarily specified positions.

It is traditionally described using terminology from the game of chess. Let ${\displaystyle B\subseteq \{1,\ldots ,n\}\times \{1,\ldots ,n\}}$, called a board. As illustrated below, we can think of ${\displaystyle B}$ as a chess board, with the positions in ${\displaystyle B}$ marked by "${\displaystyle \times }$".

 a b c d e f g h 8 8 7 7 6 6 5 5 4 4 3 3 2 2 1 1 a b c d e f g h

For a permutation ${\displaystyle \pi }$ of ${\displaystyle \{1,\ldots ,n\}}$, define the graph ${\displaystyle G_{\pi }(V,E)}$ as

{\displaystyle {\begin{aligned}G_{\pi }&=\{(i,\pi (i))\mid i\in \{1,2,\ldots ,n\}\}.\end{aligned}}}

This can also be viewed as a set of marked positions on a chess board. Each row and each column has only one marked position, because ${\displaystyle \pi }$ is a permutation. Thus, we can identify each ${\displaystyle G_{\pi }}$ as a placement of ${\displaystyle n}$ rooks (“城堡”，规则同中国象棋里的“车”) without attacking each other.

For example, the following is the ${\displaystyle G_{\pi }}$ of such ${\displaystyle \pi }$ that ${\displaystyle \pi (i)=i}$.

 a b c d e f g h 8 8 7 7 6 6 5 5 4 4 3 3 2 2 1 1 a b c d e f g h

Now define

{\displaystyle {\begin{aligned}N_{0}&=\left|\left\{\pi \mid B\cap G_{\pi }=\emptyset \right\}\right|\\r_{k}&={\mbox{number of }}k{\mbox{-subsets of }}B{\mbox{ such that no two elements have a common coordinate}}\\&=\left|\left\{S\in {B \choose k}\,{\bigg |}\,\forall (i_{1},j_{1}),(i_{2},j_{2})\in S,i_{1}\neq i_{2},j_{1}\neq j_{2}\right\}\right|\end{aligned}}}

Interpreted in chess game,

• ${\displaystyle B}$: a set of marked positions in an ${\displaystyle [n]\times [n]}$ chess board.
• ${\displaystyle N_{0}}$: the number of ways of placing ${\displaystyle n}$ non-attacking rooks on the chess board such that none of these rooks lie in ${\displaystyle B}$.
• ${\displaystyle r_{k}}$: number of ways of placing ${\displaystyle k}$ non-attacking rooks on ${\displaystyle B}$.

Our goal is to count ${\displaystyle N_{0}}$ in terms of ${\displaystyle r_{k}}$. This gives the number of permutations avoid all positions in a ${\displaystyle B}$.

 Theorem ${\displaystyle N_{0}=\sum _{k=0}^{n}(-1)^{k}r_{k}(n-k)!}$.
Proof.
 For each ${\displaystyle i\in [n]}$, let ${\displaystyle A_{i}=\{\pi \mid (i,\pi (i))\in B\}}$ be the set of permutations ${\displaystyle \pi }$ whose ${\displaystyle i}$-th position is in ${\displaystyle B}$. ${\displaystyle N_{0}}$ is the number of permutations avoid all positions in ${\displaystyle B}$. Thus, our goal is to count the number of permutations ${\displaystyle \pi }$ in none of ${\displaystyle A_{i}}$ for ${\displaystyle i\in [n]}$. For each ${\displaystyle I\subseteq [n]}$, let ${\displaystyle A_{I}=\bigcap _{i\in I}A_{i}}$, which is the set of permutations ${\displaystyle \pi }$ such that ${\displaystyle (i,\pi (i))\in B}$ for all ${\displaystyle i\in I}$. Due to the principle of inclusion-exclusion, ${\displaystyle N_{0}=\sum _{I\subseteq [n]}(-1)^{|I|}|A_{I}|=\sum _{k=0}^{n}(-1)^{k}\sum _{I\in {[n] \choose k}}|A_{I}|}$. The next observation is that ${\displaystyle \sum _{I\in {[n] \choose k}}|A_{I}|=r_{k}(n-k)!}$, because we can count both sides by first placing ${\displaystyle k}$ non-attacking rooks on ${\displaystyle B}$ and placing ${\displaystyle n-k}$ additional non-attacking rooks on ${\displaystyle [n]\times [n]}$ in ${\displaystyle (n-k)!}$ ways. Therefore, ${\displaystyle N_{0}=\sum _{k=0}^{n}(-1)^{k}r_{k}(n-k)!}$.
${\displaystyle \square }$

Derangement problem

We use the above general method to solve the derange problem again.

Take ${\displaystyle B=\{(1,1),(2,2),\ldots ,(n,n)\}}$ as the chess board. A derangement ${\displaystyle \pi }$ is a placement of ${\displaystyle n}$ non-attacking rooks such that none of them is in ${\displaystyle B}$.

 a b c d e f g h 8 8 7 7 6 6 5 5 4 4 3 3 2 2 1 1 a b c d e f g h

Clearly, the number of ways of placing ${\displaystyle k}$ non-attacking rooks on ${\displaystyle B}$ is ${\displaystyle r_{k}={n \choose k}}$. We want to count ${\displaystyle N_{0}}$, which gives the number of ways of placing ${\displaystyle n}$ non-attacking rooks such that none of these rooks lie in ${\displaystyle B}$.

By the above theorem

${\displaystyle N_{0}=\sum _{k=0}^{n}(-1)^{k}r_{k}(n-k)!=\sum _{k=0}^{n}(-1)^{k}{n \choose k}(n-k)!=\sum _{k=0}^{n}(-1)^{k}{\frac {n!}{k!}}=n!\sum _{k=0}^{n}(-1)^{k}{\frac {1}{k!}}\approx {\frac {n!}{e}}.}$

Problème des ménages

Suppose that in a banquet, we want to seat ${\displaystyle n}$ couples at a circular table, satisfying the following constraints:

• Men and women are in alternate places.
• No one sits next to his/her spouse.

In how many ways can this be done?

(For convenience, we assume that every seat at the table marked differently so that rotating the seats clockwise or anti-clockwise will end up with a different solution.)

First, let the ${\displaystyle n}$ ladies find their seats. They may either sit at the odd numbered seats or even numbered seats, in either case, there are ${\displaystyle n!}$ different orders. Thus, there are ${\displaystyle 2(n!)}$ ways to seat the ${\displaystyle n}$ ladies.

After sitting the wives, we label the remaining ${\displaystyle n}$ places clockwise as ${\displaystyle 0,1,\ldots ,n-1}$. And a seating of the ${\displaystyle n}$ husbands is given by a permutation ${\displaystyle \pi }$ of ${\displaystyle [n]}$ defined as follows. Let ${\displaystyle \pi (i)}$ be the seat of the husband of he lady sitting at the ${\displaystyle i}$-th place.

It is easy to see that ${\displaystyle \pi }$ satisfies that ${\displaystyle \pi (i)\neq i}$ and ${\displaystyle \pi (i)\not \equiv i+1{\pmod {n}}}$, and every permutation ${\displaystyle \pi }$ with these properties gives a feasible seating of the ${\displaystyle n}$ husbands. Thus, we only need to count the number of permutations ${\displaystyle \pi }$ such that ${\displaystyle \pi (i)\not \equiv i,i+1{\pmod {n}}}$.

Take ${\displaystyle B=\{(0,0),(1,1),\ldots ,(n-1,n-1),(0,1),(1,2),\ldots ,(n-2,n-1),(n-1,0)\}}$ as the chess board. A permutation ${\displaystyle \pi }$ which defines a way of seating the husbands, is a placement of ${\displaystyle n}$ non-attacking rooks such that none of them is in ${\displaystyle B}$.

 a b c d e f g h 8 8 7 7 6 6 5 5 4 4 3 3 2 2 1 1 a b c d e f g h

We need to compute ${\displaystyle r_{k}}$, the number of ways of placing ${\displaystyle k}$ non-attacking rooks on ${\displaystyle B}$. For our choice of ${\displaystyle B}$, ${\displaystyle r_{k}}$ is the number of ways of choosing ${\displaystyle k}$ points, no two consecutive, from a collection of ${\displaystyle 2n}$ points arranged in a circle.

We first see how to do this in a line.

 Lemma The number of ways of choosing ${\displaystyle k}$ non-consecutive objects from a collection of ${\displaystyle m}$ objects arranged in a line, is ${\displaystyle {m-k+1 \choose k}}$.
Proof.
 We draw a line of ${\displaystyle m-k}$ black points, and then insert ${\displaystyle k}$ red points into the ${\displaystyle m-k+1}$ spaces between the black points (including the beginning and end). {\displaystyle {\begin{aligned}&\sqcup \,\bullet \,\sqcup \,\bullet \,\sqcup \,\bullet \,\sqcup \,\bullet \,\sqcup \,\bullet \,\sqcup \,\bullet \,\sqcup \,\bullet \,\sqcup \\&\qquad \qquad \qquad \quad \Downarrow \\&\sqcup \,\bullet \,\,{\color {Red}\bullet }\,\bullet \,\,{\color {Red}\bullet }\,\bullet \,\sqcup \,\bullet \,\,{\color {Red}\bullet }\,\,\bullet \,\sqcup \,\bullet \,\sqcup \,\bullet \,\,{\color {Red}\bullet }\end{aligned}}} This gives us a line of ${\displaystyle m}$ points, and the red points specifies the chosen objects, which are non-consecutive. The mapping is 1-1 correspondence. There are ${\displaystyle {m-k+1 \choose k}}$ ways of placing ${\displaystyle k}$ red points into ${\displaystyle m-k+1}$ spaces.
${\displaystyle \square }$

The problem of choosing non-consecutive objects in a circle can be reduced to the case that the objects are in a line.

 Lemma The number of ways of choosing ${\displaystyle k}$ non-consecutive objects from a collection of ${\displaystyle m}$ objects arranged in a circle, is ${\displaystyle {\frac {m}{m-k}}{m-k \choose k}}$.
Proof.
 Let ${\displaystyle f(m,k)}$ be the desired number; and let ${\displaystyle g(m,k)}$ be the number of ways of choosing ${\displaystyle k}$ non-consecutive points from ${\displaystyle m}$ points arranged in a circle, next coloring the ${\displaystyle k}$ points red, and then coloring one of the uncolored point blue. Clearly, ${\displaystyle g(m,k)=(m-k)f(m,k)}$. But we can also compute ${\displaystyle g(m,k)}$ as follows: Choose one of the ${\displaystyle m}$ points and color it blue. This gives us ${\displaystyle m}$ ways. Cut the circle to make a line of ${\displaystyle m-1}$ points by removing the blue point. Choose ${\displaystyle k}$ non-consecutive points from the line of ${\displaystyle m-1}$ points and color them red. This gives ${\displaystyle {m-k \choose k}}$ ways due to the previous lemma. Thus, ${\displaystyle g(m,k)=m{m-k \choose k}}$. Therefore we have the desired number ${\displaystyle f(m,k)={\frac {m}{m-k}}{m-k \choose k}}$.
${\displaystyle \square }$

By the above lemma, we have that ${\displaystyle r_{k}={\frac {2n}{2n-k}}{2n-k \choose k}}$. Then apply the theorem of counting permutations with restricted positions,

${\displaystyle N_{0}=\sum _{k=0}^{n}(-1)^{k}r_{k}(n-k)!=\sum _{k=0}^{n}(-1)^{k}{\frac {2n}{2n-k}}{2n-k \choose k}(n-k)!.}$

This gives the number of ways of seating the ${\displaystyle n}$ husbands after the ladies are seated. Recall that there are ${\displaystyle 2n!}$ ways of seating the ${\displaystyle n}$ ladies. Thus, the total number of ways of seating ${\displaystyle n}$ couples as required by problème des ménages is

${\displaystyle 2n!\sum _{k=0}^{n}(-1)^{k}{\frac {2n}{2n-k}}{2n-k \choose k}(n-k)!.}$

The Euler totient function

Two integers ${\displaystyle m,n}$ are said to be relatively prime if their greatest common diviser ${\displaystyle \mathrm {gcd} (m,n)=1}$. For a positive integer ${\displaystyle n}$, let ${\displaystyle \phi (n)}$ be the number of positive integers from ${\displaystyle \{1,2,\ldots ,n\}}$ that are relative prime to ${\displaystyle n}$. This function, called the Euler ${\displaystyle \phi }$ function or the Euler totient function, is fundamental in number theory.

We now derive a formula for this function by using the principle of inclusion-exclusion.

 Theorem (The Euler totient function) Suppose ${\displaystyle n}$ is divisible by precisely ${\displaystyle r}$ different primes, denoted ${\displaystyle p_{1},\ldots ,p_{r}}$. Then ${\displaystyle \phi (n)=n\prod _{i=1}^{r}\left(1-{\frac {1}{p_{i}}}\right)}$.
Proof.
 Let ${\displaystyle U=\{1,2,\ldots ,n\}}$ be the universe. The number of positive integers from ${\displaystyle U}$ which is divisible by some ${\displaystyle p_{i_{1}},p_{i_{2}},\ldots ,p_{i_{s}}\in \{p_{1},\ldots ,p_{r}\}}$, is ${\displaystyle {\frac {n}{p_{i_{1}}p_{i_{2}}\cdots p_{i_{s}}}}}$. ${\displaystyle \phi (n)}$ is the number of integers from ${\displaystyle U}$ which is not divisible by any ${\displaystyle p_{1},\ldots ,p_{r}}$. By principle of inclusion-exclusion, {\displaystyle {\begin{aligned}\phi (n)&=n+\sum _{k=1}^{r}(-1)^{k}\sum _{1\leq i_{1}
${\displaystyle \square }$

Möbius inversion

Posets

A partially ordered set or poset for short is a set ${\displaystyle P}$ together with a binary relation denoted ${\displaystyle \leq _{P}}$ (or just ${\displaystyle \leq }$ if no confusion is caused), satisfying

• (reflexivity) For all ${\displaystyle x\in P,x\leq x}$.
• (antisymmetry) If ${\displaystyle x\leq y}$ and ${\displaystyle y\leq x}$, then ${\displaystyle x=y}$.
• (transitivity) If ${\displaystyle x\leq y}$ and ${\displaystyle y\leq z}$, then ${\displaystyle x\leq z}$.

We say two elements ${\displaystyle x}$ and ${\displaystyle y}$ are comparable if ${\displaystyle x\leq y}$ or ${\displaystyle y\leq x}$; otherwise ${\displaystyle x}$ and ${\displaystyle y}$ are incomparable.

Notation
• ${\displaystyle x\geq y}$ means ${\displaystyle y\leq x}$.
• ${\displaystyle x means ${\displaystyle x\leq y}$ and ${\displaystyle x\neq y}$.
• ${\displaystyle x>y}$ means ${\displaystyle y.

The Möbius function

Let ${\displaystyle P}$ be a finite poset. Consider functions in form of ${\displaystyle \alpha :P\times P\rightarrow \mathbb {R} }$ defined over domain ${\displaystyle P\times P}$. It is convenient to treat such functions as matrices whose rows and columns are indexed by ${\displaystyle P}$.

Incidence algebra of poset
Let
${\displaystyle I(P)=\{\alpha :P\times P\rightarrow \mathbb {R} \mid \alpha (x,y)=0{\text{ for all }}x\not \leq _{P}y\}}$
be the class of ${\displaystyle \alpha }$ such that ${\displaystyle \alpha (x,y)}$ is non-zero only for ${\displaystyle x\leq _{P}y}$.
Treating ${\displaystyle \alpha }$ as matrix, it is trivial to see that ${\displaystyle I(P)}$ is closed under addition and scalar multiplication, that is,
• if ${\displaystyle \alpha ,\beta \in I(P)}$ then ${\displaystyle \alpha +\beta \in I(P)}$;
• if ${\displaystyle \alpha \in I(P)}$ then ${\displaystyle c\alpha \in I(P)}$ for any ${\displaystyle c\in \mathbb {R} }$;
where ${\displaystyle \alpha ,\beta }$ are treated as matrices.
With this spirit, it is natural to define the matrix multiplication in ${\displaystyle I(P)}$. For ${\displaystyle \alpha ,\beta \in I(P)}$,
${\displaystyle (\alpha \beta )(x,y)=\sum _{z\in P}\alpha (x,z)\beta (z,y)=\sum _{x\leq z\leq y}\alpha (x,z)\beta (z,y)}$.
The second equation is due to that for ${\displaystyle \alpha ,\beta \in I(P)}$, for all ${\displaystyle z}$ other than ${\displaystyle x\leq z\leq y}$, ${\displaystyle \alpha (x,z)\beta (z,y)}$ is zero.
By the transitivity of relation ${\displaystyle \leq _{P}}$, it is also easy to prove that ${\displaystyle I(P)}$ is closed under matrix multiplication (the detailed proof is left as an exercise). Therefore, ${\displaystyle I(P)}$ is closed under addition, scalar multiplication and matrix multiplication, so we have an algebra ${\displaystyle I(P)}$, called incidence algebra, over functions on ${\displaystyle P\times P}$.
Zeta function and Möbius function
A special function in ${\displaystyle I(P)}$ is the so-called zeta function ${\displaystyle \zeta }$, defined as
${\displaystyle \zeta (x,y)={\begin{cases}1&{\text{if }}x\leq _{P}y,\\0&{\text{otherwise.}}\end{cases}}}$
As a matrix (or more accurately, as an element of the incidence algebra), ${\displaystyle \zeta }$ is invertible and its inversion, denoted by ${\displaystyle \mu }$, is called the Möbius function. More precisely, ${\displaystyle \mu }$ is also in the incidence algebra ${\displaystyle I(P)}$, and ${\displaystyle \mu \zeta =I}$ where ${\displaystyle I}$ is the identity matrix (the identity of the incidence algebra ${\displaystyle I(P)}$).

There is an equivalent explicit definition of Möbius function.

 Definition (Möbius function) ${\displaystyle \mu (x,y)={\begin{cases}-\sum _{x\leq z

To see the equivalence between this definition and the inversion of zeta function, we may have the following proposition, which is proved by directly evaluating ${\displaystyle \mu \zeta }$.

 Proposition For any ${\displaystyle x,y\in P}$, ${\displaystyle \sum _{x\leq z\leq y}\mu (x,z)={\begin{cases}1&{\text{if }}x=y,\\0&{\text{otherwise.}}\end{cases}}}$
Proof.
 It holds that ${\displaystyle (\mu \zeta )(x,y)=\sum _{x\leq z\leq y}\mu (x,z)\zeta (z,y)=\sum _{x\leq z\leq y}\mu (x,z)}$. On the other hand, ${\displaystyle \mu \zeta =I}$, i.e. ${\displaystyle (\mu \zeta )(x,y)={\begin{cases}1&{\text{if }}x=y,\\0&{\text{otherwise.}}\end{cases}}}$ The proposition follows.
${\displaystyle \square }$

Note that ${\displaystyle \mu (x,y)=\sum _{x\leq z\leq y}\mu (x,z)-\sum _{x\leq z, which gives the above inductive definition of Möbius function.

Computing Möbius functions

We consider the simple poset ${\displaystyle P=[n]}$, where ${\displaystyle \leq }$ is the total order. It follows directly from the recursive definition of Möbius function that

${\displaystyle \mu (i,j)={\begin{cases}1&{\text{if }}i=j,\\-1&{\text{if }}i+1=j,\\0&{\text{otherwise.}}\end{cases}}}$

Usually for general posets, it is difficult to directly compute the Möbius function from its definition. We introduce a rule helping us compute the Möbius function by decomposing the poset into posets with simple structures.

 Theorem (the product rule) Let ${\displaystyle P}$ and ${\displaystyle Q}$ be two finite posets, and ${\displaystyle P\times Q}$ be the poset resulted from Cartesian product of ${\displaystyle P}$ and ${\displaystyle Q}$, where for all ${\displaystyle (x,y),(x',y')\in P\times Q}$, ${\displaystyle (x,y)\leq (x',y')}$ if and only if ${\displaystyle x\leq x'}$ and ${\displaystyle y\leq y'}$. Then ${\displaystyle \mu _{P\times Q}((x,y),(x',y'))=\mu _{P}(x,x')\mu _{Q}(y,y')}$.
Proof.
 We use the recursive definition ${\displaystyle \mu (x,y)={\begin{cases}-\sum _{x\leq z to prove the equation in the theorem. If ${\displaystyle (x,y)=(x',y')}$, then ${\displaystyle x=x'}$ and ${\displaystyle y=y'}$. It is easy to see that both sides of the equation are 1. If ${\displaystyle (x,y)\not \leq (x',y')}$, then either ${\displaystyle x\not \leq x'}$ or ${\displaystyle y\not \leq y'}$. It is also easy to see that both sides are 0. The only remaining case is that ${\displaystyle (x,y)<(x',y')}$, in which case either ${\displaystyle x or ${\displaystyle y. {\displaystyle {\begin{aligned}\sum _{(x,y)\leq (u,v)\leq (x',y')}\mu _{P}(x,u)\mu _{Q}(y,v)&=\left(\sum _{x\leq u\leq x'}\mu _{P}(x,u)\right)\left(\sum _{y\leq v\leq y'}\mu _{Q}(y,v)\right)=I(x,x')I(y,y')=0,\end{aligned}}} where the last two equations are due to the proposition for ${\displaystyle \mu }$. Thus ${\displaystyle \mu _{P}(x,x')\mu _{Q}(y,y')=-\sum _{(x,y)\leq (u,v)<(x',y')}\mu _{P}(x,u)\mu _{Q}(y,v)}$. By induction, assume that the equation ${\displaystyle \mu _{P\times Q}((x,y),(u,v))=\mu _{P}(x,u)\mu _{Q}(y,v)}$ is true for all ${\displaystyle (u,v)<(x',y')}$. Then {\displaystyle {\begin{aligned}\mu _{P\times Q}((x,y),(x',y'))&=-\sum _{(x,y)\leq (u,v)<(x',y')}\mu _{P\times Q}((x,y),(u,v))\\&=-\sum _{(x,y)\leq (u,v)<(x',y')}\mu _{P}(x,u)\mu _{Q}(y,v)\\&=\mu _{P}(x,x')\mu _{Q}(y,y'),\end{aligned}}} which complete the proof.
${\displaystyle \square }$
Poset of subsets
Consider the poset defined by all subsets of a finite universe ${\displaystyle U}$, that is ${\displaystyle P=2^{U}}$, and for ${\displaystyle S,T\subseteq U}$, ${\displaystyle S\leq _{P}T}$ if and only if ${\displaystyle S\subseteq T}$.
 Möbius function for subsets The Möbius function for the above defined poset ${\displaystyle P}$ is that for ${\displaystyle S,T\subseteq U}$, ${\displaystyle \mu (S,T)={\begin{cases}(-1)^{|T|-|S|}&{\text{if }}S\subseteq T,\\0&{\text{otherwise.}}\end{cases}}}$
Proof.
 We can equivalently represent each ${\displaystyle S\subseteq U}$ by a boolean string ${\displaystyle S\in \{0,1\}^{U}}$, where ${\displaystyle S(x)=1}$ if and only if ${\displaystyle x\in S}$. For each element ${\displaystyle x\in U}$, we can define a poset ${\displaystyle P_{x}=\{0,1\}}$ with ${\displaystyle 0\leq 1}$. By definition of Möbius function, the Möbius function of this elementary poset is given by ${\displaystyle \mu _{x}(0,0)=\mu _{x}(1,1)=1}$, ${\displaystyle \mu _{x}(0,1)=-1}$ and ${\displaystyle \mu (1,0)=0}$. The poset ${\displaystyle P}$ of all subsets of ${\displaystyle U}$ is the Cartesian product of all ${\displaystyle P_{x}}$, ${\displaystyle x\in U}$. By the product rule, ${\displaystyle \mu (S,T)=\prod _{x\in U}\mu _{x}(S(x),T(x))=\prod _{x\in S \atop x\in T}1\prod _{x\not \in S \atop x\not \in T}1\prod _{x\in S \atop x\not \in T}0\prod _{x\not \in S \atop x\in T}(-1)={\begin{cases}(-1)^{|T|-|S|}&{\text{if }}S\subseteq T,\\0&{\text{otherwise.}}\end{cases}}}$
${\displaystyle \square }$
Note that the poset ${\displaystyle P}$ is actually the Boolean algebra of rank ${\displaystyle |U|}$. The proof relies only on that the fact that the poset is a Boolean algebra, thus the theorem holds for Boolean algebra posets.
Posets of divisors
Consider the poset defined by all devisors of a positive integer ${\displaystyle n}$, that is ${\displaystyle P=\{a>0\mid a|n\}}$, and for ${\displaystyle a,b\in P}$, ${\displaystyle a\leq _{P}b}$ if and only if ${\displaystyle a|b\,}$.
 Möbius function for divisors The Möbius function for the above defined poset ${\displaystyle P}$ is that for ${\displaystyle a,b>0}$ that ${\displaystyle a|n}$ and ${\displaystyle b|n}$, ${\displaystyle \mu (a,b)={\begin{cases}(-1)^{r}&{\text{if }}{\frac {b}{a}}{\text{ is the product of }}r{\text{ distinct primes}},\\0&{\text{otherwise, i.e. if }}a\not |b{\text{ or }}{\frac {b}{a}}{\text{ is not squarefree.}}\end{cases}}}$
Proof.
 Denote ${\displaystyle n=p_{1}^{n_{1}}p_{2}^{n_{2}}\cdots p_{k}^{n_{k}}}$. Represent ${\displaystyle n}$ by a tuple ${\displaystyle (n_{1},n_{2},\ldots ,n_{k})}$. Every ${\displaystyle a\in P}$ corresponds in this way to a tuple ${\displaystyle (a_{1},a_{2},\ldots ,a_{k})}$ with ${\displaystyle a_{i}\leq n_{i}}$ for all ${\displaystyle 1\leq i\leq k}$. Let ${\displaystyle P_{i}=\{1,2,\ldots ,n_{i}\}}$ be the poset with ${\displaystyle \leq }$ being the total order. The poset ${\displaystyle P}$ of divisors of ${\displaystyle n}$ is thus isomorphic to the poset constructed by the Cartesian product of all ${\displaystyle P_{i}}$, ${\displaystyle 1\leq i\leq k}$. Then {\displaystyle {\begin{aligned}\mu (a,b)&=\prod _{1\leq i\leq k}\mu (a_{i},b_{i})=\prod _{1\leq i\leq k \atop a_{i}=b_{i}}1\prod _{1\leq i\leq k \atop b_{i}-a_{i}=1}(-1)\prod _{1\leq i\leq k \atop b_{i}-a_{i}\not \in \{0,1\}}0={\begin{cases}(-1)^{\sum _{i}(b_{i}-a_{i})}&{\text{if all }}b_{i}-a_{i}\in \{0,1\},\\0&{\text{otherwise.}}\end{cases}}\\&={\begin{cases}(-1)^{r}&{\text{if }}{\frac {b}{a}}{\text{ is the product of }}r{\text{ distinct primes}},\\0&{\text{otherwise.}}\end{cases}}\end{aligned}}}
${\displaystyle \square }$

Principle of Möbius inversion

We now introduce the the famous Möbius inversion formula.

 Möbius inversion formula Let ${\displaystyle P}$ be a finite poset and ${\displaystyle \mu }$ its Möbius function. Let ${\displaystyle f,g:P\rightarrow \mathbb {R} }$. Then ${\displaystyle \forall x\in P,\,\,g(x)=\sum _{y\leq x}f(y)}$, if and only if ${\displaystyle \forall x\in P,\,\,f(x)=\sum _{y\leq x}g(y)\mu (y,x)}$.

The functions ${\displaystyle f,g:P\rightarrow \mathbb {R} }$ are vectors. Evaluate the matrix multiplications ${\displaystyle f\zeta }$ and ${\displaystyle g\mu }$ as follows:

${\displaystyle (f\zeta )(x)=\sum _{y\in P}f(y)\zeta (y,x)=\sum _{y\leq x}f(y)}$,

and

${\displaystyle (g\mu )(x)=\sum _{y\in P}g(y)\mu (y,x)=\sum _{y\leq x}g(y)\mu (y,x)}$.

The Möbius inversion formula is nothing but the following statement

${\displaystyle f\zeta =g\Leftrightarrow f=g\mu }$,

which is trivially true due to ${\displaystyle \mu \zeta =I}$ by basic linear algebra.

The following dual form of the inversion formula is also useful.

 Möbius inversion formula, dual form Let ${\displaystyle P}$ be a finite poset and ${\displaystyle \mu }$ its Möbius function. Let ${\displaystyle f,g:P\rightarrow \mathbb {R} }$. Then ${\displaystyle \forall x\in P,\,\,g(x)=\sum _{y{\color {red}\geq }x}f(y)}$, if and only if ${\displaystyle \forall x\in P,\,\,f(x)=\sum _{y{\color {red}\geq }x}\mu (x,y)g(y)}$.

To prove the dual form, we only need to evaluate the matrix multiplications on left:

${\displaystyle \zeta f=g\Leftrightarrow f=\mu g}$.
Principle of Inclusion-Exclusion
Let ${\displaystyle A_{1},A_{2},\ldots ,A_{n}\subseteq U}$. For any ${\displaystyle J\subseteq \{1,2,\ldots ,n\}}$,
• let ${\displaystyle f(J)}$ be the number of elements that belongs to exactly the sets ${\displaystyle A_{i},i\in J}$ and to no others, i.e.
${\displaystyle f(J)=\left|\left(\bigcap _{i\in J}A_{i}\right)\setminus \left(\bigcup _{i\not \in J}A_{i}\right)\right|}$;
• let ${\displaystyle g(J)=\left|\bigcap _{i\in J}A_{i}\right|}$.
For any ${\displaystyle J\subseteq \{1,2,\ldots ,n\}}$, the following relation holds for the above defined ${\displaystyle f}$ and ${\displaystyle g}$:
${\displaystyle g(J)=\sum _{I\supseteq J}f(I)}$.
Applying the dual form of the Möbius inversion formula, we have that for any ${\displaystyle J\subseteq \{1,2,\ldots ,n\}}$,
${\displaystyle f(J)=\sum _{I\supseteq J}\mu (J,I)g(I)=\sum _{I\supseteq J}\mu (J,I)\left|\bigcap _{i\in I}A_{i}\right|}$,
where the Möbius function is for the poset of all subsets of ${\displaystyle \{1,2,\ldots ,n\}}$, ordered by ${\displaystyle \subseteq }$, thus it holds that ${\displaystyle \mu (J,I)=(-1)^{|I|-|J|}\,}$ for ${\displaystyle J\subseteq I}$. Therefore,
${\displaystyle f(J)=\sum _{I\supseteq J}(-1)^{|I|-|J|}\left|\bigcap _{i\in I}A_{i}\right|}$.
We have a formula for the number of elements with exactly those properties ${\displaystyle A_{i},i\in J}$ for any ${\displaystyle J\subseteq \{1,2,\ldots ,n\}}$. For the special case that ${\displaystyle J=\emptyset }$, ${\displaystyle f(\emptyset )}$ is the number of elements satisfying no property of ${\displaystyle A_{1},A_{2},\ldots ,A_{n}}$, and
${\displaystyle f(\emptyset )=\left|U\setminus \bigcup _{i}A_{i}\right|=\sum _{I\subseteq \{1,\ldots ,n\}}(-1)^{|I|}\left|\bigcap _{i\in I}A_{i}\right|}$
which gives precisely the Principle of Inclusion-Exclusion.
Möbius inversion formula for number theory
The number-theoretical Möbius inversion formula is stated as such: Let ${\displaystyle N}$ be a positive integer,
${\displaystyle g(n)=\sum _{d|n}f(d)\,}$ for all ${\displaystyle n|N}$
if and only if
${\displaystyle f(n)=\sum _{d|n}g(d)\mu \left({\frac {n}{d}}\right)\,}$ for all ${\displaystyle n|N}$,
where ${\displaystyle \mu }$ is the number-theoretical Möbius function, defined as
${\displaystyle \mu (n)={\begin{cases}1&{\text{if }}n{\text{ is product of an even number of distinct primes,}}\\-1&{\text{if }}n{\text{ is product of an odd number of distinct primes,}}\\0&{\text{otherwise.}}\end{cases}}}$
The number-theoretical Möbius inversion formula is just a special case of the Möbius inversion formula for posets, when the poset is the set of divisors of ${\displaystyle N}$, and for any ${\displaystyle a,b\in P}$, ${\displaystyle a\leq _{P}b}$ if ${\displaystyle a|b}$.

Reference

• Stanley, Enumerative Combinatorics, Volume 1, Chapter 2.
• van Lin and Wilson, A course in combinatorics, Chapter 10, 25.