组合数学 (Fall 2019)/Pólya's theory of counting: Difference between revisions

From TCS Wiki
Jump to navigation Jump to search
imported>Etone
(Created page with "== Principle of Inclusion-Exclusion == Let <math>A</math> and <math>B</math> be two finite sets. The cardinality of their union is :<math>|A\cup B|=|A|+|B|-{\color{Blue}|A\cap...")
 
imported>Etone
No edit summary
 
Line 1: Line 1:
== Principle of Inclusion-Exclusion ==
== Groups ==
Let <math>A</math> and <math>B</math> be two finite sets. The cardinality of their union is
A group <math>(G,\cdot)</math> is set <math>G</math> along with a binary operator <math>\cdot</math> which satisfies the following axioms:
:<math>|A\cup B|=|A|+|B|-{\color{Blue}|A\cap B|}</math>.
* ''closure'': <math>\forall g,h\in G, g\cdot h \in G</math>;
For three sets <math>A</math>, <math>B</math>, and <math>C</math>, the cardinality of the union of these three sets is computed as
* ''associativity'': <math>\forall f,g,h\in G, f\cdot(g\cdot h)=(f\cdot g)\cdot h</math>;
:<math>|A\cup B\cup C|=|A|+|B|+|C|-{\color{Blue}|A\cap B|}-{\color{Blue}|A\cap C|}-{\color{Blue}|B\cap C|}+{\color{Red}|A\cap B\cap C|}</math>.
* ''identity'': there exists a special element <math>e\in G</math>, called the '''identity''', such that <math>e\cdot g=g</math> for any <math>g\in G</math>;
This is illustrated by the following figure.
* ''inverse'': <math>\forall g\in G</math>, there exists an <math>h\in G</math> such that <math>g\cdot h=e</math>, and we denote that <math>h=g^{-1}</math>.
::[[Image:Inclusion-exclusion.png|200px|border|center]]


Generally, the '''Principle of Inclusion-Exclusion''' states the rule for computing the union of <math>n</math> finite sets <math>A_1,A_2,\ldots,A_n</math>, such that
=== Permutation groups===
{{Equation|
A permutation is a bijection <math>\pi:[n]\xrightarrow[\text{onto}]{\text{1-1}}[n]</math>. We can define a natural operator "<math>\cdot</math>" between permutations by function composition, i.e. for any <math>\pi,\sigma\in S_n</math>, <math>(\pi\cdot\sigma)(i)=\pi(\sigma(i))</math> for all <math>i\in[n]</math>.
<math>
\begin{align}
\left|\bigcup_{i=1}^nA_i\right|
&=
\sum_{I\subseteq\{1,\ldots,n\}}(-1)^{|I|-1}\left|\bigcap_{i\in I}A_i\right|.
\end{align}
</math>
}}
-----


In combinatorial enumeration, the Principle of Inclusion-Exclusion is usually applied in its complement form.
;Symmetric group <math>S_n</math>
 
:Let <math>S_n</math> denote the set of all permutations of <math>[n]</math>. It is easy to verify that <math>S_n</math> together with function composition <math>\cdot</math> form a group. This group is called the '''symmetric group''' on <math>n</math> elements.
Let <math>A_1,A_2,\ldots,A_n\subseteq U</math> be subsets of some finite set <math>U</math>. Here <math>U</math> is some universe of combinatorial objects, whose cardinality is easy to calculate (e.g. all strings, tuples, permutations), and each <math>A_i</math> contains the objects with some specific property (e.g. a "pattern") which we want to avoid. The problem is to count the number of objects without any of the <math>n</math> properties. We write <math>\bar{A_i}=U-A_i</math>. The number of objects without any of the properties <math>A_1,A_2,\ldots,A_n</math> is
{{Equation|
<math>
\begin{align}
\left|\bar{A_1}\cap\bar{A_2}\cap\cdots\cap\bar{A_n}\right|=\left|U-\bigcup_{i=1}^nA_i\right|
&=
|U|+\sum_{I\subseteq\{1,\ldots,n\}}(-1)^{|I|}\left|\bigcap_{i\in I}A_i\right|.
\end{align}
</math>
}}
For an <math>I\subseteq\{1,2,\ldots,n\}</math>, we denote
:<math>A_I=\bigcap_{i\in I}A_i</math>
with the convention that <math>A_\emptyset=U</math>. The above equation is stated as:
{{Theorem|Principle of Inclusion-Exclusion|
:Let <math>A_1,A_2,\ldots,A_n</math> be a family of subsets of <math>U</math>. Then the number of elements of <math>U</math> which lie in none of the subsets <math>A_i</math> is
::<math>\sum_{I\subseteq\{1,\ldots, n\}}(-1)^{|I|}|A_I|</math>.
}}


Let <math>S_k=\sum_{|I|=k}|A_I|\,</math>. Conventionally, <math>S_0=|A_\emptyset|=|U|</math>. The principle of inclusion-exclusion can be expressed as
Subgroups of <math>S_n</math> are called '''permutation groups'''. The most basic permutation group is <math>S_n</math> itself (since a group is a subgroup of itself). Besides it, there are several typical permutation groups related to natural symmetric operations.
{{Equation|<math>
\left|\bar{A_1}\cap\bar{A_2}\cap\cdots\cap\bar{A_n}\right|=
S_0-S_1+S_2+\cdots+(-1)^nS_n.
</math>
}}


=== Surjections ===
;Cyclic group <math>C_n</math>
In the twelvefold way, we discuss the counting problems incurred by the mappings <math>f:N\rightarrow M</math>. The basic case is that elements from both <math>N</math> and <math>M</math> are distinguishable. In this case, it is easy to count the number of arbitrary mappings (which is <math>m^n</math>) and the number of injective (one-to-one) mappings (which is <math>(m)_n</math>), but the number of surjective is difficult. Here we apply the principle of inclusion-exclusion to count the number of surjective (onto) mappings.
:Given a permutation <math>\pi\in S_n</math>, denote that  
{{Theorem|Theorem|
::<math>\pi^t=\underbrace{\pi\cdot\pi\cdots\pi}_t</math>,
:The number of surjective mappings from an <math>n</math>-set to an <math>m</math>-set is given by
:and <math>\pi^0=e</math> is the identity.
::<math>\sum_{k=1}^m(-1)^{m-k}{m\choose k}k^n</math>.
:We define a special permutation <math>\sigma\,</math> as that <math>\sigma(i)=(i+1)\bmod n\,</math> for any <math>i</math>, and let <math>C_n=\{\sigma^t\mid t\ge 0\}</math>. We say that <math>C_n</math> is ''generated'' by the permutation <math>\sigma</math>. It is easy to verify that <math>C_n</math> is a subgroup of <math>S_n</math>. The group <math>C_n</math> is called the '''cyclic group''' of order <math>n</math>, which is the group formed by all '''rotational symmetries''' of a regular polygon of <math>n</math> sides.
}}
:It is easy to see that <math>|C_n|=n</math>, that is, there are <math>n</math> rotations of a regular <math>n</math>-gon.
{{Proof|
Let <math>U=\{f:[n]\rightarrow[m]\}</math> be the set of mappings from <math>[n]</math> to <math>[m]</math>. Then <math>|U|=m^n</math>.  


For <math>i\in[m]</math>, let <math>A_i</math> be the set of mappings <math>f:[n]\rightarrow[m]</math> that none of <math>j\in[n]</math> is mapped to <math>i</math>, i.e. <math>A_i=\{f:[n]\rightarrow[m]\setminus\{i\}\}</math>, thus <math>|A_i|=(m-1)^n</math>.
;Dihedral group <math>D_n</math>
:Define another special permutation <math>\rho\,</math> as that <math>\rho(i)=n-i-1\,</math> for all <math>i\in[n]</math>. The '''Dihedral group''' <math>D_n</math> is obtained by adding <math>\rho</math> into <math>C_n</math> and then take a ''closure'' (under operations of "<math>\cdot</math>"). This group is the group of symmetries, including '''rotations''' as well as '''reflections''', of a regular polygon of <math>n</math> sides.
:It is an exercise to check that <math>|D_n|=2n</math>.


More generally, for <math>I\subseteq [m]</math>, <math>A_I=\bigcap_{i\in I}A_i</math> contains the mappings <math>f:[n]\rightarrow[m]\setminus I</math>. And <math>|A_I|=(m-|I|)^n\,</math>.
==== Cycle decomposition ====
A permutation <math>\pi:[n]\xrightarrow[\text{onto}]{\text{1-1}}[n]</math> can be written in the following form:
:<math>\begin{pmatrix}
1 & 2 & \cdots & n \\
\pi(1) & \pi(2) & \cdots & \pi(n)
\end{pmatrix}</math>.
For example,
:<math>\begin{pmatrix}
1 & 2 & 3 & 4 & 5 \\
3 & 5 & 1 & 2 & 4
\end{pmatrix}</math>.
The permutation can be equivalently described as a composition of a number of '''cycles'''. For example, in the above permutation, we have two cycles:
:<math>1\rightarrow 3\rightarrow 1</math> and <math>2\rightarrow5\rightarrow4\rightarrow2</math>.
We can denote the permutation by
:<math>(13)(254)</math>.


A mapping <math>f:[n]\rightarrow[m]</math> is surjective if <math>f</math> lies in none of <math>A_i</math>. By the principle of inclusion-exclusion, the number of surjective <math>f:[n]\rightarrow[m]</math> is
=== Group action ===
:<math>\sum_{I\subseteq[m]}(-1)^{|I|}\left|A_I\right|=\sum_{I\subseteq[m]}(-1)^{|I|}(m-|I|)^n=\sum_{j=0}^m(-1)^j{m\choose j}(m-j)^n</math>.
{{Theorem|Definition (group action)|
Let <math>k=m-j</math>. The theorem is proved.
:A '''group action''' of a group <math>G</math> on a set <math>X</math> is a binary operator:
::<math>\circ:G\times X\rightarrow X</math>
:satisfying:
:* Associativity: <math>(g\cdot h)\circ x=g\circ (h\circ x)</math> for all <math>g,h\in G</math> and <math>x\in X</math>;
:* Identity: <math>e\circ x=x</math> for all <math>x\in X</math>.
}}
}}


Recall that, in the twelvefold way, we establish a relation between surjections and partitions.
==== Example: coloring a necklace ====
 
Suppose a necklace is made of <math>n</math> beads, each with one of the <math>m</math> colors. Formally, a necklace is an assignment <math>x:[n]\rightarrow[m]</math> of <math>m</math> colors to <math>n</math> positions. Let <math>X=\{x:[n]\rightarrow[m]\}</math> be the set of all such assignments.
* Surjection to ordered partition:
:For a surjective <math>f:[n]\rightarrow[m]</math>, <math>(f^{-1}(0),f^{-1}(1),\ldots,f^{-1}(m-1))</math> is an '''ordered partition''' of <math>[n]</math>.
* Ordered partition to surjection:
:For an ordered <math>m</math>-partition <math>(B_0,B_1,\ldots, B_{m-1})</math> of <math>[n]</math>, we can define a function <math>f:[n]\rightarrow[m]</math> by letting <math>f(i)=j</math> if and only if <math>i\in B_j</math>. <math>f</math> is surjective since as a partition, none of <math>B_i</math> is empty.
 
Therefore, we have a one-to-one correspondence between surjective mappings from an <math>n</math>-set to an <math>m</math>-set and the ordered <math>m</math>-partitions of an <math>n</math>-set.


The Stirling number of the second kind <math>\left\{{n\atop m}\right\}</math> is the number of <math>m</math>-partitions of an <math>n</math>-set. There are <math>m!</math> ways to order an <math>m</math>-partition, thus the number of surjective mappings <math>f:[n]\rightarrow[m]</math> is <math>m! \left\{{n\atop m}\right\}</math>. Combining with what we have proved for surjections, we give the following result for the Stirling number of the second kind.
For example, when <math>n=4</math> and <math>m=2</math>, <math>X</math> contains all possible 2-colorings (say red and blue) of 4 positions.
:<math>X=\{{\color{blue}bbbb}, {\color{blue}bbb}{\color{red}r}, {\color{blue}bb}{\color{red}r}{\color{blue}b}, {\color{blue}bb}{\color{red}rr}, {\color{blue}b}{\color{red}r}{\color{blue}b}{\color{red}r}, {\color{blue}b}{\color{red}rr}{\color{blue}b}, {\color{blue}b}{\color{red}rrr},  {\color{red}r}{\color{blue}bbb}, {\color{red}r}{\color{blue}bb}{\color{red}r}, {\color{red}r}{\color{blue}b}{\color{red}r}{\color{blue}b}, {\color{red}r}{\color{blue}b}{\color{red}rr}, {\color{red}rr}{\color{blue}bb}, {\color{red}rr}{\color{blue}b}{\color{red}r}, {\color{red}rrr}{\color{blue}b}, {\color{red}rrrr}\}</math>


{{Theorem|Proposition|
We consider two kinds of symmetric operations on necklaces:
:<math>\left\{{n\atop m}\right\}=\frac{1}{m!}\sum_{k=1}^m(-1)^{m-k}{m\choose k}k^n</math>.
* Rotation: the corresponding group is the cyclic group <math>C_n</math>.
}}
* Rotation and reflection: the corresponding group is the dihedral group <math>D_n</math>.


=== Derangements ===
Mathematically, these operations on necklaces are described as group actions on <math>X</math>. Recall that each member <math>x\in X</math> is an <math>m</math>-coloring <math>x:[n]\rightarrow[m]</math> of <math>n</math> positions. Suppose the permutation group is <math>G</math>, for any <math>\pi\in G</math> and any <math>x\in X</math>, the group action <math>\pi\circ x</math> is naturally defined as such:
We now count the number of bijections from a set to itself with no fixed points. This is the '''derangement problem'''.
:<math>(\pi\circ x)(i)=x(\pi(i))</math>.


For a permutation <math>\pi</math> of <math>\{1,2,\ldots,n\}</math>, a '''fixed point''' is such an <math>i\in\{1,2,\ldots,n\}</math> that <math>\pi(i)=i</math>.
== Burnside's Lemma ==
A [http://en.wikipedia.org/wiki/Derangement '''derangement'''] of <math>\{1,2,\ldots,n\}</math> is a permutation of <math>\{1,2,\ldots,n\}</math> that has no fixed points.


{{Theorem|Theorem|
=== Orbits ===
:The number of derangements of <math>\{1,2,\ldots,n\}</math> given by
{{Theorem|Definition|
::<math>n!\sum_{k=0}^n\frac{(-1)^k}{k!}\approx \frac{n!}{\mathrm{e}}</math>.
:Let <math>G</math> be a permutation group acting on a set <math>X</math>. For any <math>x\in X</math>. The '''orbit''' of <math>x</math> under action of <math>G</math>, denoted <math>Gx</math>, is defined as
::<math>Gx=\{\pi\circ x\mid \pi\in G\}</math>.
}}
}}
{{Proof|
Let <math>U</math> be the set of all permutations of <math>\{1,2,\ldots,n\}</math>. So <math>|U|=n!</math>.


Let <math>A_i</math> be the set of permutations with fixed point <math>i</math>; so <math>|A_i|=(n-1)!</math>. More generally, for any <math>I\subseteq \{1,2,\ldots,n\}</math>, <math>A_I=\bigcap_{i\in I}A_i</math>, and <math>|A_I|=(n-|I|)!</math>, since permutations in <math>A_I</math> fix every point in <math>I</math> and permute the remaining points arbitrarily. A permutation is a derangement if and only if it lies in none of the sets <math>A_i</math>. So the number of derangements is
For any <math>x,y,z\in X</math>, the followings can be verified for orbits
:<math>\sum_{I\subseteq\{1,2,\ldots,n\}}(-1)^{|I|}(n-|I|)!=\sum_{k=0}^n(-1)^k{n\choose k}(n-k)!=n!\sum_{k=0}^n\frac{(-1)^k}{k!}.</math>
* ''reflexivity'': <math>x\in Gx</math> since <math>e\circ x=x</math>;
By Taylor's series,
* ''symmetric'':  if <math>y\in Gx</math> then there is a <math>\pi\in G</math> such that <math>\pi\circ x=y</math>, thus <math>\pi^{-1}\circ y=\pi^{-1}\circ(\pi\circ x) =(\pi^{-1}\cdot \pi)\circ x=e\circ x=x</math>, hence <math>x\in Gy</math>;
:<math>\frac{1}{\mathrm{e}}=\sum_{k=0}^\infty\frac{(-1)^k}{k!}=\sum_{k=0}^n\frac{(-1)^k}{k!}\pm o\left(\frac{1}{n!}\right)</math>.
* ''transitivity'': if <math>y\in Gx</math> and <math>z\in Gy</math>, then there exist <math>\pi,\sigma\in G</math> such that <math>\pi\circ x=y</math> and <math>\sigma\circ y=z</math>, thus <math>(\sigma\cdot \pi)\circ x=\sigma\circ(\pi\circ x)=\sigma\circ y=z</math>, hence <math>z\in Gx</math>.
It is not hard to see that <math>n!\sum_{k=0}^n\frac{(-1)^k}{k!}</math> is the closest integer to <math>\frac{n!}{\mathrm{e}}</math>.
}}


Therefore, there are about <math>\frac{1}{\mathrm{e}}</math> fraction of all permutations with no fixed points.
Therefore, <math>x\in Gy</math> defines an equivalent relation between <math>x</math> and <math>y</math>, and orbits partition the set <math>X</math>.


=== Permutations with restricted positions ===
;Example
We introduce a general theory of counting permutations with restricted positions. In the derangement problem, we count the number of permutations that <math>\pi(i)\neq i</math>. We now generalize to the problem of counting permutations which avoid a set of arbitrarily specified positions.  
:Back to the example of coloring necklaces.


It is traditionally described using terminology from the game of chess. Let <math>B\subseteq \{1,\ldots,n\}\times \{1,\ldots,n\}</math>, called a '''board'''.  As illustrated below, we can think of <math>B</math> as a chess board, with the positions in <math>B</math> marked by "<math>\times</math>".
:<math>X</math> contains all possible 2-colorings (say red and blue) of 4 positions.
{{Chess diagram small
::<math>X=\{{\color{blue}bbbb}, {\color{blue}bbb}{\color{red}r}, {\color{blue}bb}{\color{red}r}{\color{blue}b}, {\color{blue}bb}{\color{red}rr}, {\color{blue}b}{\color{red}r}{\color{blue}bb}, {\color{blue}b}{\color{red}r}{\color{blue}b}{\color{red}r}, {\color{blue}b}{\color{red}rr}{\color{blue}b}, {\color{blue}b}{\color{red}rrr}, {\color{red}r}{\color{blue}bbb}, {\color{red}r}{\color{blue}bb}{\color{red}r}, {\color{red}r}{\color{blue}b}{\color{red}r}{\color{blue}b}, {\color{red}r}{\color{blue}b}{\color{red}rr}, {\color{red}rr}{\color{blue}bb}, {\color{red}rr}{\color{blue}b}{\color{red}r}, {\color{red}rrr}{\color{blue}b}, {\color{red}rrrr}\}</math>
|
|
|=
8 |__|xx|xx|__|xx|__|__|xx|=
7 |xx|__|__|xx|__|__|xx|__|=
6 |xx|__|xx|xx|__|xx|xx|__|=
5 |__|xx|__|__|xx|__|xx|__|=
4 |xx|__|__|__|xx|xx|xx|__|=
3 |__|xx|__|xx|__|__|__|xx|=
2 |__|__|xx|__|xx|__|__|xx|=
1 |xx|__|__|xx|__|xx|__|__|=
a b c d e f g h
|
}}
For a permutation <math>\pi</math> of <math>\{1,\ldots,n\}</math>, define the '''graph''' <math>G_\pi(V,E)</math> as
:<math>
\begin{align}
G_\pi &= \{(i,\pi(i))\mid i\in \{1,2,\ldots,n\}\}.
\end{align}
</math>
This can also be viewed as a set of marked positions on a chess board. Each row and each column has only one marked position, because <math>\pi</math> is a permutation. Thus, we can identify each <math>G_\pi</math> as a placement of <math>n</math> rooks (“城堡”,规则同中国象棋里的“车”) without attacking each other.


For example, the following is the <math>G_\pi</math> of such <math>\pi</math> that <math>\pi(i)=i</math>.
:We consider the rotations of necklaces. The cyclic group <math>C_4</math> acting on <math>X</math> partitions the <math>X</math> into the following equivalent classes:
{{Chess diagram small
::<math>\begin{align}
|
&\{{\color{blue}bbbb}\},\\
|
&\{ {\color{blue}bbb}{\color{red}r}, {\color{blue}bb}{\color{red}r}{\color{blue}b}, {\color{blue}b}{\color{red}r}{\color{blue}bb}, {\color{red}r}{\color{blue}bbb}\},\\
|=
&\{ {\color{blue}bb}{\color{red}rr}, {\color{blue}b}{\color{red}rr}{\color{blue}b}, {\color{red}r}{\color{blue}bb}{\color{red}r}, {\color{red}rr}{\color{blue}bb}, \},\\
8 |rl|__|__|__|__|__|__|__|=
&\{ {\color{blue}b}{\color{red}r}{\color{blue}b}{\color{red}r}, {\color{red}r}{\color{blue}b}{\color{red}r}{\color{blue}b}\},\\
7 |__|rl|__|__|__|__|__|__|=
&\{ {\color{blue}b}{\color{red}rrr}, {\color{red}r}{\color{blue}b}{\color{red}rr}, {\color{red}rr}{\color{blue}b}{\color{red}r}, {\color{red}rrr}{\color{blue}b}\},\\
6 |__|__|rl|__|__|__|__|__|=
&\{ {\color{red}rrrr}\}
5 |__|__|__|rl|__|__|__|__|=
4 |__|__|__|__|rl|__|__|__|=
3 |__|__|__|__|__|rl|__|__|=
2 |__|__|__|__|__|__|rl|__|=
1 |__|__|__|__|__|__|__|rl|=
a b c d e f g h
|
}}
Now define
:<math>\begin{align}
N_0 &= \left|\left\{\pi\mid B\cap G_\pi=\emptyset\right\}\right|\\
r_k &= \mbox{number of }k\mbox{-subsets of }B\mbox{ such that no two elements have a common coordinate}\\
&=\left|\left\{S\in{B\choose k} \,\bigg|\, \forall (i_1,j_1),(i_2,j_2)\in S, i_1\neq i_2, j_1\neq j_2 \right\}\right|
\end{align}
</math>
Interpreted in chess game,
* <math>B</math>: a set of marked positions in an <math>[n]\times [n]</math> chess board.
* <math>N_0</math>: the number of ways of placing <math>n</math> non-attacking rooks on the chess board such that none of these rooks lie in <math>B</math>.
* <math>r_k</math>: number of ways of placing <math>k</math> non-attacking rooks on <math>B</math>.
 
Our goal is to count <math>N_0</math> in terms of <math>r_k</math>. This gives the number of permutations avoid all positions in a <math>B</math>.
 
{{Theorem|Theorem|
:<math>N_0=\sum_{k=0}^n(-1)^kr_k(n-k)!</math>.
}}
{{Proof|
For each <math>i\in[n]</math>, let <math>A_i=\{\pi\mid (i,\pi(i))\in B\}</math> be the set of permutations <math>\pi</math> whose <math>i</math>-th position is in <math>B</math>.
 
<math>N_0</math> is the number of permutations avoid all positions in <math>B</math>. Thus, our goal is to count the number of permutations <math>\pi</math> in none of <math>A_i</math> for <math>i\in [n]</math>.
 
For each <math>I\subseteq [n]</math>, let <math>A_I=\bigcap_{i\in I}A_i</math>, which is the set of permutations <math>\pi</math> such that <math>(i,\pi(i))\in B</math> for all <math>i\in I</math>. Due to the principle of inclusion-exclusion,
:<math>N_0=\sum_{I\subseteq [n]} (-1)^{|I|}|A_I|=\sum_{k=0}^n(-1)^k\sum_{I\in{[n]\choose k}}|A_I|</math>.
 
The next observation is that
:<math>\sum_{I\in{[n]\choose k}}|A_I|=r_k(n-k)!</math>,
because we can count both sides by first placing <math>k</math> non-attacking rooks on <math>B</math> and placing <math>n-k</math> additional non-attacking rooks on <math>[n]\times [n]</math> in <math>(n-k)!</math> ways.
 
Therefore,
:<math>N_0=\sum_{k=0}^n(-1)^kr_k(n-k)!</math>.
}}
 
====Derangement problem====
We use the above general method to solve the derange problem again.
 
Take <math>B=\{(1,1),(2,2),\ldots,(n,n)\}</math> as the chess board.  A derangement <math>\pi</math> is a placement of <math>n</math> non-attacking rooks such that none of them is in <math>B</math>.
{{Chess diagram small
|
|
|=
8 |xx|__|__|__|__|__|__|__|=
7 |__|xx|__|__|__|__|__|__|=
6 |__|__|xx|__|__|__|__|__|=
5 |__|__|__|xx|__|__|__|__|=
4 |__|__|__|__|xx|__|__|__|=
3 |__|__|__|__|__|xx|__|__|=
2 |__|__|__|__|__|__|xx|__|=
1 |__|__|__|__|__|__|__|xx|=
a b c d e f g h
|
}}
Clearly, the number of ways of placing <math>k</math> non-attacking rooks on <math>B</math> is <math>r_k={n\choose k}</math>. We want to count <math>N_0</math>, which gives the number of ways of placing <math>n</math> non-attacking rooks such that none of these rooks lie in <math>B</math>.
 
By the above theorem
:<math>
N_0=\sum_{k=0}^n(-1)^kr_k(n-k)!=\sum_{k=0}^n(-1)^k{n\choose k}(n-k)!=\sum_{k=0}^n(-1)^k\frac{n!}{k!}=n!\sum_{k=0}^n(-1)^k\frac{1}{k!}\approx\frac{n!}{e}.
</math>
 
====Problème des ménages====
Suppose that in a banquet, we want to seat <math>n</math> couples at a circular table, satisfying the following constraints:
* Men and women are in alternate places.
* No one sits next to his/her spouse.
 
In how many ways can this be done?
 
(For convenience, we assume that every seat at the table marked differently so that rotating the seats clockwise or anti-clockwise will end up with a '''different''' solution.)
 
First, let the <math>n</math> ladies find their seats. They may either sit at the odd numbered seats or even numbered seats, in either case, there are <math>n!</math> different orders. Thus, there are <math>2(n!)</math> ways to seat the <math>n</math> ladies.
 
After sitting the wives, we label the remaining <math>n</math> places clockwise as <math>0,1,\ldots, n-1</math>. And a seating of the <math>n</math> husbands is given by a permutation <math>\pi</math> of <math>[n]</math> defined as follows. Let <math>\pi(i)</math> be the seat of the husband of he lady sitting at the <math>i</math>-th place.
 
It is easy to see that <math>\pi</math> satisfies that <math>\pi(i)\neq i</math> and <math>\pi(i)\not\equiv i+1\pmod n</math>, and every permutation <math>\pi</math> with these properties gives a feasible seating of the <math>n</math> husbands. Thus, we only need to count the number of permutations <math>\pi</math> such that <math>\pi(i)\not\equiv i, i+1\pmod n</math>.
 
Take <math>B=\{(0,0),(1,1),\ldots,(n-1,n-1), (0,1),(1,2),\ldots,(n-2,n-1),(n-1,0)\}</math> as the chess board.  A permutation <math>\pi</math> which defines a way of seating the husbands, is a placement of <math>n</math> non-attacking rooks such that none of them is in <math>B</math>.
{{Chess diagram small
|
|
|=
8 |xx|xx|__|__|__|__|__|__|=
7 |__|xx|xx|__|__|__|__|__|=
6 |__|__|xx|xx|__|__|__|__|=
5 |__|__|__|xx|xx|__|__|__|=
4 |__|__|__|__|xx|xx|__|__|=
3 |__|__|__|__|__|xx|xx|__|=
2 |__|__|__|__|__|__|xx|xx|=
1 |xx|__|__|__|__|__|__|xx|=
a b c d e f g h
|
}}
We need to compute <math>r_k</math>, the number of ways of placing <math>k</math> non-attacking rooks on <math>B</math>. For our choice of <math>B</math>, <math>r_k</math> is the number of ways of choosing <math>k</math> points, no two consecutive, from a collection of <math>2n</math> points arranged in a circle.
 
We first see how to do this in a ''line''.
{{Theorem|Lemma|
:The number of ways of choosing <math>k</math> ''non-consecutive'' objects from a collection of <math>m</math> objects arranged in a ''line'', is <math>{m-k+1\choose k}</math>.
}}
{{Proof|
We draw a line of <math>m-k</math> black points, and then insert <math>k</math> red points into the <math>m-k+1</math> spaces between the black points (including the beginning and end).
::<math>
\begin{align}
&\sqcup \, \bullet \, \sqcup \, \bullet \, \sqcup \, \bullet \, \sqcup \, \bullet \, \sqcup \, \bullet \, \sqcup \, \bullet \, \sqcup \, \bullet \, \sqcup \\
&\qquad\qquad\qquad\quad\Downarrow\\
&\sqcup \, \bullet \,\, {\color{Red}\bullet} \, \bullet \,\, {\color{Red}\bullet} \, \bullet \, \sqcup \, \bullet \,\, {\color{Red}\bullet}\, \, \bullet \, \sqcup \, \bullet \, \sqcup \, \bullet \,\, {\color{Red}\bullet}
\end{align}
\end{align}
</math>
</math>
This gives us a line of <math>m</math> points, and the red points specifies the chosen objects, which are non-consecutive. The mapping is 1-1 correspondence.
There are <math>{m-k+1\choose k}</math> ways of placing <math>k</math> red points into <math>m-k+1</math> spaces.
}}


The problem of choosing non-consecutive objects in a circle can be reduced to the case that the objects are in a line.
=== Invariant sets and stabilizers ===
Let <math>G</math> be a permutation group acting on a set <math>X</math>. Let <math>\pi\in G, x\in X</math>.
* The '''invariant set''' of <math>\pi</math>:
::<math>X_\pi=\{x\in X\mid \pi\circ x=x\}</math>.
* The '''stabilizer''' of <math>x</math>:
::<math>G_x=\{\pi\in G\mid \pi\circ x=x\}</math>.


There is a beautiful identity for invariant sets and stabilizers.
{{Theorem|Lemma|
{{Theorem|Lemma|
:The number of ways of choosing <math>k</math> ''non-consecutive'' objects from a collection of <math>m</math> objects arranged in a ''circle'', is <math>\frac{m}{m-k}{m-k\choose k}</math>.
:Let <math>G</math> be a permutation group acting on a set <math>X</math>. For any <math>x\in X</math>,
::<math>|G_x||Gx|=|G|\,</math>.
}}
}}
{{Proof|
{{Proof|
Let <math>f(m,k)</math> be the desired number; and let <math>g(m,k)</math> be the number of ways of choosing <math>k</math> non-consecutive points from <math>m</math> points arranged in a circle, next coloring the <math>k</math> points red, and then coloring one of the uncolored point blue.  
Recall that the orbit <math>Gx=\{\pi\circ x\mid \pi\in G\}</math>. Suppose that <math>Gx=\{x_1,x_2,\ldots,x_t\}</math>. Then there exist a set <math>P=\{\pi_1,\pi_2,\ldots,\pi_t\}</math> of permutations such that <math>\pi_i\circ x=x_i\,</math> for <math>i=1,2,\ldots,t</math>. We construct a bijection between <math>G</math> and <math>P\times G_x</math>, which will prove the lemma.


Clearly, <math>g(m,k)=(m-k)f(m,k)</math>.
For any <math>\pi\in G</math>, it holds that <math>\pi\circ x=x_i</math> for some <math>x_i</math>, and since <math>\pi_i\circ x=x_i</math>, we have
<math>\pi_i\circ x=\pi\circ x</math>, hence
:<math>(\pi_i^{-1}\cdot\pi)\circ x=\pi_i^{-1}\circ(\pi\circ x)=\pi_i^{-1}\circ(\pi_i\circ x)=(\pi_i^{-1}\cdot\pi_i)\circ x=e\circ x=x</math>.
Denote that <math>\sigma=\pi_i^{-1}\cdot \pi</math>. Then
*<math>\pi_i\cdot\sigma=\pi_i\cdot(\pi_i^{-1}\cdot\pi)=\pi</math>, and
*<math>\sigma\circ x=(\pi_i^{-1}\cdot\pi)\circ x=x</math>, which implies that <math>\sigma\in G_x</math>.
Therefore, each <math>\pi\in G</math> corresponds to a unique pair <math>(\pi_i,\sigma)\in P\times G_x</math>. We have a 1-1 mapping from <math>G</math> to <math>P\times G_x</math>.


But we can also compute <math>g(m,k)</math> as follows:
For every <math>\pi_i\in P</math> and every <math>\sigma\in G_x</math>, <math>\pi=\pi_i\cdot\sigma\in G</math>. We show that this is a 1-1 mapping. Suppose that <math>\pi_i\cdot\sigma=\pi_j\cdot\tau</math> for some <math>\pi_i,\pi_j\in P</math> and <math>\sigma,\tau\in G_x</math>. Then <math>(\pi_i\cdot\sigma)\circ x=\pi_i\circ(\sigma\circ x)=\pi_i\circ x=x_i</math> and <math>(\pi_j\cdot \tau)\circ x=\pi_j\circ(\tau\circ x)=\pi_j\circ x=x_j</math>, which implies <math>x_i=x_j</math> and correspondingly <math>\pi_i=\pi_j</math>. Since <math>\pi_i\cdot\sigma=\pi_j\cdot\tau</math>, <math>\sigma=\tau</math>. Therefore, we show that the mapping is 1-1.
* Choose one of the <math>m</math> points and color it blue. This gives us <math>m</math> ways.
* Cut the circle to make a line of <math>m-1</math> points by removing the blue point.
* Choose <math>k</math> non-consecutive points from the line of <math>m-1</math> points and color them red. This gives <math>{m-k\choose k}</math> ways due to the previous lemma.


Thus, <math>g(m,k)=m{m-k\choose k}</math>. Therefore we have the desired number <math>f(m,k)=\frac{m}{m-k}{m-k\choose k}</math>.
In conclusion, there exists a bijection between <math>P\times G_x</math> and <math>G</math>. As a consequence, <math>|Gx||G_x|=|P||G_x|=|P\times G_x|=|G|</math>.
}}
}}


By the above lemma, we have that <math>r_k=\frac{2n}{2n-k}{2n-k\choose k}</math>. Then apply the theorem of counting permutations with restricted positions,
=== Counting orbits===
:<math>
N_0=\sum_{k=0}^n(-1)^kr_k(n-k)!=\sum_{k=0}^n(-1)^k\frac{2n}{2n-k}{2n-k\choose k}(n-k)!.
</math>
 
This gives the number of ways of seating the <math>n</math> husbands ''after the ladies are seated''. Recall that there are <math>2n!</math> ways of seating the <math>n</math> ladies. Thus, the total number of ways of seating <math>n</math> couples as required by problème des ménages is
:<math>
2n!\sum_{k=0}^n(-1)^k\frac{2n}{2n-k}{2n-k\choose k}(n-k)!.
</math>
 
== Inversion ==
 
=== Posets ===
A '''partially ordered set''' or '''poset''' for short is a set <math>P</math> together with a binary relation denoted <math>\le_P</math> (or just <math>\le</math> if no confusion is caused), satisfying
* (''reflexivity'') For all <math>x\in P, x\le x</math>.
* (''antisymmetry'') If <math>x\le y</math> and <math>y\le x</math>, then <math>x=y</math>.
* (''transitivity'') If <math>x\le y</math> and <math>y\le z</math>, then <math>x\le z</math>.
 
We say two elements <math>x</math> and <math>y</math> are '''comparable''' if <math>x\le y</math> or <math>y\le x</math>; otherwise <math>x</math> and <math>y</math> are '''incomparable'''.
 
;Notation
* <math>x\ge y</math> means <math>y\le x</math>.
* <math>x<y</math> means <math>x\le y</math> and <math>x\neq y</math>.
* <math>x>y</math> means <math>y<x</math>.
 
=== The Möbius function===
Let <math>P</math> be a finite poset. Consider functions in form of <math>\alpha:P\times P\rightarrow\mathbb{R}</math> defined over domain <math>P\times P</math>. It is convenient to treat such functions as matrices whose rows and columns are indexed by <math>P</math>.
 
;Incidence algebra of poset
:Let
::<math>I(P)=\{\alpha:P\times P\rightarrow\mathbb{R}\mid \alpha(x,y)=0\text{ for all }x\not\le_P y\}</math>
:be the class of <math>\alpha</math> such that <math>\alpha(x,y)</math> is non-zero only for <math>x\le_P y</math>.
 
:Treating <math>\alpha</math> as matrix, it is trivial to see that <math>I(P)</math> is closed under addition and scalar multiplication, that is,
:* if <math>\alpha,\beta\in I(P)</math> then <math>\alpha+\beta\in I(P)</math>;
:* if <math>\alpha\in I(P)</math> then <math>c\alpha\in I(P)</math> for any <math>c\in\mathbb{R}</math>;
:where <math>\alpha,\beta</math> are treated as matrices.
 
:With this spirit, it is natural to define the matrix multiplication in <math>I(P)</math>. For <math>\alpha,\beta\in I(P)</math>,
::<math>(\alpha\beta)(x,y)=\sum_{z\in P}\alpha(x,z)\beta(z,y)=\sum_{x\le z\le y}\alpha(x,z)\beta(z,y)</math>.
:The second equation is due to that for <math>\alpha,\beta\in I(P)</math>, for all <math>z</math> other than <math>x\le z\le y</math>, <math>\alpha(x,z)\beta(z,y)</math> is zero.
:By the transitivity of relation <math>\le_P</math>, it is also easy to prove that <math>I(P)</math> is closed under matrix multiplication (the detailed proof is left as an exercise). Therefore, <math>I(P)</math> is closed under addition, scalar multiplication and matrix multiplication, so we have an algebra <math>I(P)</math>, called '''incidence algebra''', over functions on <math>P\times P</math>.
 
;Zeta function and Möbius function
:A special function in <math>I(P)</math> is the so-called '''zeta function''' <math>\zeta</math>, defined as
::<math>\zeta(x,y)=\begin{cases}1&\text{if }x\le_P y,\\0 &\text{otherwise.}\end{cases}</math>
:As a matrix (or more accurately, as an element of the incidence algebra), <math>\zeta</math> is invertible and its inversion, denoted by <math>\mu</math>, is called the '''Möbius function'''. More precisely, <math>\mu</math> is also in the incidence algebra <math>I(P)</math>, and <math>\mu\zeta=I</math> where <math>I</math> is the identity matrix (the identity of the incidence algebra <math>I(P)</math>).


There is an equivalent explicit definition of Möbius function.
{{Theorem|Burnside's Lemma|
{{Theorem|Definition (Möbius function)|
:Let <math>G</math> be a permutation group acting on a set <math>X</math>. The number of orbits, denoted <math>|X/G|</math>, is
:<math>\mu(x,y)=\begin{cases}
::<math>|X/G|=\frac{1}{|G|}\sum_{\pi\in G}|X_{\pi}|.</math>
-\sum_{x\le z< y}\mu(x,z)&\text{if }x<y,\\
1&\text{if }x=y,\\
0&\text{if }x\not\le y.
\end{cases}
</math>
}}
 
To see the equivalence between this definition and the inversion of zeta function, we may have the following proposition, which is proved by directly evaluating <math>\mu\zeta</math>.
{{Theorem|Proposition|
:For any <math>x,y\in P</math>,
::<math>\sum_{x\le z\le y}\mu(x,z)=\begin{cases}1 &\text{if }x=y,\\
0 &\text{otherwise.}\end{cases}</math>
}}
}}
{{Proof|
{{Proof|
It holds that
Let
:<math>(\mu\zeta)(x,y)=\sum_{x\le z\le y}\mu(x,z)\zeta(z,y)=\sum_{x\le z\le y}\mu(x,z)</math>.
:<math>A(\pi,x)=\begin{cases}1 & \pi\circ x=x,\\ 0 & \text{otherwise.}\end{cases}</math>
On the other hand, <math>\mu\zeta=I</math>, i.e.
Basically <math>A(\pi,x)</math> indicates whether <math>x</math> is invariant under action of <math>\pi</math>. We can think of <math>A</math> as a matrix indexed by <math>G\times X</math>. An important observation is that the row sums and column sums of <math>A</math> are <math>|X_\pi|</math> and <math>|G_x|</math> respectively, namely,
:<math>(\mu\zeta)(x,y)=\begin{cases}1 &\text{if }x=y,\\
:<math>|X_\pi|=\sum_{x\in X}A(\pi,x)</math> and <math>|G_x|=\sum_{\pi\in G}A(\pi,x)</math>.
0 &\text{otherwise.}\end{cases}</math>
Then a double counting gives that
The proposition follows.
:<math>\sum_{\pi\in G}|X_\pi|=\sum_{\pi\in G}\sum_{x\in X}A(\pi,x)=\sum_{x\in X}\sum_{\pi\in G}A(\pi,x)=\sum_{x\in X}|G_x|</math>.
}}
Note that <math>\mu(x,y)=\sum_{x\le z\le y}\mu(x,z)-\sum_{x\le z< y}\mu(x,z)</math>, which gives the above inductive definition of Möbius function.
 
=== Computing Möbius functions===
We consider the simple poset <math>P=[n]</math>, where <math>\le</math> is the total order. It follows directly from the recursive definition of Möbius function that
:<math>\mu(i,j)=\begin{cases}1 & \text{if }i=j,\\
-1 & \text{if }i+1=j,\\
0 & \text{otherwise.}
\end{cases}
</math>
 
Usually for general posets, it is difficult to directly compute the Möbius function from its definition. We introduce a rule helping us compute the Möbius function by decomposing the poset into posets with simple structures.


{{Theorem|Theorem (the product rule)|
Due to the above lemma, <math>|G_x|=\frac{|G|}{|Gx|}</math>, thus
: Let <math>P</math> and <math>Q</math> be two finite posets, and <math>P\times Q</math> be the poset resulted from Cartesian product of <math>P</math> and <math>Q</math>, where for all <math>(x,y), (x',y')\in P\times Q</math>, <math>(x,y)\le (x',y')</math> if and only if <math>x\le x'</math> and <math>y\le y'</math>. Then
:<math>\sum_{x\in X}|G_x|=\sum_{x\in X}\frac{|G|}{|Gx|}=|G|\sum_{x\in X}\frac{1}{|Gx|}</math>.
::<math>\mu_{P\times Q}((x,y),(x',y'))=\mu_P(x,x')\mu_Q(y,y')</math>.
We may enumerate the <math>x\in X</math> in the sum by first enumerating the orbits and then the elements in each orbit. Denote the orbits by <math>X_1,X_2,\ldots,X_{|X/G|}</math>. They forms a partition of <math>X</math>, and for any <math>X_i</math> and every <math>x\in X_i</math>, <math>X_i=Gx</math>. Thus,
:<math>|G|\sum_{x\in X}\frac{1}{|Gx|}=|G|\sum_{i=1}^{|X/G|}\sum_{x\in X_i}\frac{1}{|Gx|}=|G|\sum_{i=1}^{|X/G|}\sum_{x\in X_i}\frac{1}{|X_i|}=|G|\sum_{i=1}^{|X/G|}1=|G|{|X/G|}</math>.
Combining everything together
:<math>\sum_{\pi\in G}|X_\pi|=\sum_{x\in X}|G_x|=|G|\sum_{x\in X}\frac{1}{|Gx|}=|G|{|X/G|}</math>,
which gives us that the number of orbits is
:<math>{|X/G|}=\frac{1}{|G|}\sum_{\pi\in G}|X_\pi|</math>.
}}
}}
{{Proof|
We use the recursive definition
:<math>\mu(x,y)=\begin{cases}
-\sum_{x\le z< y}\mu(x,z)&\text{if }x<y,\\
1&\text{if }x=y,\\
0&\text{if }x\not\le y.
\end{cases}
</math>
to prove the equation in the theorem.


If <math>(x,y)=(x',y')</math>, then <math>x=x'</math> and <math>y=y'</math>. It is easy to see that both sides of the equation are 1. If <math>(x,y)\not\le(x',y')</math>, then either <math>x\not\le x'</math> or <math>y\not\le y'</math>. It is also easy to see that both sides are 0.
==Pólya's Theory of Counting ==
=== The cycle index ===
Suppose we want to count the number of ways to color <math>n</math> objects using up to <math>m</math> colors, discounting symmetries on the objects described by a group <math>G</math>. If a coloring <math>x:[n]\rightarrow[m]</math> is invariant under the action of a permutation <math>\pi\in G</math>, then every position in the same cycle of <math>\pi</math> must have the same color. Therefore, if <math>\pi</math> is decomposed to <math>k</math> disjoint cycles, the number of colorings invariant under the action of <math>\pi</math> is <math>|X_\pi|=m^k</math>.


The only remaining case is that <math>(x,y)<(x',y')</math>, in which case either <math>x<x'</math> or <math>y<y'</math>.
We define the cycle index of a permutation group <math>G</math>.
:<math>
For a permutation <math>\pi\in G</math> of <math>[n]</math>, if <math>\pi</math> is a product of <math>k</math> cycles, and the <math>i</math>th cycle has length <math>\ell_i</math>, let
\begin{align}
:<math>M_\pi(x_1,x_2,\ldots,x_n)=\prod_{i=1}^k x_{\ell_i}</math>.
\sum_{(x,y)\le (u,v)\le (x',y')}\mu_P(x,u)\mu_Q(y,v)
The '''cycle index''' of <math>G</math> is defined by
&=\left(\sum_{x\le u\le x'}\mu_P(x,u)\right)\left(\sum_{y\le v\le y'}\mu_Q(y,v)\right)=I(x,x')I(y,y')=0,
:<math>P_G(x_1,x_2,\ldots,x_n)=\frac{1}{|G|}\sum_{\pi\in G}M_\pi(x_1,x_2,\ldots,x_n)</math>.
\end{align}
</math>
where the last two equations are due to the proposition for <math>\mu</math>. Thus
:<math>\mu_P(x,x')\mu_Q(y,y')=-\sum_{(x,y)\le (u,v)< (x',y')}\mu_P(x,u)\mu_Q(y,v)</math>.


By induction, assume that the equation <math>\mu_{P\times Q}((x,y),(u,v))=\mu_P(x,u)\mu_Q(y,v)</math> is true for all <math>(u,v)< (x',y')</math>. Then
Due to Burnside's lemma, the number of equivalent classes of <math>m</math>-colorings of <math>n</math> objects, discounting the symmetries described by permutation group <math>G</math>, is given by
:<math>
:<math>P_G(\underbrace{m,m,\ldots,m}_{n})</math>
\begin{align}
\mu_{P\times Q}((x,y),(x',y'))
&=-\sum_{(x,y)\le (u,v)< (x',y')}\mu_{P\times Q}((x,y),(u,v))\\
&=-\sum_{(x,y)\le (u,v)< (x',y')}\mu_P(x,u)\mu_Q(y,v)\\
&=\mu_P(x,x')\mu_Q(y,y'),
\end{align}
</math>
which complete the proof.
}}


;Poset of subsets
=== Pólya's enumeration formula ===
:Consider the poset defined by all subsets of a finite universe <math>U</math>, that is <math>P=2^U</math>, and for <math>S,T\subseteq U</math>, <math>S\le_P T</math> if and only if <math>S\subseteq T</math>.
For any tuple <math>\mathbf{v}=(n_1,n_2,\ldots,n_m)</math> of nonnegative integers satisfying that <math>n_1+n_2+\cdots +n_m=n</math>, let <math>a_{\mathbf{v}}</math> represent the number of nonequivalent (with respect to a permutation group <math>G</math>) <math>m</math>-colorings of the <math>n</math> objects, where the <math>i</math>th color occurs precisely <math>n_i</math> times. The '''pattern inventory''' is the (multivariate) generating function for the sequence <math>a_{\mathbf{v}}</math>, defined as
:<math>F_G(y_1,y_2,\ldots,y_m)=\sum_{\mathbf{v}}a_{\mathbf{v}}y_1^{n_1}y_2^{n_2}\cdots y_m^{n_m}</math>,
where the sum runs over all <math>\mathbf{v}=(n_1,n_2,\ldots,n_m)</math> satisfying that <math>n_1+n_2+\cdots +n_m=n</math> and <math>n_i\ge 0, 1\le i\le m</math>.


{{Theorem| Möbius function for subsets|
{{Theorem|Theorem (Pólya's enumeration formula)|
:The Möbius function for the above defined poset <math>P</math> is that for <math>S,T\subseteq U</math>,
:Let <math>G</math> be a permutation group. The pattern inventory for the nonequivalent <math>m</math>-colorings of <math>n</math> objects under the action of <math>G</math> is:
::<math>\mu(S,T)=
::<math>F_G(y_1,y_2,\ldots,y_m)=P_G\left(\sum_{i=1}^m y_i, \sum_{i=1}^m y_i^2,\ldots, \sum_{i=1}^m y_i^n\right)</math>.
\begin{cases}
(-1)^{|T|-|S|} & \text{if }S\subseteq T,\\
0 &\text{otherwise.}
\end{cases}
</math>
}}
}}
{{Proof|
{{Proof|
We can equivalently represent each <math>S\subseteq U</math> by a boolean string <math>S\in\{0,1\}^U</math>, where <math>S(x)=1</math> if and only if <math>x\in S</math>.
Let <math>\mathbf{v}=(n_1,\ldots,n_m)</math> be a vector of nonnegative integers satisfying that <math>n_1+\cdots+n_m=n</math>, and let <math>a_{\mathbf{v}}</math> represent the number of nonequivalent (with respect to a permutation group <math>G</math>) <math>m</math>-colorings of the <math>n</math> objects, such that the <math>i</math>th color occurs precisely <math>n_i</math> times.
 
For each element <math>x\in U</math>, we can define a poset <math>P_x=\{0, 1\}</math> with <math>0\le 1</math>. By definition of Möbius function, the Möbius function of this elementary poset is given by <math>\mu_x(0,0)=\mu_x(1,1)=1</math>, <math>\mu_x(0,1)=-1</math> and <math>\mu(1,0)=0</math>.
 
The poset <math>P</math> of all subsets of <math>U</math> is the Cartesian product of all <math>P_x</math>, <math>x\in U</math>. By the product rule,
:<math>\mu(S,T)=\prod_{x\in U}\mu_x(S(x), T(x))=\prod_{x\in S\atop x\in T}1\prod_{x\not\in S\atop x\not\in T}1\prod_{x\in S\atop x\not\in T}0\prod_{x\not\in S\atop x\in T}(-1)=\begin{cases}
(-1)^{|T|-|S|} & \text{if }S\subseteq T,\\
0 &\text{otherwise.}
\end{cases}</math>
}}


:Note that the poset <math>P</math> is actually the [http://en.wikipedia.org/wiki/Boolean_algebra_(structure) Boolean algebra] of rank <math>|U|</math>. The proof relies only on that the fact that the poset is a Boolean algebra, thus the theorem holds for Boolean algebra posets.
The set of configurations is now
:<math>X^{\mathbf{v}}=\{x:[n]\rightarrow[m]\mid \forall i\in[m], x^{-1}(i)=n_i\}</math>,
which contains all <math>m</math>-colorings of the <math>n</math> objects such that the <math>i</math>th color occurs precisely <math>n_i</math> times, without considering symmetries.


;Posets of divisors
For every <math>\pi\in G</math>, the invariant set is now
:Consider the poset defined by all devisors of a positive integer <math>n</math>, that is <math>P=\{a>0\mid a|n\}</math>, and for <math>a,b\in P</math>, <math>a\le_P b</math> if and only if <math>a|b\,</math>.
:<math>X_\pi^{\mathbf{v}}=\{x\in X^{\mathbf{v}}\mid \pi\circ x=x\}</math>.


{{Theorem|Möbius function for divisors|
;Claim 1<nowiki>:</nowiki> <math>a_{\mathbf{v}}=\frac{1}{|G|}\sum_{\pi\in G}|X_\pi^{\mathbf{v}}|</math>.
:The Möbius function for the above defined poset <math>P</math> is that for <math>a,b>0</math> that <math>a|n</math> and <math>b|n</math>,
{{Proofbox|
::<math>\mu(a,b)=
;Proof of Claim 1.
\begin{cases}
For <math>\mathbf{v}=(n_1,\ldots,n_m)</math>, given any permutation <math>\pi\in G</math>, <math>|X^{\mathbf{v}}_\pi|</math> gives the number of colorings <math>x</math> in <math>X^\mathbf{v}</math> invariant under action of <math>\pi</math>, and <math>a_\mathbf{v}</math> is the number of orbits of <math>X^\mathbf{v}</math> induced by the action of permutation group <math>G</math>. Due to Burnside's lemma,
(-1)^{r} & \text{if }\frac{b}{a}\text{ is the product of }r\text{ distinct primes},\\
:<math>a_{\mathbf{v}}=\frac{1}{|G|}\sum_{\pi\in G}|X_\pi^{\mathbf{v}}|</math>.
0 &\text{otherwise, i.e. if }a\not|b\text{ or }\frac{b}{a}\text{ is not squarefree.}
\end{cases}
</math>
}}
}}
{{Proof|
;Claim 2<nowiki>:</nowiki> <math>M_\pi\left(\sum_{i=1}^m y_i, \sum_{i=1}^my_i^2,\ldots,\sum_{i=1}^my_i^n\right)=\sum_{\mathbf{v}=(n_1,\ldots,n_m)\atop n_1+\cdots+n_m=n}|X^{\mathbf{v}}_\pi|y_1^{n_1}y_2^{n_2}\cdots y_m^{n_m}</math>.
Denote <math>n=p_1^{n_1}p_2^{n_2}\cdots p_k^{n_k}</math>. Represent <math>n</math> by a tuple <math>(n_1,n_2,\ldots,n_k)</math>. Every <math>a\in P</math> corresponds in this way to a tuple <math>(a_1,a_2,\ldots,a_k)</math> with <math>a_i\le n_i</math> for all <math>1\le i\le k</math>.
{{Proofbox|
 
;Proof of Claim 2.
Let <math>P_i=\{1,2,\ldots,n_i\}</math> be the poset with <math>\le</math> being the total order. The poset <math>P</math> of divisors of <math>n</math> is thus isomorphic to the poset constructed by the Cartesian product of all <math>P_i</math>, <math>1\le i\le k</math>. Then
Supposed that a permutation <math>\pi\in G</math> can be represented as a composition of <math>k</math> cycles, the <math>i</math>th cycle of length <math>\ell_i</math>.
:<math>\pi=\overbrace{\underbrace{(\cdots\cdots)}_{\ell_1}\underbrace{(\cdots)}_{\ell_2}\cdots\underbrace{(\cdots)}_{\ell_k}}^{k \text{ cycles}}</math>
The permutation <math>\pi</math> does not change a coloring <math>x</math>, if and only if every cycle of <math>\pi</math> are assigned the same color. Then <math>|X^{\mathbf{v}}_\pi|</math> equals the number of <math>m</math>-colorings of <math>k</math> cycles, where the <math>i</math>th cycle is of length <math>\ell_i</math>, satisfying that the <math>i</math>th color occurs precisely <math>n_i</math> times. The following generalized version of multinomial theorem generates <math>|X^{\mathbf{v}}_\pi|</math> for all <math>\mathbf{v}=(n_1,\ldots,n_m)</math>.
:<math>
:<math>
\begin{align}
\begin{align}
\mu(a,b)
&\quad\, (y_1^{\ell_1}+y_2^{\ell_1}+\cdots+y_m^{\ell_1})(y_1^{\ell_2}+y_2^{\ell_2}+\cdots+y_m^{\ell_2})\cdots(y_1^{\ell_m}+y_2^{\ell_m}+\cdots+y_m^{\ell_m})\\
&=\prod_{1\le i\le k}\mu(a_i,b_i)=\prod_{1\le i\le k\atop a_i=b_i}1\prod_{1\le i\le k\atop b_i-a_i=1}(-1)\prod_{1\le i\le k\atop b_i-a_i\not\in\{0,1\}}0
&=\sum_{\mathbf{v}=(n_1,\ldots,n_m)\atop n_1+\cdots+n_m=n}|X^{\mathbf{v}}_\pi|y_1^{n_1}y_2^{n_2}\cdots y_m^{n_m}.
=\begin{cases}
(-1)^{\sum_{i}(b_i-a_i)} & \text{if all }b_i-a_i\in\{0,1\},\\
0 &\text{otherwise.}
\end{cases}\\
&=\begin{cases}
(-1)^{r} & \text{if }\frac{b}{a}\text{ is the product of }r\text{ distinct primes},\\
0 &\text{otherwise.}
\end{cases}
\end{align}
\end{align}
</math>
</math>
}}


=== Principle of Möbius inversion ===
On the other hand,
We now introduce the the famous Möbius inversion formula.
:<math>\begin{align}
{{Theorem|Möbius inversion formula|
&\quad\, (y_1^{\ell_1}+y_2^{\ell_1}+\cdots+y_m^{\ell_1})(y_1^{\ell_2}+y_2^{\ell_2}+\cdots+y_m^{\ell_2})\cdots(y_1^{\ell_m}+y_2^{\ell_m}+\cdots+y_m^{\ell_m})\\
:Let <math>P</math> be a finite poset and <math>\mu</math> its Möbius function. Let <math>f,g:P\rightarrow \mathbb{R}</math>. Then
&=M_\pi\left(\sum_{i=1}^m y_i, \sum_{i=1}^my_i^2,\ldots,\sum_{i=1}^my_i^n\right).
::<math>\forall x\in P,\,\, g(x)=\sum_{y\le x} f(y)</math>,
\end{align}</math>
:if and only if
Therefore,
::<math>\forall x\in P,\,\, f(x)=\sum_{y\le x}g(y)\mu(y,x)</math>.
:<math>M_\pi\left(\sum_{i=1}^m y_i, \sum_{i=1}^my_i^2,\ldots,\sum_{i=1}^my_i^n\right)=\sum_{\mathbf{v}=(n_1,\ldots,n_m)\atop n_1+\cdots+n_m=n}|X^{\mathbf{v}}_\pi|y_1^{n_1}y_2^{n_2}\cdots y_m^{n_m}</math>.
}}
}}
The functions <math>f,g:P\rightarrow\mathbb{R}</math> are vectors. Evaluate the matrix multiplications <math>f\zeta</math> and <math>g\mu</math> as follows:
:<math>(f\zeta)(x)=\sum_{y\in P}f(y)\zeta(y,x)=\sum_{y\le x}f(y)</math>,
and
:<math>(g\mu)(x)=\sum_{y\in P}g(y)\mu(y,x)=\sum_{y\le x}g(y)\mu(y,x)</math>.
The Möbius inversion formula is nothing but the following statement
:<math>f\zeta=g\Leftrightarrow f=g\mu</math>,
which is trivially true due to <math>\mu\zeta=I</math> by basic linear algebra.


The following dual form of the inversion formula is also useful.
Combining everything together,
{{Theorem|Möbius inversion formula, dual form|
:Let <math>P</math> be a finite poset and <math>\mu</math> its Möbius function. Let <math>f,g:P\rightarrow \mathbb{R}</math>. Then
::<math>\forall x\in P, \,\, g(x)=\sum_{y{\color{red}\ge} x} f(y)</math>,
: if and only if
::<math>\forall x\in P, \,\, f(x)=\sum_{y{\color{red}\ge} x}\mu(x,y)g(y)</math>.
}}
To prove the dual form, we only need to evaluate the matrix multiplications on left:
:<math>\zeta f=g\Leftrightarrow f=\mu g</math>.
 
;Principle of Inclusion-Exclusion
:Let <math>A_1,A_2,\ldots,A_n\subseteq U</math>. For any <math>J\subseteq\{1,2,\ldots,n\}</math>,
:*let <math>f(J)</math> be the number of elements that belongs to ''exactly'' the sets <math>A_i, i\in J</math> and to no others, i.e.
:::<math>f(J)=\left|\left(\bigcap_{i\in J}A_i\right)\setminus\left(\bigcup_{i\not\in J}A_i\right)\right|</math>;
:*let <math>g(J)=\left|\bigcap_{i\in J}A_i\right|</math>.
:For any <math>J\subseteq\{1,2,\ldots,n\}</math>, the following relation holds for the above defined <math>f</math> and <math>g</math>:
::<math>g(J)=\sum_{I\supseteq J}f(I)</math>.
:Applying the dual form of the Möbius inversion formula, we have that for any <math>J\subseteq\{1,2,\ldots,n\}</math>,
::<math>f(J)=\sum_{I\supseteq J}\mu(J,I)g(I)=\sum_{I\supseteq J}\mu(J,I)\left|\bigcap_{i\in I}A_i\right|</math>,
:where the Möbius function is for the poset of all subsets of <math>\{1,2,\ldots,n\}</math>, ordered by <math>\subseteq</math>, thus it holds that <math>\mu(J,I)=(-1)^{|I|-|J|}\,</math> for <math>J\subseteq I</math>. Therefore,
::<math>f(J)=\sum_{I\supseteq J}(-1)^{|I|-|J|}\left|\bigcap_{i\in I}A_i\right|</math>.
:We have a formula for the number of elements with exactly those properties <math>A_i, i\in J</math> for any <math>J\subseteq\{1,2,\ldots,n\}</math>. For the special case that <math>J=\emptyset</math>, <math>f(\emptyset)</math> is the number of elements satisfying no property of <math>A_1,A_2,\ldots,A_n</math>, and
::<math>f(\emptyset)=\left|U\setminus\bigcup_iA_i\right|=\sum_{I\subseteq \{1,\ldots,n\}}(-1)^{|I|}\left|\bigcap_{i\in I}A_i\right|</math>
:which gives precisely the Principle of Inclusion-Exclusion.
 
;Möbius inversion formula for number theory
:The number-theoretical Möbius inversion formula is stated as such: Let <math>N</math> be a positive integer,
::<math>g(n)=\sum_{d|n}f(d)\,</math> for all <math>n|N</math>
:if and only if
::<math>f(n)=\sum_{d|n}g(d)\mu\left(\frac{n}{d}\right)\,</math> for all <math>n|N</math>,
:where <math>\mu</math> is the [http://en.wikipedia.org/wiki/M%C3%B6bius_function number-theoretical Möbius function], defined as
::<math>\mu(n)=\begin{cases}1 & \text{if }n\text{ is product of an even number of distinct primes,}\\
-1 &\text{if }n\text{ is product of an odd number of distinct primes,}\\
0 &\text{otherwise.}\end{cases}</math>
:The number-theoretical Möbius inversion formula is just a special case of the Möbius inversion formula for posets, when the poset is the set of divisors of <math>N</math>, and for any <math>a,b\in P</math>, <math>a\le_P b</math> if <math>a|b</math>.
 
== Sieve Method in Number Theory ==
=== The Euler totient function ===
Two integers <math>m, n</math> are said to be '''relatively prime''' if their greatest common diviser <math>\mathrm{gcd}(m,n)=1</math>. For a positive integer <math>n</math>, let <math>\phi(n)</math> be the number of positive integers from <math>\{1,2,\ldots,n\}</math> that are relative prime to <math>n</math>. This function, called the Euler <math>\phi</math> function or '''the Euler totient function''', is fundamental in number theory.
 
We now derive a formula for this function by using the principle of inclusion-exclusion.
{{Theorem|Theorem (The Euler totient function)|
Suppose <math>n</math> is divisible by precisely <math>r</math> different primes, denoted <math>p_1,\ldots,p_r</math>. Then
:<math>\phi(n)=n\prod_{i=1}^r\left(1-\frac{1}{p_i}\right)</math>.
}}
{{Proof|
Let <math>U=\{1,2,\ldots,n\}</math> be the universe. The number of positive integers from <math>U</math> which is divisible by some <math>p_{i_1},p_{i_2},\ldots,p_{i_s}\in\{p_1,\ldots,p_r\}</math>, is <math>\frac{n}{p_{i_1}p_{i_2}\cdots p_{i_s}}</math>.
 
<math>\phi(n)</math> is the number of integers from <math>U</math> which is not divisible by any <math>p_1,\ldots,p_r</math>.
By principle of inclusion-exclusion,
:<math>
:<math>
\begin{align}
\begin{align}
\phi(n)
&\quad\, F_G(y_1,y_2,\ldots,y_m)\\
&=n+\sum_{k=1}^r(-1)^k\sum_{1\le i_1<i_2<\cdots <i_k\le n}\frac{n}{p_{i_1}p_{i_2}\cdots p_{i_k}}\\
&=\sum_{\mathbf{v}=(n_1,\ldots,n_m)\atop n_1+\cdots+n_m=n}a_{\mathbf{v}}y_1^{n_1}y_2^{n_2}\cdots y_m^{n_m}\\
&=n-\sum_{1\le i\le n}\frac{n}{p_i}+\sum_{1\le i<j\le n}\frac{n}{p_i p_j}-\sum_{1\le i<j<k\le n}\frac{n}{p_{i} p_{j} p_{k}}+\cdots + (-1)^r\frac{n}{p_{1}p_{2}\cdots p_{r}}\\
&=\sum_{\mathbf{v}=(n_1,\ldots,n_m)\atop n_1+\cdots+n_m=n}\left(\frac{1}{|G|}\sum_{\pi\in G}|X_\pi^{\mathbf{v}}|\right)y_1^{n_1}y_2^{n_2}\cdots y_m^{n_m}
&=n\left(1-\sum_{1\le i\le n}\frac{1}{p_i}+\sum_{1\le i<j\le n}\frac{1}{p_i p_j}-\sum_{1\le i<j<k\le n}\frac{1}{p_{i} p_{j} p_{k}}+\cdots + (-1)^r\frac{1}{p_{1}p_{2}\cdots p_{r}}\right)\\
&\quad\text{(due to Claim 1)}\\
&=n\prod_{i=1}^r\left(1-\frac{1}{p_i}\right).
&=\frac{1}{|G|}\sum_{\pi\in G}\sum_{\mathbf{v}=(n_1,\ldots,n_m)\atop n_1+\cdots+n_m=n} |X_\pi^{\mathbf{v}}|y_1^{n_1}y_2^{n_2}\cdots y_m^{n_m}\\
&=\frac{1}{|G|}\sum_{\pi\in G}M_\pi\left(\sum_{i=1}^m y_i, \sum_{i=1}^my_i^2,\ldots,\sum_{i=1}^my_i^n\right)
&\quad\text{(due to Claim 2)}\\
&=P_G\left(\sum_{i=1}^m y_i, \sum_{i=1}^m y_i^2,\ldots, \sum_{i=1}^m y_i^n\right).
\end{align}
\end{align}
</math>
</math>
}}
}}
== Reference ==
* ''Stanley,'' Enumerative Combinatorics, Volume 1, Chapter 2.
* ''van Lin and Wilson'', A course in combinatorics, Chapter 10, 25.

Latest revision as of 06:16, 8 October 2019

Groups

A group [math]\displaystyle{ (G,\cdot) }[/math] is set [math]\displaystyle{ G }[/math] along with a binary operator [math]\displaystyle{ \cdot }[/math] which satisfies the following axioms:

  • closure: [math]\displaystyle{ \forall g,h\in G, g\cdot h \in G }[/math];
  • associativity: [math]\displaystyle{ \forall f,g,h\in G, f\cdot(g\cdot h)=(f\cdot g)\cdot h }[/math];
  • identity: there exists a special element [math]\displaystyle{ e\in G }[/math], called the identity, such that [math]\displaystyle{ e\cdot g=g }[/math] for any [math]\displaystyle{ g\in G }[/math];
  • inverse: [math]\displaystyle{ \forall g\in G }[/math], there exists an [math]\displaystyle{ h\in G }[/math] such that [math]\displaystyle{ g\cdot h=e }[/math], and we denote that [math]\displaystyle{ h=g^{-1} }[/math].

Permutation groups

A permutation is a bijection [math]\displaystyle{ \pi:[n]\xrightarrow[\text{onto}]{\text{1-1}}[n] }[/math]. We can define a natural operator "[math]\displaystyle{ \cdot }[/math]" between permutations by function composition, i.e. for any [math]\displaystyle{ \pi,\sigma\in S_n }[/math], [math]\displaystyle{ (\pi\cdot\sigma)(i)=\pi(\sigma(i)) }[/math] for all [math]\displaystyle{ i\in[n] }[/math].

Symmetric group [math]\displaystyle{ S_n }[/math]
Let [math]\displaystyle{ S_n }[/math] denote the set of all permutations of [math]\displaystyle{ [n] }[/math]. It is easy to verify that [math]\displaystyle{ S_n }[/math] together with function composition [math]\displaystyle{ \cdot }[/math] form a group. This group is called the symmetric group on [math]\displaystyle{ n }[/math] elements.

Subgroups of [math]\displaystyle{ S_n }[/math] are called permutation groups. The most basic permutation group is [math]\displaystyle{ S_n }[/math] itself (since a group is a subgroup of itself). Besides it, there are several typical permutation groups related to natural symmetric operations.

Cyclic group [math]\displaystyle{ C_n }[/math]
Given a permutation [math]\displaystyle{ \pi\in S_n }[/math], denote that
[math]\displaystyle{ \pi^t=\underbrace{\pi\cdot\pi\cdots\pi}_t }[/math],
and [math]\displaystyle{ \pi^0=e }[/math] is the identity.
We define a special permutation [math]\displaystyle{ \sigma\, }[/math] as that [math]\displaystyle{ \sigma(i)=(i+1)\bmod n\, }[/math] for any [math]\displaystyle{ i }[/math], and let [math]\displaystyle{ C_n=\{\sigma^t\mid t\ge 0\} }[/math]. We say that [math]\displaystyle{ C_n }[/math] is generated by the permutation [math]\displaystyle{ \sigma }[/math]. It is easy to verify that [math]\displaystyle{ C_n }[/math] is a subgroup of [math]\displaystyle{ S_n }[/math]. The group [math]\displaystyle{ C_n }[/math] is called the cyclic group of order [math]\displaystyle{ n }[/math], which is the group formed by all rotational symmetries of a regular polygon of [math]\displaystyle{ n }[/math] sides.
It is easy to see that [math]\displaystyle{ |C_n|=n }[/math], that is, there are [math]\displaystyle{ n }[/math] rotations of a regular [math]\displaystyle{ n }[/math]-gon.
Dihedral group [math]\displaystyle{ D_n }[/math]
Define another special permutation [math]\displaystyle{ \rho\, }[/math] as that [math]\displaystyle{ \rho(i)=n-i-1\, }[/math] for all [math]\displaystyle{ i\in[n] }[/math]. The Dihedral group [math]\displaystyle{ D_n }[/math] is obtained by adding [math]\displaystyle{ \rho }[/math] into [math]\displaystyle{ C_n }[/math] and then take a closure (under operations of "[math]\displaystyle{ \cdot }[/math]"). This group is the group of symmetries, including rotations as well as reflections, of a regular polygon of [math]\displaystyle{ n }[/math] sides.
It is an exercise to check that [math]\displaystyle{ |D_n|=2n }[/math].

Cycle decomposition

A permutation [math]\displaystyle{ \pi:[n]\xrightarrow[\text{onto}]{\text{1-1}}[n] }[/math] can be written in the following form:

[math]\displaystyle{ \begin{pmatrix} 1 & 2 & \cdots & n \\ \pi(1) & \pi(2) & \cdots & \pi(n) \end{pmatrix} }[/math].

For example,

[math]\displaystyle{ \begin{pmatrix} 1 & 2 & 3 & 4 & 5 \\ 3 & 5 & 1 & 2 & 4 \end{pmatrix} }[/math].

The permutation can be equivalently described as a composition of a number of cycles. For example, in the above permutation, we have two cycles:

[math]\displaystyle{ 1\rightarrow 3\rightarrow 1 }[/math] and [math]\displaystyle{ 2\rightarrow5\rightarrow4\rightarrow2 }[/math].

We can denote the permutation by

[math]\displaystyle{ (13)(254) }[/math].

Group action

Definition (group action)
A group action of a group [math]\displaystyle{ G }[/math] on a set [math]\displaystyle{ X }[/math] is a binary operator:
[math]\displaystyle{ \circ:G\times X\rightarrow X }[/math]
satisfying:
  • Associativity: [math]\displaystyle{ (g\cdot h)\circ x=g\circ (h\circ x) }[/math] for all [math]\displaystyle{ g,h\in G }[/math] and [math]\displaystyle{ x\in X }[/math];
  • Identity: [math]\displaystyle{ e\circ x=x }[/math] for all [math]\displaystyle{ x\in X }[/math].

Example: coloring a necklace

Suppose a necklace is made of [math]\displaystyle{ n }[/math] beads, each with one of the [math]\displaystyle{ m }[/math] colors. Formally, a necklace is an assignment [math]\displaystyle{ x:[n]\rightarrow[m] }[/math] of [math]\displaystyle{ m }[/math] colors to [math]\displaystyle{ n }[/math] positions. Let [math]\displaystyle{ X=\{x:[n]\rightarrow[m]\} }[/math] be the set of all such assignments.

For example, when [math]\displaystyle{ n=4 }[/math] and [math]\displaystyle{ m=2 }[/math], [math]\displaystyle{ X }[/math] contains all possible 2-colorings (say red and blue) of 4 positions.

[math]\displaystyle{ X=\{{\color{blue}bbbb}, {\color{blue}bbb}{\color{red}r}, {\color{blue}bb}{\color{red}r}{\color{blue}b}, {\color{blue}bb}{\color{red}rr}, {\color{blue}b}{\color{red}r}{\color{blue}b}{\color{red}r}, {\color{blue}b}{\color{red}rr}{\color{blue}b}, {\color{blue}b}{\color{red}rrr}, {\color{red}r}{\color{blue}bbb}, {\color{red}r}{\color{blue}bb}{\color{red}r}, {\color{red}r}{\color{blue}b}{\color{red}r}{\color{blue}b}, {\color{red}r}{\color{blue}b}{\color{red}rr}, {\color{red}rr}{\color{blue}bb}, {\color{red}rr}{\color{blue}b}{\color{red}r}, {\color{red}rrr}{\color{blue}b}, {\color{red}rrrr}\} }[/math]

We consider two kinds of symmetric operations on necklaces:

  • Rotation: the corresponding group is the cyclic group [math]\displaystyle{ C_n }[/math].
  • Rotation and reflection: the corresponding group is the dihedral group [math]\displaystyle{ D_n }[/math].

Mathematically, these operations on necklaces are described as group actions on [math]\displaystyle{ X }[/math]. Recall that each member [math]\displaystyle{ x\in X }[/math] is an [math]\displaystyle{ m }[/math]-coloring [math]\displaystyle{ x:[n]\rightarrow[m] }[/math] of [math]\displaystyle{ n }[/math] positions. Suppose the permutation group is [math]\displaystyle{ G }[/math], for any [math]\displaystyle{ \pi\in G }[/math] and any [math]\displaystyle{ x\in X }[/math], the group action [math]\displaystyle{ \pi\circ x }[/math] is naturally defined as such:

[math]\displaystyle{ (\pi\circ x)(i)=x(\pi(i)) }[/math].

Burnside's Lemma

Orbits

Definition
Let [math]\displaystyle{ G }[/math] be a permutation group acting on a set [math]\displaystyle{ X }[/math]. For any [math]\displaystyle{ x\in X }[/math]. The orbit of [math]\displaystyle{ x }[/math] under action of [math]\displaystyle{ G }[/math], denoted [math]\displaystyle{ Gx }[/math], is defined as
[math]\displaystyle{ Gx=\{\pi\circ x\mid \pi\in G\} }[/math].

For any [math]\displaystyle{ x,y,z\in X }[/math], the followings can be verified for orbits

  • reflexivity: [math]\displaystyle{ x\in Gx }[/math] since [math]\displaystyle{ e\circ x=x }[/math];
  • symmetric: if [math]\displaystyle{ y\in Gx }[/math] then there is a [math]\displaystyle{ \pi\in G }[/math] such that [math]\displaystyle{ \pi\circ x=y }[/math], thus [math]\displaystyle{ \pi^{-1}\circ y=\pi^{-1}\circ(\pi\circ x) =(\pi^{-1}\cdot \pi)\circ x=e\circ x=x }[/math], hence [math]\displaystyle{ x\in Gy }[/math];
  • transitivity: if [math]\displaystyle{ y\in Gx }[/math] and [math]\displaystyle{ z\in Gy }[/math], then there exist [math]\displaystyle{ \pi,\sigma\in G }[/math] such that [math]\displaystyle{ \pi\circ x=y }[/math] and [math]\displaystyle{ \sigma\circ y=z }[/math], thus [math]\displaystyle{ (\sigma\cdot \pi)\circ x=\sigma\circ(\pi\circ x)=\sigma\circ y=z }[/math], hence [math]\displaystyle{ z\in Gx }[/math].

Therefore, [math]\displaystyle{ x\in Gy }[/math] defines an equivalent relation between [math]\displaystyle{ x }[/math] and [math]\displaystyle{ y }[/math], and orbits partition the set [math]\displaystyle{ X }[/math].

Example
Back to the example of coloring necklaces.
[math]\displaystyle{ X }[/math] contains all possible 2-colorings (say red and blue) of 4 positions.
[math]\displaystyle{ X=\{{\color{blue}bbbb}, {\color{blue}bbb}{\color{red}r}, {\color{blue}bb}{\color{red}r}{\color{blue}b}, {\color{blue}bb}{\color{red}rr}, {\color{blue}b}{\color{red}r}{\color{blue}bb}, {\color{blue}b}{\color{red}r}{\color{blue}b}{\color{red}r}, {\color{blue}b}{\color{red}rr}{\color{blue}b}, {\color{blue}b}{\color{red}rrr}, {\color{red}r}{\color{blue}bbb}, {\color{red}r}{\color{blue}bb}{\color{red}r}, {\color{red}r}{\color{blue}b}{\color{red}r}{\color{blue}b}, {\color{red}r}{\color{blue}b}{\color{red}rr}, {\color{red}rr}{\color{blue}bb}, {\color{red}rr}{\color{blue}b}{\color{red}r}, {\color{red}rrr}{\color{blue}b}, {\color{red}rrrr}\} }[/math]
We consider the rotations of necklaces. The cyclic group [math]\displaystyle{ C_4 }[/math] acting on [math]\displaystyle{ X }[/math] partitions the [math]\displaystyle{ X }[/math] into the following equivalent classes:
[math]\displaystyle{ \begin{align} &\{{\color{blue}bbbb}\},\\ &\{ {\color{blue}bbb}{\color{red}r}, {\color{blue}bb}{\color{red}r}{\color{blue}b}, {\color{blue}b}{\color{red}r}{\color{blue}bb}, {\color{red}r}{\color{blue}bbb}\},\\ &\{ {\color{blue}bb}{\color{red}rr}, {\color{blue}b}{\color{red}rr}{\color{blue}b}, {\color{red}r}{\color{blue}bb}{\color{red}r}, {\color{red}rr}{\color{blue}bb}, \},\\ &\{ {\color{blue}b}{\color{red}r}{\color{blue}b}{\color{red}r}, {\color{red}r}{\color{blue}b}{\color{red}r}{\color{blue}b}\},\\ &\{ {\color{blue}b}{\color{red}rrr}, {\color{red}r}{\color{blue}b}{\color{red}rr}, {\color{red}rr}{\color{blue}b}{\color{red}r}, {\color{red}rrr}{\color{blue}b}\},\\ &\{ {\color{red}rrrr}\} \end{align} }[/math]

Invariant sets and stabilizers

Let [math]\displaystyle{ G }[/math] be a permutation group acting on a set [math]\displaystyle{ X }[/math]. Let [math]\displaystyle{ \pi\in G, x\in X }[/math].

  • The invariant set of [math]\displaystyle{ \pi }[/math]:
[math]\displaystyle{ X_\pi=\{x\in X\mid \pi\circ x=x\} }[/math].
  • The stabilizer of [math]\displaystyle{ x }[/math]:
[math]\displaystyle{ G_x=\{\pi\in G\mid \pi\circ x=x\} }[/math].

There is a beautiful identity for invariant sets and stabilizers.

Lemma
Let [math]\displaystyle{ G }[/math] be a permutation group acting on a set [math]\displaystyle{ X }[/math]. For any [math]\displaystyle{ x\in X }[/math],
[math]\displaystyle{ |G_x||Gx|=|G|\, }[/math].
Proof.

Recall that the orbit [math]\displaystyle{ Gx=\{\pi\circ x\mid \pi\in G\} }[/math]. Suppose that [math]\displaystyle{ Gx=\{x_1,x_2,\ldots,x_t\} }[/math]. Then there exist a set [math]\displaystyle{ P=\{\pi_1,\pi_2,\ldots,\pi_t\} }[/math] of permutations such that [math]\displaystyle{ \pi_i\circ x=x_i\, }[/math] for [math]\displaystyle{ i=1,2,\ldots,t }[/math]. We construct a bijection between [math]\displaystyle{ G }[/math] and [math]\displaystyle{ P\times G_x }[/math], which will prove the lemma.

For any [math]\displaystyle{ \pi\in G }[/math], it holds that [math]\displaystyle{ \pi\circ x=x_i }[/math] for some [math]\displaystyle{ x_i }[/math], and since [math]\displaystyle{ \pi_i\circ x=x_i }[/math], we have [math]\displaystyle{ \pi_i\circ x=\pi\circ x }[/math], hence

[math]\displaystyle{ (\pi_i^{-1}\cdot\pi)\circ x=\pi_i^{-1}\circ(\pi\circ x)=\pi_i^{-1}\circ(\pi_i\circ x)=(\pi_i^{-1}\cdot\pi_i)\circ x=e\circ x=x }[/math].

Denote that [math]\displaystyle{ \sigma=\pi_i^{-1}\cdot \pi }[/math]. Then

  • [math]\displaystyle{ \pi_i\cdot\sigma=\pi_i\cdot(\pi_i^{-1}\cdot\pi)=\pi }[/math], and
  • [math]\displaystyle{ \sigma\circ x=(\pi_i^{-1}\cdot\pi)\circ x=x }[/math], which implies that [math]\displaystyle{ \sigma\in G_x }[/math].

Therefore, each [math]\displaystyle{ \pi\in G }[/math] corresponds to a unique pair [math]\displaystyle{ (\pi_i,\sigma)\in P\times G_x }[/math]. We have a 1-1 mapping from [math]\displaystyle{ G }[/math] to [math]\displaystyle{ P\times G_x }[/math].

For every [math]\displaystyle{ \pi_i\in P }[/math] and every [math]\displaystyle{ \sigma\in G_x }[/math], [math]\displaystyle{ \pi=\pi_i\cdot\sigma\in G }[/math]. We show that this is a 1-1 mapping. Suppose that [math]\displaystyle{ \pi_i\cdot\sigma=\pi_j\cdot\tau }[/math] for some [math]\displaystyle{ \pi_i,\pi_j\in P }[/math] and [math]\displaystyle{ \sigma,\tau\in G_x }[/math]. Then [math]\displaystyle{ (\pi_i\cdot\sigma)\circ x=\pi_i\circ(\sigma\circ x)=\pi_i\circ x=x_i }[/math] and [math]\displaystyle{ (\pi_j\cdot \tau)\circ x=\pi_j\circ(\tau\circ x)=\pi_j\circ x=x_j }[/math], which implies [math]\displaystyle{ x_i=x_j }[/math] and correspondingly [math]\displaystyle{ \pi_i=\pi_j }[/math]. Since [math]\displaystyle{ \pi_i\cdot\sigma=\pi_j\cdot\tau }[/math], [math]\displaystyle{ \sigma=\tau }[/math]. Therefore, we show that the mapping is 1-1.

In conclusion, there exists a bijection between [math]\displaystyle{ P\times G_x }[/math] and [math]\displaystyle{ G }[/math]. As a consequence, [math]\displaystyle{ |Gx||G_x|=|P||G_x|=|P\times G_x|=|G| }[/math].

[math]\displaystyle{ \square }[/math]

Counting orbits

Burnside's Lemma
Let [math]\displaystyle{ G }[/math] be a permutation group acting on a set [math]\displaystyle{ X }[/math]. The number of orbits, denoted [math]\displaystyle{ |X/G| }[/math], is
[math]\displaystyle{ |X/G|=\frac{1}{|G|}\sum_{\pi\in G}|X_{\pi}|. }[/math]
Proof.

Let

[math]\displaystyle{ A(\pi,x)=\begin{cases}1 & \pi\circ x=x,\\ 0 & \text{otherwise.}\end{cases} }[/math]

Basically [math]\displaystyle{ A(\pi,x) }[/math] indicates whether [math]\displaystyle{ x }[/math] is invariant under action of [math]\displaystyle{ \pi }[/math]. We can think of [math]\displaystyle{ A }[/math] as a matrix indexed by [math]\displaystyle{ G\times X }[/math]. An important observation is that the row sums and column sums of [math]\displaystyle{ A }[/math] are [math]\displaystyle{ |X_\pi| }[/math] and [math]\displaystyle{ |G_x| }[/math] respectively, namely,

[math]\displaystyle{ |X_\pi|=\sum_{x\in X}A(\pi,x) }[/math] and [math]\displaystyle{ |G_x|=\sum_{\pi\in G}A(\pi,x) }[/math].

Then a double counting gives that

[math]\displaystyle{ \sum_{\pi\in G}|X_\pi|=\sum_{\pi\in G}\sum_{x\in X}A(\pi,x)=\sum_{x\in X}\sum_{\pi\in G}A(\pi,x)=\sum_{x\in X}|G_x| }[/math].

Due to the above lemma, [math]\displaystyle{ |G_x|=\frac{|G|}{|Gx|} }[/math], thus

[math]\displaystyle{ \sum_{x\in X}|G_x|=\sum_{x\in X}\frac{|G|}{|Gx|}=|G|\sum_{x\in X}\frac{1}{|Gx|} }[/math].

We may enumerate the [math]\displaystyle{ x\in X }[/math] in the sum by first enumerating the orbits and then the elements in each orbit. Denote the orbits by [math]\displaystyle{ X_1,X_2,\ldots,X_{|X/G|} }[/math]. They forms a partition of [math]\displaystyle{ X }[/math], and for any [math]\displaystyle{ X_i }[/math] and every [math]\displaystyle{ x\in X_i }[/math], [math]\displaystyle{ X_i=Gx }[/math]. Thus,

[math]\displaystyle{ |G|\sum_{x\in X}\frac{1}{|Gx|}=|G|\sum_{i=1}^{|X/G|}\sum_{x\in X_i}\frac{1}{|Gx|}=|G|\sum_{i=1}^{|X/G|}\sum_{x\in X_i}\frac{1}{|X_i|}=|G|\sum_{i=1}^{|X/G|}1=|G|{|X/G|} }[/math].

Combining everything together

[math]\displaystyle{ \sum_{\pi\in G}|X_\pi|=\sum_{x\in X}|G_x|=|G|\sum_{x\in X}\frac{1}{|Gx|}=|G|{|X/G|} }[/math],

which gives us that the number of orbits is

[math]\displaystyle{ {|X/G|}=\frac{1}{|G|}\sum_{\pi\in G}|X_\pi| }[/math].
[math]\displaystyle{ \square }[/math]

Pólya's Theory of Counting

The cycle index

Suppose we want to count the number of ways to color [math]\displaystyle{ n }[/math] objects using up to [math]\displaystyle{ m }[/math] colors, discounting symmetries on the objects described by a group [math]\displaystyle{ G }[/math]. If a coloring [math]\displaystyle{ x:[n]\rightarrow[m] }[/math] is invariant under the action of a permutation [math]\displaystyle{ \pi\in G }[/math], then every position in the same cycle of [math]\displaystyle{ \pi }[/math] must have the same color. Therefore, if [math]\displaystyle{ \pi }[/math] is decomposed to [math]\displaystyle{ k }[/math] disjoint cycles, the number of colorings invariant under the action of [math]\displaystyle{ \pi }[/math] is [math]\displaystyle{ |X_\pi|=m^k }[/math].

We define the cycle index of a permutation group [math]\displaystyle{ G }[/math]. For a permutation [math]\displaystyle{ \pi\in G }[/math] of [math]\displaystyle{ [n] }[/math], if [math]\displaystyle{ \pi }[/math] is a product of [math]\displaystyle{ k }[/math] cycles, and the [math]\displaystyle{ i }[/math]th cycle has length [math]\displaystyle{ \ell_i }[/math], let

[math]\displaystyle{ M_\pi(x_1,x_2,\ldots,x_n)=\prod_{i=1}^k x_{\ell_i} }[/math].

The cycle index of [math]\displaystyle{ G }[/math] is defined by

[math]\displaystyle{ P_G(x_1,x_2,\ldots,x_n)=\frac{1}{|G|}\sum_{\pi\in G}M_\pi(x_1,x_2,\ldots,x_n) }[/math].

Due to Burnside's lemma, the number of equivalent classes of [math]\displaystyle{ m }[/math]-colorings of [math]\displaystyle{ n }[/math] objects, discounting the symmetries described by permutation group [math]\displaystyle{ G }[/math], is given by

[math]\displaystyle{ P_G(\underbrace{m,m,\ldots,m}_{n}) }[/math]

Pólya's enumeration formula

For any tuple [math]\displaystyle{ \mathbf{v}=(n_1,n_2,\ldots,n_m) }[/math] of nonnegative integers satisfying that [math]\displaystyle{ n_1+n_2+\cdots +n_m=n }[/math], let [math]\displaystyle{ a_{\mathbf{v}} }[/math] represent the number of nonequivalent (with respect to a permutation group [math]\displaystyle{ G }[/math]) [math]\displaystyle{ m }[/math]-colorings of the [math]\displaystyle{ n }[/math] objects, where the [math]\displaystyle{ i }[/math]th color occurs precisely [math]\displaystyle{ n_i }[/math] times. The pattern inventory is the (multivariate) generating function for the sequence [math]\displaystyle{ a_{\mathbf{v}} }[/math], defined as

[math]\displaystyle{ F_G(y_1,y_2,\ldots,y_m)=\sum_{\mathbf{v}}a_{\mathbf{v}}y_1^{n_1}y_2^{n_2}\cdots y_m^{n_m} }[/math],

where the sum runs over all [math]\displaystyle{ \mathbf{v}=(n_1,n_2,\ldots,n_m) }[/math] satisfying that [math]\displaystyle{ n_1+n_2+\cdots +n_m=n }[/math] and [math]\displaystyle{ n_i\ge 0, 1\le i\le m }[/math].

Theorem (Pólya's enumeration formula)
Let [math]\displaystyle{ G }[/math] be a permutation group. The pattern inventory for the nonequivalent [math]\displaystyle{ m }[/math]-colorings of [math]\displaystyle{ n }[/math] objects under the action of [math]\displaystyle{ G }[/math] is:
[math]\displaystyle{ F_G(y_1,y_2,\ldots,y_m)=P_G\left(\sum_{i=1}^m y_i, \sum_{i=1}^m y_i^2,\ldots, \sum_{i=1}^m y_i^n\right) }[/math].
Proof.

Let [math]\displaystyle{ \mathbf{v}=(n_1,\ldots,n_m) }[/math] be a vector of nonnegative integers satisfying that [math]\displaystyle{ n_1+\cdots+n_m=n }[/math], and let [math]\displaystyle{ a_{\mathbf{v}} }[/math] represent the number of nonequivalent (with respect to a permutation group [math]\displaystyle{ G }[/math]) [math]\displaystyle{ m }[/math]-colorings of the [math]\displaystyle{ n }[/math] objects, such that the [math]\displaystyle{ i }[/math]th color occurs precisely [math]\displaystyle{ n_i }[/math] times.

The set of configurations is now

[math]\displaystyle{ X^{\mathbf{v}}=\{x:[n]\rightarrow[m]\mid \forall i\in[m], x^{-1}(i)=n_i\} }[/math],

which contains all [math]\displaystyle{ m }[/math]-colorings of the [math]\displaystyle{ n }[/math] objects such that the [math]\displaystyle{ i }[/math]th color occurs precisely [math]\displaystyle{ n_i }[/math] times, without considering symmetries.

For every [math]\displaystyle{ \pi\in G }[/math], the invariant set is now

[math]\displaystyle{ X_\pi^{\mathbf{v}}=\{x\in X^{\mathbf{v}}\mid \pi\circ x=x\} }[/math].
Claim 1: [math]\displaystyle{ a_{\mathbf{v}}=\frac{1}{|G|}\sum_{\pi\in G}|X_\pi^{\mathbf{v}}| }[/math].
Proof of Claim 1.

For [math]\displaystyle{ \mathbf{v}=(n_1,\ldots,n_m) }[/math], given any permutation [math]\displaystyle{ \pi\in G }[/math], [math]\displaystyle{ |X^{\mathbf{v}}_\pi| }[/math] gives the number of colorings [math]\displaystyle{ x }[/math] in [math]\displaystyle{ X^\mathbf{v} }[/math] invariant under action of [math]\displaystyle{ \pi }[/math], and [math]\displaystyle{ a_\mathbf{v} }[/math] is the number of orbits of [math]\displaystyle{ X^\mathbf{v} }[/math] induced by the action of permutation group [math]\displaystyle{ G }[/math]. Due to Burnside's lemma,

[math]\displaystyle{ a_{\mathbf{v}}=\frac{1}{|G|}\sum_{\pi\in G}|X_\pi^{\mathbf{v}}| }[/math].
[math]\displaystyle{ \square }[/math]
Claim 2: [math]\displaystyle{ M_\pi\left(\sum_{i=1}^m y_i, \sum_{i=1}^my_i^2,\ldots,\sum_{i=1}^my_i^n\right)=\sum_{\mathbf{v}=(n_1,\ldots,n_m)\atop n_1+\cdots+n_m=n}|X^{\mathbf{v}}_\pi|y_1^{n_1}y_2^{n_2}\cdots y_m^{n_m} }[/math].
Proof of Claim 2.

Supposed that a permutation [math]\displaystyle{ \pi\in G }[/math] can be represented as a composition of [math]\displaystyle{ k }[/math] cycles, the [math]\displaystyle{ i }[/math]th cycle of length [math]\displaystyle{ \ell_i }[/math].

[math]\displaystyle{ \pi=\overbrace{\underbrace{(\cdots\cdots)}_{\ell_1}\underbrace{(\cdots)}_{\ell_2}\cdots\underbrace{(\cdots)}_{\ell_k}}^{k \text{ cycles}} }[/math]

The permutation [math]\displaystyle{ \pi }[/math] does not change a coloring [math]\displaystyle{ x }[/math], if and only if every cycle of [math]\displaystyle{ \pi }[/math] are assigned the same color. Then [math]\displaystyle{ |X^{\mathbf{v}}_\pi| }[/math] equals the number of [math]\displaystyle{ m }[/math]-colorings of [math]\displaystyle{ k }[/math] cycles, where the [math]\displaystyle{ i }[/math]th cycle is of length [math]\displaystyle{ \ell_i }[/math], satisfying that the [math]\displaystyle{ i }[/math]th color occurs precisely [math]\displaystyle{ n_i }[/math] times. The following generalized version of multinomial theorem generates [math]\displaystyle{ |X^{\mathbf{v}}_\pi| }[/math] for all [math]\displaystyle{ \mathbf{v}=(n_1,\ldots,n_m) }[/math].

[math]\displaystyle{ \begin{align} &\quad\, (y_1^{\ell_1}+y_2^{\ell_1}+\cdots+y_m^{\ell_1})(y_1^{\ell_2}+y_2^{\ell_2}+\cdots+y_m^{\ell_2})\cdots(y_1^{\ell_m}+y_2^{\ell_m}+\cdots+y_m^{\ell_m})\\ &=\sum_{\mathbf{v}=(n_1,\ldots,n_m)\atop n_1+\cdots+n_m=n}|X^{\mathbf{v}}_\pi|y_1^{n_1}y_2^{n_2}\cdots y_m^{n_m}. \end{align} }[/math]

On the other hand,

[math]\displaystyle{ \begin{align} &\quad\, (y_1^{\ell_1}+y_2^{\ell_1}+\cdots+y_m^{\ell_1})(y_1^{\ell_2}+y_2^{\ell_2}+\cdots+y_m^{\ell_2})\cdots(y_1^{\ell_m}+y_2^{\ell_m}+\cdots+y_m^{\ell_m})\\ &=M_\pi\left(\sum_{i=1}^m y_i, \sum_{i=1}^my_i^2,\ldots,\sum_{i=1}^my_i^n\right). \end{align} }[/math]

Therefore,

[math]\displaystyle{ M_\pi\left(\sum_{i=1}^m y_i, \sum_{i=1}^my_i^2,\ldots,\sum_{i=1}^my_i^n\right)=\sum_{\mathbf{v}=(n_1,\ldots,n_m)\atop n_1+\cdots+n_m=n}|X^{\mathbf{v}}_\pi|y_1^{n_1}y_2^{n_2}\cdots y_m^{n_m} }[/math].
[math]\displaystyle{ \square }[/math]

Combining everything together,

[math]\displaystyle{ \begin{align} &\quad\, F_G(y_1,y_2,\ldots,y_m)\\ &=\sum_{\mathbf{v}=(n_1,\ldots,n_m)\atop n_1+\cdots+n_m=n}a_{\mathbf{v}}y_1^{n_1}y_2^{n_2}\cdots y_m^{n_m}\\ &=\sum_{\mathbf{v}=(n_1,\ldots,n_m)\atop n_1+\cdots+n_m=n}\left(\frac{1}{|G|}\sum_{\pi\in G}|X_\pi^{\mathbf{v}}|\right)y_1^{n_1}y_2^{n_2}\cdots y_m^{n_m} &\quad\text{(due to Claim 1)}\\ &=\frac{1}{|G|}\sum_{\pi\in G}\sum_{\mathbf{v}=(n_1,\ldots,n_m)\atop n_1+\cdots+n_m=n} |X_\pi^{\mathbf{v}}|y_1^{n_1}y_2^{n_2}\cdots y_m^{n_m}\\ &=\frac{1}{|G|}\sum_{\pi\in G}M_\pi\left(\sum_{i=1}^m y_i, \sum_{i=1}^my_i^2,\ldots,\sum_{i=1}^my_i^n\right) &\quad\text{(due to Claim 2)}\\ &=P_G\left(\sum_{i=1}^m y_i, \sum_{i=1}^m y_i^2,\ldots, \sum_{i=1}^m y_i^n\right). \end{align} }[/math]
[math]\displaystyle{ \square }[/math]