# 组合数学 (Spring 2014)/Cayley's formula

## Cayley's Formula

We now present a theorem of the number of labeled trees on a fixed number of vertices. It is due to Cayley in 1889. The theorem is often referred by the name Cayley's formula.

 Cayley's formula for trees There are ${\displaystyle n^{n-2}}$ different trees on ${\displaystyle n}$ distinct vertices.

The theorem has several proofs, including the bijection which encodes a tree by a Prüfer code, through the Kirchhoff's matrix tree theorem, and by double counting.

### Proof of Cayley's formula by double counting

We now present a double counting proof, which is considered by the Proofs from THE BOOK "the most beautiful of them all".

Proof of Cayley's formula by double counting
 (Due to Pitman 1999) Let ${\displaystyle T_{n}}$ be the number of different trees defined on ${\displaystyle n}$ distinct vertices. A rooted tree is a tree with a special vertex. That is, one of the ${\displaystyle n}$ vertices is marked as the "root" of the tree. A rooted tree defines a natural direction of all edges, such that an edge ${\displaystyle uv}$ of the tree is directed from ${\displaystyle u}$ to ${\displaystyle v}$ if ${\displaystyle u}$ is before ${\displaystyle v}$ along the unique path from the root. We count the number of different sequences of directed edges that can be added to an empty graph on ${\displaystyle n}$ vertices to form from it a rooted tree. We note that such a sequence can be formed in two ways: Starting with an unrooted tree, choose one of its vertices as root, and fix an total order of edges to specify the order in which the edges are added. Starting from an empty graph, add the edges one by one in steps. In the first method, we pick one of the ${\displaystyle T_{n}}$ unrooted trees, choose one of the ${\displaystyle n}$ vertices as the root, and pick one of the ${\displaystyle (n-1)!}$ total orders of the ${\displaystyle n-1}$ edges. This gives us ${\displaystyle T_{n}n(n-1)!=T_{n}n!}$ ways. In the second method, we consider the number of choices in one step, and multiply the numbers of choices in all steps. This is done as follows. Given a sequence of adding ${\displaystyle n-1}$ edges to an empty graph to form a rooted tree, we reverse this sequence and get a sequence of removing edges one by one from the final rooted tree until no edge left. We observe that: At first, we remove an edge from the rooted tree. Suppose that the root of the tree is ${\displaystyle r}$, and the removed directed edge is ${\displaystyle (u,v)}$. After removing ${\displaystyle (u,v)}$, the original rooted tree is disconnected into two rooted trees, one rooted at ${\displaystyle r}$ and the other rooted at ${\displaystyle v}$. After removing ${\displaystyle k-1}$ edges, there are ${\displaystyle k}$ rooted trees. In the ${\displaystyle k}$th step, a directed edge ${\displaystyle (u,v)}$ in the current forest is removed and the tree containing ${\displaystyle (u,v)}$ is disconnected into two trees, one rooted at the old root of that tree, and the other rooted at ${\displaystyle v}$. We now again reverse the above procedure, and consider the sequence of adding directed edges to an empty graph to form a rooted tree. At first, we have ${\displaystyle n}$ rooted trees, each of 0 edge (${\displaystyle n}$ isolated vertices). After adding ${\displaystyle n-k}$ edges, there are ${\displaystyle k}$ rooted trees. Denoting the directed edge added next as ${\displaystyle (u,v)}$. As observed above, ${\displaystyle u}$ can be any one of the ${\displaystyle n}$ vertices; but ${\displaystyle v}$ must be the root of one of the ${\displaystyle k}$ trees, except the tree which contains ${\displaystyle u}$. There are ${\displaystyle n(k-1)}$ choices of such ${\displaystyle (u,v)}$. Multiplying the numbers of choices in all steps, the number of sequences of adding directed edges to an empty graph to form a rooted tree is given by ${\displaystyle \prod _{k=2}^{n}n(k-1)=n^{n-2}n!}$. By the principle of double counting, counting the same thing by different methods yield the same result. ${\displaystyle T_{n}n!=n^{n-2}n!}$, which gives that ${\displaystyle T_{n}=n^{n-2}}$.
${\displaystyle \square }$

## Prüfer code

The Prüfer code encodes a labeled tree to a sequence of labels. This gives a bijections between trees and tuples.

### Encoding

In a tree, the vertices of degree 1 are called leaves. It is easy to see that:

• each tree has at least two leaves; and
• after removing a leaf (along with the edge adjacent to it) from a tree, the resulting graph is still a tree.

The following algorithm transforms a tree ${\displaystyle T}$ of ${\displaystyle n}$ vertices ${\displaystyle 1,2,\ldots ,n}$, to a tuple ${\displaystyle (v_{1},v_{2},\ldots ,v_{n-2})\in \{1,2,\ldots ,n\}^{n-2}}$.

 Prüfer code (encoder) Input: A tree ${\displaystyle T}$ of ${\displaystyle n}$ distinct vertices, labeled by ${\displaystyle 1,2,\ldots ,n}$. let ${\displaystyle T_{1}=T}$; for ${\displaystyle i=1}$ to ${\displaystyle n-1}$, do let ${\displaystyle u_{i}}$ be the leaf in ${\displaystyle T_{i}}$ with the smallest label, and ${\displaystyle v_{i}}$ be its neighbor; let ${\displaystyle T_{i+1}}$ be the new tree obtained from deleting the leaf ${\displaystyle u_{i}}$ from ${\displaystyle T_{i}}$; end return ${\displaystyle (v_{1},v_{2},\ldots ,v_{n-2})}$;

### Decoding

It is trivial to observe the following lemma:

 Lemma 1 For each ${\displaystyle 1\leq i\leq n-1}$, ${\displaystyle T_{i}}$ is a tree of ${\displaystyle n-i+1}$ vertices. In particular, the vertices of ${\displaystyle T_{i}}$ are ${\displaystyle u_{i},u_{i+1},\ldots ,u_{n-1},v_{n-1}}$, and the edges of ${\displaystyle T_{i}}$ are precisely ${\displaystyle \{u_{j},v_{j}\}}$, ${\displaystyle i\leq j\leq n-1}$.

And there is a reason that we do not need to store ${\displaystyle v_{n-1}}$ in the Prüfer code.

 Lemma 2 It always holds that ${\displaystyle v_{n-1}=n}$.
Proof.
 Every tree (of at least two vertices) has at least two leaves. The ${\displaystyle u_{i}}$, ${\displaystyle 1\leq i\leq n-1}$, are the leaf of the smallest label in ${\displaystyle T_{i}}$, which can never be ${\displaystyle n}$, thus ${\displaystyle n}$ is never deleted.
${\displaystyle \square }$

Lemma 1 and 2 together imply that given a Prüfer code ${\displaystyle (v_{1},v_{2},\ldots ,v_{n-2})}$, the only remaining task to reconstruct the tree ${\displaystyle T}$ is to figure out those ${\displaystyle u_{i}}$, ${\displaystyle 1\leq i\leq n-1}$. The following lemma state how to obtain ${\displaystyle u_{i}}$, ${\displaystyle 1\leq i\leq n-1}$, from a Prüfer code ${\displaystyle (v_{1},v_{2},\ldots ,v_{n-2})}$.

 Lemma 3 For ${\displaystyle i=1,2,\ldots ,n-1}$, ${\displaystyle u_{i}}$ is the smallest element of ${\displaystyle \{1,2,\ldots ,n\}}$ not in ${\displaystyle \{u_{1},\ldots ,u_{i-1}\}\cup \{v_{i},\ldots ,v_{n-1}\}}$.
Proof.
 Note that ${\displaystyle u_{1},u_{2},\ldots ,u_{n-1},v_{n-1}}$ is a sequence of distinct vertices, because ${\displaystyle u_{1},u_{2},\ldots ,u_{n-1}}$ are deleted one by one from the tree, and ${\displaystyle v_{n-1}=n}$ is never deleted. Thus, each vertex ${\displaystyle v}$ appears among ${\displaystyle u_{1},u_{2},\ldots ,u_{n-1},v_{n-1}}$ exactly once. And each vertex ${\displaystyle v}$ appears for ${\displaystyle deg(v)}$ times among the edges ${\displaystyle \{u_{i},v_{i}\}}$, ${\displaystyle 1\leq i\leq n-1}$, where ${\displaystyle deg(v)}$ denotes the degree of vertex ${\displaystyle v}$ in the original tree ${\displaystyle T}$. Therefore, each vertex ${\displaystyle v}$ appears among ${\displaystyle v_{1},v_{2},\ldots ,v_{n-2}}$ for ${\displaystyle deg(v)-1}$ times. Similarly, each vertex ${\displaystyle v}$ of ${\displaystyle T_{i}}$ appears among ${\displaystyle v_{i},v_{i+1},\ldots ,v_{n-2}}$ for ${\displaystyle deg_{i}(v)-1}$ times, where ${\displaystyle deg_{i}(v)}$ is the degree of vertex ${\displaystyle v}$ in the tree ${\displaystyle T_{i}}$. In particular, the leaves of ${\displaystyle T_{i}}$ are not among ${\displaystyle \{v_{i},v_{i+1},\ldots ,v_{n-2}\}}$. Recall that the vertices of ${\displaystyle T_{i}}$ are ${\displaystyle u_{i},u_{i+1},\ldots ,u_{n-1},v_{n-1}}$. Then the leaves of ${\displaystyle T_{i}}$ are the elements of ${\displaystyle \{1,2,\ldots ,n\}}$ not in ${\displaystyle \{u_{1},\ldots ,u_{i-1}\}\cup \{v_{i},\ldots ,v_{n-1}\}}$. By definition of Prüfer code, ${\displaystyle u_{i}}$ is the leaf in ${\displaystyle T_{i}}$ of smallest label, hence the smallest element of ${\displaystyle \{1,2,\ldots ,n\}}$ not in ${\displaystyle \{u_{1},\ldots ,u_{i-1}\}\cup \{v_{i},\ldots ,v_{n-1}\}}$.
${\displaystyle \square }$

Applying Lemma 3, we have the following decoder for the Prüfer code:

 Prüfer code (decoder) Input: A tuple ${\displaystyle (v_{1},v_{2},\ldots ,v_{n-2})\in \{1,2,\ldots ,n\}^{n-2}}$. let ${\displaystyle T}$ be empty graph, and ${\displaystyle v_{n-1}=n}$; for ${\displaystyle i=1}$ to ${\displaystyle n-1}$, do let ${\displaystyle u_{i}}$ be the smallest label not in ${\displaystyle \{u_{1},\ldots ,u_{i-1}\}\cup \{v_{i},\ldots ,v_{n-1}\}}$; add an edge ${\displaystyle \{u_{i},v_{i}\}}$ to ${\displaystyle T}$; end return ${\displaystyle T}$;

In other words, the encoding of trees to tuples by the Prüfer code is reversible, thus the mapping is injective (1-1). To see it is also surjective, we need to show that for every possible ${\displaystyle (v_{1},v_{2},\ldots ,v_{n-2})\in \{1,2,\ldots ,n\}^{n-2}}$, the above decoder recovers a tree from it.

It is easy to see that the decoder always returns a graph of ${\displaystyle n-1}$ edges on the ${\displaystyle n}$ vertices. The only thing remaining to verify is that the returned graph has no cycle in it, which can be easily proved by a timeline argument (left as an exercise).

### Bijection proof of Cayley's formula

Therefore, the Prüfer code establishes a bijection between the set of trees on ${\displaystyle n}$ distinct vertices and the tuples from ${\displaystyle \{1,2,\ldots ,n\}^{n-2}}$. This proves Cayley's formula.

## Kirchhoff's Matrix-Tree Theorem

Given an undirected graph ${\displaystyle G([n],E)}$, the adjacency matrix ${\displaystyle A}$ of graph ${\displaystyle G}$ is an ${\displaystyle n\times n}$ matrix such that

${\displaystyle A(i,j)={\begin{cases}1&\{i,j\}\in E,\\0&\{i,j\}\not \in E.\end{cases}}}$

### Graph Laplacian

Let ${\displaystyle D}$ be a ${\displaystyle n\times n}$ diagonal matrix such that

${\displaystyle D(i,j)={\begin{cases}{\text{deg}}(i)&i=j,\\0&i\neq j,\end{cases}}}$

where ${\displaystyle {\text{deg}}(i)}$ denotes the degree of vertex ${\displaystyle i}$.

The Laplacian matrix ${\displaystyle L}$ of graph ${\displaystyle G}$ is defined as ${\displaystyle L=D-A}$, that is,

${\displaystyle L(i,j)={\begin{cases}{\text{deg}}(i)&i=j,\\-1&i\neq j{\text{ and }}\{i,j\}\in E,\\0&{\text{otherwise}}.\end{cases}}}$

Suppose ${\displaystyle G([n],E)}$ has ${\displaystyle m}$ edges. The incidence matrix ${\displaystyle B}$ of graph ${\displaystyle G}$ is an ${\displaystyle n\times m}$ matrix such that

${\displaystyle \forall i\in [n],\forall e\in E,\quad B(i,e)={\begin{cases}1&e=\{i,j\}{\text{ and }}ij,\\0&{\text{otherwise}}.\end{cases}}}$

The following proposition is easy to verify.

 Proposition ${\displaystyle L=BB^{T}}$.
Proof.
 For any ${\displaystyle i,j\in [n]}$, we have ${\displaystyle (BB^{T})(i,j)=\sum _{e\in E}B(i,e)B^{T}(e,j)=\sum _{e\in E}B(i,e)B(j,e)}$. It is easy to verify that ${\displaystyle \sum _{e\in E}B(i,e)B(j,e)={\begin{cases}{\text{deg}}(i)&i=j,\\-1&i\neq j{\text{ and }}\{i,j\}\in E,\\0&{\text{otherwise}},\end{cases}}}$ which is equal to the definition of ${\displaystyle L}$.
${\displaystyle \square }$

### The Matrix-tree Theorem

The matrix-tree theorem of Kirchhoff states a striking fact: The number of spanning trees in any connected graph can be computed as the determinant of some appropriate graph matrix.

 Kirchhoff's Matrix-Tree Theorem For any connected graph ${\displaystyle G}$ of ${\displaystyle n}$ vertices, the number of spanning trees in ${\displaystyle G}$ is ${\displaystyle \det(L_{i,i})\,}$ for all ${\displaystyle i\in [n]}$, where ${\displaystyle L_{i,i}\,}$ is the ${\displaystyle (n-1)\times (n-1)}$ matrix resulting from the Laplacian matrix ${\displaystyle L}$ of graph ${\displaystyle G}$ by deleting the ${\displaystyle i}$-th row and the ${\displaystyle i}$-th column.

The determinant can be computed as fast as matrix multiplication, thus is quite efficient, especially when compared to our task: counting the number of subgraphs satisfying certain nontrivial global constraint (e.g. spanning tree). Such efficient algorithm is rarely seen for counting problems, which are usually #P-hard to compute (e.g. number of matchings in a graph).

The key to prove the matrix-tree theorem is the Cauchy-Binet theorem in linear algebra, whose proof is beyond the scope of this class.

 Cauchy-Binet Theorem Let ${\displaystyle A}$ and ${\displaystyle B}$ be, respectively, ${\displaystyle n\times m}$ and ${\displaystyle m\times n}$ matrix. For any ${\displaystyle S\subseteq [m]}$ of size ${\displaystyle n}$, let ${\displaystyle A_{[n],S}}$ and ${\displaystyle B_{S,[n]}}$ denote, respectively, the ${\displaystyle n\times n}$ submatrices of ${\displaystyle A}$ and ${\displaystyle B}$, consisting of the columns of ${\displaystyle A}$, or the rows of ${\displaystyle B}$, indexed by elements of ${\displaystyle S}$. Then ${\displaystyle \det(AB)=\sum _{S\in {[m] \choose n}}\det(A_{[n],S})\det(B_{S,[n]}).}$

Let ${\displaystyle B}$ be the incidence matrix of graph ${\displaystyle G}$. Fix any vertex ${\displaystyle i}$, let ${\displaystyle C}$ be the ${\displaystyle (n-1)\times m}$ matrix resulting from ${\displaystyle B}$ by deleting the ${\displaystyle i}$-th row.

Recall that ${\displaystyle L=BB^{T}}$. It is easy to verify that

${\displaystyle L_{i,i}=CC^{T}.}$

Due to the Cauchy-Binet Theorem, we have

{\displaystyle {\begin{aligned}\det(L_{i,i})=\det(CC^{T})&=\sum _{S\in {[m] \choose n-1}}\det(C_{[n-1],S})\det(C_{S,[n-1]}^{T})\\&=\sum _{S\in {[m] \choose n-1}}\det(C_{[n-1],S})^{2}.\end{aligned}}}

The next lemma gives a key observation to prove the matrix-tree theorem.

 Lemma For any ${\displaystyle S\subseteq [m]}$ of size ${\displaystyle n-1}$, the value of ${\displaystyle \det(C_{[n-1],S})}$ is either 1, or -1, or 0. Moreover, ${\displaystyle \det(C_{[n-1],S})=\pm 1}$ if and only if ${\displaystyle S}$ indicates a spanning tree in ${\displaystyle G}$.
Proof.
 We first show ${\displaystyle \det(C_{[n-1],S})\in \{0,1,-1\}}$ by induction on ${\displaystyle n}$. Note that ${\displaystyle C_{[n-1],S}}$ is an ${\displaystyle (n-1)\times (n-1)}$ matrix such that each column contains at most one 1 and at most one -1, and all other entries are 0. For such matrix, when ${\displaystyle n-1=1}$, the induction hypothesis ${\displaystyle \det(C_{[n-1],S})\in \{0,1,-1\}}$ is trivially true. And for general ${\displaystyle n}$, if every column has a 1 and a -1, then the sum of all rows is the zero vector, so the matrix is singular. Otherwise, expand the determinant by a column with one nonzero entry to find it is ${\displaystyle \pm 1}$ times the determinant of a smaller matrix of the same property, which by induction hypothesis has value 0 or ${\displaystyle \pm 1}$. We then show that ${\displaystyle \det(C_{[n-1],S})}$ is nonzero if and only if ${\displaystyle S}$ is a spanning tree. Suppose the ${\displaystyle n-1}$ edges corresponding to ${\displaystyle S}$ is not a spanning tree of ${\displaystyle G}$. Then ${\displaystyle S}$ must have more than one components, and there must be a component ${\displaystyle R}$ not containing vertex ${\displaystyle i}$. The rows of ${\displaystyle C_{[n-1],S}}$ corresponding to ${\displaystyle R}$ add to 0, thus these rows are linearly dependent, and hence ${\displaystyle \det(C_{[n-1],S})=0}$. Suppose the ${\displaystyle n-1}$ edges corresponding to ${\displaystyle S}$ is a spanning tree of ${\displaystyle G}$. Then there is a vertex ${\displaystyle j_{1}\neq i}$ of degree 1 in ${\displaystyle S}$; let ${\displaystyle e_{1}}$ be the edge in ${\displaystyle S}$ incident to ${\displaystyle j_{1}}$. Deleting vertex ${\displaystyle j_{1}}$ and edge ${\displaystyle e_{1}}$ from ${\displaystyle S}$, we obtain a tree of ${\displaystyle n-2}$ edges. Again there is a vertex ${\displaystyle j_{2}\neq i}$ of degree 1 with incident edge ${\displaystyle e_{2}}$. Continue this we enumerate all vertices except ${\displaystyle i}$ as ${\displaystyle j_{1},j_{2},\ldots ,j_{n-1}}$ and all edges in ${\displaystyle S}$ as ${\displaystyle e_{1},e_{2},\ldots ,e_{n-1}}$. Now permute the rows and columns of ${\displaystyle C_{[n-1],S}}$ such that the ${\displaystyle k}$-th row corresponds to vertex ${\displaystyle j_{k}}$ and the ${\displaystyle k}$-th column corresponds to ${\displaystyle e_{k}}$. By our construction the permuted ${\displaystyle C_{[n-1],S}}$ is lower triangle with ${\displaystyle \pm 1}$ diagonal entries, since when each ${\displaystyle j_{k}}$ is removed it is of degree 1 in the remaining tree and is incident to edge ${\displaystyle e_{k}}$. Therefore, ${\displaystyle \det(C_{[n-1],S})=\pm 1}$.
${\displaystyle \square }$

The matrix-tree theorem follows as consequence. Recall we show that

{\displaystyle {\begin{aligned}\det(L_{i,i})&=\sum _{S\in {[m] \choose n-1}}\det(C_{[n-1],S})^{2}.\end{aligned}}}

The sum enumerates over all subgraphs ${\displaystyle S}$ of ${\displaystyle n-1}$ edges, and by the above lemma ${\displaystyle \det(C_{[n-1],S})^{2}=1}$ if and only if ${\displaystyle S}$ is a spanning tree in ${\displaystyle G}$. Therefore, ${\displaystyle \det(L_{i,i})}$ gives the number of spanning trees in ${\displaystyle G}$.

### Cayley's formula by the matrix-tree theorem

The number of trees of ${\displaystyle n}$ distinct vertices equals the number of spanning trees in the complete graph ${\displaystyle K_{n}}$. For ${\displaystyle G=K_{n}}$, for any ${\displaystyle i\in [n]}$ the ${\displaystyle (n-1)\times (n-1)}$ matrix ${\displaystyle L_{ii}}$ is given by

${\displaystyle L_{i,i}={\begin{bmatrix}n-1&-1&\cdots &-1\\-1&n-1&\cdots &-1\\\vdots &\vdots &\ddots &-1\\-1&-1&\cdots &n-1\end{bmatrix}},}$

whose determinant is ${\displaystyle \det(L_{i,i})=n^{n-2}}$.