# Cheeger's Inequality

One of the most exciting results in spectral graph theory is the following theorem which relate the graph expansion to the spectral gap.

 Theorem (Cheeger's inequality) Let ${\displaystyle G}$ be a ${\displaystyle d}$-regular graph with spectrum ${\displaystyle \lambda _{1}\geq \lambda _{2}\geq \cdots \geq \lambda _{n}}$. Then ${\displaystyle {\frac {d-\lambda _{2}}{2}}\leq \phi (G)\leq {\sqrt {2d(d-\lambda _{2})}}.}$

The theorem was first stated for Riemannian manifolds, and was proved by Cheeger and Buser (for different directions of the inequalities). The discrete case is proved independently by Dodziuk and Alon-Milman.

For a ${\displaystyle d}$-regular graph, the quantity ${\displaystyle (d-\lambda _{2})}$ is called the spectral gap. The name is due to the fact that it is the gap between the first and the second largest eigenvalues of a graph.

If we write ${\displaystyle \alpha =1-{\frac {\lambda _{2}}{d}}}$ (sometimes it is called the normalized spectral gap), the Cheeger's inequality is turned into a nicer form:

${\displaystyle {\frac {\alpha }{2}}\leq {\frac {\phi }{d}}\leq {\sqrt {2\alpha }}}$ or equivalently ${\displaystyle {\frac {1}{2}}\left({\frac {\phi }{d}}\right)^{2}\leq \alpha \leq 2\left({\frac {\phi }{d}}\right)}$.

## Optimization Characterization of Eigenvalues

 Theorem (Rayleigh-Ritz theorem) Let ${\displaystyle A}$ be a symmetric ${\displaystyle n\times n}$ matrix. Let ${\displaystyle \lambda _{1}\geq \lambda _{2}\geq \cdots \geq \lambda _{n}}$ be the eigen values of ${\displaystyle A}$ and ${\displaystyle v_{1},v_{2},\ldots ,v_{n}}$ be the corresponding eigenvectors. Then {\displaystyle {\begin{aligned}\lambda _{1}&=\max _{x\in \mathbb {R} ^{n}}{\frac {x^{T}Ax}{x^{T}x}}\end{aligned}}} and {\displaystyle {\begin{aligned}\lambda _{2}&=\max _{x\bot v_{1}}{\frac {x^{T}Ax}{x^{T}x}}.\end{aligned}}}
Proof.
 Without loss of generality, we may assume that ${\displaystyle v_{1},v_{2},\ldots ,v_{n}}$ are orthonormal eigen-basis. Then it holds that ${\displaystyle {\frac {v_{1}^{T}Av_{1}}{v_{1}^{T}v_{1}}}=\lambda _{1}v_{1}^{T}v_{1}=\lambda _{1}}$, thus we have ${\displaystyle \max _{x\in \mathbb {R} ^{n}}{\frac {x^{T}Ax}{x^{T}x}}\geq \lambda _{1}}$. Let ${\displaystyle x\in \mathbb {R} ^{n}}$ be an arbitrary vector and let ${\displaystyle y={\frac {x}{\sqrt {x^{T}x}}}={\frac {x}{\|x\|}}}$ be its normalization. Since ${\displaystyle v_{1},v_{2},\ldots ,v_{n}}$ are orthonormal basis, ${\displaystyle y}$ can be expressed as ${\displaystyle y=\sum _{i=1}^{n}c_{i}v_{i}}$. Then {\displaystyle {\begin{aligned}{\frac {x^{T}Ax}{x^{T}x}}&=y^{T}Ay=\left(\sum _{i=1}^{n}c_{i}v_{i}\right)^{T}A\left(\sum _{i=1}^{n}c_{i}v_{i}\right)=\left(\sum _{i=1}^{n}c_{i}v_{i}\right)^{T}\left(\sum _{i=1}^{n}\lambda _{i}c_{i}v_{i}\right)\\&=\sum _{i=1}^{n}\lambda _{i}c_{i}^{2}\leq \lambda _{1}\sum _{i=1}^{n}c_{i}^{2}=\lambda _{1}\|y\|=\lambda _{1}.\end{aligned}}} Therefore, ${\displaystyle \max _{x\in \mathbb {R} ^{n}}{\frac {x^{T}Ax}{x^{T}x}}\leq \lambda _{1}}$. Altogether we have ${\displaystyle \max _{x\in \mathbb {R} ^{n}}{\frac {x^{T}Ax}{x^{T}x}}=\lambda _{1}}$ It is similar to prove ${\displaystyle \max _{x\bot v_{1}}{\frac {x^{T}Ax}{x^{T}x}}=\lambda _{2}}$. In the first part take ${\displaystyle x=v_{2}}$ to show that ${\displaystyle \max _{x\bot v_{1}}{\frac {x^{T}Ax}{x^{T}x}}\geq \lambda _{2}}$; and in the second part take an arbitrary ${\displaystyle x\bot v_{1}}$ and ${\displaystyle y={\frac {x}{\|x\|}}}$. Notice that ${\displaystyle y\bot v_{1}}$, thus ${\displaystyle y=\sum _{i=1}^{n}c_{i}v_{i}}$ with ${\displaystyle c_{1}=0}$.
${\displaystyle \square }$

The Rayleigh-Ritz Theorem is a special case of a fundamental theorem in linear algebra, called the Courant-Fischer theorem, which characterizes the eigenvalues of a symmetric matrix by a series of optimizations:

 Theorem (Courant-Fischer theorem) Let ${\displaystyle A}$ be a symmetric matrix with eigenvalues ${\displaystyle \lambda _{1}\geq \lambda _{2}\geq \cdots \geq \lambda _{n}}$. Then {\displaystyle {\begin{aligned}\lambda _{k}&=\max _{v_{1},v_{2},\ldots ,v_{n-k}\in \mathbb {R} ^{n}}\min _{\overset {x\in \mathbb {R} ^{n},x\neq \mathbf {0} }{x\bot v_{1},v_{2},\ldots ,v_{n-k}}}{\frac {x^{T}Ax}{x^{T}x}}\\&=\min _{v_{1},v_{2},\ldots ,v_{k-1}\in \mathbb {R} ^{n}}\max _{\overset {x\in \mathbb {R} ^{n},x\neq \mathbf {0} }{x\bot v_{1},v_{2},\ldots ,v_{k-1}}}{\frac {x^{T}Ax}{x^{T}x}}.\end{aligned}}}

## Graph Laplacian

Let ${\displaystyle G(V,E)}$ be a ${\displaystyle d}$-regular graph of ${\displaystyle n}$ vertices and let ${\displaystyle A}$ be its adjacency matrix. We define ${\displaystyle L=dI-A}$ to be the Laplacian of the graph ${\displaystyle G}$. Take ${\displaystyle x\in \mathbb {R} ^{V}}$ as a distribution over vertices, its Laplacian quadratic form ${\displaystyle x^{T}Lx}$ measures the "smoothness" of ${\displaystyle x}$ over the graph topology, just as what the Laplacian operator does to the differentiable functions.

 Laplacian Property For any vector ${\displaystyle x\in \mathbb {R} ^{n}}$, it holds that ${\displaystyle x^{T}Lx=\sum _{uv\in E}(x_{u}-x_{v})^{2}}$.
Proof.
 {\displaystyle {\begin{aligned}x^{T}Lx&=\sum _{u,v\in V}x_{u}(dI-A)_{uv}x_{v}\\&=\sum _{u}\left(dx_{u}^{2}-\sum _{uv\in E}x_{u}x_{v}\right)\\&=\sum _{u\in V}\sum _{uv\in E}(x_{u}^{2}-x_{u}x_{v}).\end{aligned}}} On the other hand, {\displaystyle {\begin{aligned}\sum _{uv\in E}(x_{u}-x_{v})^{2}&=\sum _{uv\in E}\left(x_{u}^{2}-2x_{u}x_{v}+x_{v}^{2}\right)\\&=\sum _{uv\in E}\left((x_{u}^{2}-x_{u}x_{v})+(x_{v}^{2}-x_{v}x_{u})\right)\\&=\sum _{u\in V}\sum _{uv\in E}(x_{u}^{2}-x_{u}x_{v}).\end{aligned}}}
${\displaystyle \square }$

Applying the Rayleigh-Ritz theorem to the Laplacian matrix of the graph, we have the following "variational characterization" of the spectral gap ${\displaystyle d-\lambda _{2}}$.

 Theorem (Variational Characterization) Let ${\displaystyle G(V,E)}$ be a ${\displaystyle d}$-regular graph of ${\displaystyle n}$ vertices. Suppose that its adjacency matrix is ${\displaystyle A}$, whose eigenvalues are ${\displaystyle \lambda _{1}\geq \lambda _{2}\geq \cdots \geq \lambda _{n}}$. Let ${\displaystyle L=dI-A}$ be the Laplacian matrix. Then {\displaystyle {\begin{aligned}d-\lambda _{2}&=\min _{x\bot {\boldsymbol {1}}}{\frac {x^{T}Lx}{x^{T}x}}=\min _{x\bot {\boldsymbol {1}}}{\frac {\sum _{uv\in E}(x_{u}-x_{v})^{2}}{\sum _{v\in V}x_{v}^{2}}}.\end{aligned}}}
Proof.
 For ${\displaystyle d}$-regular graph, we know that ${\displaystyle \lambda _{1}=d}$ and ${\displaystyle {\boldsymbol {1}}A=d{\boldsymbol {1}}}$, thus ${\displaystyle {\boldsymbol {1}}}$ is the eigenvector of ${\displaystyle \lambda _{1}}$. Due to Rayleigh-Ritz Theorem, it holds that ${\displaystyle \lambda _{2}=\max _{x\bot {\boldsymbol {1}}}{\frac {x^{T}Ax}{x^{T}x}}}$. Then {\displaystyle {\begin{aligned}\min _{x\bot {\boldsymbol {1}}}{\frac {x^{T}Lx}{x^{T}x}}&=\min _{x\bot {\boldsymbol {1}}}{\frac {x^{T}(dI-A)x}{x^{T}x}}\\&=\min _{x\bot {\boldsymbol {1}}}{\frac {dx^{T}x-x^{T}Ax}{x^{T}x}}\\&=\min _{x\bot {\boldsymbol {1}}}\left(d-{\frac {x^{T}Ax}{x^{T}x}}\right)\\&=d-\max _{x\bot {\boldsymbol {1}}}{\frac {x^{T}Ax}{x^{T}x}}\\&=d-\lambda _{2}.\end{aligned}}} We know it holds for the graph Laplacian that ${\displaystyle x^{T}Lx=\sum _{uv\in E}(x_{u}-x_{v})^{2}}$. So the variational characterization of the second eigenvalue of graph is proved.
${\displaystyle \square }$

## Proof of Cheeger's Inequality

We will first give an informal explanation why Cheeger's inequality holds.

Recall that the expansion is defined as

${\displaystyle \phi (G)=\min _{\overset {S\subset V}{|S|\leq {\frac {n}{2}}}}{\frac {|\partial S|}{|S|}}.}$

Let ${\displaystyle \chi _{S}}$ be the characteristic vector of the set ${\displaystyle S}$ such that

${\displaystyle \chi _{S}(v)={\begin{cases}1&v\in S,\\0&v\not \in S.\end{cases}}}$

It is easy to see that

${\displaystyle {\frac {\sum _{uv\in E}(\chi _{S}(u)-\chi _{S}(v))^{2}}{\sum _{v\in V}\chi _{S}(v)^{2}}}={\frac {|\partial S|}{|S|}}.}$

Thus, the expansion can be expressed algebraically as

${\displaystyle \phi (G)=\min _{\overset {S\subset V}{|S|\leq {\frac {n}{2}}}}{\frac {\sum _{uv\in E}(\chi _{S}(u)-\chi _{S}(v))^{2}}{\sum _{v\in V}\chi _{S}(v)^{2}}}=\min _{\overset {x\in \{0,1\}^{n}}{\|x\|_{1}\leq {\frac {n}{2}}}}{\frac {\sum _{uv\in E}(x_{u}-x_{v})^{2}}{\sum _{v\in V}x_{v}^{2}}}.}$

On the other hand, due to the variational characterization of the spectral gap, we have

${\displaystyle d-\lambda _{2}=\min _{x\bot {\boldsymbol {1}}}{\frac {\sum _{uv\in E}(x_{u}-x_{v})^{2}}{\sum _{v\in V}x_{v}^{2}}}.}$

We can easily observe the similarity between the two formulas. Both the expansion ration ${\displaystyle \phi (G)}$ and the spectral gap ${\displaystyle d-\lambda _{2}}$ can be characterized by optimizations of the same objective function ${\displaystyle {\frac {\sum _{uv\in E}(x_{u}-x_{v})^{2}}{\sum _{v\in V}x_{v}^{2}}}}$ over different domains (for the spectral gap, the optimization is over all ${\displaystyle x\bot {\boldsymbol {1}}}$; and for the expansion ratio, it is over all such vectors ${\displaystyle x\in \{0,1\}^{n}}$ with at most ${\displaystyle n/2}$ many 1-entries).

Notations

Throughout the proof, we assume that ${\displaystyle G(V,E)}$ is the ${\displaystyle d}$-regular graph of ${\displaystyle n}$ vertices, ${\displaystyle A}$ is the adjacency matrix, whose eigenvalues are ${\displaystyle \lambda _{1}\geq \lambda _{2}\geq \cdots \geq \lambda _{n}}$, and ${\displaystyle L=(dI-A)}$ is the graph Laplacian.

### Large spectral gap implies high expansion

 Cheeger's inequality (lower bound) ${\displaystyle \phi (G)\geq {\frac {d-\lambda _{2}}{2}}.}$
Proof.
 Let ${\displaystyle S^{*}\subset V}$, ${\displaystyle |S^{*}|\leq {\frac {n}{2}}}$, be the vertex set achieving the optimal expansion ratio ${\displaystyle \phi (G)=\min _{\overset {S\subset V}{|S|\leq {\frac {n}{2}}}}{\frac {|\partial S|}{|S|}}={\frac {|\partial S^{*}|}{|S^{*}|}}}$, and ${\displaystyle x\in \mathbb {R} ^{n}}$ be a vector defined as ${\displaystyle x_{v}={\begin{cases}1/|S^{*}|&{\text{if }}v\in S^{*},\\-1/\left|{\overline {S^{*}}}\right|&{\text{if }}v\in {\overline {S^{*}}}.\end{cases}}}$ Clearly, ${\displaystyle x\cdot {\boldsymbol {1}}=\sum _{v\in S^{*}}{\frac {1}{|S^{*}|}}-\sum _{v\in {\overline {S^{*}}}}{\frac {1}{\left|{\overline {S^{*}}}\right|}}=0}$, thus ${\displaystyle x\bot {\boldsymbol {1}}}$. Due to the variational characterization of the second eigenvalue, {\displaystyle {\begin{aligned}d-\lambda _{2}&\leq {\frac {\sum _{uv\in E}(x_{u}-x_{v})^{2}}{\sum _{v\in V}x_{v}^{2}}}\\&={\frac {\sum _{u\in S^{*},v\in {\overline {S^{*}}},uv\in E}\left(1/|S^{*}|+1/|{\overline {S^{*}}}|\right)^{2}}{1/|S^{*}|+1/|{\overline {S^{*}}}|}}\\&=\left({\frac {1}{|S^{*}|}}+{\frac {1}{\left|{\overline {S^{*}}}\right|}}\right)\cdot |\partial S^{*}|\\&\leq {\frac {2|\partial S^{*}|}{|S^{*}|}}&({\text{since }}|S^{*}|\leq {\frac {n}{2}})\\&=2\phi (G).\end{aligned}}}
${\displaystyle \square }$

### High expansion implies large spectral gap

We next prove the upper bound direction of the Cheeger's inequality:

 Cheeger's inequality (upper bound) ${\displaystyle \phi (G)\leq {\sqrt {2d(d-\lambda _{2})}}.}$

This direction is harder than the lower bound direction. But it is mathematically more interesting and also more useful to us for analyzing the mixing time of random walks.

We prove the following equivalent inequality:

${\displaystyle {\frac {\phi ^{2}}{2d}}\leq d-\lambda _{2}.}$

Let ${\displaystyle x}$ satisfy that

• ${\displaystyle Ax=\lambda _{2}x}$, i.e., it is a eigenvector for ${\displaystyle \lambda _{2}}$;
• ${\displaystyle |\{v\in V\mid x_{v}>0\}|\leq {\frac {n}{2}}}$, i.e., ${\displaystyle x}$ has at most ${\displaystyle n/2}$ positive entries. (We can always choose ${\displaystyle x}$ to be ${\displaystyle -x}$ if this is not satisfied.)

And let nonnegative vector ${\displaystyle y}$ be defined as

${\displaystyle y_{v}={\begin{cases}x_{v}&x_{v}>0,\\0&{\text{otherwise.}}\end{cases}}}$

We then prove the following inequalities:

1. ${\displaystyle {\frac {y^{T}Ly}{y^{T}y}}\leq d-\lambda _{2}}$;
2. ${\displaystyle {\frac {\phi ^{2}}{2d}}\leq {\frac {y^{T}Ly}{y^{T}y}}}$.

The theorem is then a simple consequence by combining these two inequalities.

We prove the first inequality:

 Lemma ${\displaystyle {\frac {y^{T}Ly}{y^{T}y}}\leq d-\lambda _{2}}$.
Proof.
 If ${\displaystyle x_{u}\geq 0}$, then {\displaystyle {\begin{aligned}(Ly)_{u}&=((dI-A)y)_{u}=dy_{u}-\sum _{v}A(u,v)y_{v}=dx_{u}-\sum _{v:x_{v}\geq 0}A(u,v)x_{v}\\&\leq dx_{u}-\sum _{v}A(u,v)x_{v}=((dI-A)x)_{u}=(d-\lambda _{2})x_{u}.\end{aligned}}} Then {\displaystyle {\begin{aligned}y^{T}Ly&=\sum _{u}y_{u}(Ly)_{u}=\sum _{u:x_{u}\geq 0}y_{u}(Ly)_{u}=\sum _{u:x_{u}\geq 0}x_{u}(Ly)_{u}\\&\leq (d-\lambda _{2})\sum _{u:x_{u}\geq 0}x_{u}^{2}=(d-\lambda _{2})\sum _{u}y_{u}^{2}=(d-\lambda _{2})y^{T}y,\end{aligned}}} which proves the lemma.
${\displaystyle \square }$

We then prove the second inequality:

 Lemma ${\displaystyle {\frac {\phi ^{2}}{2d}}\leq {\frac {y^{T}Ly}{y^{T}y}}}$.
Proof.

To prove this, we introduce a new quantity ${\displaystyle {\frac {\sum _{uv\in E}|y_{u}^{2}-y_{v}^{2}|}{y^{T}y}}}$ and shows that

${\displaystyle \phi \leq {\frac {\sum _{uv\in E}|y_{u}^{2}-y_{v}^{2}|}{y^{T}y}}\leq {\sqrt {2d}}\cdot {\sqrt {\frac {y^{T}Ly}{y^{T}y}}}}$.

This will give us the desired inequality ${\displaystyle {\frac {\phi ^{2}}{2d}}\leq {\frac {y^{T}Ly}{y^{T}y}}}$.

 Lemma ${\displaystyle {\frac {\sum _{uv\in E}|y_{u}^{2}-y_{v}^{2}|}{y^{T}y}}\leq {\sqrt {2d}}\cdot {\sqrt {\frac {y^{T}Ly}{y^{T}y}}}}$.
Proof.
 By the Cauchy-Schwarz Inequality, {\displaystyle {\begin{aligned}\sum _{uv\in E}|y_{u}^{2}-y_{v}^{2}|&=\sum _{uv\in E}|y_{u}-y_{v}||y_{u}+y_{v}|\\&\leq {\sqrt {\sum _{uv\in E}(y_{u}-y_{v})^{2}}}\cdot {\sqrt {\sum _{uv\in E}(y_{u}+y_{v})^{2}}}.\end{aligned}}} By the Laplacian property, the first term ${\displaystyle {\sqrt {\sum _{uv\in E}(y_{u}-y_{v})^{2}}}={\sqrt {y^{T}Ly}}}$. By the Inequality of Arithmetic and Geometric Means, the second term ${\displaystyle {\sqrt {\sum _{uv\in E}(y_{u}+y_{v})^{2}}}\leq {\sqrt {2\sum _{uv\in E}(y_{u}^{2}+y_{v}^{2})}}={\sqrt {2d\sum _{u\in V}y_{u}^{2}}}={\sqrt {2dy^{T}y}}.}$ Combining them together, we have ${\displaystyle \sum _{uv\in E}|y_{u}^{2}-y_{v}^{2}|\leq {\sqrt {2d}}\cdot {\sqrt {y^{T}Ly}}\cdot {\sqrt {y^{T}y}}}$.
${\displaystyle \square }$
 Lemma ${\displaystyle \phi \leq {\frac {\sum _{uv\in E}|y_{u}^{2}-y_{v}^{2}|}{y^{T}y}}}$.
Proof.
 Suppose that ${\displaystyle y}$ has ${\displaystyle t}$ nonzero entries. We know that ${\displaystyle t\leq n/2}$ due to the definition of ${\displaystyle y}$. We enumerate the vertices ${\displaystyle u_{1},u_{2},\ldots ,u_{n}\in V}$ such that ${\displaystyle y_{u_{1}}\geq y_{u_{2}}\geq \cdots \geq y_{u_{t}}>y_{u_{t+1}}=\cdots =y_{u_{n}}=0}$. Then {\displaystyle {\begin{aligned}\sum _{uv\in E}|y_{u}^{2}-y_{v}^{2}|&=\sum _{u_{i}u_{j}\in E \atop ii}A(u_{i},u_{j})\sum _{k=i}^{j-1}(y_{u_{k}}^{2}-y_{u_{k+1}}^{2})=\sum _{i=1}^{n}\sum _{j>i}\sum _{k=i}^{j-1}A(u_{i},u_{j})(y_{u_{k}}^{2}-y_{u_{k+1}}^{2}).\end{aligned}}} We have the following universal equation for sums: {\displaystyle {\begin{aligned}\sum _{i=1}^{n}\sum _{j>i}\sum _{k=i}^{j-1}A(u_{i},u_{j})(y_{u_{k}}^{2}-y_{u_{k+1}}^{2})&=\sum _{k=1}^{n}\sum _{i\leq k}\sum _{j>k}A(u_{i},u_{j})(y_{u_{k}}^{2}-y_{u_{k+1}}^{2})\\&=\sum _{k=1}^{t}(y_{u_{k}}^{2}-y_{u_{k+1}}^{2})\sum _{i\leq k}\sum _{j>k}A(u_{i},u_{j})\end{aligned}}} Notice that ${\displaystyle \sum _{i\leq k}\sum _{j>k}A(u_{i},u_{j})=|\partial \{u_{1},\ldots ,u_{k}\}|}$, which is at most ${\displaystyle \phi k}$ since ${\displaystyle k\leq t\leq n/2}$. Therefore, combining these together, we have {\displaystyle {\begin{aligned}\sum _{uv\in E}|y_{u}^{2}-y_{v}^{2}|&=\sum _{k=1}^{t}(y_{u_{k}}^{2}-y_{u_{k+1}}^{2})\sum _{i\leq k}\sum _{j>k}A(u_{i},u_{j})\\&=\sum _{k=1}^{t}(y_{u_{k}}^{2}-y_{u_{k+1}}^{2})|\partial \{u_{1},\ldots ,u_{k}\}|\\&\leq \phi \sum _{k=1}^{t}(y_{u_{k}}^{2}-y_{u_{k+1}}^{2})k\\&=\phi \sum _{k=1}^{t}y_{u_{k}}^{2}\\&=\phi y^{T}y.\end{aligned}}}
${\displaystyle \square }$
${\displaystyle \square }$