随机算法 (Fall 2011)/The Spectral Gap: Difference between revisions

Revision as of 16:11, 26 November 2011

It turns out that the second largest eigenvalue of a graph contains important information about the graph's expansion parameter. The following theorem is the so-called Cheeger's inequality.

Theorem (Cheeger's inequality)

Let [math]\displaystyle{ G }[/math] be a [math]\displaystyle{ d }[/math]-regular graph with spectrum [math]\displaystyle{ \lambda_1\ge\lambda_2\ge\cdots\ge\lambda_n }[/math]. Then

[math]\displaystyle{ \frac{d-\lambda_2}{2}\le \phi(G) \le \sqrt{2d(d-\lambda_2)}. }[/math]

The theorem was first stated for Riemannian manifolds, and was proved by Cheeger and Buser (for different directions of the inequalities). The discrete case is proved independently by Dodziuk and Alon-Milman.

For a [math]\displaystyle{ d }[/math]-regular graph, the value [math]\displaystyle{ (d-\lambda_2) }[/math] is known as the spectral gap. The name is due to the fact that it is the gap between the first and the second eigenvalue in the spectrum of a graph. The spectral gap provides an estimate on the expansion ratio of a graph. More precisely, a [math]\displaystyle{ d }[/math]-regular graph has large expansion ratio (thus being an expander) if the spectral gap is large.

If we write [math]\displaystyle{ \alpha=1-\frac{\lambda_2}{d} }[/math] (sometimes it is called the normalized spectral gap), the Cheeger's inequality is turned into a nicer form:

[math]\displaystyle{ \frac{\alpha}{2}\le \frac{\phi}{d}\le\sqrt{2\alpha} }[/math] or equivalently [math]\displaystyle{ \frac{1}{2}\left(\frac{\phi}{d}\right)^2\le \alpha\le 2\left(\frac{\phi}{d}\right) }[/math].

We will not prove the theorem, but we will explain briefly why it works.

For the spectra of graphs, the Cheeger's inequality is proved by the Courant-Fischer theorem in linear algebra. The Courant-Fischer theorem is a fundamental theorem in linear algebra which characterizes the eigenvalues by a series of optimizations:

Theorem (Courant-Fischer theorem)

Let [math]\displaystyle{ A }[/math] be a symmetric matrix with eigenvalues [math]\displaystyle{ \lambda_1\ge\lambda_2\ge\cdots\ge\lambda_n }[/math]. Then

[math]\displaystyle{ \begin{align} \lambda_k &=\max_{v_1,v_2,\ldots,v_{n-k}\in \mathbb{R}^n}\min_{\overset{x\in\mathbb{R}^n, x\neq \mathbf{0}}{x\bot v_1,v_2,\ldots,v_{n-k}}}\frac{x^TAx}{x^Tx}\\ &= \min_{v_1,v_2,\ldots,v_{k-1}\in \mathbb{R}^n}\max_{\overset{x\in\mathbb{R}^n, x\neq \mathbf{0}}{x\bot v_1,v_2,\ldots,v_{k-1}}}\frac{x^TAx}{x^Tx}. \end{align} }[/math]

For a [math]\displaystyle{ d }[/math]-regular graph with adjacency matrix [math]\displaystyle{ A }[/math] and spectrum [math]\displaystyle{ \lambda_1\ge\lambda_2\ge\cdots\ge\lambda_n }[/math], its largest eigenvalue [math]\displaystyle{ \lambda_1=d }[/math] with eigenvector [math]\displaystyle{ A\cdot\mathbf{1}=d\mathbf{1} }[/math]. According to the Courant-Fischer theorem, the second largest eigenvalue can be computed as

[math]\displaystyle{ \lambda_2=\max_{x\bot \mathbf{1}}\frac{x^TAx}{x^Tx}, }[/math]

and

[math]\displaystyle{ d-\lambda_2=\min_{x\bot \mathbf{1}}\frac{x^T(dI-A)x}{x^Tx}. }[/math]

The later is an optimization, which shares some resemblance of the expansion ratio [math]\displaystyle{ \phi(G)=\min_{\overset{S\subset V}{|S|\le\frac{n}{2}}}\frac{|\partial S|}{|S|}=\min_{\chi_S}\frac{\chi_S^T(dI-A)\chi_S}{\chi_S^T\chi_S} }[/math], where [math]\displaystyle{ \chi_S }[/math] is the characteristic vector of the set [math]\displaystyle{ S }[/math], defined as [math]\displaystyle{ \chi_S(i)=1 }[/math] if [math]\displaystyle{ i\in S }[/math] and [math]\displaystyle{ \chi_S(i)=0 }[/math] if [math]\displaystyle{ i\not\in S }[/math]. It is not hard to verify that [math]\displaystyle{ \chi_S^T\chi_S=\sum_{i}\chi_S(i)=|S| }[/math] and [math]\displaystyle{ \chi_S^T(dI-A)\chi_S=\sum_{i\sim j}(\chi_S(i)-\chi_S(j))^2=|\partial S| }[/math].

Therefore, the spectral gap [math]\displaystyle{ d-\lambda_2 }[/math] and the expansion ratio [math]\displaystyle{ \phi(G) }[/math] both involve some optimizations with the similar forms. It explains why they can be used to approximate each other.

Let [math]\displaystyle{ G(V,E) }[/math] be a [math]\displaystyle{ d }[/math]-regular graph of [math]\displaystyle{ n }[/math] vertices and let [math]\displaystyle{ A }[/math] be its adjacency matrix. We define [math]\displaystyle{ L=dI-A }[/math] to be the Laplacian of the graph [math]\displaystyle{ G }[/math]. Take [math]\displaystyle{ x\in \mathbb{R}^V }[/math] as a distribution over vertices, its Laplacian quadratic form [math]\displaystyle{ x^TLx }[/math] measures the "smoothness" of [math]\displaystyle{ x }[/math] over the graph topology, just as what the Laplacian operator does to the differentiable functions.

Laplacian Property

For any vector [math]\displaystyle{ x\in\mathbb{R}^n }[/math], it holds that

[math]\displaystyle{ x^TLx=\sum_{\{u,v\}\in E}(x(u)-x(v))^2 }[/math].

Proof.

[math]\displaystyle{ \begin{align} x^TLx &= \sum_{u,v\in V}x_u(dI-A)_{uv}x_v\\ &= \sum_{u}\left(dx_u^2-\sum_{v\sim u}x_ux_v\right)\\ &= \sum_{u\in V}\sum_{v\sim u}(x_u^2-x_ux_v). \end{align} }[/math]

On the other hand,

[math]\displaystyle{ \begin{align} \sum_{\{u,v\}\in E}(x(u)-x(v))^2 &= \sum_{\{u,v\}\in E}\left(x_u^2-2x_ux_v+x_v^2\right)\\ &= \sum_{\{u,v\}\in E}\left((x_u^2-x_ux_v)+(x_v^2-x_vx_u)\right)\\ &= \sum_{u\in V}\sum_{v\sim u}(x_u^2-x_ux_v). \end{align} }[/math]

[math]\displaystyle{ \square }[/math]

Theorem (Rayleigh-Ritz theorem)

Let [math]\displaystyle{ A }[/math] be a symmetric matrix. Suppose the eigenvalues of [math]\displaystyle{ A }[/math] are [math]\displaystyle{ \lambda_1\ge\lambda_2\ge\cdots\ge\lambda_n }[/math] and [math]\displaystyle{ v_1 }[/math] is the eigenvector of eigenvalue [math]\displaystyle{ \lambda_1 }[/math].

Then

[math]\displaystyle{ \begin{align} \lambda_2 &=\max_{x\bot v_1}\frac{x^TAx}{x^Tx}. \end{align} }[/math]

Theorem (Variational Characterization)

Let [math]\displaystyle{ G(V,E) }[/math] be a [math]\displaystyle{ d }[/math]-regular graph of [math]\displaystyle{ n }[/math] vertices. Suppose its adjacency matrix is [math]\displaystyle{ A }[/math], Laplacian matrix is [math]\displaystyle{ L=dI-A }[/math], and the eigenvalues of [math]\displaystyle{ A }[/math] are [math]\displaystyle{ \lambda_1\ge\lambda_2\ge\cdots\ge\lambda_n }[/math].

Then

[math]\displaystyle{ \begin{align} d-\lambda_2 &=\min_{x\bot \boldsymbol{1}}\frac{x^TLx}{x^Tx} =\min_{x\bot \boldsymbol{1}}\frac{\sum_{\{u,v\}\in E}(x(u)-x(v))^2}{\sum_{v\in V}x(v)^2}. \end{align} }[/math]

Proof.

For [math]\displaystyle{ d }[/math]-regular graph, we know that [math]\displaystyle{ \lambda_1=d }[/math] and [math]\displaystyle{ \boldsymbol{1}A=d\boldsymbol{1} }[/math], thus [math]\displaystyle{ \boldsymbol{1} }[/math] is the eigenvector of [math]\displaystyle{ \lambda_1 }[/math]. Due to Rayleigh-Ritz Theorem, it holds that [math]\displaystyle{ \lambda_2 =\max_{x\bot \boldsymbol{1}}\frac{x^TAx}{x^Tx} }[/math]. Then

[math]\displaystyle{ \begin{align} \min_{x\bot \boldsymbol{1}}\frac{x^TLx}{x^Tx} &=\min_{x\bot \boldsymbol{1}}\frac{x^T(dI-A)x}{x^Tx}\\ &=\min_{x\bot \boldsymbol{1}}\frac{dx^Tx-x^TAx}{x^Tx}\\ &=\min_{x\bot \boldsymbol{1}}\left(d-\frac{x^TAx}{x^Tx}\right)\\ &=d-\max_{x\bot \boldsymbol{1}}\frac{x^TAx}{x^Tx}\\ &=d-\lambda_2. \end{align} }[/math]

We know it holds for the graph Laplacian that [math]\displaystyle{ x^TLx=\sum_{\{u,v\}\in E}(x(u)-x(v))^2 }[/math]. So the variational characterization of the second eigenvalue of graph is proved.

[math]\displaystyle{ \square }[/math]

Cheeger's inequality (lower bound)

[math]\displaystyle{ \phi(G)\ge\frac{d-\lambda_2}{2}. }[/math]

Proof.

Let [math]\displaystyle{ S^*\subset V }[/math], [math]\displaystyle{ |S^*|\le\frac{n}{2} }[/math], be the vertex set achieving the optimal expansion ratio [math]\displaystyle{ \phi(G)=\min_{\overset{S\subset V}{|S|\le\frac{n}{2}}} \frac{|\partial S|}{|S|}=\frac{|\partial S^*|}{|S^*|} }[/math], and [math]\displaystyle{ x\in\mathbb{R}^n }[/math] be a vector defined as

[math]\displaystyle{ x(v)=\begin{cases} 1/|S^*| & \text{if } v\in S^*,\\ -1/\left|\overline{S^*}\right| &\text{if }v\in\overline{S^*}. \end{cases} }[/math]

Clearly, [math]\displaystyle{ x\cdot \boldsymbol{1}=\sum_{v\in S^*}\frac{1}{|S^*|} -\sum_{v\in\overline{S^*}}\frac{1}{\left|\overline{S^*}\right|}=0 }[/math], thus [math]\displaystyle{ x\bot\boldsymbol{1} }[/math].

Due to the variational characterization of the second eigenvalue,

[math]\displaystyle{ \begin{align} d-\lambda_2 &\le\frac{\sum_{\{u,v\}\in E}(x(u)-x(v))^2}{\sum_{v\in V}x(v)^2}\\ &=\frac{\sum_{u\in S^*,v\in\overline{S^*},\{u,v\}\in E}\left(1/|S^*|+1/|\overline{S^*}|\right)^2}{1/|S^*|+1/|\overline{S^*}|}\\ &=\left(\frac{1}{|S^*|}+\frac{1}{\left|\overline{S^*}\right|}\right)\cdot|\partial S^*|\\ &\le \frac{2|\partial S^*|}{|S^*|} &(\text{since }|S^*|\le\frac{n}{2})\\ &=2\phi(G). \end{align} }[/math]

[math]\displaystyle{ \square }[/math]

Cheeger's inequality (upper bound)

[math]\displaystyle{ \phi(G) \le \sqrt{2d(d-\lambda_2)}. }[/math]

@@ Line 4: / Line 4: @@
 :Let <math>G</math> be a <math>d</math>-regular graph with spectrum <math>\lambda_1\ge\lambda_2\ge\cdots\ge\lambda_n</math>. Then
 ::<math>
-\frac{d-\lambda_2}{2}\le \phi(G) \le \sqrt{2d(d-\lambda_2)}
+\frac{d-\lambda_2}{2}\le \phi(G) \le \sqrt{2d(d-\lambda_2)}.
 </math>
 }}
@@ Line 44: / Line 44: @@
 Therefore, the spectral gap <math>d-\lambda_2</math> and the expansion ratio <math>\phi(G)</math> both involve some optimizations with the similar forms. It explains why they can be used to approximate each other.
+----
+Let <math>G(V,E)</math> be a <math>d</math>-regular graph of <math>n</math> vertices and let <math>A</math> be its adjacency matrix. We define <math>L=dI-A</math> to be the '''Laplacian''' of the graph <math>G</math>. Take <math>x\in \mathbb{R}^V</math> as a distribution over vertices, its Laplacian quadratic form <math>x^TLx</math> measures the "smoothness" of <math>x</math> over the graph topology, just as what the [http://en.wikipedia.org/wiki/Laplace_operator Laplacian operator] does to the differentiable functions.
+{{Theorem|Laplacian Property|
+:For any vector <math>x\in\mathbb{R}^n</math>, it holds that
+::<math>x^TLx=\sum_{\{u,v\}\in E}(x(u)-x(v))^2</math>.
+}}
+{{Proof|
+:<math>\begin{align}
+x^TLx
+&=
+\sum_{u,v\in V}x_u(dI-A)_{uv}x_v\\
+&=
+\sum_{u}\left(dx_u^2-\sum_{v\sim u}x_ux_v\right)\\
+&=
+\sum_{u\in V}\sum_{v\sim u}(x_u^2-x_ux_v).
+\end{align}
+</math>
+On the other hand,
+:<math>
+\begin{align}
+\sum_{\{u,v\}\in E}(x(u)-x(v))^2
+&=
+\sum_{\{u,v\}\in E}\left(x_u^2-2x_ux_v+x_v^2\right)\\
+&=
+\sum_{\{u,v\}\in E}\left((x_u^2-x_ux_v)+(x_v^2-x_vx_u)\right)\\
+&=
+\sum_{u\in V}\sum_{v\sim u}(x_u^2-x_ux_v).
+\end{align}
+</math>
+}}
+------
+{{Theorem
+|Theorem (Rayleigh-Ritz theorem)|
+:Let <math>A</math> be a symmetric matrix. Suppose the eigenvalues of <math>A</math> are <math>\lambda_1\ge\lambda_2\ge\cdots\ge\lambda_n</math> and <math>v_1</math> is the eigenvector of eigenvalue <math>\lambda_1</math>.
+:Then
+::<math>
+\begin{align}
+\lambda_2
+&=\max_{x\bot v_1}\frac{x^TAx}{x^Tx}.
+\end{align}
+</math>
+}}
+{{Theorem
+|Theorem (Variational Characterization)|
+:Let <math>G(V,E)</math> be a <math>d</math>-regular graph of <math>n</math> vertices. Suppose its adjacency matrix is <math>A</math>, Laplacian matrix is <math>L=dI-A</math>, and the eigenvalues of <math>A</math> are <math>\lambda_1\ge\lambda_2\ge\cdots\ge\lambda_n</math>.
+:Then
+::<math>
+\begin{align}
+d-\lambda_2
+&=\min_{x\bot \boldsymbol{1}}\frac{x^TLx}{x^Tx}
+=\min_{x\bot \boldsymbol{1}}\frac{\sum_{\{u,v\}\in E}(x(u)-x(v))^2}{\sum_{v\in V}x(v)^2}.
+\end{align}
+</math>
+}}
+{{proof|
+For <math>d</math>-regular graph, we know that <math>\lambda_1=d</math> and <math>\boldsymbol{1}A=d\boldsymbol{1}</math>, thus <math>\boldsymbol{1}</math> is the eigenvector of <math>\lambda_1</math>. Due to Rayleigh-Ritz Theorem, it holds that <math>\lambda_2
+=\max_{x\bot \boldsymbol{1}}\frac{x^TAx}{x^Tx}</math>. Then
+:<math>
+\begin{align}
+\min_{x\bot \boldsymbol{1}}\frac{x^TLx}{x^Tx}
+&=\min_{x\bot \boldsymbol{1}}\frac{x^T(dI-A)x}{x^Tx}\\
+&=\min_{x\bot \boldsymbol{1}}\frac{dx^Tx-x^TAx}{x^Tx}\\
+&=\min_{x\bot \boldsymbol{1}}\left(d-\frac{x^TAx}{x^Tx}\right)\\
+&=d-\max_{x\bot \boldsymbol{1}}\frac{x^TAx}{x^Tx}\\
+&=d-\lambda_2.
+\end{align}
+</math>
+We know it holds for the graph Laplacian that <math>x^TLx=\sum_{\{u,v\}\in E}(x(u)-x(v))^2</math>. So the variational characterization of the second eigenvalue of graph is proved.
+}}
+{{Theorem
+|Cheeger's inequality (lower bound)|
+:<math>
+\phi(G)\ge\frac{d-\lambda_2}{2}.
+</math>
+}}
+{{Proof|
+Let <math>S^*\subset V</math>, <math>|S^*|\le\frac{n}{2}</math>, be the vertex set achieving the optimal expansion ratio <math>\phi(G)=\min_{\overset{S\subset V}{|S|\le\frac{n}{2}}} \frac{|\partial S|}{|S|}=\frac{|\partial S^*|}{|S^*|}</math>, and <math>x\in\mathbb{R}^n</math> be a vector defined as
+:<math>
+x(v)=\begin{cases}
+/|S^*| & \text{if } v\in S^*,\\
+-1/\left|\overline{S^*}\right| &\text{if }v\in\overline{S^*}.
+\end{cases}
+</math>
+Clearly, <math>x\cdot \boldsymbol{1}=\sum_{v\in S^*}\frac{1}{|S^*|} -\sum_{v\in\overline{S^*}}\frac{1}{\left|\overline{S^*}\right|}=0</math>, thus <math>x\bot\boldsymbol{1}</math>.
+Due to the variational characterization of the second eigenvalue,
+:<math>
+\begin{align}
+d-\lambda_2
+&\le\frac{\sum_{\{u,v\}\in E}(x(u)-x(v))^2}{\sum_{v\in V}x(v)^2}\\
+&=\frac{\sum_{u\in S^*,v\in\overline{S^*},\{u,v\}\in E}\left(1/|S^*|+1/|\overline{S^*}|\right)^2}{1/|S^*|+1/|\overline{S^*}|}\\
+&=\left(\frac{1}{|S^*|}+\frac{1}{\left|\overline{S^*}\right|}\right)\cdot|\partial S^*|\\
+&\le \frac{2|\partial S^*|}{|S^*|} &(\text{since }|S^*|\le\frac{n}{2})\\
+&=2\phi(G).
+\end{align}
+</math>
+}}
+{{Theorem|Cheeger's inequality (upper bound)|
+:<math>
+ \phi(G) \le \sqrt{2d(d-\lambda_2)}.
+</math>
+}}

随机算法 (Fall 2011)/The Spectral Gap: Difference between revisions

Revision as of 16:11, 26 November 2011

Navigation menu

Search