随机算法 (Fall 2011)/Random Walks on Undirected Graphs

From TCS Wiki
Jump to navigation Jump to search

A walk on a graph [math]\displaystyle{ G = (V,E) }[/math] is a sequence of vertices [math]\displaystyle{ v_1, v_2, \ldots \in V }[/math] such that [math]\displaystyle{ v_{i+1} }[/math] is a neighbor of [math]\displaystyle{ v_i }[/math] for every index [math]\displaystyle{ i }[/math]. When [math]\displaystyle{ v_{i+1} }[/math] is selected uniformly at random from among [math]\displaystyle{ v_i }[/math]’s neighbors, independently for every [math]\displaystyle{ i }[/math], this is called a random walk on [math]\displaystyle{ G }[/math].

We consider the special case that [math]\displaystyle{ G(V,E) }[/math] is an undirected graph, and denote that [math]\displaystyle{ n=|V| }[/math] and [math]\displaystyle{ m=|E| }[/math].

A Markov chain is defined by this random walk, with the vertex set [math]\displaystyle{ V }[/math] as the state space, and the transition matrix [math]\displaystyle{ P }[/math], which is defined as follows:

[math]\displaystyle{ P(u,v)=\begin{cases} \frac{1}{d(u)}&\mbox{if }(u,v)\in E,\\ 0 & \mbox{otherwise }, \end{cases} }[/math]

where [math]\displaystyle{ d(u) }[/math] denotes the degree of vertex [math]\displaystyle{ u }[/math].

Note that unlike the PageRank example, now the probability depends on [math]\displaystyle{ d(u) }[/math] instead of [math]\displaystyle{ d_+(u) }[/math]. This is because the graph is undirected.

Proposition
Let [math]\displaystyle{ M_G }[/math] be the Markov chain defined as above.
  • [math]\displaystyle{ M_G }[/math] is irreducible if and only if [math]\displaystyle{ G }[/math] is connected.
  • [math]\displaystyle{ M_G }[/math] is aperiodic if and only if [math]\displaystyle{ G }[/math] is non-bipartite.

We leave the proof as an exercise.

We can just assume [math]\displaystyle{ G }[/math] is connected, so we do not have to worry about the reducibility any more.

The periodicity of a random walk on a undirected bipartite graph is usually dealt with by the following trick of "lazy" random walk.

Lazy random walk
Given an undirected graph [math]\displaystyle{ G(V,E) }[/math], a random walk is defined by the transition matrix
[math]\displaystyle{ P'(u,v)=\begin{cases} \frac{1}{2} & \mbox{if }u=v,\\ \frac{1}{2d(u)}&\mbox{if }(u,v)\in E,\\ 0 & \mbox{otherwise }. \end{cases} }[/math]
For this random walk, at each step, we first flip a fair coin to decide whether to move or to stay, and if the coin told us to move, we pick a uniform edge and move to the adjacent vertex. It is easy to see that the resulting Markov chain is aperiodic for any [math]\displaystyle{ G }[/math].

We consider the non-lazy version of random walk. We observe that the random walk has the following stationary distribution.

Theorem
The random walk on [math]\displaystyle{ G(V,E) }[/math] with [math]\displaystyle{ |E|=m }[/math] has a stationary distribution [math]\displaystyle{ \pi }[/math], where [math]\displaystyle{ \forall v\in V }[/math],
[math]\displaystyle{ \begin{align} \pi_v=\frac{d(v)}{2m}\end{align} }[/math] for [math]\displaystyle{ v\in V }[/math].
Proof.
First, since [math]\displaystyle{ \sum_{v\in V}d(v)=2m }[/math], it follows that
[math]\displaystyle{ \sum_{v\in V}\pi_v=\sum_{v\in V}\frac{d(v)}{2m}=1, }[/math]

thus [math]\displaystyle{ \pi }[/math] is a well-defined distribution. And let [math]\displaystyle{ N(v) }[/math] denote the set of neighbors of [math]\displaystyle{ v }[/math]. Then for any [math]\displaystyle{ v\in V }[/math],

[math]\displaystyle{ (\pi P)_v=\sum_{u\in V}\pi_uP(u,v)=\sum_{u\in N(v)}\frac{d(u)}{2m}\frac{1}{d(u)}=\frac{1}{2m}\sum_{u\in N(v)}1=\frac{d(v)}{2m}=\pi_v. }[/math]

Thus [math]\displaystyle{ \pi P=\pi }[/math], and [math]\displaystyle{ \pi }[/math] is stationary.

[math]\displaystyle{ \square }[/math]

For connected and non-bipartite [math]\displaystyle{ G }[/math], the random walk converges to this stationary distribution. Note that the stationary distribution is proportional to the degrees of the vertices, therefore, if [math]\displaystyle{ G }[/math] is a regular graph, that is, [math]\displaystyle{ d(v) }[/math] is the same for all [math]\displaystyle{ v\in V }[/math], the stationary distribution is the uniform distribution.

The following parameters of random walks are closely related to the performances of randomized algorithms based on random walks:

  • Hitting time: It takes how long for a random walk to visit some specific vertex.
  • Cover time: It takes how long for a random walk to visit all vertices.
  • Mixing time: It takes how long for a random walk to be close enough to the stationary distribution.