Randomized Algorithms (Spring 2010)/Markov chains and random walks: Difference between revisions
imported>WikiSysop |
imported>WikiSysop |
||
Line 75: | Line 75: | ||
Another way to picture a Markov chain is by its transition graph. A weighted directed graph <math>G(V,E,w)</math> is said to be a '''transition graph''' of a finite Markov chain with transition matrix <math>P</math> if: | Another way to picture a Markov chain is by its transition graph. A weighted directed graph <math>G(V,E,w)</math> is said to be a '''transition graph''' of a finite Markov chain with transition matrix <math>P</math> if: | ||
* <math>V=\mathcal{S}</math>, i.e. each node corresponds to a state of the Markov chain; | * <math>V=\mathcal{S}</math>, i.e. each node of the transition graph corresponds to a state of the Markov chain; | ||
* for any <math>i,j\in V</math>, <math>(i,j)\in E</math> if and only if <math>P_{i,j}>0</math>, and the weight <math>w(i,j)=P_{i,j}</math>. | * for any <math>i,j\in V</math>, <math>(i,j)\in E</math> if and only if <math>P_{i,j}>0</math>, and the weight <math>w(i,j)=P_{i,j}</math>. | ||
Revision as of 06:13, 28 April 2010
Markov Chains
The Markov property and transition matrices
A stochastic processes [math]\displaystyle{ \{X_t\mid t\in T\} }[/math] is a collection of random variables. The index [math]\displaystyle{ t }[/math] is often called time, as the process represents the value of a random variable changing over time. Let [math]\displaystyle{ \mathcal{S} }[/math] be the set of values assumed by the random variables [math]\displaystyle{ X_t }[/math]. We call each element of [math]\displaystyle{ \mathcal{S} }[/math] a state, as [math]\displaystyle{ X_t }[/math] represents the state of the process at time [math]\displaystyle{ t }[/math].
The model of stochastic processes can be very general. In this class, we only consider the stochastic processes with the following properties:
- discrete time
- The index set [math]\displaystyle{ T }[/math] is countable. Specifically, we assume the process is [math]\displaystyle{ X_0,X_1,X_2,\ldots }[/math]
- discrete space
- The state space [math]\displaystyle{ \mathcal{S} }[/math] is countable. We are especially interested in the case that [math]\displaystyle{ \mathcal{S} }[/math] is finite, in which case the process is called a finite process.
The next property is about the dependency structure among random variables. The simplest dependency structure for [math]\displaystyle{ X_0,X_1,\ldots }[/math] is no dependency at all, that is, independence. We consider the next simplest dependency structure called the Markov property.
Definition (the Markov property)
|
Informally, the Markov property means: "conditioning on the present, the future does not depend on the past." Hence, the Markov property is also called the memoryless property.
A stochastic process [math]\displaystyle{ X_0,X_1,\ldots }[/math] of discrete time and discrete space is a Markov chain if it has the Markov property.
Let [math]\displaystyle{ P^{(t+1)}_{i,j}=\Pr[X_{t+1}=j\mid X_t=i] }[/math]. For a Markov chain with a finite state space [math]\displaystyle{ \mathcal{S}=[N] }[/math]. This gives us a transition matrix [math]\displaystyle{ P^{(t+1)} }[/math] at time [math]\displaystyle{ t }[/math]. The transition matrix is an [math]\displaystyle{ N\times N }[/math] matrix of nonnegative entries such that the sum over each row of [math]\displaystyle{ P^{(t)} }[/math] is 1, since
- [math]\displaystyle{ \begin{align}\sum_{j}P^{(t+1)}_{i,j}=\sum_{j}\Pr[X_{t+1}=j\mid X_t=i]=1\end{align} }[/math].
In linear algebra, matrices of this type are called stochastic matrices.
Let [math]\displaystyle{ \pi^{(t)} }[/math] be the distribution of the chain at time [math]\displaystyle{ t }[/math], that is, [math]\displaystyle{ \begin{align}\pi^{(t)}_i=\Pr[X_t=i]\end{align} }[/math]. For a finite chain, [math]\displaystyle{ \pi^{(t)} }[/math] is a vector of [math]\displaystyle{ N }[/math] nonnegative entries such that [math]\displaystyle{ \begin{align}\sum_{i}\pi^{(t)}_i=1\end{align} }[/math]. In linear algebra, vectors of this type are called stochastic vectors. Then, it holds that
- [math]\displaystyle{ \begin{align}\pi^{(t+1)}=\pi^{(t)}P^{(t+1)}\end{align} }[/math].
To see this, we apply the law of total probability,
- [math]\displaystyle{ \begin{align} \pi^{(t+1)}_j &= \Pr[X_{t+1}=j]\\ &= \sum_{i}\Pr[X_{t+1}=j\mid X_t=i]\Pr[X_t=i]\\ &=\sum_{i}\pi^{(t)}_iP^{(t+1)}_{i,j}\\ &=(\pi^{(t)}P^{(t+1)})_j. \end{align} }[/math]
Therefore, a finite Markov chain [math]\displaystyle{ X_0,X_1,\ldots }[/math] is specified by an initial distribution [math]\displaystyle{ \pi^{(0)} }[/math] and a sequence of transition matrices [math]\displaystyle{ P^{(1)},P^{(2)},\ldots }[/math]. And the transitions of chain can be described by a series of matrix products:
- [math]\displaystyle{ \pi^{(0)}\stackrel{P^{(1)}}{\longrightarrow}\pi^{(1)}\stackrel{P^{(2)}}{\longrightarrow}\pi^{(2)}\stackrel{P^{(3)}}{\longrightarrow}\cdots\cdots\pi^{(t)}\stackrel{P^{(t+1)}}{\longrightarrow}\pi^{(t+1)}\cdots }[/math]
- Homogeneity
A Markov chain is said to be homogenous if the transitions depend only on the current states but not on the time, that is
- [math]\displaystyle{ P^{(t)}_{i,j}=P_{i,j} }[/math] for all [math]\displaystyle{ t }[/math].
The transitions of a homogenous Markov chain is given by a single matrix [math]\displaystyle{ P }[/math]. Suppose that [math]\displaystyle{ \pi^{(0)} }[/math] is the initial distribution. At each time [math]\displaystyle{ t }[/math],
- [math]\displaystyle{ \begin{align}\pi^{(t+1)}=\pi^{(t)}P\end{align} }[/math].
Expanding this recursion, we have
- [math]\displaystyle{ \begin{align}\pi^{(n)}=\pi^{(0)}P^n\end{align} }[/math].
From now on, we restrict ourselves to the homogenous Markov chains, and the term "Markov chain" means "homogenous Markov chian" unless stated otherwise.
Definition (finite Markov chain)
|
To specify a Markov chain, we only need to give:
- initial distribution [math]\displaystyle{ \pi^{(0)} }[/math];
- transition matrix [math]\displaystyle{ P }[/math].
Then the transitions can be simulated by matrix products:
- [math]\displaystyle{ \pi^{(0)}\stackrel{P}{\longrightarrow}\pi^{(1)}\stackrel{P}{\longrightarrow}\pi^{(2)}\stackrel{P}{\longrightarrow}\cdots\cdots\pi^{(t)}\stackrel{P}{\longrightarrow}\pi^{(t+1)}\stackrel{P}{\longrightarrow}\cdots }[/math]
The distribution of the chain at time [math]\displaystyle{ n }[/math] can be computed by [math]\displaystyle{ \pi^{(n)}=\pi^{(0)}P^n }[/math].
Another way to picture a Markov chain is by its transition graph. A weighted directed graph [math]\displaystyle{ G(V,E,w) }[/math] is said to be a transition graph of a finite Markov chain with transition matrix [math]\displaystyle{ P }[/math] if:
- [math]\displaystyle{ V=\mathcal{S} }[/math], i.e. each node of the transition graph corresponds to a state of the Markov chain;
- for any [math]\displaystyle{ i,j\in V }[/math], [math]\displaystyle{ (i,j)\in E }[/math] if and only if [math]\displaystyle{ P_{i,j}\gt 0 }[/math], and the weight [math]\displaystyle{ w(i,j)=P_{i,j} }[/math].
Irreducibility and aperiodicity
Definition (irreducibility)
|
Definition (aperiodicity)
|
Stationary distributions
Definition (stationary distribution)
|
Theorem (Basic limit theorem)
|
Recurrence and Ergodicity*
Definition (recurrence)
|
Definition (ergodicity)
|