随机算法 (Fall 2011)/Martingales

From EtoneWiki
Jump to: navigation, search

Review of conditional expectations

The conditional expectation of a random variable with respect to an event is defined by

In particular, if the event is , the conditional expectation

defines a function

Thus, can be regarded as a random variable .

Example
Suppose that we uniformly sample a human from all human beings. Let be his/her height, and let be the country where he/she is from. For any country , gives the average height of that country. And is the random variable which can be defined in either ways:
  • We choose a human uniformly at random from all human beings, and is the average height of the country where he/she comes from.
  • We choose a country at random with a probability proportional to its population, and is the average height of the chosen country.

The following proposition states some fundamental facts about conditional expectation.

Proposition (fundamental facts about conditional expectation)
Let and be arbitrary random variables. Let and be arbitrary functions. Then
  1. .
  2. .
  3. .

The proposition can be formally verified by computing these expectations. Although these equations look formal, the intuitive interpretations to them are very clear.

The first equation:

says that there are two ways to compute an average. Suppose again that is the height of a uniform random human and is the country where he/she is from. There are two ways to compute the average human height: one is to directly average over the heights of all humans; the other is that first compute the average height for each country, and then average over these heights weighted by the populations of the countries.

The second equation:

is the same as the first one, restricted to a particular subspace. As the previous example, inaddition to the height and the country , let be the gender of the individual. Thus, is the average height of a human being of a given sex. Again, this can be computed either directly or on a country-by-country basis.

The third equation:

.

looks obscure at the first glance, especially when considering that and are not necessarily independent. Nevertheless, the equation follows the simple fact that conditioning on any , the function value becomes a constant, thus can be safely taken outside the expectation due to the linearity of expectation. For any value ,

The proposition holds in more general cases when and are a sequence of random variables.

Martingales

"Martingale" originally refers to a betting strategy in which the gambler doubles his bet after every loss. Assuming unlimited wealth, this strategy is guaranteed to eventually have a positive net profit. For example, starting from an initial stake 1, after losses, if the th bet wins, then it gives a net profit of

which is a positive number.

However, the assumption of unlimited wealth is unrealistic. For limited wealth, with geometrically increasing bet, it is very likely to end up bankrupt. You should never try this strategy in real life. And remember: gambling is bad!

Suppose that the gambler is allowed to use any strategy. His stake on the next beting is decided based on the results of all the bettings so far. This gives us a highly dependent sequence of random variables , where is his initial capital, and represents his capital after the th betting. Up to different betting strategies, can be arbitrarily dependent on . However, as long as the game is fair, namely, winning and losing with equal chances, conditioning on the past variables , we will expect no change in the value of the present variable on average. Random variables satisfying this property is called a martingale sequence.

Definition (martingale)
A sequence of random variables is a martingale if for all ,
Example (coin flips)
A fair coin is flipped for a number of times. Let denote the outcome of the th flip. Let
.
The random variables defines a martingale.
Proof
We first observe that , which intuitively says that the next number of HEADs depends only on the current number of HEADs. This property is also called the Markov property in statistic processes.
Example (Polya's urn scheme)
Consider an urn (just a container) that initially contains balck balls and white balls. At each step, we uniformly select a ball from the urn, and replace the ball with balls of the same color. Let , and be the fraction of black balls in the urn after the th step. The sequence is a martingale.
Example (edge exposure in a random graph)
Consider a random graph generated as follows. Let be the set of vertices, and let be the set of all possible edges. For convenience, we enumerate these potential edges by . For each potential edge , we independently flip a fair coin to decide whether the edge appears in . Let be the random variable that indicates whether . We are interested in some graph-theoretical parameter, say chromatic number, of the random graph . Let be the chromatic number of . Let , and for each , let , namely, the expected chromatic number of the random graph after fixing the first edges. This process is called edges exposure of a random graph, as we "exposing" the edges one by one in a random grpah.
360px
As shown by the above figure, the sequence is a martingale. In particular, , and . The martingale moves from no information to full information (of the random graph ) in small steps.

It is nontrivial to formally verify that the edge exposure sequence for a random graph is a martingale. However, we will later see that this construction can be put into a more general context.