随机算法 (Fall 2011)/Chernoff Bound

From EtoneWiki
Jump to: navigation, search

Suppose that we have a fair coin. If we toss it once, then the outcome is completely unpredictable. But if we toss it, say for 1000 times, then the number of HEADs is very likely to be around 500. This striking phenomenon, illustrated in the right figure, is called the concentration. The Chernoff bound captures the concentration of independent trials.

Coinflip.png

The Chernoff bound is also a tail bound for the sum of independent random variables which may give us exponentially sharp bounds.

Before proving the Chernoff bound, we should talk about the moment generating functions.

Moment generating functions

The more we know about the moments of a random variable , the more information we would have about . There is a so-called moment generating function, which "packs" all the information about the moments of into one function.

Definition
The moment generating function of a random variable is defined as where is the parameter of the function.

By Taylor's expansion and the linearity of expectations,

The moment generating function is a function of .

The Chernoff bound

The Chernoff bounds are exponentially sharp tail inequalities for the sum of independent trials. The bounds are obtained by applying Markov's inequality to the moment generating function of the sum of independent trials, with some appropriate choice of the parameter .

Chernoff bound (the upper tail)
Let , where are independent Poisson trials. Let .
Then for any ,
Proof.
For any , is equivalent to that , thus

where the last step follows by Markov's inequality.

Computing the moment generating function :

Let for . Then,

.

We bound the moment generating function for each individual as follows.

where in the last step we apply the Taylor's expansion so that where . (By doing this, we can transform the product to the sum of , which is .)

Therefore,

Thus, we have shown that for any ,

.

For any , we can let to get

The idea of the proof is actually quite clear: we apply Markov's inequality to and for the rest, we just estimate the moment generating function . To make the bound as tight as possible, we minimized the by setting , which can be justified by taking derivatives of .


We then proceed to the lower tail, the probability that the random variable deviates below the mean value:

Chernoff bound (the lower tail)
Let , where are independent Poisson trials. Let .
Then for any ,
Proof.
For any , by the same analysis as in the upper tail version,

For any , we can let to get


Some useful special forms of the bounds can be derived directly from the above general forms of the bounds. We now know better why we say that the bounds are exponentially sharp.

Useful forms of the Chernoff bound
Let , where are independent Poisson trials. Let . Then
1. for ,
2. for ,
Proof.
To obtain the bounds in (1), we need to show that for , and . We can verify both inequalities by standard analysis techniques.

To obtain the bound in (2), let . Then . Hence,

Balls into bins, revisited

Throwing balls uniformly and independently to bins, what is the maximum load of all bins with high probability? In the last class, we gave an analysis of this problem by using a counting argument.

Now we give a more "advanced" analysis by using Chernoff bounds.


For any and , let be the indicator variable for the event that ball is thrown to bin . Obviously

Let be the load of bin .


Then the expected load of bin is

For the case , it holds that

Note that is a sum of mutually independent indicator variable. Applying Chernoff bound, for any particular bin ,

When

When , . Write . The above bound can be written as

Let , we evaluate by taking logarithm to its reciprocal.

Thus,

Applying the union bound, the probability that there exists a bin with load is

.

Therefore, for , with high probability, the maximum load is .

For larger

When , then according to ,

We can apply an easier form of the Chernoff bounds,

By the union bound, the probability that there exists a bin with load is,

.

Therefore, for , with high probability, the maximum load is .