# 随机算法 (Fall 2011)/Azuma's Inequality

We introduce a martingale tail inequality, called Azuma's inequality.

**Azuma's Inequality**- Let be a martingale such that, for all ,
- Then

- Let be a martingale such that, for all ,

Before formally proving this theorem, some comments are in order. First, unlike the Chernoff bounds, there is no assumption of independence. This shows the power of martingale inequalities.

Second, the condition that

is central to the proof. This condition is sometimes called the **bounded difference condition**. If we think of the martingale as a process evolving through time, where gives some measurement at time , the bounded difference condition states that the process does not make big jumps. The Azuma's inequality says that if so, then it is unlikely that process wanders far from its starting point.

A special case is when the differences are bounded by a constant. The following corollary is directly implied by the Azuma's inequality.

**Corollary**- Let be a martingale such that, for all ,
- Then

- Let be a martingale such that, for all ,

This corollary states that for any martingale sequence whose diferences are bounded by a constant, the probability that it deviates far away from the starting point after steps is bounded by .

The proof of Azuma's Inequality uses several ideas which are used in the proof of the Chernoff bounds. We first observe that the total deviation of the martingale sequence can be represented as the sum of deferences in every steps. Thus, as the Chernoff bounds, we are looking for a bound of the deviation of the sum of random variables. The strategy of the proof is almost the same as the proof of Chernoff bounds: we first apply Markov's inequality to the moment generating function, then we bound the moment generating function, and at last we optimize the parameter of the moment generating function. However, unlike the Chernoff bounds, the martingale differences are not independent any more. So we replace the use of the independence in the Chernoff bound by the martingale property. The proof is detailed as follows.

In order to bound the probability of , we first bound the upper tail . The bound of the lower tail can be symmetrically proved with the replaced by .

## Contents

### Represent the deviation as the sum of differences

We define the **martingale difference sequence**: for , let

It holds that

The second to the last equation is due to the fact that is a martingale and the definition of conditional expectation.

Let be the accumulated differences

The deviation can be computed by the accumulated differences:

We then only need to upper bound the probability of the event .

### Apply Markov's inequality to the moment generating function

The event is equivalent to that for . Apply Markov's inequality, we have

This is exactly the same as what we did to prove the Chernoff bound. Next, we need to bound the moment generating function .

### Bound the moment generating functions

The moment generating function

The first and the last equations are due to the fundamental facts about conditional expectation which are proved by us in the first section.

We then upper bound the by a constant. To do so, we need the following technical lemma which is proved by the convexity of .

**Lemma**- Let be a random variable such that and . Then for ,

- Let be a random variable such that and . Then for ,

**Proof.**Observe that for , the function of the variable is convex in the interval . We draw a line between the two endpoints points and . The curve of lies entirely below this line. Thus, Since , we have

By expanding both sides as Taylor's series, it can be verified that .

Apply the above lemma to the random variable

We have already shown that its expectation and by the bounded difference condition of Azuma's inequality, we have Thus, due to the above lemma, it holds that

Back to our analysis of the expectation , we have

Apply the same analysis to , we can solve the above recursion by

Go back to the Markov's inequality,

We then only need to choose a proper .

### Optimization

By choosing , we have that

Thus, the probability

The upper tail of Azuma's inequality is proved. By replacing by , the lower tail can be treated just as the upper tail. Applying the union bound, Azuma's inequality is proved.