高级算法 (Fall 2018)/Basic tail inequalities

From EtoneWiki
Jump to: navigation, search

Markov's Inequality

One of the most natural information about a random variable is its expectation, which is the first moment of the random variable. Markov's inequality draws a tail bound for a random variable from its expectation.

Theorem (Markov's Inequality)
Let be a random variable assuming only nonnegative values. Then, for all ,
Proof.
Let be the indicator such that

It holds that . Since is 0-1 valued, . Therefore,

Generalization

For any random variable , for an arbitrary non-negative real function , the is a non-negative random variable. Applying Markov's inequality, we directly have that

This trivial application of Markov's inequality gives us a powerful tool for proving tail inequalities. With the function which extracts more information about the random variable, we can prove sharper tail inequalities.

Chebyshev's inequality

Variance

Definition (variance)
The variance of a random variable is defined as
The standard deviation of random variable is

The variance is the diagonal case for covariance.

Definition (covariance)
The covariance of two random variables and is

We have the following theorem for the variance of sum.

Theorem
For any two random variables and ,
Generally, for any random variables ,
Proof.
The equation for two variables is directly due to the definition of variance and covariance. The equation for variables can be deduced from the equation for two variables.

For independent random variables, the expectation of a product equals the product of expectations.

Theorem
For any two independent random variables and ,
Proof.

Consequently, covariance of independent random variables is always zero.

Theorem
For any two independent random variables and ,
Proof.

The variance of the sum of pairwise independent random variables is equal to the sum of variances.

Theorem
For pairwise independent random variables ,
Remark
The theorem holds for pairwise independent random variables, a much weaker independence requirement than the mutual independence. This makes the second-moment methods very useful for pairwise independent random variables.

Variance of binomial distribution

For a Bernoulli trial with parameter .

The variance is

Let be a binomial random variable with parameter and , i.e. , where 's are i.i.d. Bernoulli trials with parameter . The variance is

Chebyshev's inequality

Theorem (Chebyshev's Inequality)
For any ,
Proof.
Observe that

Since is a nonnegative random variable, we can apply Markov's inequality, such that