Probability Basics
We are familiar with the principle of inclusion-exclusion for finite sets.
Principle of Inclusion-Exclusion:
- Let [math]\displaystyle{ S_1, S_2, \ldots, S_n }[/math] be [math]\displaystyle{ n }[/math] finite sets. Then
- [math]\displaystyle{ \begin{align}
\left|\bigcup_{1\le i\le n}S_i\right|
&=
\sum_{i=1}^n|S_i|
-\sum_{i\lt j}|S_i\cap S_j|
+\sum_{i\lt j\lt k}|S_i\cap S_j\cap S_k|\\
& \quad -\cdots
+(-1)^{\ell-1}\sum_{i_1\lt i_2\lt \cdots\lt i_\ell}\left|\bigcap_{r=1}^\ell S_{i_r}\right|
+\cdots
+(-1)^{n-1} \left|\bigcap_{i=1}^n S_i\right|.
\end{align} }[/math]
|
The principle can be generalized to probability events.
Principle of Inclusion-Exclusion:
- Let [math]\displaystyle{ \mathcal{E}_1, \mathcal{E}_2, \ldots, \mathcal{E}_n }[/math] be [math]\displaystyle{ n }[/math] events. Then
- [math]\displaystyle{ \begin{align}
\Pr\left[\bigvee_{1\le i\le n}\mathcal{E}_i\right]
&=
\sum_{i=1}^n\Pr[\mathcal{E}_i]
-\sum_{i\lt j}\Pr[\mathcal{E}_i\wedge \mathcal{E}_j]
+\sum_{i\lt j\lt k}\Pr[\mathcal{E}_i\wedge \mathcal{E}_j\wedge \mathcal{E}_k]\\
& \quad -\cdots
+(-1)^{\ell-1}\sum_{i_1\lt i_2\lt \cdots\lt i_\ell}\Pr\left[\bigwedge_{r=1}^\ell \mathcal{E}_{i_r}\right]
+\cdots
+(-1)^{n-1}\Pr\left[\bigwedge_{i=1}^n \mathcal{E}_{i}\right].
\end{align} }[/math]
|
The proof of the principle is due to measure theory, and is omitted here. The following inequality follows (nontrivially) the principle of inclusion-exclusion:
Theorem (the union bound):
- Let [math]\displaystyle{ \mathcal{E}_1, \mathcal{E}_2, \ldots, \mathcal{E}_n }[/math] be [math]\displaystyle{ n }[/math] events. Then
- [math]\displaystyle{ \begin{align}
\Pr\left[\bigvee_{1\le i\le n}\mathcal{E}_i\right]
&\le
\sum_{i=1}^n\Pr[\mathcal{E}_i].
\end{align} }[/math]
|
The name of this inequality is Boole's inequality. It is usually referred by its nickname "the union bound".
Let [math]\displaystyle{ X }[/math] be a discrete random variable.
Definition (Independence):
- Two random variables [math]\displaystyle{ X }[/math] and [math]\displaystyle{ Y }[/math] are independent if and only if
- [math]\displaystyle{ \begin{align}
\Pr[X=x\wedge Y=y]
&=
\Pr[X=x]\cdot\Pr[Y=y]
\end{align} }[/math]
- for all values [math]\displaystyle{ x }[/math] and [math]\displaystyle{ y }[/math].
|
The expectation of [math]\displaystyle{ X }[/math] is defined as follows.
Definition (Expectation):
- The expectation of a discrete random variable [math]\displaystyle{ X }[/math], denoted by [math]\displaystyle{ \mathbb{E}[X] }[/math], is given by
- [math]\displaystyle{ \begin{align}
\mathbb{E}[X] &= \sum_{x}x\Pr[X=x],
\end{align} }[/math]
- where the summation is over all values [math]\displaystyle{ x }[/math] in the range of [math]\displaystyle{ X }[/math].
|
Linearity of Expectation
Balls-into-bins model
Imagine that [math]\displaystyle{ m }[/math] balls are thrown into [math]\displaystyle{ n }[/math] bins, in such a way that each ball is thrown into a bin which is uniformly and independently chosen from all [math]\displaystyle{ n }[/math] bins. We may ask several questions regarding the distribution of balls in the bins, including:
- the probability that there is no bin with more than one balls (the birthday problem)
- the expected number of balls in each bin (occupancy problem)
- the maximum load of all bins with high probability (occupancy problem)
- the probability that there is no empty bin (coupon collector problem)
The coupon collector problem
Deviation bounds
Markov's inequality
Chebyshev's inequality
The coupon collector revisited
The [math]\displaystyle{ k }[/math]-Median Problem