Select the Median
The selection problem is the problem of finding the [math]\displaystyle{ k }[/math]th smallest element in a set [math]\displaystyle{ S }[/math]. A typical case of selection problem is finding the median, the [math]\displaystyle{ (\lceil n/2\rceil) }[/math]th element in the sorted order of [math]\displaystyle{ S }[/math].
The median can be found in [math]\displaystyle{ O(n\log n) }[/math] time by sorting. There is a linear-time deterministic algorithm, "median of medians" algorithm, which is very sophisticated. Here we introduce a much simpler randomized algorithm which also runs in linear time. The idea of this algorithm is random sampling.
Randomized median algorithm
Analysis
Chernoff Bound
Suppose that we have a fair coin. If we toss it once, then the outcome is completely unpredictable. But if we toss it, say for 1000 times, then the outcome is very much predictable. The number of HEADs is very likely to be around 500. This striking phenomenon is called the concentration.
Moment generating functions
The Chernoff bound
Chernoff bound (the upper tail):
- Let [math]\displaystyle{ X=\sum_{i=1}^n X_i }[/math], where [math]\displaystyle{ X_1, X_2, \ldots, X_n }[/math] are independent Poisson trials. Let [math]\displaystyle{ \mu=\mathbf{E}[X] }[/math].
- Then for any [math]\displaystyle{ \delta\gt 0 }[/math],
- [math]\displaystyle{ \Pr[X\ge (1+\delta)\mu]\lt \left(\frac{e^{\delta}}{(1+\delta)^{(1+\delta)}}\right)^{\mu}. }[/math]
|
Chernoff bound (the lower tail):
- Let [math]\displaystyle{ X=\sum_{i=1}^n X_i }[/math], where [math]\displaystyle{ X_1, X_2, \ldots, X_n }[/math] are independent Poisson trials. Let [math]\displaystyle{ \mu=\mathbf{E}[X] }[/math].
- Then for any [math]\displaystyle{ 0\lt \delta\lt 1 }[/math],
- [math]\displaystyle{ \Pr[X\le (1-\delta)\mu]\lt \left(\frac{e^{-\delta}}{(1-\delta)^{(1-\delta)}}\right)^{\mu}. }[/math]
|
Chernoff-Hoeffding bound (for continuous random variables):
- Let [math]\displaystyle{ X=\sum_{i=1}^n X_i }[/math], where for each [math]\displaystyle{ 1\le i\le n }[/math], [math]\displaystyle{ X_i }[/math] is independently distributed over the range [math]\displaystyle{ [0,1] }[/math]. Let [math]\displaystyle{ \mu=\mathbf{E}[X] }[/math].
- Then for any [math]\displaystyle{ \delta\gt 0 }[/math],
- [math]\displaystyle{ \Pr[X\ge (1+\delta)\mu]\lt \left(\frac{e^{\delta}}{(1+\delta)^{(1+\delta)}}\right)^{\mu}; }[/math]
- and for any [math]\displaystyle{ 0\lt \delta\lt 1 }[/math],
- [math]\displaystyle{ \Pr[X\ge (1-\delta)\mu]\lt \left(\frac{e^{-\delta}}{(1-\delta)^{(1-\delta)}}\right)^{\mu}. }[/math]
|
Useful forms of the Chernoff bound
- Let [math]\displaystyle{ X=\sum_{i=1}^n X_i }[/math], where for each [math]\displaystyle{ 1\le i\le n }[/math], [math]\displaystyle{ X_i }[/math] is independently distributed over the range [math]\displaystyle{ [0,1] }[/math]. Let [math]\displaystyle{ \mu=\mathbf{E}[X] }[/math]. Then
- 1. for [math]\displaystyle{ 0\lt \delta\le 1 }[/math],
- [math]\displaystyle{ \Pr[X\ge (1+\delta)\mu]\lt \exp\left(-\frac{\mu\delta^2}{3}\right); }[/math]
- [math]\displaystyle{ \Pr[X\le (1-\delta)\mu]\lt \exp\left(-\frac{\mu\delta^2}{2}\right); }[/math]
- 2. for [math]\displaystyle{ t\gt 0 }[/math],
- [math]\displaystyle{ \Pr[X\ge\mu+t]\le \exp\left(-\frac{2t^2}{n}\right); }[/math]
- [math]\displaystyle{ \Pr[X\le\mu-t]\le \exp\left(-\frac{2t^2}{n}\right); }[/math]
- 3. for [math]\displaystyle{ t\ge 2e\mu }[/math],
- [math]\displaystyle{ \Pr[X\ge t]\le 2^{-t}. }[/math]
|
Permutation Routing