随机算法 (Fall 2011)/Mixing Time and 高级算法 (Fall 2017)/Hashing and Sketching: Difference between pages

From TCS Wiki
(Difference between pages)
Jump to navigation Jump to search
imported>WikiSysop
(Created page with '=Total Variation Distance= In probability theory, the '''total variation distance''' measures the difference between two probability distributions. {{Theorem|Definition (total va…')
 
imported>Etone
 
Line 1: Line 1:
=Total Variation Distance=
=Count Distinct Elements=
In probability theory, the '''total variation distance''' measures the difference between two probability distributions.
{{Theorem|Definition (total variation distance)|
:Let <math>p</math> and <math>q</math> be two probability distributions over the same finite state space <math>\Omega</math>, the '''total variation distance''' between <math>p</math> and <math>q</math> is defined as
::<math>\|p-q\|_{TV}=\frac{1}{2}\sum_{x\in\Omega}|p(x)-q(x)|=\frac{1}{2}\|p-q\|_1</math>,
:where <math>\|\cdot\|_1</math> is the <math>\ell_1</math>-norm of vectors.
}}
It can be verified (left as an exercise) that
:<math>\max_{A\subset\Omega}|p(A)-q(A)|=\frac{1}{2}\sum_{x\in\Omega}|p(x)-q(x)|</math>,
thus the total variation distance can be equivalently defined as
:<math>\|p-q\|_{TV}=\max_{A\subset\Omega}|p(A)-q(A)|</math>.
So the total variation distance between two distributions gives an upper bound on the difference between the probabilities of the same event according to the two distributions.


= Mixing Time =
== An estimator by hashing ==
The '''mixing time''' of a Markov chain gives the rate at which a Markov chain converges to the stationary distribution.


{{Theorem
==Flajolet-Martin algorithm==
|Definition (mixing time)|
:Let <math>\pi</math> be the stationary of the chain, and <math>p_x^{(t)}</math> be the distribution after <math>t</math> steps when the initial state is <math>x</math>.
:* <math>
\Delta_x(t)=\|p_x^{(t)}-\pi\|_{TV}
</math> is the distance to stationary distribution <math>\pi</math> after <math>t</math> steps, started at state <math>x</math>.
:* <math>\Delta(t)=\max_{x\in\Omega}\Delta_x(t)</math> is the maximum distance to stationary distribution <math>\pi</math> after <math>t</math> steps.
:*<math>\tau_x(\epsilon)=\min\{t\mid\Delta_x(t)\le\epsilon\}</math> is the time until the total variation distance to the stationary distribution, started at the initial state <math>x</math>, reaches <math>\epsilon</math>.
:* <math>\tau(\epsilon)=\max_{x\in\Omega}\tau_x(\epsilon)</math> is the time until the total variation distance to the stationary distribution, started at the worst possible initial state, reaches <math>\epsilon</math>.
}}
We note that <math>\Delta_x(t)</math> is monotonically non-increasing in <math>t</math>. So the definition of <math>\tau_x(\epsilon)</math> makes sense, and is actually the inverse of <math>\Delta_x(t)</math>.


{{Theorem|Definition (mixing time)|
= Set  Membership=
:The '''mixing time''' <math>\tau_{\mathrm{mix}}</math> of a Markov chain is <math>\tau(1/2\mathrm{e})</math>.
}}
The mixing time is the time until the total variation distance to the stationary distribution, starting from the worst possible initial state <math>x\in\Omega</math>, reaches <math>\frac{1}{2\mathrm{e}}</math>. The value <math>\frac{1}{2\mathrm{e}}</math> is chosen just for the convenience of calculation. The next proposition says that <math>\tau(\epsilon)</math> with general <math>\epsilon</math> can be estimated from <math>\tau_{\mathrm{mix}}</math>.


{{Theorem|Proposition|
== Perfect hashing==
# <math>\Delta(k\cdot\tau_{\mathrm{mix}})\le \mathrm{e}^{-k}</math> for any integer <math>k\ge1</math>.
#<math>\tau(\epsilon)\le\tau_{\mathrm{mix}}\cdot\left\lceil\ln\frac{1}{\epsilon}\right\rceil</math>.
}}
So the distance to stationary distribution <math>\Delta(t)</math> decays exponentially in multiplications of <math>\tau_\mathrm{mix}</math>.


Both the formal proofs of the monotonicity of <math>\Delta_x(t)</math> and the above proposition uses the coupling technique and is postponed to next section.
== Bloom filter ==
 
= Frequency Estimation=
 
== Count-min sketch==

Revision as of 08:31, 10 October 2017

Count Distinct Elements

An estimator by hashing

Flajolet-Martin algorithm

Set Membership

Perfect hashing

Bloom filter

Frequency Estimation

Count-min sketch