Latest revision as of 05:32, 19 March 2014

Erdős–Rényi Random Graphs

Consider a graph [math]\displaystyle{ G(V,E) }[/math] which is randomly generated as:

[math]\displaystyle{ |V|=n }[/math];
[math]\displaystyle{ \forall \{u,v\}\in{V\choose 2} }[/math], [math]\displaystyle{ uv\in E }[/math] independently with probability [math]\displaystyle{ p }[/math].

Such graph is denoted as [math]\displaystyle{ G(n,p) }[/math]. This is called the Erdős–Rényi model or [math]\displaystyle{ G(n,p) }[/math] model for random graphs.

Informally, the presence of every edge of [math]\displaystyle{ G(n,p) }[/math] is determined by an independent coin flipping (with probability of HEADs [math]\displaystyle{ p }[/math]).

Threshold phenomenon

One of the most fascinating phenomenon of random graphs is that for so many natural graph properties, the random graph [math]\displaystyle{ G(n,p) }[/math] suddenly changes from almost always not having the property to almost always having the property as [math]\displaystyle{ p }[/math] grows in a very small range.

A monotone graph property [math]\displaystyle{ P }[/math] is said to have the threshold [math]\displaystyle{ p(n) }[/math] if

when [math]\displaystyle{ p\ll p(n) }[/math], [math]\displaystyle{ \Pr[P(G(n,p))]=0 }[/math] as [math]\displaystyle{ n\rightarrow\infty }[/math] (also called [math]\displaystyle{ G(n,p) }[/math] almost always does not have [math]\displaystyle{ P }[/math]); and
when [math]\displaystyle{ p\gg p(n) }[/math], [math]\displaystyle{ \Pr[P(G(n,p))]=1 }[/math] as [math]\displaystyle{ n\rightarrow\infty }[/math] (also called [math]\displaystyle{ G(n,p) }[/math] almost always has [math]\displaystyle{ P }[/math]).

The classic method for proving the threshold is the so-called second moment method (Chebyshev's inequality).

Threshold for 4-clique

Theorem

The threshold for a random graph [math]\displaystyle{ G(n,p) }[/math] to contain a 4-clique is [math]\displaystyle{ p=n^{-2/3} }[/math].

We formulate the problem as such. For any [math]\displaystyle{ 4 }[/math]-subset of vertices [math]\displaystyle{ S\in{V\choose 4} }[/math], let [math]\displaystyle{ X_S }[/math] be the indicator random variable such that

[math]\displaystyle{ X_S= \begin{cases} 1 & S\mbox{ is a clique},\\ 0 & \mbox{otherwise}. \end{cases} }[/math]

Let [math]\displaystyle{ X=\sum_{S\in{V\choose 4}}X_S }[/math] be the total number of 4-cliques in [math]\displaystyle{ G }[/math].

It is sufficient to prove the following lemma.

Lemma

If [math]\displaystyle{ p=o(n^{-2/3}) }[/math], then [math]\displaystyle{ \Pr[X\ge 1]\rightarrow 0 }[/math] as [math]\displaystyle{ n\rightarrow\infty }[/math].
If [math]\displaystyle{ p=\omega(n^{-2/3}) }[/math], then [math]\displaystyle{ \Pr[X\ge 1]\rightarrow 1 }[/math] as [math]\displaystyle{ n\rightarrow\infty }[/math].

Proof.

The first claim is proved by the first moment (expectation and Markov's inequality) and the second claim is proved by the second moment method (Chebyshev's inequality).

Every 4-clique has 6 edges, thus for any [math]\displaystyle{ S\in{V\choose 4} }[/math],

[math]\displaystyle{ \mathbf{E}[X_S]=\Pr[X_S=1]=p^6 }[/math].

By the linearity of expectation,

[math]\displaystyle{ \mathbf{E}[X]=\sum_{S\in{V\choose 4}}\mathbf{E}[X_S]={n\choose 4}p^6 }[/math].

Applying Markov's inequality

[math]\displaystyle{ \Pr[X\ge 1]\le \mathbf{E}[X]=O(n^4p^6)=o(1) }[/math], if [math]\displaystyle{ p=o(n^{-2/3}) }[/math].

The first claim is proved.

To prove the second claim, it is equivalent to show that [math]\displaystyle{ \Pr[X=0]=o(1) }[/math] if [math]\displaystyle{ p=\omega(n^{-2/3}) }[/math]. By the Chebyshev's inequality,

[math]\displaystyle{ \Pr[X=0]\le\Pr[|X-\mathbf{E}[X]|\ge\mathbf{E}[X]]\le\frac{\mathbf{Var}[X]}{(\mathbf{E}[X])^2} }[/math],

where the variance is computed as

[math]\displaystyle{ \mathbf{Var}[X]=\mathbf{Var}\left[\sum_{S\in{V\choose 4}}X_S\right]=\sum_{S\in{V\choose 4}}\mathbf{Var}[X_S]+\sum_{S,T\in{V\choose 4}, S\neq T}\mathbf{Cov}(X_S,X_T) }[/math].

For any [math]\displaystyle{ S\in{V\choose 4} }[/math],

[math]\displaystyle{ \mathbf{Var}[X_S]=\mathbf{E}[X_S^2]-\mathbf{E}[X_S]^2\le \mathbf{E}[X_S^2]=\mathbf{E}[X_S]=p^6 }[/math]. Thus the first term of above formula is [math]\displaystyle{ \sum_{S\in{V\choose 4}}\mathbf{Var}[X_S]=O(n^4p^6) }[/math].

We now compute the covariances. For any [math]\displaystyle{ S,T\in{V\choose 4} }[/math] that [math]\displaystyle{ S\neq T }[/math]:

Case.1: [math]\displaystyle{ |S\cap T|\le 1 }[/math], so [math]\displaystyle{ S }[/math] and [math]\displaystyle{ T }[/math] do not share any edges. [math]\displaystyle{ X_S }[/math] and [math]\displaystyle{ X_T }[/math] are independent, thus [math]\displaystyle{ \mathbf{Cov}(X_S,X_T)=0 }[/math].
Case.2: [math]\displaystyle{ |S\cap T|= 2 }[/math], so [math]\displaystyle{ S }[/math] and [math]\displaystyle{ T }[/math] share an edge. Since [math]\displaystyle{ |S\cup T|=6 }[/math], there are [math]\displaystyle{ {n\choose 6}=O(n^6) }[/math] pairs of such [math]\displaystyle{ S }[/math] and [math]\displaystyle{ T }[/math].

[math]\displaystyle{ \mathbf{Cov}(X_S,X_T)=\mathbf{E}[X_SX_T]-\mathbf{E}[X_S]\mathbf{E}[X_T]\le\mathbf{E}[X_SX_T]=\Pr[X_S=1\wedge X_T=1]=p^{11} }[/math]

since there are 11 edges in the union of two 4-cliques that share a common edge. The contribution of these pairs is [math]\displaystyle{ O(n^6p^{11}) }[/math].

Case.2: [math]\displaystyle{ |S\cap T|= 3 }[/math], so [math]\displaystyle{ S }[/math] and [math]\displaystyle{ T }[/math] share a triangle. Since [math]\displaystyle{ |S\cup T|=5 }[/math], there are [math]\displaystyle{ {n\choose 5}=O(n^5) }[/math] pairs of such [math]\displaystyle{ S }[/math] and [math]\displaystyle{ T }[/math]. By the same argument,

[math]\displaystyle{ \mathbf{Cov}(X_S,X_T)\le\Pr[X_S=1\wedge X_T=1]=p^{9} }[/math]

since there are 9 edges in the union of two 4-cliques that share a triangle. The contribution of these pairs is [math]\displaystyle{ O(n^5p^{9}) }[/math].

Putting all these together,

[math]\displaystyle{ \mathbf{Var}[X]=O(n^4p^6+n^6p^{11}+n^5p^{9}). }[/math]

And

[math]\displaystyle{ \Pr[X=0]\le\frac{\mathbf{Var}[X]}{(\mathbf{E}[X])^2}=O(n^{-4}p^{-6}+n^{-2}p^{-1}+n^{-3}p^{-3}) }[/math],

which is [math]\displaystyle{ o(1) }[/math] if [math]\displaystyle{ p=\omega(n^{-2/3}) }[/math]. The second claim is also proved.

[math]\displaystyle{ \square }[/math]

Threshold for balanced subgraphs

The above theorem can be generalized to any "balanced" subgraphs.

Definition

The density of a graph [math]\displaystyle{ G(V,E) }[/math], denoted [math]\displaystyle{ \rho(G)\, }[/math], is defined as [math]\displaystyle{ \rho(G)=\frac{|E|}{|V|} }[/math].
A graph [math]\displaystyle{ G(V,E) }[/math] is balanced if [math]\displaystyle{ \rho(H)\le \rho(G) }[/math] for all subgraphs [math]\displaystyle{ H }[/math] of [math]\displaystyle{ G }[/math].

Cliques are balanced, because [math]\displaystyle{ \frac{{k\choose 2}}{k}\le \frac{{n\choose 2}}{n} }[/math] for any [math]\displaystyle{ k\le n }[/math]. The threshold for 4-clique is a direct corollary of the following general theorem.

Theorem (Erdős–Rényi 1960)

Let [math]\displaystyle{ H }[/math] be a balanced graph with [math]\displaystyle{ k }[/math] vertices and [math]\displaystyle{ \ell }[/math] edges. The threshold for the property that a random graph [math]\displaystyle{ G(n,p) }[/math] contains a (not necessarily induced) subgraph isomorphic to [math]\displaystyle{ H }[/math] is [math]\displaystyle{ p=n^{-k/\ell}\, }[/math].

Sketch of proof.

For any [math]\displaystyle{ S\in{V\choose k} }[/math], let [math]\displaystyle{ X_S }[/math] indicate whether [math]\displaystyle{ G_S }[/math] (the subgraph of [math]\displaystyle{ G }[/math] induced by [math]\displaystyle{ S }[/math]) contain a subgraph [math]\displaystyle{ H }[/math]. Then

[math]\displaystyle{ p^{\ell}\le\mathbf{E}[X_S]\le k!p^{\ell} }[/math], since there are at most [math]\displaystyle{ k! }[/math] ways to match the substructure.

Note that [math]\displaystyle{ k }[/math] does not depend on [math]\displaystyle{ n }[/math]. Thus, [math]\displaystyle{ \mathbf{E}[X_S]=\Theta(p^{\ell}) }[/math]. Let [math]\displaystyle{ X=\sum_{S\in{V\choose k}}X_S }[/math] be the number of [math]\displaystyle{ H }[/math]-subgraphs.

[math]\displaystyle{ \mathbf{E}[X]=\Theta(n^kp^{\ell}) }[/math].

By Markov's inequality, [math]\displaystyle{ \Pr[X\ge 1]\le \mathbf{E}[X]=\Theta(n^kp^{\ell}) }[/math] which is [math]\displaystyle{ o(1) }[/math] when [math]\displaystyle{ p\ll n^{-\ell/k} }[/math].

By Chebyshev's inequality, [math]\displaystyle{ \Pr[X=0]\le \frac{\mathbf{Var}[X]}{\mathbf{E}[X]^2} }[/math] where

[math]\displaystyle{ \mathbf{Var}[X]=\sum_{S\in{V\choose k}}\mathbf{Var}[X_S]+\sum_{S\neq T}\mathbf{Cov}(X_S,X_T) }[/math].

The first term [math]\displaystyle{ \sum_{S\in{V\choose k}}\mathbf{Var}[X_S]\le \sum_{S\in{V\choose k}}\mathbf{E}[X_S^2]= \sum_{S\in{V\choose k}}\mathbf{E}[X_S]=\mathbf{E}[X]=\Theta(n^kp^{\ell}) }[/math].

For the covariances, [math]\displaystyle{ \mathbf{Cov}(X_S,X_T)\neq 0 }[/math] only if [math]\displaystyle{ |S\cap T|=i }[/math] for [math]\displaystyle{ 2\le i\le k-1 }[/math]. Note that [math]\displaystyle{ |S\cap T|=i }[/math] implies that [math]\displaystyle{ |S\cup T|=2k-i }[/math]. And for balanced [math]\displaystyle{ H }[/math], the number of edges of interest in [math]\displaystyle{ S }[/math] and [math]\displaystyle{ T }[/math] is [math]\displaystyle{ 2\ell-i\rho(H_{S\cap T})\ge 2\ell-i\rho(H)=2\ell-i\ell/k }[/math]. Thus, [math]\displaystyle{ \mathbf{Cov}(X_S,X_T)\le\mathbf{E}[X_SX_T]\le p^{2\ell-i\ell/k} }[/math]. And,

[math]\displaystyle{ \sum_{S\neq T}\mathbf{Cov}(X_S,X_T)=\sum_{i=2}^{k-1}O(n^{2k-i}p^{2\ell-i\ell/k}) }[/math]

Therefore, when [math]\displaystyle{ p\gg n^{-\ell/k} }[/math],

[math]\displaystyle{ \Pr[X=0]\le \frac{\mathbf{Var}[X]}{\mathbf{E}[X]^2}\le \frac{\Theta(n^kp^{\ell})+\sum_{i=2}^{k-1}O(n^{2k-i}p^{2\ell-i\ell/k})}{\Theta(n^{2k}p^{2\ell})}=\Theta(n^{-k}p^{-\ell})+\sum_{i=2}^{k-1}O(n^{-i}p^{-i\ell/k})=o(1) }[/math].

[math]\displaystyle{ \square }[/math]

Two-point sampling

Pairwise Independent Variables

We now consider constructing pairwise independent random variables ranging over [math]\displaystyle{ [p]=\{0,1,2,\ldots,p-1\} }[/math] for some prime [math]\displaystyle{ p }[/math]. We assume [math]\displaystyle{ X_0,X_1 }[/math] to be two independent random variables which are uniformly and independently distributed over [math]\displaystyle{ [p] }[/math].

Let [math]\displaystyle{ Y_0,Y_1,\ldots, Y_{p-1} }[/math] be defined as:

[math]\displaystyle{ \begin{align} Y_i=(X_0+i\cdot X_1)\bmod p &\quad \mbox{for }i\in[p]. \end{align} }[/math]

Theorem

The random variables [math]\displaystyle{ Y_0,Y_1,\ldots, Y_{p-1} }[/math] are pairwise independent uniform random variables over [math]\displaystyle{ [p] }[/math].

Proof.

We first show that [math]\displaystyle{ Y_i }[/math] are uniform. That is, we will show that for any [math]\displaystyle{ i,a\in[p] }[/math],

[math]\displaystyle{ \begin{align} \Pr\left[(X_0+i\cdot X_1)\bmod p=a\right] &= \frac{1}{p}. \end{align} }[/math]

Due to the law of total probability,

[math]\displaystyle{ \begin{align} \Pr\left[(X_0+i\cdot X_1)\bmod p=a\right] &= \sum_{j\in[p]}\Pr[X_1=j]\cdot\Pr\left[(X_0+ij)\bmod p=a\right]\\ &=\frac{1}{p}\sum_{j\in[p]}\Pr\left[X_0\equiv(a-ij)\pmod{p}\right]. \end{align} }[/math]

For prime [math]\displaystyle{ p }[/math], for any [math]\displaystyle{ i,j,a\in[p] }[/math], there is exact one value in [math]\displaystyle{ [p] }[/math] of [math]\displaystyle{ X_0 }[/math] satisfying [math]\displaystyle{ X_0\equiv(a-ij)\pmod{p} }[/math]. Thus, [math]\displaystyle{ \Pr\left[X_0\equiv(a-ij)\pmod{p}\right]=1/p }[/math] and the above probability is [math]\displaystyle{ \frac{1}{p} }[/math].

We then show that [math]\displaystyle{ Y_i }[/math] are pairwise independent, i.e. we will show that for any [math]\displaystyle{ Y_i,Y_j }[/math] that [math]\displaystyle{ i\neq j }[/math] and any [math]\displaystyle{ a,b\in[p] }[/math],

[math]\displaystyle{ \begin{align} \Pr\left[Y_i=a\wedge Y_j=b\right] &= \frac{1}{p^2}. \end{align} }[/math]

The event [math]\displaystyle{ Y_i=a\wedge Y_j=b }[/math] is equivalent to that

[math]\displaystyle{ \begin{cases} (X_0+iX_1)\equiv a\pmod{p}\\ (X_0+jX_1)\equiv b\pmod{p} \end{cases} }[/math]

Due to the Chinese remainder theorem, there exists a unique solution of [math]\displaystyle{ X_0 }[/math] and [math]\displaystyle{ X_1 }[/math] in [math]\displaystyle{ [p] }[/math] to the above linear congruential system. Thus the probability of the event is [math]\displaystyle{ \frac{1}{p^2} }[/math].

[math]\displaystyle{ \square }[/math]

Two-spoint sampling

Consider a Monte Carlo randomized algorithm with one-sided error for a decision problem [math]\displaystyle{ f }[/math]. We formulate the algorithm as a deterministic algorithm [math]\displaystyle{ A }[/math] that takes as input [math]\displaystyle{ x }[/math] and a uniform random number [math]\displaystyle{ r\in[p] }[/math] where [math]\displaystyle{ p }[/math] is a prime, such that for any input [math]\displaystyle{ x }[/math]:

If [math]\displaystyle{ f(x)=1 }[/math], then [math]\displaystyle{ \Pr[A(x,r)=1]\ge\frac{1}{2} }[/math], where the probability is taken over the random choice of [math]\displaystyle{ r }[/math].
If [math]\displaystyle{ f(x)=0 }[/math], then [math]\displaystyle{ A(x,r)=0 }[/math] for any [math]\displaystyle{ r }[/math].

We call [math]\displaystyle{ r }[/math] the random source for the algorithm.

For the [math]\displaystyle{ x }[/math] that [math]\displaystyle{ f(x)=1 }[/math], we call the [math]\displaystyle{ r }[/math] that makes [math]\displaystyle{ A(x,r)=1 }[/math] a witness for [math]\displaystyle{ x }[/math]. For a positive [math]\displaystyle{ x }[/math], at least half of [math]\displaystyle{ [p] }[/math] are witnesses. The random source [math]\displaystyle{ r }[/math] has polynomial number of bits, which means that [math]\displaystyle{ p }[/math] is exponentially large, thus it is infeasible to find the witness for an input [math]\displaystyle{ x }[/math] by exhaustive search. Deterministic overcomes this by having sophisticated deterministic rules for efficiently searching for a witness. Randomization, on the other hard, reduce this to a bit of luck, by randomly choosing an [math]\displaystyle{ r }[/math] and winning with a probability of 1/2.

We can boost the accuracy (equivalently, reduce the error) of any Monte Carlo randomized algorithm with one-sided error by running the algorithm for a number of times.

Suppose that we sample [math]\displaystyle{ t }[/math] values [math]\displaystyle{ r_1,r_2,\ldots,r_t }[/math] uniformly and independently from [math]\displaystyle{ [p] }[/math], and run the following scheme:

[math]\displaystyle{ B(x,r_1,r_2,\ldots,r_t): }[/math]

return [math]\displaystyle{ \bigvee_{i=1}^t A(x,r_i) }[/math];

That is, return 1 if any instance of [math]\displaystyle{ A(x,r_i)=1 }[/math]. For any [math]\displaystyle{ x }[/math] that [math]\displaystyle{ f(x)=1 }[/math], due to the independence of [math]\displaystyle{ r_1,r_2,\ldots,r_t }[/math], the probability that [math]\displaystyle{ B(x,r_1,r_2,\ldots,r_t) }[/math] returns an incorrect result is at most [math]\displaystyle{ 2^{-t} }[/math]. On the other hand, [math]\displaystyle{ B }[/math] never makes mistakes for the [math]\displaystyle{ x }[/math] that [math]\displaystyle{ f(x)=0 }[/math] since [math]\displaystyle{ A }[/math] has no false positives. Thus, the error of the Monte Carlo algorithm is reduced to [math]\displaystyle{ 2^{-t} }[/math].

Sampling [math]\displaystyle{ t }[/math] mutually independent random numbers from [math]\displaystyle{ [p] }[/math] can be quite expensive since it requires [math]\displaystyle{ \Omega(t\log p) }[/math] random bits. Suppose that we can only afford [math]\displaystyle{ O(\log p) }[/math] random bits. In particular, we sample two independent uniform random number [math]\displaystyle{ a }[/math] and [math]\displaystyle{ b }[/math] from [math]\displaystyle{ [p] }[/math]. If we use [math]\displaystyle{ a }[/math] and [math]\displaystyle{ b }[/math] directly bu running two independent instances [math]\displaystyle{ A(x,a) }[/math] and [math]\displaystyle{ A(x,b) }[/math], we only get an error upper bound of 1/4.

The following scheme reduces the error significantly with the same number of random bits:

Algorithm

Choose two independent uniform random number [math]\displaystyle{ a }[/math] and [math]\displaystyle{ b }[/math] from [math]\displaystyle{ [p] }[/math]. Construct [math]\displaystyle{ t }[/math] random number [math]\displaystyle{ r_1,r_2,\ldots,r_t }[/math] by:

[math]\displaystyle{ \begin{align} \forall 1\le i\le t, &\quad \mbox{let }r_i = (a\cdot i+b)\bmod p. \end{align} }[/math]

Run [math]\displaystyle{ B(x,r_1,r_2,\ldots,r_t): }[/math].

Due to the discussion in the last section, we know that for [math]\displaystyle{ t\le p }[/math], [math]\displaystyle{ r_1,r_2,\ldots,r_t }[/math] are pairwise independent and uniform over [math]\displaystyle{ [p] }[/math]. Let [math]\displaystyle{ X_i=A(x,r_i) }[/math] and [math]\displaystyle{ X=\sum_{i=1}^tX_i }[/math]. Due to the uniformity of [math]\displaystyle{ r_i }[/math] and our definition of [math]\displaystyle{ A }[/math], for any [math]\displaystyle{ x }[/math] that [math]\displaystyle{ f(x)=1 }[/math], it holds that

[math]\displaystyle{ \Pr[X_i=1]=\Pr[A(x,r_i)=1]\ge\frac{1}{2}. }[/math]

By the linearity of expectations,

[math]\displaystyle{ \mathbf{E}[X]=\sum_{i=1}^t\mathbf{E}[X_i]=\sum_{i=1}^t\Pr[X_i=1]\ge\frac{t}{2}. }[/math]

Since [math]\displaystyle{ X_i }[/math] is Bernoulli trial with a probability of success at least [math]\displaystyle{ p=1/2 }[/math]. We can estimate the variance of each [math]\displaystyle{ X_i }[/math] as follows.

[math]\displaystyle{ \mathbf{Var}[X_i]=p(1-p)\le\frac{1}{4}. }[/math]

Applying Chebyshev's inequality, we have that for any [math]\displaystyle{ x }[/math] that [math]\displaystyle{ f(x)=1 }[/math],

[math]\displaystyle{ \begin{align} \Pr\left[\bigvee_{i=1}^t A(x,r_i)=0\right] &= \Pr[X=0]\\ &\le \Pr[|X-\mathbf{E}[X]|\ge \mathbf{E}[X]]\\ &\le \Pr\left[|X-\mathbf{E}[X]|\ge \frac{t}{2}\right]\\ &\le \frac{4}{t^2}\sum_{i=1}^t\mathbf{Var}[X_i]\\ &\le \frac{1}{t}. \end{align} }[/math]

The error is reduced to [math]\displaystyle{ 1/t }[/math] with only two random numbers. This scheme works as long as [math]\displaystyle{ t\le p }[/math].

随机算法 (Spring 2014)/Second Moment: Difference between revisions

Latest revision as of 05:32, 19 March 2014

Contents

Erdős–Rényi Random Graphs

Threshold phenomenon

Threshold for 4-clique

Threshold for balanced subgraphs

Two-point sampling

Pairwise Independent Variables

Two-spoint sampling

Navigation menu

@@ Line 1: / Line 1: @@
-#REDIRECT [[随机算法 (Spring 2014)/Moment and Deviation]]
+=  Erdős–Rényi Random Graphs =
+Consider a graph <math>G(V,E)</math> which is randomly generated as:
+* <math>|V|=n</math>;
+* <math>\forall \{u,v\}\in{V\choose 2}</math>, <math>uv\in E</math> independently with probability <math>p</math>.
+Such graph is denoted as '''<math>G(n,p)</math>'''. This is called the '''Erdős–Rényi model''' or '''<math>G(n,p)</math> model''' for random graphs.
+Informally, the presence of every edge of <math>G(n,p)</math> is determined by an independent coin flipping (with probability of HEADs <math>p</math>).
+= Threshold phenomenon =
+One of the most fascinating phenomenon of random graphs is that for so many natural graph properties, the random graph <math>G(n,p)</math> suddenly changes from almost always not having the property to almost always having the property as <math>p</math> grows in a very small range.
+A monotone graph property <math>P</math> is said to have the '''threshold''' <math>p(n)</math> if
+* when <math>p\ll p(n)</math>, <math>\Pr[P(G(n,p))]=0</math> as <math>n\rightarrow\infty</math> (also called <math>G(n,p)</math> almost always does not have <math>P</math>); and
+* when <math>p\gg p(n)</math>, <math>\Pr[P(G(n,p))]=1</math> as <math>n\rightarrow\infty</math> (also called <math>G(n,p)</math> almost always has <math>P</math>).
+The classic method for proving the threshold is the so-called second moment method (Chebyshev's inequality).
+== Threshold for 4-clique ==
+{{Theorem|Theorem|
+:The threshold for a random graph <math>G(n,p)</math> to contain a 4-clique is <math>p=n^{-2/3}</math>.
+}}
+We formulate the problem as such.
+For any <math>4</math>-subset of vertices <math>S\in{V\choose 4}</math>, let <math>X_S</math> be the indicator random variable such that
+:<math>
+X_S=
+\begin{cases}
+& S\mbox{ is a clique},\\
+&  \mbox{otherwise}.
+\end{cases}
+</math>
+Let <math>X=\sum_{S\in{V\choose 4}}X_S</math> be the total number of 4-cliques in <math>G</math>.
+It is sufficient to prove the following lemma.
+{{Theorem|Lemma|
+*If <math>p=o(n^{-2/3})</math>, then <math>\Pr[X\ge 1]\rightarrow 0</math> as <math>n\rightarrow\infty</math>.
+*If <math>p=\omega(n^{-2/3})</math>, then <math>\Pr[X\ge 1]\rightarrow 1</math> as <math>n\rightarrow\infty</math>.
+}}
+{{Proof|
+The first claim is proved by the first moment (expectation and Markov's inequality) and the second claim is proved by the second moment method (Chebyshev's inequality).
+Every 4-clique has 6 edges, thus for any <math>S\in{V\choose 4}</math>,
+:<math>\mathbf{E}[X_S]=\Pr[X_S=1]=p^6</math>.
+By the linearity of expectation,
+:<math>\mathbf{E}[X]=\sum_{S\in{V\choose 4}}\mathbf{E}[X_S]={n\choose 4}p^6</math>.
+Applying Markov's inequality
+:<math>\Pr[X\ge 1]\le \mathbf{E}[X]=O(n^4p^6)=o(1)</math>, if <math>p=o(n^{-2/3})</math>.
+The first claim is proved.
+To prove the second claim, it is equivalent to show that <math>\Pr[X=0]=o(1)</math> if <math>p=\omega(n^{-2/3})</math>. By the Chebyshev's inequality,
+:<math>\Pr[X=0]\le\Pr[|X-\mathbf{E}[X]|\ge\mathbf{E}[X]]\le\frac{\mathbf{Var}[X]}{(\mathbf{E}[X])^2}</math>,
+where the variance is computed as
+:<math>\mathbf{Var}[X]=\mathbf{Var}\left[\sum_{S\in{V\choose 4}}X_S\right]=\sum_{S\in{V\choose 4}}\mathbf{Var}[X_S]+\sum_{S,T\in{V\choose 4}, S\neq T}\mathbf{Cov}(X_S,X_T)</math>.
+For any <math>S\in{V\choose 4}</math>,
+:<math>\mathbf{Var}[X_S]=\mathbf{E}[X_S^2]-\mathbf{E}[X_S]^2\le \mathbf{E}[X_S^2]=\mathbf{E}[X_S]=p^6</math>. Thus the first term of above formula is <math>\sum_{S\in{V\choose 4}}\mathbf{Var}[X_S]=O(n^4p^6)</math>.
+We now compute the covariances. For any <math>S,T\in{V\choose 4}</math> that <math>S\neq T</math>:
+* Case.1: <math>|S\cap T|\le 1</math>, so <math>S</math> and <math>T</math> do not share any edges. <math>X_S</math> and <math>X_T</math> are independent, thus <math>\mathbf{Cov}(X_S,X_T)=0</math>.
+* Case.2: <math>|S\cap T|= 2</math>, so <math>S</math> and <math>T</math> share an edge. Since <math>|S\cup T|=6</math>, there are <math>{n\choose 6}=O(n^6)</math> pairs of such <math>S</math> and <math>T</math>.
+::<math>\mathbf{Cov}(X_S,X_T)=\mathbf{E}[X_SX_T]-\mathbf{E}[X_S]\mathbf{E}[X_T]\le\mathbf{E}[X_SX_T]=\Pr[X_S=1\wedge X_T=1]=p^{11}</math>
+:since there are 11 edges in the union of two 4-cliques that share a common edge. The contribution of these pairs is <math>O(n^6p^{11})</math>.
+* Case.2: <math>|S\cap T|= 3</math>, so <math>S</math> and <math>T</math> share a triangle. Since <math>|S\cup T|=5</math>, there are <math>{n\choose 5}=O(n^5)</math> pairs of such <math>S</math> and <math>T</math>. By the same argument,
+::<math>\mathbf{Cov}(X_S,X_T)\le\Pr[X_S=1\wedge X_T=1]=p^{9}</math>
+:since there are 9 edges in the union of two 4-cliques that share a triangle. The contribution of these pairs is <math>O(n^5p^{9})</math>.
+Putting all these together,
+:<math>\mathbf{Var}[X]=O(n^4p^6+n^6p^{11}+n^5p^{9}).</math>
+And
+:<math>\Pr[X=0]\le\frac{\mathbf{Var}[X]}{(\mathbf{E}[X])^2}=O(n^{-4}p^{-6}+n^{-2}p^{-1}+n^{-3}p^{-3})</math>,
+which is <math>o(1)</math> if <math>p=\omega(n^{-2/3})</math>. The second claim is also proved.
+}}
+== Threshold for balanced subgraphs ==
+The above theorem can be generalized to any "balanced" subgraphs.
+{{Theorem|Definition|
+* The '''density''' of a graph <math>G(V,E)</math>, denoted <math>\rho(G)\,</math>, is defined as <math>\rho(G)=\frac{|E|}{|V|}</math>.
+* A graph <math>G(V,E)</math> is '''balanced''' if <math>\rho(H)\le \rho(G)</math> for all subgraphs <math>H</math> of <math>G</math>.
+}}
+Cliques are balanced, because <math>\frac{{k\choose 2}}{k}\le \frac{{n\choose 2}}{n}</math> for any <math>k\le n</math>. The threshold for 4-clique is a direct corollary of the following general theorem.
+{{Theorem|Theorem (Erdős–Rényi 1960)|
+:Let <math>H</math> be a balanced graph with <math>k</math> vertices and <math>\ell</math> edges. The threshold for the property that a random graph <math>G(n,p)</math> contains a (not necessarily induced) subgraph isomorphic to <math>H</math> is <math>p=n^{-k/\ell}\,</math>.
+}}
+{{Prooftitle|Sketch of proof.|
+For any <math>S\in{V\choose k}</math>, let <math>X_S</math> indicate whether <math>G_S</math> (the subgraph of <math>G</math> induced by <math>S</math>) contain a subgraph <math>H</math>. Then
+:<math>p^{\ell}\le\mathbf{E}[X_S]\le k!p^{\ell}</math>, since there are at most <math>k!</math> ways to match the substructure.
+Note that <math>k</math> does not depend on <math>n</math>. Thus, <math>\mathbf{E}[X_S]=\Theta(p^{\ell})</math>. Let <math>X=\sum_{S\in{V\choose k}}X_S</math> be the number of <math>H</math>-subgraphs.
+:<math>\mathbf{E}[X]=\Theta(n^kp^{\ell})</math>.
+By Markov's inequality, <math>\Pr[X\ge 1]\le \mathbf{E}[X]=\Theta(n^kp^{\ell})</math> which is <math>o(1)</math> when <math>p\ll n^{-\ell/k}</math>.
+By Chebyshev's inequality, <math>\Pr[X=0]\le \frac{\mathbf{Var}[X]}{\mathbf{E}[X]^2}</math> where
+:<math>\mathbf{Var}[X]=\sum_{S\in{V\choose k}}\mathbf{Var}[X_S]+\sum_{S\neq T}\mathbf{Cov}(X_S,X_T)</math>.
+The first term <math>\sum_{S\in{V\choose k}}\mathbf{Var}[X_S]\le \sum_{S\in{V\choose k}}\mathbf{E}[X_S^2]= \sum_{S\in{V\choose k}}\mathbf{E}[X_S]=\mathbf{E}[X]=\Theta(n^kp^{\ell})</math>.
+For the covariances, <math>\mathbf{Cov}(X_S,X_T)\neq 0</math> only if <math>|S\cap T|=i</math> for <math>2\le i\le k-1</math>. Note that <math>|S\cap T|=i</math> implies that <math>|S\cup T|=2k-i</math>. And for balanced <math>H</math>, the number of edges of interest in <math>S</math> and <math>T</math> is <math>2\ell-i\rho(H_{S\cap T})\ge 2\ell-i\rho(H)=2\ell-i\ell/k</math>. Thus, <math>\mathbf{Cov}(X_S,X_T)\le\mathbf{E}[X_SX_T]\le p^{2\ell-i\ell/k}</math>. And,
+:<math>\sum_{S\neq T}\mathbf{Cov}(X_S,X_T)=\sum_{i=2}^{k-1}O(n^{2k-i}p^{2\ell-i\ell/k})</math>
+Therefore, when <math>p\gg n^{-\ell/k}</math>,
+:<math>
+\Pr[X=0]\le \frac{\mathbf{Var}[X]}{\mathbf{E}[X]^2}\le \frac{\Theta(n^kp^{\ell})+\sum_{i=2}^{k-1}O(n^{2k-i}p^{2\ell-i\ell/k})}{\Theta(n^{2k}p^{2\ell})}=\Theta(n^{-k}p^{-\ell})+\sum_{i=2}^{k-1}O(n^{-i}p^{-i\ell/k})=o(1)</math>.
+}}
+= Two-point sampling =
+==  Pairwise Independent Variables ==
+We now consider constructing pairwise independent random variables ranging over <math>[p]=\{0,1,2,\ldots,p-1\}</math> for some prime <math>p</math>. We assume <math>X_0,X_1</math> to be two independent random variables which are uniformly and independently distributed over <math>[p]</math>.
+Let <math>Y_0,Y_1,\ldots, Y_{p-1}</math> be defined as:
+:<math>
+\begin{align}
+Y_i=(X_0+i\cdot X_1)\bmod p &\quad \mbox{for }i\in[p].
+\end{align}
+</math>
+{{Theorem
+|Theorem|
+: The random variables <math>Y_0,Y_1,\ldots, Y_{p-1}</math> are pairwise independent uniform random variables over <math>[p]</math>.
+}}
+{{Proof| We first show that <math>Y_i</math> are uniform. That is, we will show that for any <math>i,a\in[p]</math>,
+:<math>\begin{align}
+\Pr\left[(X_0+i\cdot X_1)\bmod p=a\right]
+&=
+\frac{1}{p}.
+\end{align}</math>
+Due to the law of total probability,
+:<math>\begin{align}
+\Pr\left[(X_0+i\cdot X_1)\bmod p=a\right]
+&=
+\sum_{j\in[p]}\Pr[X_1=j]\cdot\Pr\left[(X_0+ij)\bmod p=a\right]\\
+&=\frac{1}{p}\sum_{j\in[p]}\Pr\left[X_0\equiv(a-ij)\pmod{p}\right].
+\end{align}</math>
+For prime <math>p</math>, for any <math>i,j,a\in[p]</math>, there is exact one value in <math>[p]</math> of <math>X_0</math> satisfying <math>X_0\equiv(a-ij)\pmod{p}</math>. Thus, <math>\Pr\left[X_0\equiv(a-ij)\pmod{p}\right]=1/p</math> and the above probability is <math>\frac{1}{p}</math>.
+We then show that <math>Y_i</math> are pairwise independent, i.e. we will show that for any <math>Y_i,Y_j</math> that <math>i\neq j</math> and any <math>a,b\in[p]</math>,
+:<math>\begin{align}
+\Pr\left[Y_i=a\wedge Y_j=b\right]
+&=
+\frac{1}{p^2}.
+\end{align}</math>
+The event <math>Y_i=a\wedge Y_j=b</math> is equivalent to that
+:<math>
+\begin{cases}
+(X_0+iX_1)\equiv a\pmod{p}\\
+(X_0+jX_1)\equiv b\pmod{p}
+\end{cases}
+</math>
+Due to the [http://en.wikipedia.org/wiki/Chinese_remainder_theorem Chinese remainder theorem], there exists a unique solution of <math>X_0</math> and <math>X_1</math> in <math>[p]</math> to the above linear congruential system. Thus the probability of the event is <math>\frac{1}{p^2}</math>.
+}}
+== Two-spoint sampling ==
+Consider a Monte Carlo randomized algorithm with one-sided error for a decision problem <math>f</math>. We formulate the algorithm as a deterministic algorithm <math>A</math> that takes as input  <math>x</math> and a uniform random number <math>r\in[p]</math> where <math>p</math> is a prime, such that for any  input <math>x</math>:
+* If <math>f(x)=1</math>, then <math>\Pr[A(x,r)=1]\ge\frac{1}{2}</math>, where the probability is taken over the random choice of <math>r</math>.
+* If <math>f(x)=0</math>, then <math>A(x,r)=0</math> for any <math>r</math>.
+We call <math>r</math> the '''random source''' for the algorithm.
+For the <math>x</math> that <math>f(x)=1</math>, we call the <math>r</math> that makes <math>A(x,r)=1</math> a '''witness''' for <math>x</math>. For a positive <math>x</math>, at least half of <math>[p]</math> are witnesses. The random source <math>r</math> has polynomial number of bits, which means that <math>p</math> is exponentially large, thus it is infeasible to find the witness for an input <math>x</math> by exhaustive search. Deterministic overcomes this by having sophisticated deterministic rules for efficiently searching for a witness. Randomization, on the other hard, reduce this to a bit of luck, by randomly choosing an <math>r</math> and winning with a probability of 1/2.
+We can boost the accuracy (equivalently, reduce the error) of any Monte Carlo randomized algorithm with one-sided error by running the algorithm for a number of times.
+Suppose that we sample <math>t</math> values <math>r_1,r_2,\ldots,r_t</math> uniformly and independently from <math>[p]</math>, and run the following scheme:
+:{|border="1"
+|<math>B(x,r_1,r_2,\ldots,r_t):</math>
+:return <math>\bigvee_{i=1}^t A(x,r_i)</math>;
+|}
+That is, return 1 if any instance of <math>A(x,r_i)=1</math>. For any <math>x</math> that <math>f(x)=1</math>, due to the independence of <math>r_1,r_2,\ldots,r_t</math>, the probability that <math>B(x,r_1,r_2,\ldots,r_t)</math> returns an incorrect result is at most <math>2^{-t}</math>. On the other hand, <math>B</math> never makes mistakes for the <math>x</math> that <math>f(x)=0</math> since <math>A</math> has no false positives. Thus, the error of the Monte Carlo algorithm is reduced to <math>2^{-t}</math>.
+Sampling <math>t</math> mutually independent random numbers from <math>[p]</math> can be quite expensive since it requires <math>\Omega(t\log p)</math> random bits. Suppose that we can only afford <math>O(\log p)</math> random bits. In particular, we sample two independent uniform random number <math>a</math> and <math>b</math> from <math>[p]</math>. If we use <math>a</math> and <math>b</math> directly bu running two independent instances <math>A(x,a)</math> and <math>A(x,b)</math>, we only get an error upper bound of 1/4.
+The following scheme reduces the error significantly with the same number of random bits:
+:{{Theorem
+|Algorithm|
+Choose two independent uniform random number <math>a</math> and <math>b</math> from <math>[p]</math>.
+Construct <math>t</math> random number <math>r_1,r_2,\ldots,r_t</math> by:
+:<math>
+\begin{align}
+\forall 1\le i\le t, &\quad \mbox{let }r_i = (a\cdot i+b)\bmod p.
+\end{align}
+</math>
+Run <math>B(x,r_1,r_2,\ldots,r_t):</math>.
+}}
+Due to the discussion in the last section, we know that for <math>t\le p</math>, <math>r_1,r_2,\ldots,r_t</math> are pairwise independent and uniform over <math>[p]</math>. Let <math>X_i=A(x,r_i)</math> and <math>X=\sum_{i=1}^tX_i</math>. Due to the uniformity of <math>r_i</math> and our definition of <math>A</math>, for any <math>x</math> that <math>f(x)=1</math>, it holds that
+:<math>
+\Pr[X_i=1]=\Pr[A(x,r_i)=1]\ge\frac{1}{2}.
+</math>
+By the linearity of expectations,
+:<math>
+\mathbf{E}[X]=\sum_{i=1}^t\mathbf{E}[X_i]=\sum_{i=1}^t\Pr[X_i=1]\ge\frac{t}{2}.
+</math>
+Since <math>X_i</math> is Bernoulli trial with a probability of success at least <math>p=1/2</math>.
+We can estimate the variance of each <math>X_i</math> as follows.
+:<math>
+\mathbf{Var}[X_i]=p(1-p)\le\frac{1}{4}.
+</math>
+Applying Chebyshev's inequality, we have that for any <math>x</math> that <math>f(x)=1</math>,
+:<math>\begin{align}
+\Pr\left[\bigvee_{i=1}^t A(x,r_i)=0\right]
+&=
+\Pr[X=0]\\
+&\le
+\Pr[|X-\mathbf{E}[X]|\ge \mathbf{E}[X]]\\
+&\le
+\Pr\left[|X-\mathbf{E}[X]|\ge \frac{t}{2}\right]\\
+&\le
+\frac{4}{t^2}\sum_{i=1}^t\mathbf{Var}[X_i]\\
+&\le \frac{1}{t}.
+\end{align}
+</math>
+The error is reduced to <math>1/t</math> with only two random numbers. This scheme works as long as <math>t\le p</math>.

随机算法 (Spring 2014)/Second Moment: Difference between revisions

Latest revision as of 05:32, 19 March 2014

Erdős–Rényi Random Graphs

Threshold phenomenon

Threshold for 4-clique

Threshold for balanced subgraphs

Two-point sampling

Pairwise Independent Variables

Two-spoint sampling

Navigation menu

Search