概率论 (Summer 2014)/Problem Set 1
Problem 1
(Due to J. von Neumann.)
- Suppose you are given a coin for which the probability of HEADS, say [math]\displaystyle{ p }[/math], is unknown. How can you use this coin to generate unbiased (i.e., [math]\displaystyle{ \Pr[\mbox{HEADS}]=\Pr[\mbox{TAILS}]=1/2 }[/math]) coin-flips? Give a scheme for which the expected number of flips of the biased coin for extracting one unbiased coin-flip is no more than [math]\displaystyle{ 1/(p(1-p)) }[/math].
- Devise an extension of the scheme that extracts the largest possible number of independent, unbiased coin-flips from a given number of flips of the biased coin.
Problem 2
(Due to D.E. Knuth and A. C-C. Yao.)
- Suppose you are provided with a source of unbiased random bits. Explain how you will use this to generate uniform samples from the set [math]\displaystyle{ S=\{0,\dots,n-1\} }[/math]. Determine the expected number of random bits required by your sampling algorithm.
- What is the worst-case number of random bits required by your sampling algorithm? Consider the case when [math]\displaystyle{ n }[/math] is a power of [math]\displaystyle{ 2 }[/math], as well as the case when it is not.
- Solve (1) and (2) when, instead of unbiased random bits, you are required to use as the source of randomness uniform random samples from the set [math]\displaystyle{ \{0,\dots,p-1\} }[/math]; consider the case when [math]\displaystyle{ n }[/math] is a power of [math]\displaystyle{ p }[/math], as well as the case when it is not.
Problem 3
(Due to D.R. Karger and R. Motwani.)
- Let [math]\displaystyle{ S,T }[/math] be two disjoint subsets of a universe [math]\displaystyle{ U }[/math] such that [math]\displaystyle{ |S|=|T|=n }[/math]. Suppose we select a random set [math]\displaystyle{ R\subseteq U }[/math] by independently sampling each element of [math]\displaystyle{ U }[/math] with probability [math]\displaystyle{ p }[/math]. We say that the random sample [math]\displaystyle{ R }[/math] is good if the following two conditions hold: [math]\displaystyle{ R\cap S=\emptyset }[/math] and [math]\displaystyle{ R\cap T\ne\emptyset }[/math]. Show that for [math]\displaystyle{ p=1/n }[/math], the probability that [math]\displaystyle{ R }[/math] is good is larger than some positive constant.
- Suppose now that the random set [math]\displaystyle{ R }[/math] is chosen by sampling the elements of [math]\displaystyle{ U }[/math] with only pairwise independence. Show that for a suitable choice of the value of [math]\displaystyle{ p }[/math], the probability that [math]\displaystyle{ R }[/math] is good is larger than some positive constant.
Problem 4
We play the following game:
Start with [math]\displaystyle{ n }[/math] people, each with 2 hands. None of these hands hold each other. At each round, uniformly pick 2 free hands and let these two hands hold together. Repeat this until no free hands left.
- What is the expected number of cycles made by people holding hands with each other (one person with left hand holding right hand is also counted as a cycle) at the end of the game?
Problem 5
In Balls-and-Bins model, we throw [math]\displaystyle{ n }[/math] balls independently and uniformly at random into [math]\displaystyle{ n }[/math] bins, then the maximum load is [math]\displaystyle{ \Theta\left(\frac{\ln n}{\ln\ln n}\right) }[/math] with high probability.
The two-choice paradigm is another way to throw [math]\displaystyle{ n }[/math] balls into [math]\displaystyle{ n }[/math] bins: each ball is thrown into the least loaded of 2 bins chosen independently and uniformly at random and breaks the tie arbitrarily. The maximum load of two-choice paradigm is [math]\displaystyle{ \Theta(\ln\ln n) }[/math] with high probability, which is exponentially less than the previous one. This phenomenon is called the power of two choices.
Now consider the following three paradigms:
- The first [math]\displaystyle{ n/2 }[/math] balls are thrown into bins independently and uniformly at random. The remaining [math]\displaystyle{ n/2 }[/math] balls are thrown into bins using two-choice paradigm.
- The first [math]\displaystyle{ n/2 }[/math] balls are thrown into bins using two-choice paradigm. The remaining [math]\displaystyle{ n/2 }[/math] balls are thrown into bins independently and uniformly at random.
- Assume all [math]\displaystyle{ n }[/math] balls are in a sequence. For every [math]\displaystyle{ 1\le i\le n }[/math], if [math]\displaystyle{ i }[/math] is odd, we throw [math]\displaystyle{ i }[/math]th ball into bins independently and uniformly at random, otherwise, we throw it into bins using two-choice paradigm.
What is the maximum load with high probability in each of three paradigms. You need to give an asymptotically tight bound (i.e. [math]\displaystyle{ \Theta(\cdot) }[/math]).
Problem 6
Consider a sequence of [math]\displaystyle{ n }[/math] flips of an unbiased coin. Let [math]\displaystyle{ H_i }[/math] denote the absolute value of the excess of the number of HEADS over the number of TAILS seen in the first [math]\displaystyle{ i }[/math] flips. Define [math]\displaystyle{ H=\max_i H_i }[/math]. Show that [math]\displaystyle{ \mathbf{E}[H_i]=\Theta(\sqrt{i}) }[/math], and that [math]\displaystyle{ \mathbf{E}[H]=\Theta(\sqrt{n}) }[/math].
Problem 7
Consider the following experiment, which proceeds in a sequence of rounds. For the first round, we have [math]\displaystyle{ n }[/math] balls, which are thrown independently and uniformly at random into [math]\displaystyle{ n }[/math] bins. After round [math]\displaystyle{ i }[/math], for [math]\displaystyle{ i\ge 1 }[/math], we discard every ball that fell into a bin by itself in round [math]\displaystyle{ i }[/math] (i.e., we discard a ball if and only if there is no other balls that fell into the same bin). The remaining balls are retained for round [math]\displaystyle{ i+1 }[/math], in which they are thrown independently and uniformly at random into the [math]\displaystyle{ n }[/math] bins. Show that there is a constant [math]\displaystyle{ c }[/math] such that with probability [math]\displaystyle{ 1-o(1) }[/math], the number of rounds is at most [math]\displaystyle{ c\ln\ln n }[/math].