随机算法 (Spring 2014)/Problem Set 2: Difference between revisions
imported>Etone No edit summary |
imported>Etone No edit summary |
||
Line 5: | Line 5: | ||
# Suppose now that the random set <math>R</math> is chosen by sampling the elements of <math>U</math> with only <i>pairwise</i> independence. Show that for a suitable choice of the value of <math>p</math>, the probability that <math>R</math> is good is larger than some positive constant. | # Suppose now that the random set <math>R</math> is chosen by sampling the elements of <math>U</math> with only <i>pairwise</i> independence. Show that for a suitable choice of the value of <math>p</math>, the probability that <math>R</math> is good is larger than some positive constant. | ||
== Problem 2 | == Problem 2== | ||
# Generalize the LazySelect algorithm for the <math>k</math>-selection problem: Given as input an array of <math>n</math> distinct numbers and an integer <math>k</math>, find the <math>k</math>th smallest number in the array. | # Generalize the LazySelect algorithm for the <math>k</math>-selection problem: Given as input an array of <math>n</math> distinct numbers and an integer <math>k</math>, find the <math>k</math>th smallest number in the array. | ||
# Use the Chernoff bounds instead of Chebyshev's inequality in the analysis of the LazySelect Algorithm and try to use as few random samples as possible. | # Use the Chernoff bounds instead of Chebyshev's inequality in the analysis of the LazySelect Algorithm and try to use as few random samples as possible. | ||
==Problem | ==Problem 3== | ||
A '''boolean code''' is a mapping <math>C:\{0,1\}^k\rightarrow\{0,1\}^n</math>. Each <math>x\in\{0,1\}^k</math> is called a '''message''' and <math>y=C(x)</math> is called a '''codeword'''. The '''code rate''' <math>r</math> of a code <math>C</math> is <math>r=\frac{k}{n}</math>. A boolean code <math>C:\{0,1\}^k\rightarrow\{0,1\}^n</math> is a '''linear code''' if it is a linear transformation, i.e. there is a matrix <math>A\in\{0,1\}^{n\times k}</math> such that <math>C(x)=Ax</math> for any <math>x\in\{0,1\}^k</math>, where the additions and multiplications are defined over the finite field of order two, <math>(\{0,1\},+_{\bmod 2},\times_{\bmod 2})</math>. | A '''boolean code''' is a mapping <math>C:\{0,1\}^k\rightarrow\{0,1\}^n</math>. Each <math>x\in\{0,1\}^k</math> is called a '''message''' and <math>y=C(x)</math> is called a '''codeword'''. The '''code rate''' <math>r</math> of a code <math>C</math> is <math>r=\frac{k}{n}</math>. A boolean code <math>C:\{0,1\}^k\rightarrow\{0,1\}^n</math> is a '''linear code''' if it is a linear transformation, i.e. there is a matrix <math>A\in\{0,1\}^{n\times k}</math> such that <math>C(x)=Ax</math> for any <math>x\in\{0,1\}^k</math>, where the additions and multiplications are defined over the finite field of order two, <math>(\{0,1\},+_{\bmod 2},\times_{\bmod 2})</math>. | ||
Line 24: | Line 19: | ||
* Prove a similar result for linear boolean codes. | * Prove a similar result for linear boolean codes. | ||
== Problem | == Problem 4 == | ||
Let <math>X_1,X_2,\dots,X_n</math> be independent geometrically distributed random variables each having expectation 2 (each of the <math>X_i</math> is an independent experiment counting the number of tosses of an unbiased coin up to and including the first HEADS). Let <math>X=\sum_{i=1}^nX_i</math> and <math>\delta</math> be a positive real constant. Derive the best upper bound you can on <math>\Pr[X>(1+\delta)(2n)]</math>. | Let <math>X_1,X_2,\dots,X_n</math> be independent geometrically distributed random variables each having expectation 2 (each of the <math>X_i</math> is an independent experiment counting the number of tosses of an unbiased coin up to and including the first HEADS). Let <math>X=\sum_{i=1}^nX_i</math> and <math>\delta</math> be a positive real constant. Derive the best upper bound you can on <math>\Pr[X>(1+\delta)(2n)]</math>. |
Latest revision as of 12:35, 24 March 2014
Problem 1
(Due to D.R. Karger and R. Motwani.)
- Let [math]\displaystyle{ S,T }[/math] be two disjoint subsets of a universe [math]\displaystyle{ U }[/math] such that [math]\displaystyle{ |S|=|T|=n }[/math]. Suppose we select a random set [math]\displaystyle{ R\subseteq U }[/math] by independently sampling each element of [math]\displaystyle{ U }[/math] with probability [math]\displaystyle{ p }[/math]. We say that the random sample [math]\displaystyle{ R }[/math] is good if the following two conditions hold: [math]\displaystyle{ R\cap S=\emptyset }[/math] and [math]\displaystyle{ R\cap T\ne\emptyset }[/math]. Show that for [math]\displaystyle{ p=1/n }[/math], the probability that [math]\displaystyle{ R }[/math] is good is larger than some positive constant.
- Suppose now that the random set [math]\displaystyle{ R }[/math] is chosen by sampling the elements of [math]\displaystyle{ U }[/math] with only pairwise independence. Show that for a suitable choice of the value of [math]\displaystyle{ p }[/math], the probability that [math]\displaystyle{ R }[/math] is good is larger than some positive constant.
Problem 2
- Generalize the LazySelect algorithm for the [math]\displaystyle{ k }[/math]-selection problem: Given as input an array of [math]\displaystyle{ n }[/math] distinct numbers and an integer [math]\displaystyle{ k }[/math], find the [math]\displaystyle{ k }[/math]th smallest number in the array.
- Use the Chernoff bounds instead of Chebyshev's inequality in the analysis of the LazySelect Algorithm and try to use as few random samples as possible.
Problem 3
A boolean code is a mapping [math]\displaystyle{ C:\{0,1\}^k\rightarrow\{0,1\}^n }[/math]. Each [math]\displaystyle{ x\in\{0,1\}^k }[/math] is called a message and [math]\displaystyle{ y=C(x) }[/math] is called a codeword. The code rate [math]\displaystyle{ r }[/math] of a code [math]\displaystyle{ C }[/math] is [math]\displaystyle{ r=\frac{k}{n} }[/math]. A boolean code [math]\displaystyle{ C:\{0,1\}^k\rightarrow\{0,1\}^n }[/math] is a linear code if it is a linear transformation, i.e. there is a matrix [math]\displaystyle{ A\in\{0,1\}^{n\times k} }[/math] such that [math]\displaystyle{ C(x)=Ax }[/math] for any [math]\displaystyle{ x\in\{0,1\}^k }[/math], where the additions and multiplications are defined over the finite field of order two, [math]\displaystyle{ (\{0,1\},+_{\bmod 2},\times_{\bmod 2}) }[/math].
The distance between two codeword [math]\displaystyle{ y_1 }[/math] and [math]\displaystyle{ y_2 }[/math], denoted by [math]\displaystyle{ d(y_1,y_2) }[/math], is defined as the Hamming distance between them. Formally, [math]\displaystyle{ d(y_1,y_2)=\|y_1-y_2\|_1=\sum_{i=1}^n|y_1(i)-y_2(i)| }[/math]. The distance of a code [math]\displaystyle{ C }[/math] is the minimum distance between any two codewords. Formally, [math]\displaystyle{ d=\min_{x_1,x_2\in \{0,1\}^k\atop x_1\neq x_2}d(C(x_1),C(x_2)) }[/math].
Usually we want to make both the code rate [math]\displaystyle{ r }[/math] and the code distance [math]\displaystyle{ d }[/math] as large as possible, because a larger rate means that the amount of actual message per transmitted bit is high, and a larger distance allows for more error correction and detection.
- Use the probabilistic method to prove that there exists a boolean code [math]\displaystyle{ C:\{0,1\}^k\rightarrow\{0,1\}^n }[/math] of code rate [math]\displaystyle{ r }[/math] and distance [math]\displaystyle{ \left(\frac{1}{2}-\Theta\left(\sqrt{r}\right)\right)n }[/math]. Try to optimize the constant in [math]\displaystyle{ \Theta(\cdot) }[/math].
- Prove a similar result for linear boolean codes.
Problem 4
Let [math]\displaystyle{ X_1,X_2,\dots,X_n }[/math] be independent geometrically distributed random variables each having expectation 2 (each of the [math]\displaystyle{ X_i }[/math] is an independent experiment counting the number of tosses of an unbiased coin up to and including the first HEADS). Let [math]\displaystyle{ X=\sum_{i=1}^nX_i }[/math] and [math]\displaystyle{ \delta }[/math] be a positive real constant. Derive the best upper bound you can on [math]\displaystyle{ \Pr[X\gt (1+\delta)(2n)] }[/math].