高级算法 (Fall 2021)/Problem Set 1
- 每道题目的解答都要有完整的解题过程。中英文不限。
Problem 1
Recall that in class we show by the probabilistic method how to deduce a [math]\displaystyle{ \frac{n(n-1)}{2} }[/math] upper bound on the number of distinct min-cuts in any multigraph [math]\displaystyle{ G }[/math] with [math]\displaystyle{ n }[/math] vertices from the [math]\displaystyle{ \frac{2}{n(n-1)} }[/math] lower bound for success probability of Karger's min-cut algorithm.
Also recall that the [math]\displaystyle{ FastCut }[/math] algorithm taught in class guarantees to return a min-cut with probability at least [math]\displaystyle{ \Omega(1/\log n) }[/math]. Does this imply a much tighter [math]\displaystyle{ O(\log n) }[/math] upper bound on the number of distinct min-cuts in any multigraph [math]\displaystyle{ G }[/math] with [math]\displaystyle{ n }[/math] vertices? Prove your improved upper bound if your answer is "yes", and give a satisfactory explanation if your answer is "no".
Problem 2
Consider the function [math]\displaystyle{ f:\mathbb{R}^n\to\mathbb{R} }[/math] defined as
- [math]\displaystyle{ f(\vec x)=f(x_1,x_2,\dots,x_n)=\prod_{i=1}^{n}(a_ix_i-b_i) }[/math],
where [math]\displaystyle{ \{a_i\}_{1\le i\le n} }[/math] and [math]\displaystyle{ \{b_i\}_{1\le i\le n} }[/math] are unknown coefficients satisfy that [math]\displaystyle{ a_i, b_i\in \mathbb{Z} }[/math] and [math]\displaystyle{ 0\le a_i, b_i \le n }[/math] for all [math]\displaystyle{ 1\le i\le n }[/math].
Let [math]\displaystyle{ p\gt n }[/math] be the smallest prime strictly greater than [math]\displaystyle{ n }[/math]. The function [math]\displaystyle{ g:\mathbb{Z}_p^n\to\mathbb{Z}_p }[/math] is defined as
- [math]\displaystyle{ g(\vec x)=g(x_1,x_2,\dots,x_n)=\prod_{i=1}^{n}(a_ix_i-b_i) }[/math],
where [math]\displaystyle{ + }[/math] and [math]\displaystyle{ \cdot }[/math] are defined over the finite field [math]\displaystyle{ \mathbb{Z}_p }[/math].
By the properties of finite field, for any value [math]\displaystyle{ \vec r\in\mathbb{Z}_p^n }[/math], it holds that [math]\displaystyle{ g(\vec r)=f(\vec r)\bmod p }[/math].
Since the coefficients [math]\displaystyle{ \{a_i\}_{1\le i\le n} }[/math] and [math]\displaystyle{ \{b_i\}_{1\le i\le n} }[/math] are unknown, you can't calculate [math]\displaystyle{ f(\vec x) }[/math] directly. However, there exists an oracle [math]\displaystyle{ O }[/math], each time [math]\displaystyle{ O }[/math] gets an input [math]\displaystyle{ \vec x }[/math], it immediately outputs the value of [math]\displaystyle{ g(\vec x) }[/math].
1. Prove that [math]\displaystyle{ f\not\equiv 0 \Rightarrow g\not\equiv 0 }[/math].
2. Use the oracle [math]\displaystyle{ O }[/math] to design an algorithm to determine whether [math]\displaystyle{ f\equiv 0 }[/math], with error probability at most [math]\displaystyle{ \epsilon }[/math], where [math]\displaystyle{ \epsilon\in (0,1) }[/math] is a constant.
Problem 3
Fix a universe [math]\displaystyle{ U }[/math] and two subset [math]\displaystyle{ A,B \subseteq U }[/math], both with size [math]\displaystyle{ n }[/math]. we create both Bloom filters [math]\displaystyle{ F_A }[/math]([math]\displaystyle{ F_B }[/math]) for [math]\displaystyle{ A }[/math] ([math]\displaystyle{ B }[/math]), using the same number of bits [math]\displaystyle{ m }[/math] and the same [math]\displaystyle{ k }[/math] hash functions.
- Let [math]\displaystyle{ F_C = F_A \land F_B }[/math] be the Bloom filter formed by computing the bitwise AND of [math]\displaystyle{ F_A }[/math] and [math]\displaystyle{ F_B }[/math]. Argue that [math]\displaystyle{ F_C }[/math] may not always be the same as the Bloom filter that are created for [math]\displaystyle{ A\cap B }[/math].
- Bloom filters can be used to estimate set differences. Express the expected number of bits where [math]\displaystyle{ F_A }[/math] and [math]\displaystyle{ F_B }[/math] differ as a function of [math]\displaystyle{ m, n, k }[/math] and [math]\displaystyle{ |A\cap B| }[/math].
Problem 4
Let [math]\displaystyle{ X_1,X_2,\ldots,X_n }[/math] be [math]\displaystyle{ n }[/math] random variables, where each [math]\displaystyle{ X_i \in \{0, 1\} }[/math] follows the distribution [math]\displaystyle{ \mu_i }[/math]. For each [math]\displaystyle{ 1\leq i \leq n }[/math], let [math]\displaystyle{ \rho_i = \mathbb{E}[X_i] }[/math] and assume [math]\displaystyle{ \rho_i \geq \frac{1}{2} }[/math]. Consider the problem of estimating the value of
- [math]\displaystyle{ Z = \prod_{i = 1}^n \rho_i }[/math].
For each [math]\displaystyle{ 1\leq i \leq n }[/math], the algorithm draws [math]\displaystyle{ s }[/math] random samples [math]\displaystyle{ X_i^{(1)},X_i^{(2)},\ldots,X_i^{(s)} }[/math] independently from the distribution [math]\displaystyle{ \mu_i }[/math], and computes
- [math]\displaystyle{ \widehat{\rho}_{i}=\frac{1}{s}\sum_{j=1}^s X_i^{(j)} }[/math].
Finally, the algorithm outputs the product of all [math]\displaystyle{ \widehat{Z}_{i} }[/math]:
- [math]\displaystyle{ \widehat{Z}=\prod_{i= 1}^n\widehat{\rho}_i }[/math].
Express [math]\displaystyle{ s }[/math] as a function of [math]\displaystyle{ n,\varepsilon,\delta }[/math] so that the output [math]\displaystyle{ \widehat{Z} }[/math] satisfies
- [math]\displaystyle{ \Pr\left[(1 - \varepsilon) Z \leq \widehat{Z} \leq (1 + \varepsilon)Z\right] \geq 1- \delta }[/math].
Try to make [math]\displaystyle{ s }[/math] as small as possible.