高级算法 (Fall 2021)/Problem Set 1: Difference between revisions

From TCS Wiki
Jump to navigation Jump to search
imported>TCSseminar
No edit summary
imported>Etone
No edit summary
 
(One intermediate revision by the same user not shown)
Line 29: Line 29:


== Problem 3 ==
== Problem 3 ==
Fix a universe <math>U</math> and two subset <math>A,B \subseteq U</math>, both with size <math>n</math>. we create both Bloom filters <math>F_A</math>(<math>F_B</math>) for <math>A</math> (<math>B</math>), using the same number of bits <math> m</math> and the same <math>k</math> hash functions.
*Let <math>F_C = F_A \land F_B</math> be the Bloom filter formed by computing the bitwise AND of <math>F_A</math> and <math>F_B</math>. Argue that <math>F_C</math> may not always be the same as the Bloom filter that are created for <math>A\cap B </math>.
*Bloom filters can be used to estimate set differences. Express the expected number of bits where <math>F_A</math> and <math>F_B</math> differ as a function of <math>m, n, k</math> and <math>|A\cap B|</math>.
== Problem 4 ==
Let <math>X_1,X_2,\ldots,X_n</math> be <math>n</math> random variables, where each <math>X_i \in \{0, 1\}</math> follows the distribution <math>\mu_i</math>. For each <math>1\leq i \leq n</math>, let <math>\rho_i = \mathbb{E}[X_i]</math> and assume <math>\rho_i \geq \frac{1}{2}</math>. Consider the problem of estimating the value of  
Let <math>X_1,X_2,\ldots,X_n</math> be <math>n</math> random variables, where each <math>X_i \in \{0, 1\}</math> follows the distribution <math>\mu_i</math>. For each <math>1\leq i \leq n</math>, let <math>\rho_i = \mathbb{E}[X_i]</math> and assume <math>\rho_i \geq \frac{1}{2}</math>. Consider the problem of estimating the value of  
:<math>Z = \prod_{i = 1}^n \rho_i</math>.
:<math>Z = \prod_{i = 1}^n \rho_i</math>.
Line 44: Line 39:
Try to make <math>s</math> as small as possible.
Try to make <math>s</math> as small as possible.


== Problem 5 ==
== Problem 4 ==
Suppose there is a coin <math> C </math>.  
Suppose there is a coin <math> C </math>.  
During each query, it outputs HEAD with probability <math>p</math> and TAIL with probability <math>1-p</math>, where <math> p \in (0, 1) </math> is a real number.
During each query, it outputs HEAD with probability <math>p</math> and TAIL with probability <math>1-p</math>, where <math> p \in (0, 1) </math> is a real number.
* Let <math> q \in (0, 1) </math> be another real number. Design an algorithm that outputs HEAD with probability <math>q</math> and TAIL with probability <math>1-q</math>. There is no other random sources for your algorithm except the coin <math>C</math>. Make sure that your algorithm halts with probability <math>1</math>.
* Let <math> q \in (0, 1) </math> be another real number. Design an algorithm that outputs HEAD with probability <math>q</math> and TAIL with probability <math>1-q</math>. There is no other random sources for your algorithm except the coin <math>C</math>. Make sure that your algorithm halts with probability <math>1</math>.
* What is the expected number of queries for the coin <math>C</math> that your algorithm use before it halts?
* What is the expected number of queries for the coin <math>C</math> that your algorithm use before it halts?

Latest revision as of 09:38, 14 September 2021

  • 每道题目的解答都要有完整的解题过程。中英文不限。

Problem 1

Recall that in class we show by the probabilistic method how to deduce a [math]\displaystyle{ \frac{n(n-1)}{2} }[/math] upper bound on the number of distinct min-cuts in any multigraph [math]\displaystyle{ G }[/math] with [math]\displaystyle{ n }[/math] vertices from the [math]\displaystyle{ \frac{2}{n(n-1)} }[/math] lower bound for success probability of Karger's min-cut algorithm.

Also recall that the [math]\displaystyle{ FastCut }[/math] algorithm taught in class guarantees to return a min-cut with probability at least [math]\displaystyle{ \Omega(1/\log n) }[/math]. Does this imply a much tighter [math]\displaystyle{ O(\log n) }[/math] upper bound on the number of distinct min-cuts in any multigraph [math]\displaystyle{ G }[/math] with [math]\displaystyle{ n }[/math] vertices? Prove your improved upper bound if your answer is "yes", and give a satisfactory explanation if your answer is "no".

Problem 2

Consider the function [math]\displaystyle{ f:\mathbb{R}^n\to\mathbb{R} }[/math] defined as

[math]\displaystyle{ f(\vec x)=f(x_1,x_2,\dots,x_n)=\prod_{i=1}^{n}(a_ix_i-b_i) }[/math],

where [math]\displaystyle{ \{a_i\}_{1\le i\le n} }[/math] and [math]\displaystyle{ \{b_i\}_{1\le i\le n} }[/math] are unknown coefficients satisfy that [math]\displaystyle{ a_i, b_i\in \mathbb{Z} }[/math] and [math]\displaystyle{ 0\le a_i, b_i \le n }[/math] for all [math]\displaystyle{ 1\le i\le n }[/math].

Let [math]\displaystyle{ p\gt n }[/math] be the smallest prime strictly greater than [math]\displaystyle{ n }[/math]. The function [math]\displaystyle{ g:\mathbb{Z}_p^n\to\mathbb{Z}_p }[/math] is defined as

[math]\displaystyle{ g(\vec x)=g(x_1,x_2,\dots,x_n)=\prod_{i=1}^{n}(a_ix_i-b_i) }[/math],

where [math]\displaystyle{ + }[/math] and [math]\displaystyle{ \cdot }[/math] are defined over the finite field [math]\displaystyle{ \mathbb{Z}_p }[/math].

By the properties of finite field, for any value [math]\displaystyle{ \vec r\in\mathbb{Z}_p^n }[/math], it holds that [math]\displaystyle{ g(\vec r)=f(\vec r)\bmod p }[/math].

Since the coefficients [math]\displaystyle{ \{a_i\}_{1\le i\le n} }[/math] and [math]\displaystyle{ \{b_i\}_{1\le i\le n} }[/math] are unknown, you can't calculate [math]\displaystyle{ f(\vec x) }[/math] directly. However, there exists an oracle [math]\displaystyle{ O }[/math], each time [math]\displaystyle{ O }[/math] gets an input [math]\displaystyle{ \vec x }[/math], it immediately outputs the value of [math]\displaystyle{ g(\vec x) }[/math].

1. Prove that [math]\displaystyle{ f\not\equiv 0 \Rightarrow g\not\equiv 0 }[/math].

2. Use the oracle [math]\displaystyle{ O }[/math] to design an algorithm to determine whether [math]\displaystyle{ f\equiv 0 }[/math], with error probability at most [math]\displaystyle{ \epsilon }[/math], where [math]\displaystyle{ \epsilon\in (0,1) }[/math] is a constant.

Problem 3

Let [math]\displaystyle{ X_1,X_2,\ldots,X_n }[/math] be [math]\displaystyle{ n }[/math] random variables, where each [math]\displaystyle{ X_i \in \{0, 1\} }[/math] follows the distribution [math]\displaystyle{ \mu_i }[/math]. For each [math]\displaystyle{ 1\leq i \leq n }[/math], let [math]\displaystyle{ \rho_i = \mathbb{E}[X_i] }[/math] and assume [math]\displaystyle{ \rho_i \geq \frac{1}{2} }[/math]. Consider the problem of estimating the value of

[math]\displaystyle{ Z = \prod_{i = 1}^n \rho_i }[/math].

For each [math]\displaystyle{ 1\leq i \leq n }[/math], the algorithm draws [math]\displaystyle{ s }[/math] random samples [math]\displaystyle{ X_i^{(1)},X_i^{(2)},\ldots,X_i^{(s)} }[/math] independently from the distribution [math]\displaystyle{ \mu_i }[/math], and computes

[math]\displaystyle{ \widehat{\rho}_{i}=\frac{1}{s}\sum_{j=1}^s X_i^{(j)} }[/math].

Finally, the algorithm outputs the product of all [math]\displaystyle{ \widehat{Z}_{i} }[/math]:

[math]\displaystyle{ \widehat{Z}=\prod_{i= 1}^n\widehat{\rho}_i }[/math].

Express [math]\displaystyle{ s }[/math] as a function of [math]\displaystyle{ n,\varepsilon,\delta }[/math] so that the output [math]\displaystyle{ \widehat{Z} }[/math] satisfies

[math]\displaystyle{ \Pr\left[(1 - \varepsilon) Z \leq \widehat{Z} \leq (1 + \varepsilon)Z\right] \geq 1- \delta }[/math].

Try to make [math]\displaystyle{ s }[/math] as small as possible.

Problem 4

Suppose there is a coin [math]\displaystyle{ C }[/math]. During each query, it outputs HEAD with probability [math]\displaystyle{ p }[/math] and TAIL with probability [math]\displaystyle{ 1-p }[/math], where [math]\displaystyle{ p \in (0, 1) }[/math] is a real number.

  • Let [math]\displaystyle{ q \in (0, 1) }[/math] be another real number. Design an algorithm that outputs HEAD with probability [math]\displaystyle{ q }[/math] and TAIL with probability [math]\displaystyle{ 1-q }[/math]. There is no other random sources for your algorithm except the coin [math]\displaystyle{ C }[/math]. Make sure that your algorithm halts with probability [math]\displaystyle{ 1 }[/math].
  • What is the expected number of queries for the coin [math]\displaystyle{ C }[/math] that your algorithm use before it halts?