高级算法 (Fall 2023)/Problem Set 1: Difference between revisions

From TCS Wiki
Jump to navigation Jump to search
Line 72: Line 72:
</math>
</math>


# Show that for each <math>\delta</math>, there exists a choice of <math>k</math> such that the <math>k</math>th-moment bound is stronger than the Chernoff bound.  
# Show that for each <math>\delta</math>, there exists a choice of <math>k</math> such that the <math>k</math>th-moment bound is stronger than the Chernoff bound. <nowiki>'''''</nowiki>Hint<nowiki>'''''</nowiki>: Consider the Taylor expansion of the moment generating function and apply the probabilistic method.
 
'''''Hint''''': Consider the Taylor expansion of the moment generating function and apply the probabilistic method.


# Why would we still prefer the Chernoff bound to the (seemingly) stronger <math>k</math>-th moment bound?
# Why would we still prefer the Chernoff bound to the (seemingly) stronger <math>k</math>-th moment bound?

Revision as of 06:37, 23 October 2023

  • 作业目前正在更新中,不是最终版
  • 每道题目的解答都要有完整的解题过程,中英文不限。
  • 我们推荐大家使用LaTeX, markdown等对作业进行排版。

Problem 1 (Min-cut/Max-cut)

  • [Counting [math]\displaystyle{ \alpha }[/math]-approximate min-cut] For any [math]\displaystyle{ \alpha \ge 1 }[/math], a cut is called an [math]\displaystyle{ \alpha }[/math]-approximate min-cut in a multigraph [math]\displaystyle{ G }[/math] if the number of edges in it is at most [math]\displaystyle{ \alpha }[/math] times that of the min-cut. Prove that the number of [math]\displaystyle{ \alpha }[/math]-approximate min-cuts in a multigraph [math]\displaystyle{ G }[/math] is at most [math]\displaystyle{ n^{2\alpha} / 2 }[/math]. Hint: Run Karger's algorithm until it has [math]\displaystyle{ \lceil 2\alpha \rceil }[/math] supernodes. What is the chance that a particular [math]\displaystyle{ \alpha }[/math]-approximate min-cut is still available? How many possible cuts does this collapsed graph have?
  • [Weighted min-cut problem] Modify the Karger's Contraction algorithm so that it works for the weighted min-cut problem. Prove that the modified algorithm returns a weighted minimum cut with probability at least [math]\displaystyle{ {2}/{n(n-1)} }[/math]. The weighted min-cut problem is defined as follows.
    • Input: an undirected weighted graph [math]\displaystyle{ G(V, E) }[/math], where every edge [math]\displaystyle{ e \in E }[/math] is associated with a positive real weight [math]\displaystyle{ w_e }[/math]
    • Output: a cut [math]\displaystyle{ C }[/math] in [math]\displaystyle{ G }[/math] such that [math]\displaystyle{ \sum_{e \in C} w_e }[/math] is minimized.
  • [Max directed-cut] In the maximum directed cut (MAX-DICUT) problem, we are given as input a directed graph [math]\displaystyle{ G(V,E) }[/math]. Each directed arc [math]\displaystyle{ (i, j) \in E }[/math] has nonnegative weight [math]\displaystyle{ w_{ij} \ge 0 }[/math]. The goal is to partition [math]\displaystyle{ V }[/math] into disjoint sets [math]\displaystyle{ U }[/math] and [math]\displaystyle{ W=V\setminus U }[/math] so as to maximize the total weight of the arcs going from [math]\displaystyle{ U }[/math] to [math]\displaystyle{ W }[/math]. Give a randomized [math]\displaystyle{ 1/4 }[/math]-approximation algorithm for this problem.

Problem 2 (Fingerprinting)

  • [Polynomial Identity Testing] Consider the function [math]\displaystyle{ f:\mathbb{R}^n\to\mathbb{R} }[/math] defined as
[math]\displaystyle{ f(\vec x)=f(x_1,x_2,\dots,x_n)=\prod_{i=1}^{n}(a_ix_i-b_i) }[/math],

where [math]\displaystyle{ \{a_i\}_{1\le i\le n} }[/math] and [math]\displaystyle{ \{b_i\}_{1\le i\le n} }[/math] are unknown coefficients satisfy that [math]\displaystyle{ a_i, b_i\in \mathbb{Z} }[/math] and [math]\displaystyle{ 0\le a_i, b_i \le n }[/math] for all [math]\displaystyle{ 1\le i\le n }[/math].

Let [math]\displaystyle{ p\gt n }[/math] be the smallest prime strictly greater than [math]\displaystyle{ n }[/math]. The function [math]\displaystyle{ g:\mathbb{Z}_p^n\to\mathbb{Z}_p }[/math] is defined as

[math]\displaystyle{ g(\vec x)=g(x_1,x_2,\dots,x_n)=\prod_{i=1}^{n}(a_ix_i-b_i) }[/math],

where [math]\displaystyle{ + }[/math] and [math]\displaystyle{ \cdot }[/math] are defined over the finite field [math]\displaystyle{ \mathbb{Z}_p }[/math].

By the properties of finite field, for any value [math]\displaystyle{ \vec r\in\mathbb{Z}_p^n }[/math], it holds that [math]\displaystyle{ g(\vec r)=f(\vec r)\bmod p }[/math].

Since the coefficients [math]\displaystyle{ \{a_i\}_{1\le i\le n} }[/math] and [math]\displaystyle{ \{b_i\}_{1\le i\le n} }[/math] are unknown, you can't calculate [math]\displaystyle{ f(\vec x) }[/math] directly. However, there exists an oracle [math]\displaystyle{ O }[/math], each time [math]\displaystyle{ O }[/math] gets an input [math]\displaystyle{ \vec x }[/math], it immediately outputs the value of [math]\displaystyle{ g(\vec x) }[/math].

  1. Prove that [math]\displaystyle{ f\not\equiv 0 \Rightarrow g\not\equiv 0 }[/math].
  2. Use the oracle [math]\displaystyle{ O }[/math] to design an algorithm to determine whether [math]\displaystyle{ f\equiv 0 }[/math], with error probability at most [math]\displaystyle{ \epsilon }[/math], where [math]\displaystyle{ \epsilon\in (0,1) }[/math] is a constant.
  • [Test isomorphism of rooted tree] Two rooted trees [math]\displaystyle{ T_1 }[/math] and [math]\displaystyle{ T_2 }[/math] are said to be isomorphic if there exists a one to one mapping [math]\displaystyle{ f }[/math] from the nodes of [math]\displaystyle{ T_1 }[/math] to those of [math]\displaystyle{ T_2 }[/math] satisfying the following condition: [math]\displaystyle{ v }[/math] is a child of [math]\displaystyle{ w }[/math] in [math]\displaystyle{ T_1 }[/math] if and only if [math]\displaystyle{ f(v) }[/math] is a child of [math]\displaystyle{ f(w) }[/math] in [math]\displaystyle{ T_2 }[/math]. Observe that no ordering is assumed on the children of any vertex. Devise an efficient randomized algorithm for testing the isomorphism of rooted trees and analyze its performance. Hint: Recursively associate a polynomial [math]\displaystyle{ P_v }[/math] with each vertex [math]\displaystyle{ v }[/math] in a tree [math]\displaystyle{ T }[/math].
  • [2D pattern matching] Consider the problem of image matching. You are given an [math]\displaystyle{ n\times n }[/math]-bit matrix as "text" and an [math]\displaystyle{ m\times m }[/math]-bit “pattern” you want to find in the text. Devise an efficient (expected time [math]\displaystyle{ O(n^2) }[/math] ) algorithm for the problem. Hint: The key is rapidly updating a fingerprint as you shift the “window” over the matrix. You may first want to consider the case of an [math]\displaystyle{ n \times m }[/math]-bit “text.” Can you transform this into a standard string-matching problem?

Problem 3 (Hashing)

  • [Set differences] Fix a universe [math]\displaystyle{ U }[/math] and two subset [math]\displaystyle{ A,B \subseteq U }[/math], both with size [math]\displaystyle{ n }[/math]. we create both Bloom filters [math]\displaystyle{ F_A }[/math]([math]\displaystyle{ F_B }[/math]) for [math]\displaystyle{ A }[/math] ([math]\displaystyle{ B }[/math]), using the same number of bits [math]\displaystyle{ m }[/math] and the same [math]\displaystyle{ k }[/math] hash functions.
  1. Let [math]\displaystyle{ F_C = F_A \land F_B }[/math] be the Bloom filter formed by computing the bitwise AND of [math]\displaystyle{ F_A }[/math] and [math]\displaystyle{ F_B }[/math]. Argue that [math]\displaystyle{ F_C }[/math] may not always be the same as the Bloom filter that are created for [math]\displaystyle{ A\cap B }[/math].
  2. Bloom filters can be used to estimate set differences. Express the expected number of bits where [math]\displaystyle{ F_A }[/math] and [math]\displaystyle{ F_B }[/math] differ as a function of [math]\displaystyle{ m, n, k }[/math] and [math]\displaystyle{ |A\cap B| }[/math].
  • [Count Distinct Element] In class, we saw how to estimate the number of distinct elements in a data stream using the Flajolet-Martin algorithm. Consider the following alternative formulation of the distinct elements problem: given an [math]\displaystyle{ N }[/math] dimensional vector [math]\displaystyle{ x }[/math], we want to process a stream of arbitrary increments to entries in [math]\displaystyle{ x }[/math]. In other words, if we see a number [math]\displaystyle{ i\in 1,\dots,N }[/math] in the stream, we update entry [math]\displaystyle{ x_i\gets x_i + 1 }[/math]. Our goal is to estimate [math]\displaystyle{ \left \|x\right \|_0 }[/math], which measures the number of non-zero entries in [math]\displaystyle{ x }[/math]. With [math]\displaystyle{ x }[/math] viewed as a histogram that maintains counts for [math]\displaystyle{ N }[/math] potential elements, [math]\displaystyle{ \left \|x\right \|_0 }[/math] is exactly the number of distinct elements processed. In this problem we will develop an alternative algorithm for estimating [math]\displaystyle{ \left \|x\right \|_0 }[/math] that can also handle decrements to entries in x. Specifically, instead of the stream containing just indices [math]\displaystyle{ i }[/math], it contains pairs [math]\displaystyle{ (i, +) }[/math] and [math]\displaystyle{ (i, −) }[/math]. On receiving [math]\displaystyle{ (i, +) }[/math], [math]\displaystyle{ x }[/math] should update so that [math]\displaystyle{ x_i\gets x_i + 1 }[/math] and on receiving [math]\displaystyle{ (i, −) }[/math], [math]\displaystyle{ x }[/math] should update so that [math]\displaystyle{ x_i\gets x_i - 1 }[/math]. For this problem we will assume that, at the end of our stream, each [math]\displaystyle{ x_i \ge 0 }[/math] (i.e. for a specific index we can’t receive more decrements than increments).
  1. Consider a simpler problem. For a given value [math]\displaystyle{ T }[/math], let’s design an algorithm that succeeds with probability [math]\displaystyle{ (1 − \delta) }[/math], outputing LOW if [math]\displaystyle{ T \lt \frac{1}{2}\left \|x\right \|_0 }[/math] and HIGH if [math]\displaystyle{ T \gt 2\left \|x\right \|_0 }[/math]:
    • Assume we have access to a completely random hash function [math]\displaystyle{ h(\cdot) }[/math] that maps each [math]\displaystyle{ i }[/math] to a random point in [math]\displaystyle{ [0, 1] }[/math]. We maintain the estimator [math]\displaystyle{ s=\sum_{i:h(i)\lt \frac{1}{2T}}x_i }[/math] as we receive increment and decrement updates. Show that, at the end of our stream, (i) If [math]\displaystyle{ T \lt \frac{1}{2}\left \|x\right \|_0 }[/math], [math]\displaystyle{ \Pr_h[s=0]\lt 1/e\approx 0.37 }[/math] and (ii) If [math]\displaystyle{ T \gt 2\left \|x\right \|_0 }[/math], [math]\displaystyle{ \Pr_h[s=0]\gt 0.5 }[/math].
    • Using this fact, show how to use [math]\displaystyle{ k=O(\log 1/\delta) }[/math] independent random hash functions, and corresponding individual estimators [math]\displaystyle{ s_1, s_2, . . . , s_k }[/math], to output LOW if [math]\displaystyle{ T \lt \frac{1}{2}\left \|x\right \|_0 }[/math] and HIGH if [math]\displaystyle{ T \gt 2\left \|x\right \|_0 }[/math]. If neither event occurs you can output either LOW or HIGH. Your algorithm should succeed with probability [math]\displaystyle{ (1 − \delta) }[/math].
  2. Using [math]\displaystyle{ O(\log N) }[/math] repetitions of your algorithm for the above decision problem (with [math]\displaystyle{ \delta }[/math] set appropriately), show how to obtain an estimate [math]\displaystyle{ F }[/math] for [math]\displaystyle{ \left \|x\right \|_0 }[/math] such that [math]\displaystyle{ \frac{1}{4}\left \|x\right \|_0\le F\le 4\left \|x\right \|_0 }[/math] w.h.p.(with probability [math]\displaystyle{ 1-O(1/N) }[/math]).
  • [Rethinking bloom filter] In class, we argued that with the right setting of parameters, the probability of a false positive for an [math]\displaystyle{ m }[/math]-bit bloom filter on [math]\displaystyle{ n }[/math] items was roughly [math]\displaystyle{ (.6185)^{m/n} }[/math]. This problem will consider another approach. Divide the [math]\displaystyle{ m }[/math] bits into [math]\displaystyle{ r }[/math] "blocks" of [math]\displaystyle{ m/r }[/math] bits. Associate a (random) [math]\displaystyle{ 2 }[/math]-universal hash function with each block. For each item, hash it to one bit in each block using the block’s hash function, and set those bits to [math]\displaystyle{ 1 }[/math]. Now consider an item [math]\displaystyle{ x }[/math] not in the inserted set, and suppose we hash it into the blocks in the same way. We wish to bound the probability we get a false positive in this data structure (i.e. all bits [math]\displaystyle{ 1 }[/math]).
  1. Bound the probability that in a particular block, [math]\displaystyle{ x }[/math] hashes to a set bit, and from that bound derive the probability that [math]\displaystyle{ x }[/math] is a false positive.
  2. Determine the best choice of [math]\displaystyle{ r }[/math] to minimize the above probability as a function of [math]\displaystyle{ m }[/math] and [math]\displaystyle{ n }[/math].

Problem 4 (Concentration of measure)

  • [[math]\displaystyle{ k }[/math]-th moment bound] Let [math]\displaystyle{ X }[/math] be a random variable with expectation [math]\displaystyle{ 0 }[/math] such that moment generating function [math]\displaystyle{ \mathbf{E}[\exp(t|X|)] }[/math] is finite for some [math]\displaystyle{ t \gt 0 }[/math]. We can use the following two kinds of tail inequalities for [math]\displaystyle{ X }[/math].

Chernoff Bound

[math]\displaystyle{ \begin{align} \mathbf{Pr}[|X| \geq \delta] \leq \min_{t \geq 0} \frac{\mathbf{E}[e^{t|X|}]}{e^{t\delta}} \end{align} }[/math]

[math]\displaystyle{ k }[/math]th-Moment Bound

[math]\displaystyle{ \begin{align} \mathbf{Pr}[|X| \geq \delta] \leq \frac{\mathbf{E}[|X|^k]}{\delta^k} \end{align} }[/math]
  1. Show that for each [math]\displaystyle{ \delta }[/math], there exists a choice of [math]\displaystyle{ k }[/math] such that the [math]\displaystyle{ k }[/math]th-moment bound is stronger than the Chernoff bound. '''''Hint''''': Consider the Taylor expansion of the moment generating function and apply the probabilistic method.
  1. Why would we still prefer the Chernoff bound to the (seemingly) stronger [math]\displaystyle{ k }[/math]-th moment bound?
  • [the median trick]
  • [cut size in random graph]
  • [code rate of boolean code]
  • [balls into bins with the "power of two choices"]

Problem 5 (Dimension reduction)

  • [inner product]
  • [linear separability]
  • [sparse vector]

Problem 1 (Lovász Local Lemma)

  • [colorable hypergrap]
  • [directed cycle]
  • [algorithmic LLL]