高级算法 (Fall 2020)/Problem Set 2

From TCS Wiki
Revision as of 14:27, 3 November 2020 by imported>TCSseminar (→‎Problem 2)
Jump to navigation Jump to search
  • 每道题目的解答都要有完整的解题过程。中英文不限。

Problem 1

Suppose we want to estimate the value of [math]\displaystyle{ Z }[/math]. Let [math]\displaystyle{ \mathcal{A} }[/math] be an algorithm that outputs [math]\displaystyle{ \widehat{Z} }[/math] satisfying [math]\displaystyle{ \Pr[ (1-\epsilon)Z \leq \widehat{Z} \leq (1+\epsilon )Z] \geq \frac{3}{4} . }[/math]

We run [math]\displaystyle{ \mathcal{A} }[/math] independently for [math]\displaystyle{ s }[/math] times, and obtain the outputs [math]\displaystyle{ \widehat{Z}_1,\widehat{Z}_2,\ldots,\widehat{Z}_s }[/math].

Let [math]\displaystyle{ X }[/math] be the median (中位数) of [math]\displaystyle{ \widehat{Z}_1,\widehat{Z}_2,\ldots,\widehat{Z}_s }[/math]. Find the number [math]\displaystyle{ s }[/math] such that [math]\displaystyle{ \Pr[ (1-\epsilon)Z \leq X \leq (1+\epsilon )Z] \geq 1 - \delta . }[/math]

Express [math]\displaystyle{ s }[/math] as a function of [math]\displaystyle{ \delta }[/math]. Make [math]\displaystyle{ s }[/math] as small as possible.

Remark: in this problem, we boost the probability of success from [math]\displaystyle{ \frac{3}{4} }[/math] to [math]\displaystyle{ 1-\delta }[/math]. This method is called the median trick.

Hint: Chernoff bound.

Problem 2

In class, we saw how to estimate the number of distinct elements in a data stream using the Flajolet-Martin algorithm. Consider the following alternative formulation of the distinct elements problem: given an [math]\displaystyle{ N }[/math] dimensional vector [math]\displaystyle{ x }[/math], we want to process a stream of arbitrary increments to entries in [math]\displaystyle{ x }[/math]. In other words, if we see a number [math]\displaystyle{ i\in 1,\dots,N }[/math] in the stream, we update entry [math]\displaystyle{ x_i\gets x_i + 1 }[/math]. Our goal is to estimate [math]\displaystyle{ \left \|x\right \|_0 }[/math], which measures the number of non-zero entries in [math]\displaystyle{ x }[/math]. With [math]\displaystyle{ x }[/math] viewed as a histogram that maintains counts for [math]\displaystyle{ N }[/math] potential elements, [math]\displaystyle{ \left \|x\right \|_0 }[/math] is exactly the number of distinct elements processed. In this problem we will develop an alternative algorithm for estimating [math]\displaystyle{ \left \|x\right \|_0 }[/math] that can also handle decrements to entries in x. Specifically, instead of the stream containing just indices [math]\displaystyle{ i }[/math], it contains pairs [math]\displaystyle{ (i, +) }[/math] and [math]\displaystyle{ (i, −) }[/math]. On receiving [math]\displaystyle{ (i, +) }[/math], [math]\displaystyle{ x }[/math] should update so that [math]\displaystyle{ x_i\gets x_i + 1 }[/math] and on receiving [math]\displaystyle{ (i, −) }[/math], [math]\displaystyle{ x }[/math] should update so that [math]\displaystyle{ x_i\gets x_i - 1 }[/math]. For this problem we will assume that, at the end of our stream, each [math]\displaystyle{ x_i \ge 0 }[/math] (i.e. for a specific index we can’t receive more decrements than increments).

  • Consider a simpler problem. For a given value [math]\displaystyle{ T }[/math], let’s design an algorithm that succeeds with probability [math]\displaystyle{ (1 − \delta) }[/math], outputing LOW if [math]\displaystyle{ T \lt \frac{1}{2}\left \|x\right \|_0 }[/math] and HIGH if [math]\displaystyle{ T \gt 2\left \|x\right \|_0 }[/math]:
    • Assume we have access to a completely random hash function [math]\displaystyle{ h(\cdot) }[/math] that maps each [math]\displaystyle{ i }[/math] to a random point in [math]\displaystyle{ [0, 1] }[/math]. We maintain the estimator [math]\displaystyle{ s=\sum_{i:h(i)\lt \frac{1}{2T}}x_i }[/math] as we receive increment and decrement updates. Show that, at the end of our stream,
      (i) If [math]\displaystyle{ T \lt \frac{1}{2}\left \|x\right \|_0 }[/math], [math]\displaystyle{ Pr_h[s=0]\lt 1/e\approx 0.37 }[/math];
      (ii) If [math]\displaystyle{ T \gt 2\left \|x\right \|_0 }[/math], [math]\displaystyle{ Pr_h[s=0]\gt 0.5 }[/math].
    • Using this fact, show how to use [math]\displaystyle{ k=O(\log 1/\delta) }[/math] independent random hash functions, and corresponding individual estimators [math]\displaystyle{ s_1, s_2, . . . , s_k }[/math], to output LOW if [math]\displaystyle{ T \lt \frac{1}{2}\left \|x\right \|_0 }[/math] and HIGH if [math]\displaystyle{ T \gt 2\left \|x\right \|_0 }[/math]. If neither event occurs you can output either LOW or HIGH. Your algorithm should succeed with probability [math]\displaystyle{ (1 − \delta) }[/math].
  • Using [math]\displaystyle{ O(\log N) }[/math] repetitions of your algorithm for the above decision problem (with [math]\displaystyle{ \delta }[/math] set appropriately), show how to obtain an estimate [math]\displaystyle{ F }[/math] for [math]\displaystyle{ \left \|x\right \|_0 }[/math] such that [math]\displaystyle{ \frac{1}{4}\left \|x\right \|_0\le F\le 4\left \|x\right \|_0 }[/math] w.h.p.(with probability [math]\displaystyle{ 1-O(1/N) }[/math]).

Problem 3

Let [math]\displaystyle{ U }[/math] be a universal set. We use [math]\displaystyle{ 2^U }[/math] to denote the collection of all subsets of [math]\displaystyle{ U }[/math]. Let [math]\displaystyle{ \mathcal{F} }[/math] be a family of hash functions, in which each hash function [math]\displaystyle{ h:2^U \rightarrow \{0,1\}^m }[/math] maps subsets of [math]\displaystyle{ U }[/math] to 0-1 vectors of length [math]\displaystyle{ m }[/math]. A locality sensitive hashing scheme is a distribution on a family [math]\displaystyle{ \mathcal{F} }[/math] of hash functions operating on [math]\displaystyle{ 2^U }[/math], such that for two subsets [math]\displaystyle{ A,B \in 2^U }[/math],

[math]\displaystyle{ (1) \qquad \Pr_{h\in\mathcal{F}}[h(A)=h(B)]=sim(A,B). }[/math]

Here [math]\displaystyle{ sim:2^U \times 2^U \rightarrow [0,1] }[/math] is called the similarity function. Given a hash function family [math]\displaystyle{ \mathcal{F} }[/math] that satisfies Equation (1), we will say that [math]\displaystyle{ \mathcal{F} }[/math] is a locality sensitive hash function family corresponding to similarity function [math]\displaystyle{ sim(\cdot,\cdot) }[/math].

  • For any similarity function [math]\displaystyle{ sim(A,B) }[/math] that admits a locality sensitive hash function family as defined in Equation (1), prove that the distance function [math]\displaystyle{ d(A,B) \triangleq 1-sim(A,B) }[/math] satisfies triangle inequality, formally,

[math]\displaystyle{ \forall A,B,C \in 2^U :\quad d(A,B) + d(B,C) \geq d(A,C). }[/math]

  • Show that there is no locality sensitive hash function family corresponding to Dice's and the Overlap coefficient. Dice's coefficient is defined as:

[math]\displaystyle{ sim_{Dice}(A,B)=\frac{2|A\cap B|}{|A| + |B|}. }[/math]

Overlap coefficient is defined as:

[math]\displaystyle{ sim_{Ovl}(A,B) = \frac{|A\cap B|}{\min(|A|, |B|)}. }[/math]

Hint: use the triangle inequality result.

  • Construct a collection of hash function [math]\displaystyle{ \mathcal{B} }[/math] where [math]\displaystyle{ f : \{0,1\}^m \rightarrow \{0,1\} }[/math] for each [math]\displaystyle{ f \in \mathcal{B} }[/math], together with a probability distribution on [math]\displaystyle{ \mathcal{B} }[/math] such that

[math]\displaystyle{ \forall x, y \in \{0,1\}^m:\quad \Pr_{f \in \mathcal{B}}[f(x) = f(y)] = \begin{cases} 1 &\text{ if } x= y;\\ \frac{1}{2} &\text{ if } x \neq y. \end{cases} }[/math]

Then use the hash function family [math]\displaystyle{ \mathcal{B} }[/math] to prove the following result. Given a locality sensitive hash function family [math]\displaystyle{ \mathcal{F} }[/math] ([math]\displaystyle{ h:2^U \rightarrow \{0,1\}^m }[/math] for each [math]\displaystyle{ h \in \mathcal{F} }[/math]) corresponding to a similarity function [math]\displaystyle{ sim(A,B) }[/math], we can obtain a locality sensitive hash function [math]\displaystyle{ \mathcal{F}' }[/math] ([math]\displaystyle{ h':2^U \rightarrow \{0,1\} }[/math] for each [math]\displaystyle{ h' \in \mathcal{F}' }[/math]) corresponding to the similarity function [math]\displaystyle{ \frac{1+sim(A,B)}{2} }[/math].

Problem 4

Let [math]\displaystyle{ G(V,E) }[/math] be an undirected graph with positive edge weights [math]\displaystyle{ w:E\to\mathbb{Z}^+ }[/math]. Given a partition of [math]\displaystyle{ V }[/math] into [math]\displaystyle{ k }[/math] disjoint subsets [math]\displaystyle{ S_1,S_2,\ldots,S_k }[/math], we define

[math]\displaystyle{ w(S_1,S_2,\ldots,S_k)=\sum_{uv\in E\atop \exists i\neq j: u\in S_i,v\in S_j}w(uv) }[/math]

as the cost of the [math]\displaystyle{ k }[/math]-cut [math]\displaystyle{ \{S_1,S_2,\ldots,S_k\} }[/math]. Our goal is to find a [math]\displaystyle{ k }[/math]-cut with maximum cost.

  1. Give a poly-time greedy algorithm for finding the weighted max [math]\displaystyle{ k }[/math]-cut. Prove that the approximation ratio is [math]\displaystyle{ (1-1/k) }[/math].
  2. Consider the following local search algorithm for the weighted max cut (max 2-cut).
Fill in the blank parenthesis. Give an analysis of the running time of the algorithm. And prove that the approximation ratio is 0.5.
start with an arbitrary bipartition of [math]\displaystyle{ V }[/math] into disjoint [math]\displaystyle{ S_0,S_1 }[/math];

while (true) do

if [math]\displaystyle{ \exists i\in\{0,1\} }[/math] and [math]\displaystyle{ v\in S_i }[/math] such that (______________)
then [math]\displaystyle{ v }[/math] leaves [math]\displaystyle{ S_i }[/math] and joins [math]\displaystyle{ S_{1-i} }[/math];
continue;
end if
break;

end