随机算法 \ 高级算法 (Fall 2016)/Problem Set 1 and 高级算法 (Fall 2017)/Hashing and Sketching: Difference between pages

From TCS Wiki
(Difference between pages)
Jump to navigation Jump to search
imported>Etone
 
imported>Etone
 
Line 1: Line 1:
每道题目的解答都要有<font color="red" >完整的解题过程</font>。中英文不限。
=Count Distinct Elements=


== An estimator by hashing ==


== Problem 1==
==Flajolet-Martin algorithm==
For any <math>\alpha\ge 1</math>, a cut <math>C</math> in an undirected (multi)graph <math>G(V,E)</math> is called an <math>\alpha</math>-min-cut if <math>|C|\le\alpha|C^*|</math> where <math>C^*</math> is a min-cut in <math>G</math>.


# Give a lower bound to the probability that Karger's Random Contraction algorithm returns an <math>\alpha</math>-min-cut in a graph <math>G(V,E)</math> of <math>n</math> vertices.
= Set  Membership=
# Use the above bound to estimate the number of distinct <math>\alpha</math>-min cuts in <math>G</math>.


== Problem 2==
== Perfect hashing==
Let <math>G(V,E)</math> be an undirected graph with positive edge weights <math>w:E\to\mathbb{Z}^+</math>. Given a partition of <math>V</math> into <math>k</math> disjoint subsets <math>S_1,S_2,\ldots,S_k</math>, we define
:<math>w(S_1,S_2,\ldots,S_k)=\sum_{uv\in E\atop \exists i\neq j: u\in S_i,v\in S_j}w(uv)</math>
as the cost of the '''<math>k</math>-cut''' <math>\{S_1,S_2,\ldots,S_k\}</math>. Our goal is to find a <math>k</math>-cut with maximum cost.
# Give a poly-time greedy algorithm for finding the weighted max <math>k</math>-cut. Prove that the approximation ratio is <math>(1-1/k)</math>.
# Consider the following local search algorithm for the weighted max cut (max 2-cut).
::Fill in the blank parenthesis. Give an analysis of the running time of the algorithm. And prove that the approximation ratio is 0.5.
start with an arbitrary bipartition of <math>V</math> into disjoint <math>S_0,S_1</math>;
while (true) do
    if <math>\exists i\in\{0,1\}</math> and <math>v\in S_i</math> such that <font color=red>(______________)</font>
      then <math>v</math> leaves <math>S_i</math> and joins <math>S_{1-i}</math>;
      continue;
    end if
    break;
end


== Problem 3==
== Bloom filter ==
Given <math>m</math> subsets <math>S_1,S_2,\ldots, S_m\subseteq U</math> of a universe <math>U</math> of size <math>n</math>, we want to find a <math>C\subseteq\{1,2,\ldots, n\}</math> of fixed size <math>k=|C|</math> with the maximum '''coverage''' <math>\left|\bigcup_{i\in C}S_i\right|</math>.


* Give a poly-time greedy algorithm for the problem. Prove that the approximation ratio is <math>1-(1-1/k)^k>1-1/e</math>.
= Frequency Estimation=


== Problem 4==
== Count-min sketch==
We consider minimum makespan scheduling on parallel identical machines when jobs are subject to '''precedence constraints'''.
 
We still want to schedule <math>n</math> jobs <math>j=1,2,\ldots, n</math> on <math>m</math> identical machines, where job <math>j</math> has  processing time <math>p_j</math>. But now a partial order <math>\preceq</math> is defined on jobs, so that if <math>j\prec k</math> then job <math>j</math> must be completely finished before job <math>k</math> begins. The following is a variant of the ''List'' algorithm for this problem: we still assume that the input is a list of <math>n</math> jobs with processing times <math>p_1,p_2,\ldots, p_n</math>.
 
whenever a machine becomes idle
    assign the next ''available'' job on the list to the machine;
 
Here a job <math>k</math> is available if all jobs <math>j\prec k</math> have already been completely processed.
 
* Prove that the approximation ratio is 2.
 
== Problem 5 ==
For a '''hypergraph''' <math>H(V,E)</math> with vertex set <math>V</math>, every '''hyperedge''' <math>e\in E</math> is a subset <math>e\subset V</math> of vertices, not necessarily of size 2. A hypergraph <math>H(V,E)</math> is '''<math>k</math>-uniform''' if every hyperedge <math>e\in V</math> is of size <math>k=|e|</math>.
 
A hypergraph <math>H(V,E)</math> is said to have '''property B''' (named after Bernstein) if <math>H</math> is 2-coloable; that is, if there is a '''proper 2-coloring''' <math>f:V\to\{{\color{red}R},{\color{blue}B}\}</math> which assigns each vertex one of the two colors <font color=red>Red</font> or <font color=blue>Blue</font>, such that none of the hyperedge is ''monochromatic''.
 
# Let <math>H(V,E)</math> be a <math>k</math>-uniform hypergraph in which every hyperedge <math>e\in E</math> shares vertices with at most <math>d</math> other hyperedges.
#*Prove that if <math>2\mathrm{e}\cdot (d+1)\le 2^{k}</math>, then <math>H</math> has property B.
#*Describe how to use Moser's recursive Fix algorithm to find a proper 2-coloring of <math>H</math>. Give the pseudocode. Prove the condition in interns of <math>d</math> and <math>k</math> under which the algorithm can find a 2-coloring of <math>H</math> with high probability.
#*Describe how to use Moser-Tardos random solver to find a proper 2-coloring of <math>H</math>. Give the pseudocode. Prove the condition in interns of <math>d</math> and <math>k</math> under which the algorithm can find a 2-coloring of <math>H</math> within bounded expected time. Give an upper bound on the expected running time.
# Let <math>H(V,E)</math> be a hypergraph (not necessarily uniform) with at least <math>n\ge 2</math> vertices satisfying that
#:<math>\forall v\in V, \sum_{e\ni v}(1-1/n)^{-|e|}2^{-|e|+1}\le \frac{1}{n}</math>.
#*Prove that <math>H</math> has property B.
#*Describe how to use Moser-Tardos random solver to find a proper 2-coloring of <math>H</math>. Give an upper bound on the expected running time.

Revision as of 08:31, 10 October 2017

Count Distinct Elements

An estimator by hashing

Flajolet-Martin algorithm

Set Membership

Perfect hashing

Bloom filter

Frequency Estimation

Count-min sketch