高级算法 (Fall 2018)/Problem Set 1: Difference between revisions

From TCS Wiki
Jump to navigation Jump to search
imported>TCSseminar
(Created page with "== Problem 1== Recall that in class we show by the probabilistic method how to deduce a <math>\frac{n(n-1)}{2}</math> upper bound on the number of distinct min-cuts in any mul...")
 
imported>TCSseminar
Line 10: Line 10:


== Problem 3 ==
== Problem 3 ==
Consider the following problem:
Fix a universe <math>U</math> and two subset <math>A,B \subseteq U</math>, both with size <math>n</math>. we both create Bloom filters <math>F_A</math>(<math>F_B</math>) for <math>A</math> (<math>B</math>), using the same number of bits <math> m</math> and the same <math>k</math> hash functions.
*Fixed a Universe <math>U</math> of size <math>m</math>, given a set <math>S \subseteq U</math> of size <math>n</math> and for each <math>x \in S</math> a <math>r</math>-bit variable <math>v_x</math>, we want to store <math>S</math> and <math>\{v_x\}_{x\in S}</math> into a table, so that the query "Is <math>x\in S</math>, and if so, what's <math>v_x</math>?" can be answer quickly.
*Let <math>F_C = F_A \and F_B</math> be the Bloom filter formed by computing the bitwise AND of <math>F_A</math> and <math>F_B</math>. Argue that <math>F_C</math> may not always be the same as the Bloom filter that are created for <math>A\cap B </math>.
 
*Bloom filters can be used to estimate set differences. Express the expected number of bits where <math>F_A</math> and <math>F_B</math> differ as a function of <math>m, n, k</math> and <math>|A\cap B|</math>.
The above problem, which is often refered to as '''dictionary''',  is a generalization of the set membership problem that we have introduced in class, and there is another hash table called '''bloomier filter''' that solves this problem in linear (<math>O(nr)</math>) space and constant query time with tiny one-sided error probability.
 
Now if we are allowed to answer arbitrarily when <math>x\notin S</math>, then without resorting to bloomier filter, show that we can store any instance(<math>S</math> and <math>\{v_x\}_{x\in S}</math>) into a table using at most <math>O(nr+\log\log m)</math> bits space, and then correctly answer this relaxed query based on table contents only.
 
''Hint: You don't have to actually design an algorithm, just prove its existence is fine. You might want to use probabilistic method and [https://en.wikipedia.org/wiki/Perfect_hash_function perfect hashing].''

Revision as of 17:07, 18 September 2018

Problem 1

Recall that in class we show by the probabilistic method how to deduce a [math]\displaystyle{ \frac{n(n-1)}{2} }[/math] upper bound on the number of distinct min-cuts in any multigraph [math]\displaystyle{ G }[/math] with [math]\displaystyle{ n }[/math] vertices from the [math]\displaystyle{ \frac{2}{n(n-1)} }[/math] lower bound for success probability of Karger's min-cut algorithm.

Also recall that the [math]\displaystyle{ FastCut }[/math] algorithm taught in class guarantees to return a min-cut with probability at least [math]\displaystyle{ \Omega(1/\log n) }[/math]. Does this imply a much tighter [math]\displaystyle{ O(\log n) }[/math] upper bound on the number of distinct min-cuts in any multigraph [math]\displaystyle{ G }[/math] with [math]\displaystyle{ n }[/math] vertices? Prove your improved upper bound if your answer is "yes", and give a satisfactory explanation if your answer is "no".

Problem 2

Two rooted trees [math]\displaystyle{ T_1 }[/math] and [math]\displaystyle{ T_2 }[/math] are said to be isomorphic if there exists a bijection [math]\displaystyle{ \phi }[/math] that maps vertices of [math]\displaystyle{ T_1 }[/math] to those of [math]\displaystyle{ T_2 }[/math] satisfying the following condition: for each internal vertex [math]\displaystyle{ v }[/math] of [math]\displaystyle{ T_1 }[/math] with children [math]\displaystyle{ u_1,u_2,\ldots, u_k }[/math], the set of children of vertex [math]\displaystyle{ \phi(v) }[/math] in [math]\displaystyle{ T_2 }[/math] is precisely [math]\displaystyle{ \{\phi(u_1), \phi(u_2),\ldots,\phi(u_k)\} }[/math], no ordering among children assumed.

Give an efficient randomized algorithm with bounded one-sided error (false positive), for testing isomorphism between rooted trees with [math]\displaystyle{ n }[/math] vertices. Analyze your algorithm.

Problem 3

Fix a universe [math]\displaystyle{ U }[/math] and two subset [math]\displaystyle{ A,B \subseteq U }[/math], both with size [math]\displaystyle{ n }[/math]. we both create Bloom filters [math]\displaystyle{ F_A }[/math]([math]\displaystyle{ F_B }[/math]) for [math]\displaystyle{ A }[/math] ([math]\displaystyle{ B }[/math]), using the same number of bits [math]\displaystyle{ m }[/math] and the same [math]\displaystyle{ k }[/math] hash functions.

  • Let [math]\displaystyle{ F_C = F_A \and F_B }[/math] be the Bloom filter formed by computing the bitwise AND of [math]\displaystyle{ F_A }[/math] and [math]\displaystyle{ F_B }[/math]. Argue that [math]\displaystyle{ F_C }[/math] may not always be the same as the Bloom filter that are created for [math]\displaystyle{ A\cap B }[/math].
  • Bloom filters can be used to estimate set differences. Express the expected number of bits where [math]\displaystyle{ F_A }[/math] and [math]\displaystyle{ F_B }[/math] differ as a function of [math]\displaystyle{ m, n, k }[/math] and [math]\displaystyle{ |A\cap B| }[/math].