随机算法 (Spring 2014)/Random Recurrence and 组合数学 (Spring 2014)/Problem Set 2: Difference between pages

From TCS Wiki
(Difference between pages)
Jump to navigation Jump to search
imported>Etone
 
imported>Etone
 
Line 1: Line 1:
=Random Quicksort=
== Problem 1==
Given as input a set <math>S</math> of <math>n</math> numbers, we want to sort the numbers in <math>S</math> in increasing order. One of the most famous algorithm for this problem is the  [http://en.wikipedia.org/wiki/Quicksort Quicksort] algorithm.
Prove the following identity:  
* if <math>|S|>1</math> do:
*<math>\sum_{k=1}^n k{n\choose k}= n2^{n-1}</math>.
** pick an <math>x\in S</math> as the ''pivot'';
** partition <math>S</math> into <math>S_1</math>, <math>\{x\}</math>, and <math>S_2</math>, where all numbers in <math>S_1</math> are smaller than <math>x</math> and all numbers in <math>S_2</math> are  larger than <math>x</math>;
** recursively sort <math>S_1</math> and <math>S_2</math>;


The time complexity of this sorting algorithm is measured by the '''number of comparisons'''.
(Hint: Use double counting.)


For the '''deterministic''' quicksort algorithm, the pivot is picked from a fixed position (e.g. the first number in the array). The worst-case time complexity in terms of number of comparisons is <math>\Theta(n^2)</math>.
== Problem 2 ==
(Erdős-Spencer 1974)


We consider the following randomized version of the quicksort.
Let <math>n</math> coins of weights 0 and 1 be given. We are also given a scale with which we may weigh any subset of the coins. Our goal is to determine the weights of coins (that is, to known which coins are 0 and which are 1) with the minimal number of weighings.
* if <math>|S|>1</math> do:
** ''uniformly'' pick a ''random'' <math>x\in S</math> as the pivot;
** partition <math>S</math> into <math>S_1</math>, <math>\{x\}</math>, and <math>S_2</math>, where all numbers in <math>S_1</math> are smaller than <math>x</math> and all numbers in <math>S_2</math> are larger than <math>x</math>;
** recursively sort <math>S_1</math> and <math>S_2</math>;


== Analysis of Random Quicksort==
This problem can be formalized as follows: A collection <math>S_1,S_1,\ldots,S_m\subseteq [n]</math> is called '''determining''' if an arbitrary subset <math>T\subseteq[n]</math> can be uniquely determined by the cardinalities <math>|S_i\cap T|, 1\le i\le m</math>.
Our goal is to analyze the expected number of comparisons during an execution of RandQSort with an arbitrary input <math>S</math>. We achieve this by measuring the chance that each pair of elements are compared, and summing all of them up due to [http://en.wikipedia.org/wiki/Expected_value#Linearity Linearity of Expectation].


Let <math>a_i</math> denote the <math>i</math>th smallest element in <math>S</math>.
* Prove that if there is a determining collection <math>S_1,S_1,\ldots,S_m\subseteq [n]</math>, then there is a way to determine the weights of <math>n</math> coins with <math>m</math> weighings.
Let <math>X_{ij}\in\{0,1\}</math> be the random variable which indicates whether <math>a_i</math> and <math>a_j</math> are compared during the execution of RandQSort. That is:
* Use pigeonhole principle to show that if a collection <math>S_1,S_1,\ldots,S_m\subseteq [n]</math> is determining, then it must hold that <math>m\ge \frac{n}{\log_2(n+1)}</math>.


:<math>
(This gives a lower bound for the number of weighings required to determine the weights of <math>n</math> coins.)
\begin{align}
X_{ij} &=
\begin{cases}
1 & a_i\mbox{ and }a_j\mbox{ are compared}\\
0 & \mbox{otherwise}
\end{cases}.
\end{align}
</math>


Elements <math>a_i</math> and <math>a_j</math> are compared only if one of them is chosen as pivot. After comparison they are separated (thus are never compared again). So we have the following observations:


'''Observation 1:  Every pair of <math>a_i</math> and <math>a_j</math> are compared at most once.'''
== Problem 3 ==


Therefore the sum of <math>X_{ij}</math> for all pair <math>\{i, j\}</math> gives the total number of comparisons. The expected number of comparisons is <math>\mathbf{E}\left[\sum_{i=1}^n\sum_{j>i}X_{ij}\right]</math>. Due to [http://en.wikipedia.org/wiki/Expected_value#Linearity Linearity of Expectation], <math>\mathbf{E}\left[\sum_{i=1}^n\sum_{j>i}X_{ij}\right] = \sum_{i=1}^n\sum_{j>i}\mathbf{E}\left[X_{ij}\right]</math>.
A set of vertices <math>D\subseteq V</math> of graph <math>G(V,E)</math> is a [http://en.wikipedia.org/wiki/Dominating_set ''dominating set''] if for every <math>v\in V</math>, it holds that <math>v\in D</math> or <math>v</math> is adjacent to a vertex in <math>D</math>. The problem of computing minimum dominating set is NP-hard.  
Our next step is to analyze <math>\mathbf{E}\left[X_{ij}\right]</math> for each <math>\{i, j\}</math>.


By the definition of expectation and <math>X_{ij}</math>,
* Prove that for every <math>d</math>-regular graph with <math>n</math> vertices, there exists a dominating set with size at most <math>\frac{n(1+\ln(d+1))}{d+1}</math>.


:<math>\begin{align}
* Try to obtain an upper bound for the size of dominating set using Lovász Local Lemma. Is it better or worse than previous one?
\mathbf{E}\left[X_{ij}\right]
&= 1\cdot \Pr[a_i\mbox{ and }a_j\mbox{ are compared}] + 0\cdot \Pr[a_i\mbox{ and }a_j\mbox{ are not compared}]\\
&= \Pr[a_i\mbox{ and }a_j\mbox{ are compared}].
\end{align}</math>


We are going to bound this probability.
== Problem 4 ==
Let <math>H(W,F)</math> be a graph and <math>n>|W|</math> be an integer. It is known that for some graph <math>G(V,E)</math> such that <math>|V|=n</math>, <math>|E|=m</math>, <math>G</math> does not contain <math>H</math> as a subgraph. Prove that for <math>k>\frac{n^2\ln n}{m}</math>, there is an edge <math>k</math>-coloring for <math>K_n</math> that <math>K_n</math> contains no monochromatic <math>H</math>.


'''Observation 2: <math>a_i</math> and <math>a_j</math> are compared if and only if one of them is chosen as pivot when they are still in the same subset.'''
Remark: Let <math>E=\binom{V}{2}</math> be the edge set of <math>K_n</math>. "An edge <math>k</math>-coloring for <math>K_n</math>" is a mapping <math>f:E\to[k]</math>.


This is easy to verify: just check the algorithm. The next one is a bit complicated.
== Problem 5 ==


'''Observation 3: If <math>a_i</math> and <math>a_j</math> are still in the same subset then all <math>\{a_i, a_{i+1}, \ldots, a_{j-1}, a_{j}\}</math> are in the same subset.'''
Let <math>G(V,E)</math> be a cycle of length <math>k\cdot n</math> and let <math>V=V_1\cup V_2\cup\dots V_n</math> be a partition of its <math>k\cdot n</math> vertices into <math>n</math> pairwise disjoint subsets, each of cardinality <math>k</math>.
For <math>k\ge 11</math>, show that there must be an independent set of <math>G</math> containing precisely one vertex from each <math>V_i</math>.


We can verify this by induction. Initially, <math>S</math> itself has the property described above; and partitioning any <math>S</math> with the property into <math>S_1</math> and <math>S_2</math> will preserve the property for both <math>S_1</math> and <math>S_2</math>. Therefore Claim 3 holds.
Hint: you can use Lovász Local Lemma.
 
Combining Observation 2 and 3, we have:
 
'''Observation 4: <math>a_i</math> and <math>a_j</math> are compared only if one of <math>\{a_i, a_j\}</math> is chosen from <math>\{a_i, a_{i+1}, \ldots, a_{j-1}, a_{j}\}</math>.'''
 
And,
 
'''Observation 5: Every one of <math>\{a_i, a_{i+1}, \ldots, a_{j-1}, a_{j}\}</math> is chosen equal-probably.'''
 
This is because the Random Quicksort chooses the pivot ''uniformly at random''.
 
Observation 4 and 5 together imply:
 
:<math>\begin{align}
\Pr[a_i\mbox{ and }a_j\mbox{ are compared}]
&\le \frac{2}{j-i+1}.
\end{align}</math>
 
{|border="1"
|'''Remark:''' Perhaps you feel confused about the above argument. You may ask: "''The algorithm chooses pivots for many times during the execution. Why in the above argument, it looks like the pivot is chosen only once?''" Good question! Let's see what really happens by looking closely.
 
For any pair <math>a_i</math> and <math>a_j</math>, initially <math>\{a_i, a_{i+1}, \ldots, a_{j-1}, a_{j}\}</math> are all in the same set <math>S</math> (obviously!). During the execution of the algorithm, the set which containing <math>\{a_i, a_{i+1}, \ldots, a_{j-1}, a_{j}\}</math> are shrinking (due to the pivoting), until one of <math>\{a_i, a_{i+1}, \ldots, a_{j-1}, a_{j}\}</math> is chosen, and the set is partitioned into different subsets. We ask for the probability that the chosen one is among <math>\{a_i, a_j\}</math>. So we really care about "the last" pivoting before <math>\{a_i, a_{i+1}, \ldots, a_{j-1}, a_{j}\}</math> is split.
 
Formally, let <math>Y</math> be the random variable denoting the pivot element. We know that for each <math>a_k\in\{a_i, a_{i+1}, \ldots, a_{j-1}, a_{j}\}</math>, <math>Y=a_k</math> with the same probability, and <math>Y\not\in\{a_i, a_{i+1}, \ldots, a_{j-1}, a_{j}\}</math> with an unknown probability (remember that there might be other elements in the same subset with <math>\{a_i, a_{i+1}, \ldots, a_{j-1}, a_{j}\}</math>). The probability we are looking for is actually
<math>\Pr[Y\in \{a_i, a_j\}\mid Y\in\{a_i, a_{i+1}, \ldots, a_{j-1}, a_{j}\}]</math>, which is always <math>\frac{2}{j-i+1}</math>, provided that <math>Y</math> is uniform over <math>\{a_i, a_{i+1}, \ldots, a_{j-1}, a_{j}\}</math>.
 
The '''conditional probability''' rules out the ''irrelevant'' events in a probabilistic argument.
|}
 
Summing all up:
 
:<math>\begin{align}
\mathbf{E}\left[\sum_{i=1}^n\sum_{j>i}X_{ij}\right]
&=
\sum_{i=1}^n\sum_{j>i}\mathbf{E}\left[X_{ij}\right]\\
&\le \sum_{i=1}^n\sum_{j>i}\frac{2}{j-i+1}\\
&= \sum_{i=1}^n\sum_{k=2}^{n-i+1}\frac{2}{k} & & (\mbox{Let }k=j-i+1)\\
&\le \sum_{i=1}^n\sum_{k=1}^{n}\frac{2}{k}\\
&= 2n\sum_{k=1}^{n}\frac{1}{k}\\
&= 2n H(n).
\end{align}</math>
 
<math>H(n)</math> is the <math>n</math>th [http://en.wikipedia.org/wiki/Harmonic_number Harmonic number]. It holds that
 
:<math>\begin{align}H(n) = \ln n+O(1)\end{align}</math>.
 
Therefore, for an arbitrary input <math>S</math> of <math>n</math> numbers, the expected number of comparisons taken by RandQSort to sort <math>S</math> is <math>\mathrm{O}(n\log n)</math>.
 
=Random Recurrence=
 
== Karp-Upfal-Wigderson bound==

Revision as of 11:26, 9 April 2014

Problem 1

Prove the following identity:

  • [math]\displaystyle{ \sum_{k=1}^n k{n\choose k}= n2^{n-1} }[/math].

(Hint: Use double counting.)

Problem 2

(Erdős-Spencer 1974)

Let [math]\displaystyle{ n }[/math] coins of weights 0 and 1 be given. We are also given a scale with which we may weigh any subset of the coins. Our goal is to determine the weights of coins (that is, to known which coins are 0 and which are 1) with the minimal number of weighings.

This problem can be formalized as follows: A collection [math]\displaystyle{ S_1,S_1,\ldots,S_m\subseteq [n] }[/math] is called determining if an arbitrary subset [math]\displaystyle{ T\subseteq[n] }[/math] can be uniquely determined by the cardinalities [math]\displaystyle{ |S_i\cap T|, 1\le i\le m }[/math].

  • Prove that if there is a determining collection [math]\displaystyle{ S_1,S_1,\ldots,S_m\subseteq [n] }[/math], then there is a way to determine the weights of [math]\displaystyle{ n }[/math] coins with [math]\displaystyle{ m }[/math] weighings.
  • Use pigeonhole principle to show that if a collection [math]\displaystyle{ S_1,S_1,\ldots,S_m\subseteq [n] }[/math] is determining, then it must hold that [math]\displaystyle{ m\ge \frac{n}{\log_2(n+1)} }[/math].

(This gives a lower bound for the number of weighings required to determine the weights of [math]\displaystyle{ n }[/math] coins.)


Problem 3

A set of vertices [math]\displaystyle{ D\subseteq V }[/math] of graph [math]\displaystyle{ G(V,E) }[/math] is a dominating set if for every [math]\displaystyle{ v\in V }[/math], it holds that [math]\displaystyle{ v\in D }[/math] or [math]\displaystyle{ v }[/math] is adjacent to a vertex in [math]\displaystyle{ D }[/math]. The problem of computing minimum dominating set is NP-hard.

  • Prove that for every [math]\displaystyle{ d }[/math]-regular graph with [math]\displaystyle{ n }[/math] vertices, there exists a dominating set with size at most [math]\displaystyle{ \frac{n(1+\ln(d+1))}{d+1} }[/math].
  • Try to obtain an upper bound for the size of dominating set using Lovász Local Lemma. Is it better or worse than previous one?

Problem 4

Let [math]\displaystyle{ H(W,F) }[/math] be a graph and [math]\displaystyle{ n\gt |W| }[/math] be an integer. It is known that for some graph [math]\displaystyle{ G(V,E) }[/math] such that [math]\displaystyle{ |V|=n }[/math], [math]\displaystyle{ |E|=m }[/math], [math]\displaystyle{ G }[/math] does not contain [math]\displaystyle{ H }[/math] as a subgraph. Prove that for [math]\displaystyle{ k\gt \frac{n^2\ln n}{m} }[/math], there is an edge [math]\displaystyle{ k }[/math]-coloring for [math]\displaystyle{ K_n }[/math] that [math]\displaystyle{ K_n }[/math] contains no monochromatic [math]\displaystyle{ H }[/math].

Remark: Let [math]\displaystyle{ E=\binom{V}{2} }[/math] be the edge set of [math]\displaystyle{ K_n }[/math]. "An edge [math]\displaystyle{ k }[/math]-coloring for [math]\displaystyle{ K_n }[/math]" is a mapping [math]\displaystyle{ f:E\to[k] }[/math].

Problem 5

Let [math]\displaystyle{ G(V,E) }[/math] be a cycle of length [math]\displaystyle{ k\cdot n }[/math] and let [math]\displaystyle{ V=V_1\cup V_2\cup\dots V_n }[/math] be a partition of its [math]\displaystyle{ k\cdot n }[/math] vertices into [math]\displaystyle{ n }[/math] pairwise disjoint subsets, each of cardinality [math]\displaystyle{ k }[/math]. For [math]\displaystyle{ k\ge 11 }[/math], show that there must be an independent set of [math]\displaystyle{ G }[/math] containing precisely one vertex from each [math]\displaystyle{ V_i }[/math].

Hint: you can use Lovász Local Lemma.