高级算法 (Fall 2021)/Problem Set 3: Difference between revisions
imported>TCSseminar No edit summary |
imported>TCSseminar No edit summary |
||
Line 54: | Line 54: | ||
* Apply the randomized rounding such that for every <math>v\in V</math>, <math>\hat{x}_v=1</math> independently with probability <math>x_v^*</math>. Analyze the approximation ratio (between the expected size of the random cut and OPT). | * Apply the randomized rounding such that for every <math>v\in V</math>, <math>\hat{x}_v=1</math> independently with probability <math>x_v^*</math>. Analyze the approximation ratio (between the expected size of the random cut and OPT). | ||
* Apply another randomized rounding such that for every <math>v\in V</math>, <math>\hat{x}_v=1</math> independently with probability <math>1/4+x_v^*/2</math>. Analyze the approximation ratio for this algorithm. | * Apply another randomized rounding such that for every <math>v\in V</math>, <math>\hat{x}_v=1</math> independently with probability <math>1/4+x_v^*/2</math>. Analyze the approximation ratio for this algorithm. | ||
== Problem 4 == |
Revision as of 08:30, 22 November 2021
- 每道题目的解答都要有完整的解题过程。中英文不限。
Problem 1
In this problem, we will demonstrate a common algorithmic application of the Johnson-Lindenstrauss Theorem (JLT). Specifically, we will consider the [math]\displaystyle{ k }[/math]-means clustering problem, which is widely used in areas like machine learning and data mining.
In the [math]\displaystyle{ k }[/math]-means clustering problem, we are given [math]\displaystyle{ n }[/math] points [math]\displaystyle{ \mathbf{x}_1,\cdots,\mathbf x_n\in\mathbb{R}^d }[/math], and the goal is to partition these points into [math]\displaystyle{ k }[/math] disjoint sets (or, cluster) [math]\displaystyle{ C_1,\cdots,C_k }[/math] so as to minimize the cost:
- [math]\displaystyle{ \texttt{cost}(\mathbf{x}_1,\cdots,\mathbf{x}_n,C_1,\cdots,C_k)=\sum_{i=1}^{k}\sum_{j\in C_i}\lVert\mathbf{x}_j-\mathbf{\mu}_i\rVert^2_2 }[/math]
Here, [math]\displaystyle{ \mathbf{\mu}_i=(1/|C_i|)\cdot\sum_{j\in C_i}\mathbf{x}_j }[/math] is the mean of the points in cluster [math]\displaystyle{ C_i }[/math]. In other words, we want to find [math]\displaystyle{ k }[/math] clusters that have as little internal variance as possible.
(a) Prove the following equality:
- [math]\displaystyle{ \texttt{cost}(\mathbf{x}_1,\cdots,\mathbf{x}_n,C_1,\cdots,C_k)=\sum_{i=1}^{k}\frac{1}{|C_i|}\sum_{j,l\in C_i,j\lt l}\lVert\mathbf{x}_j-\mathbf{x}_l\rVert^2_2 }[/math]
(b) Suppose we use the Johnson-Lindenstrauss method taught in class to embed [math]\displaystyle{ \mathbf{x}_1,\cdots,\mathbf{x}_n }[/math] into [math]\displaystyle{ O(\log{n}/\epsilon^2) }[/math]-dimensional points [math]\displaystyle{ \mathbf{\tilde{x}}_1,\cdots,\mathbf{\tilde{x}}_n }[/math]. Show that with high probability in [math]\displaystyle{ n }[/math], the following inequality holds for all clusters:
- [math]\displaystyle{ (1-\epsilon)\texttt{cost}(\mathbf{x}_1,\cdots,\mathbf{x}_n,C_1,\cdots,C_k) \leq\texttt{cost}(\mathbf{\tilde{x}}_1,\cdots,\mathbf{\tilde{x}}_n,C_1,\cdots,C_k) \leq(1+\epsilon)\texttt{cost}(\mathbf{x}_1,\cdots,\mathbf{x}_n,C_1,\cdots,C_k) }[/math]
(c) Suppose we know a set of clusters (i.e., a partition of the [math]\displaystyle{ n }[/math] points) [math]\displaystyle{ \tilde{C}_1,\cdots,\tilde{C}_k }[/math] such that:
- [math]\displaystyle{ \texttt{cost}(\mathbf{\tilde{x}}_1,\cdots,\mathbf{\tilde{x}}_n,\tilde{C}_1,\cdots,\tilde{C}_k)\leq\gamma\cdot\texttt{cost}(\mathbf{\tilde{x}}_1,\cdots,\mathbf{\tilde{x}}_n,\tilde{C}^*_1,\cdots,\tilde{C}^*_k) }[/math]
where [math]\displaystyle{ \tilde{C}^*_1,\cdots,\tilde{C}^*_k }[/math] is the optimal partition for points [math]\displaystyle{ \mathbf{\tilde{x}}_1,\cdots,\mathbf{\tilde{x}}_n }[/math]. That is, assume we know a [math]\displaystyle{ \gamma }[/math] approximation to the optimal clustering for the dimensionality reduced points. Show that, when [math]\displaystyle{ \epsilon\leq 1/2 }[/math],
- [math]\displaystyle{ \texttt{cost}(\mathbf{x}_1,\cdots,\mathbf{x}_n,\tilde{C}_1,\cdots,\tilde{C}_k)\leq(1+O(\epsilon))\cdot\gamma\cdot\texttt{cost}(\mathbf{x}_1,\cdots,\mathbf{x}_n,C^*_1,\cdots,C^*_k) }[/math]
where [math]\displaystyle{ C^*_1,\cdots,C^*_k }[/math] is the optimal partition for [math]\displaystyle{ \mathbf{x}_1,\cdots,\mathbf{x}_n }[/math].
In other words, we can compute an approximately optimal clustering for high dimensional original data using just the dimensionality reduced data. In many cases, this approach will speed up algorithms for solving [math]\displaystyle{ k }[/math]-means clustering.
Problem 2
The following is the weighted version of set cover problem:
Given [math]\displaystyle{ m }[/math] subsets [math]\displaystyle{ S_1,S_2,\ldots,S_m\subseteq U }[/math], where [math]\displaystyle{ U }[/math] is a universe of size [math]\displaystyle{ n=|U| }[/math], and each subset [math]\displaystyle{ S_i }[/math] is assigned a positive weight [math]\displaystyle{ w_i\gt 0 }[/math], the goal is to find a [math]\displaystyle{ C\subseteq\{1,2,\ldots,m\} }[/math] such that [math]\displaystyle{ U=\bigcup_{i\in C}S_i }[/math] and the total weight [math]\displaystyle{ \sum_{I\in C}w_i }[/math] is minimized.
- Give an integer program for the problem and its LP relaxation.
- Consider the following idea of randomized rounding: independently round each fractional value to [math]\displaystyle{ \{0,1\} }[/math] with the probability of the fractional value itself; and repeatedly apply this process to the variables rounded to 0 in previous iterations until [math]\displaystyle{ U }[/math] is fully covered. Show that this can return a set cover with [math]\displaystyle{ O(\log n) }[/math] approximation ratio with probability at least [math]\displaystyle{ 0.99 }[/math].
Problem 3
In the maximum directed cut (MAX-DICUT) problem, we are given as input a directed graph [math]\displaystyle{ G(V,E) }[/math]. The goal is to partition [math]\displaystyle{ V }[/math] into disjoint [math]\displaystyle{ S }[/math] and [math]\displaystyle{ T }[/math] so that the number of edges in [math]\displaystyle{ E(S,T)=\{(u,v)\in E\mid u\in S, v\in T\} }[/math] is maximized. The following is the integer program for MAX-DICUT:
- [math]\displaystyle{ \begin{align} \text{maximize} &&& \sum_{(u,v)\in E}y_{u,v}\\ \text{subject to} && y_{u,v} &\le x_u, & \forall (u,v)&\in E,\\ && y_{u,v} &\le 1-x_v, & \forall (u,v)&\in E,\\ && x_v &\in\{0,1\}, & \forall v&\in V,\\ && y_{u,v} &\in\{0,1\}, & \forall (u,v)&\in E. \end{align} }[/math]
Let [math]\displaystyle{ x_v^*,y_{u,v}^* }[/math] denote the optimal solution to the LP-relaxation of the above integer program.
- Apply the randomized rounding such that for every [math]\displaystyle{ v\in V }[/math], [math]\displaystyle{ \hat{x}_v=1 }[/math] independently with probability [math]\displaystyle{ x_v^* }[/math]. Analyze the approximation ratio (between the expected size of the random cut and OPT).
- Apply another randomized rounding such that for every [math]\displaystyle{ v\in V }[/math], [math]\displaystyle{ \hat{x}_v=1 }[/math] independently with probability [math]\displaystyle{ 1/4+x_v^*/2 }[/math]. Analyze the approximation ratio for this algorithm.