# 高级算法 (Fall 2018)/Problem Set 5

每道题目的解答都要有完整的解题过程、分析和证明。中英文不限。


## Problem 1

In the facility location problem, some clients may be far away from all facilities. If we require every client to be connected to at least one open facility, a big portion of the total cost is due to connecting these remote clients. This observation motivates the following variant of the facility location problem: for each client $j\in C$ we associate a penalty $p_{j}>0$ , and we must pay $p_{j}$ if client $j$ is not connected to any open facility. The objective is to minimize the sum of the connection costs, facility open costs, and penalties.

(a) Give the primal integer linear program for this variant of the facility location problem, its LP-relaxation and the dual LP.

(b) Modify the 3-approximation algorithm for the facility location problem to obtain an algorithm for this variant. Describe your modified algorithm.

(c) Prove your modified algorithm is a 3-approximation algorithm for this variant.

## Problem 2

Assume we have a set $V$ containing $n$ different images. We are also given two multisets $E^{+}$ and $E^{-}$ , each of which is a multiset of pairs of images. Each element $(i,j)$ in $E^{+}$ means some user has marked image $i$ and image $j$ as similar, while each element $(i,j)$ in $E^{-}$ means some user has marked image $i$ and image $j$ as dissimilar. Notice, these ratings were generated by different users and may be inconsistent. That is, some image pair $(i,j)$ may appear in both $E^{+}$ and $E^{-}$ . Call elements in $E^{+}$ as $+$ edges, and elements in $E^{-}$ as $-$ edges. We wish to partition images into clusters $S_{1},S_{2},\cdots$ so as to maximize: (number of $+$ edges that lie within clusters) $+$ (number of $-$ edges that lie between clusters).

(a) Argue that the following SDP gives an upper bound of the above objective, where $w_{(i,j)}^{+}$ and $w_{(i,j)}^{-}$ denote the number of times image pair $(i,j)$ has appeared in $E^{+}$ and $E^{-}$ , respectively.

{\begin{aligned}{\text{maximize}}\quad &\sum _{i\in V,j\in V,i (b) Devise a randomized algorithm that partition images into 4 clusters and in expectation achieves an objective value 0.75 times the optimal SDP value. Hint: use Goemans-Williamson style rounding but with two random hyperplanes instead of one.

## Problem 3

A $k$ -uniform hypergraph is an ordered pair $G=(V,E)$ , where $V$ denotes the set of vertices and $E$ denotes the set of edges. Moreover, each edge in $E$ now contains $k$ distinct vertices, instead of 2. (So a 2-uniform hypergraph is just what we normally call a graph.) A hypergraph is $k$ -regular if all vertices have degree $k$ ; that is, each vertex is exactly contained within $k$ hypergraph edges.

Show that for sufficiently large $k$ , the vertices of a $k$ -uniform, $k$ -regular hypergraph can be 2-colored so that no edge is monochromatic. What’s the smallest value of $k$ you can achieve?

## Programming Assignment

Finally, you can get your hands dirty and do some coding! You do not need to hand in solutions or source-code or anything for this programming assignment, but you should find some time to actually do this, so as to better understand the algorithmic Lovász Local Lemma.

More specifically, you will implement Moser's algorithm introduced in class for the following scenario. Consider a 9-SAT formula where each variable appears in 8 clauses. Set up a formula with 112,500 variables and 100,000 clauses in the following manner: set up 8 copies of each of the 112,500 variables (900,000 total variables), permute them, and use the ordering to assign the variables to the 100,000 clauses. (If a variable appears in a clause multiple times, try to locally correct this by swapping one copy to another clause.) Then assign a random "sign" to each variable: with probability 1/2, use ${\overline {x}}$ instead of $x$ . This gives a formula that satisfies the condition $d\leq 2^{k-3}-1$ . (Try verify this!)

For each execution of your implementation, you should track how many times the local correction procedure (i.e., ${\texttt {Fix}}$ ) is required before termination. Repeat this experiment with 100 (or more) different formulas derived from the process above, and observe the distribution of the number of local corrections required. Note that you may want to take some care to make the local correction step efficient in order to have your program run effectively.