# 随机算法 (Spring 2014)/Problem Set 3

## Problem 1

(Due to J. Naor.)

The Chernoff bound is an exponentially decreasing bound on tail distributions. Let $\displaystyle{ X_1,\dots,X_n }$ be independent random variables and $\displaystyle{ \mathbf{E}[X_i]=0 }$ for all $\displaystyle{ 1\le i\le n }$. Define $\displaystyle{ X=X_1+X_2+\dots+X_n }$. We can use the following two kinds of tail inequalities for $\displaystyle{ X }$.

• Chernoff Bounds:
$\displaystyle{ \Pr[|X|\ge\delta]\le\min_{t\ge 0}\frac{\mathbf{E}[e^{t|X|}]}{e^{t\delta}} }$.

• $\displaystyle{ k }$th-Moment Bound:
$\displaystyle{ \Pr[|X|\ge\delta]\le\frac{\mathbf{E}[|X|^k]}{\delta^k} }$.
1. Show that for each $\displaystyle{ \delta }$, there exists a choice of $\displaystyle{ k }$ such that the $\displaystyle{ k }$th-moment bound is stronger than the Chernoff bound. (Hint: You may use the probabilistic method.)
2. Why would we still prefer the Chernoff bound to the seemingly stronger $\displaystyle{ k }$th-moment bound?

## Problem 2

Given a binary string, define a run as a maximal sequence of contiguous 1s; for example, the following string

$\displaystyle{ \underbrace{111}_{3}00\underbrace{11}_{2}00\underbrace{111111}_{6}0\underbrace{1}_{1}0\underbrace{11}_{2} }$

contains 5 runs, of length 3, 2, 6, 1, and 2.

Let $\displaystyle{ S }$ be a binary string of length $\displaystyle{ n }$, generated uniformly at random. Let $\displaystyle{ X_k }$ be the number of runs in $\displaystyle{ S }$ of length $\displaystyle{ k }$ or more.

• Compute the exact value of $\displaystyle{ \mathbb{E}[X_k] }$ as a function of $\displaystyle{ n }$ and $\displaystyle{ k }$.
• Give the best concentration bound you can for $\displaystyle{ |X_k -\mathbb{E}[X_k]| }$.

## Problem 3

The maximum directed cut problem (MAX-DICUT).

We are given as input a directed graph $\displaystyle{ G=(V,E) }$, with each directed edge $\displaystyle{ (u,v)\in E }$ having a nonnegative weight $\displaystyle{ w_{uv}\ge 0 }$. The goal is to partition $\displaystyle{ V }$ into two sets $\displaystyle{ S\, }$ and $\displaystyle{ \bar{S}=V\setminus S }$ so as to maximize the value of $\displaystyle{ \sum_{(u,v)\in E\atop u\in S,v\not\in S}w_{uv} }$, that is, the total weight of the edges going from $\displaystyle{ S\, }$ to $\displaystyle{ \bar{S} }$.

• Give a randomized $\displaystyle{ \frac{1}{4} }$-approximation algorithm based on random sampling.
• Prove that the following is an integer programming for the problem:
\displaystyle{ \begin{align} \text{maximize} && \sum_{(i,j)\in E}w_{ij}z_{ij}\\ \text{subject to} && z_{ij} &\le x_i, & \forall (i,j)&\in E,\\ && z_{ij} &\le 1-x_j, & \forall (i,j)&\in E,\\ && x_i &\in\{0,1\}, & \forall i&\in V,\\ && 0 \le z_{ij}&\le 1, & \forall (i,j)&\in E. \end{align} }
• Consider a randomized rounding algorithm that solves an LP relaxation of the above integer programming and puts vertex $\displaystyle{ i }$ in $\displaystyle{ S }$ with probability $\displaystyle{ f(x_i^*) }$. We may assume that $\displaystyle{ f(x) }$ is a linear function in the form $\displaystyle{ f(x)=ax+b }$ with some constant $\displaystyle{ a }$ and $\displaystyle{ b }$ to be fixed. Try to find good $\displaystyle{ a }$ and $\displaystyle{ b }$ so that the randomized rounding algorithm has a good approximation ratio.

## Problem 4

The set cover problem is defined as follows:

• Let $\displaystyle{ U=\{u_1,u_2,\ldots,u_n\} }$ be a set of $\displaystyle{ n }$ elements, and let $\displaystyle{ \mathcal{S}=\{S_1,S_2,\ldots,S_m\} }$ be a family of subsets of $\displaystyle{ U }$. For each $\displaystyle{ u_i\in U }$, let $\displaystyle{ w_i }$ be a nonnegative weight of $\displaystyle{ u_i }$. The goal is to find a subset $\displaystyle{ V\subseteq U }$ with the minimum total weight $\displaystyle{ \sum_{i\in V}w_i }$, that intersects with all $\displaystyle{ S_i\in\mathcal{S} }$.

This problem is NP-hard.

(Remark: There are two equivalent definitions of the set cover problem. We take the hitting set version.)

Questions:

• Prove that the following is an integer programming for the problem:
\displaystyle{ \begin{align} \text{minimize} && \sum_{(i,j)\in E}w_{i}x_{i}\\ \text{subject to} && \sum_{i:u_i\in S_j}x_i &\ge 1, &1\le j\le m,\\ && x_i &\in\{0,1\}, & 1\le i\le n. \end{align} }
• Give a randomized rounding algorithm which returns an $\displaystyle{ O(\log m) }$-approximate solution with probability at least $\displaystyle{ \frac{1}{2} }$. (Hint: you may repeat the randomized rounding process if there remains some uncovered subsets after one time of applying the randomized rounding.)