# 随机算法 (Spring 2014)/Problem Set 3

## Problem 1

(Due to J. Naor.)

The Chernoff bound is an exponentially decreasing bound on tail distributions. Let ${\displaystyle X_{1},\dots ,X_{n}}$ be independent random variables and ${\displaystyle \mathbf {E} [X_{i}]=0}$ for all ${\displaystyle 1\leq i\leq n}$. Define ${\displaystyle X=X_{1}+X_{2}+\dots +X_{n}}$. We can use the following two kinds of tail inequalities for ${\displaystyle X}$.

• Chernoff Bounds:
${\displaystyle \Pr[|X|\geq \delta ]\leq \min _{t\geq 0}{\frac {\mathbf {E} [e^{t|X|}]}{e^{t\delta }}}}$.

• ${\displaystyle k}$th-Moment Bound:
${\displaystyle \Pr[|X|\geq \delta ]\leq {\frac {\mathbf {E} [|X|^{k}]}{\delta ^{k}}}}$.
1. Show that for each ${\displaystyle \delta }$, there exists a choice of ${\displaystyle k}$ such that the ${\displaystyle k}$th-moment bound is stronger than the Chernoff bound. (Hint: You may use the probabilistic method.)
2. Why would we still prefer the Chernoff bound to the seemingly stronger ${\displaystyle k}$th-moment bound?

## Problem 2

Given a binary string, define a run as a maximal sequence of contiguous 1s; for example, the following string

${\displaystyle \underbrace {111} _{3}00\underbrace {11} _{2}00\underbrace {111111} _{6}0\underbrace {1} _{1}0\underbrace {11} _{2}}$

contains 5 runs, of length 3, 2, 6, 1, and 2.

Let ${\displaystyle S}$ be a binary string of length ${\displaystyle n}$, generated uniformly at random. Let ${\displaystyle X_{k}}$ be the number of runs in ${\displaystyle S}$ of length ${\displaystyle k}$ or more.

• Compute the exact value of ${\displaystyle \mathbb {E} [X_{k}]}$ as a function of ${\displaystyle n}$ and ${\displaystyle k}$.
• Give the best concentration bound you can for ${\displaystyle |X_{k}-\mathbb {E} [X_{k}]|}$.

## Problem 3

The maximum directed cut problem (MAX-DICUT).

We are given as input a directed graph ${\displaystyle G=(V,E)}$, with each directed edge ${\displaystyle (u,v)\in E}$ having a nonnegative weight ${\displaystyle w_{uv}\geq 0}$. The goal is to partition ${\displaystyle V}$ into two sets ${\displaystyle S\,}$ and ${\displaystyle {\bar {S}}=V\setminus S}$ so as to maximize the value of ${\displaystyle \sum _{(u,v)\in E \atop u\in S,v\not \in S}w_{uv}}$, that is, the total weight of the edges going from ${\displaystyle S\,}$ to ${\displaystyle {\bar {S}}}$.

• Give a randomized ${\displaystyle {\frac {1}{4}}}$-approximation algorithm based on random sampling.
• Prove that the following is an integer programming for the problem:
{\displaystyle {\begin{aligned}{\text{maximize}}&&\sum _{(i,j)\in E}w_{ij}z_{ij}\\{\text{subject to}}&&z_{ij}&\leq x_{i},&\forall (i,j)&\in E,\\&&z_{ij}&\leq 1-x_{j},&\forall (i,j)&\in E,\\&&x_{i}&\in \{0,1\},&\forall i&\in V,\\&&0\leq z_{ij}&\leq 1,&\forall (i,j)&\in E.\end{aligned}}}
• Consider a randomized rounding algorithm that solves an LP relaxation of the above integer programming and puts vertex ${\displaystyle i}$ in ${\displaystyle S}$ with probability ${\displaystyle f(x_{i}^{*})}$. We may assume that ${\displaystyle f(x)}$ is a linear function in the form ${\displaystyle f(x)=ax+b}$ with some constant ${\displaystyle a}$ and ${\displaystyle b}$ to be fixed. Try to find good ${\displaystyle a}$ and ${\displaystyle b}$ so that the randomized rounding algorithm has a good approximation ratio.

## Problem 4

The set cover problem is defined as follows:

• Let ${\displaystyle U=\{u_{1},u_{2},\ldots ,u_{n}\}}$ be a set of ${\displaystyle n}$ elements, and let ${\displaystyle {\mathcal {S}}=\{S_{1},S_{2},\ldots ,S_{m}\}}$ be a family of subsets of ${\displaystyle U}$. For each ${\displaystyle u_{i}\in U}$, let ${\displaystyle w_{i}}$ be a nonnegative weight of ${\displaystyle u_{i}}$. The goal is to find a subset ${\displaystyle V\subseteq U}$ with the minimum total weight ${\displaystyle \sum _{i\in V}w_{i}}$, that intersects with all ${\displaystyle S_{i}\in {\mathcal {S}}}$.

This problem is NP-hard.

(Remark: There are two equivalent definitions of the set cover problem. We take the hitting set version.)

Questions:

• Prove that the following is an integer programming for the problem:
{\displaystyle {\begin{aligned}{\text{minimize}}&&\sum _{(i,j)\in E}w_{i}x_{i}\\{\text{subject to}}&&\sum _{i:u_{i}\in S_{j}}x_{i}&\geq 1,&1\leq j\leq m,\\&&x_{i}&\in \{0,1\},&1\leq i\leq n.\end{aligned}}}
• Give a randomized rounding algorithm which returns an ${\displaystyle O(\log m)}$-approximate solution with probability at least ${\displaystyle {\frac {1}{2}}}$. (Hint: you may repeat the randomized rounding process if there remains some uncovered subsets after one time of applying the randomized rounding.)