# 随机算法 (Spring 2013)/Problem Set 3

## Contents

## Problem 1

Given a binary string, define a **run** as a maximal sequence of contiguous 1s; for example, the following string

contains 5 runs, of length 3, 2, 6, 1, and 2.

Let be a binary string of length , generated uniformly at random. Let be the number of runs in of length or more.

- Compute the exact value of as a function of and .
- Give the best concentration bound you can for .

## Problem 2

A **boolean code** is a mapping . Each is called a **message** and is called a **codeword**. The **code rate** of a code is . A boolean code is a **linear code** if it is a linear transformation, i.e. there is a matrix such that for any , where the additions and multiplications are defined over the finite field of order two, .

The **distance** between two codeword and , denoted by , is defined as the Hamming distance between them. Formally, . The distance of a code is the minimum distance between any two codewords. Formally, .

Usually we want to make both the code rate and the code distance as large as possible, because a larger rate means that the amount of actual message per transmitted bit is high, and a larger distance allows for more error correction and detection.

- Use the probabilistic method to prove that there exists a boolean code of code rate and distance . Try to optimize the constant in .
- Prove a similar result for linear boolean codes.

## Problem 3

- The maximum directed cut problem (MAX-DICUT).

We are given as input a directed graph , with each directed edge having a nonnegative weight . The goal is to partition into two sets and so as to maximize the value of , that is, the total weight of the edges going from to .

- Give a randomized -approximation algorithm based on random sampling.
- Prove that the following is an integer programming for the problem:

- Consider a randomized rounding algorithm that solves an LP relaxation of the above integer programming and puts vertex in with probability . We may assume that is a linear function in the form with some constant and to be fixed. Try to find good and so that the randomized rounding algorithm has a good approximation ratio.

## Problem 4

The set cover problem is defined as follows:

- Let be a set of elements, and let be a family of subsets of . For each , let be a nonnegative weight of . The goal is to find a subset with the minimum total weight , that intersects with all .

This problem is **NP-hard**.

(**Remark**: There are two equivalent definitions of the set cover problem. We take the **hitting set** version.)

Questions:

- Prove that the following is an integer programming for the problem:

- Give a randomized rounding algorithm which returns an -approximate solution with probability at least . (Hint: you may repeat the randomized rounding process if there remains some uncovered subsets after one time of applying the randomized rounding.)