Randomized Algorithms (Spring 2010)/Randomized approximation algorithms

Approximation Algorithms

An optimization problem consists of:

• A set of valid instances (inputs).
• Each instance ${\displaystyle I}$ defines a set of feasible solutions ${\displaystyle {\mathcal {F}}_{I}}$.
• Each instance ${\displaystyle I}$ also defines an objective function ${\displaystyle \mathrm {obj} _{I}:{\mathcal {F}}_{I}\rightarrow \mathbb {R} }$.
• The problem is specified to be either a minimization problem or a maximization problem. Given an instance ${\displaystyle I}$, our goal is to find a feasible solution ${\displaystyle x\in {\mathcal {F}}_{I}}$with the minimum (or maximum) value of ${\displaystyle \mathrm {obj} _{I}(x)}$.
 Example: the ${\displaystyle s}$-${\displaystyle t}$ shortest path problem Each instance is a (nonnegatively)weighted directed graph ${\displaystyle G(V,E,w)}$ along with two distinct vertices ${\displaystyle s,t\in V}$. Given an instance, the feasible solutions are the simple paths from ${\displaystyle s}$ to ${\displaystyle t}$ in ${\displaystyle G}$. The objective function is the length of the path, defined by the total weight of the path. The problem is a minimization problem.

An optimization problem can be transformed to a decision problem by "thresholding". For example, the optimization problem: "Given an instance ${\displaystyle I}$, find the ${\displaystyle x\in {\mathcal {F}}_{I}}$ with minimum ${\displaystyle \mathrm {obj} _{I}(x)}$." can be transformed to a (parameterized) decision problem: "Given an instance ${\displaystyle (I,T)}$, determine whether there exists an ${\displaystyle x\in {\mathcal {F}}_{I}}$ with ${\displaystyle \mathrm {obj} _{I}(x)."

Given an optimization problem, we can certainly solve its decision version by solving the optimization problem itself. The converse is also true: we can find the optimum by making queries to the decision version. Through binary search, this can be done efficiently.

Coping with the NP-hardness

An optimization problem is an NP-hard optimization problem if its decision version is NP-hard. Many interesting optimization problems are NP-hard. There are basically two ways to deal with the hardness.

Heuristics
We just apply some heuristic ideas to search for the optimal solution. This might work pretty good on some natural instances. On the other hand, giving a theoretical explanation why the heuristic method performs well on your input instance is usually very hard.
Approximation algorithms
We design an algorithm with guaranteed worst-case running time, which approximates the optimum within a guaranteed approximation ratio.

Let ${\displaystyle \mathrm {OPT} _{I}}$ denote the optimum of the instance ${\displaystyle I}$.

For a minimization problem, for any instance ${\displaystyle I}$, the algorithm returns a feasible solution ${\displaystyle x\in {\mathcal {F}}_{I}}$, such that ${\displaystyle \mathrm {obj} _{I}(x)\leq \alpha \cdot \mathrm {OPT} _{I}}$, where ${\displaystyle \alpha >1}$.

For a maximization problem, for any instance ${\displaystyle I}$, the algorithm returns a feasible solution ${\displaystyle x\in {\mathcal {F}}_{I}}$, such that ${\displaystyle \mathrm {obj} _{I}(x)\geq \alpha \cdot \mathrm {OPT} _{I}}$, where ${\displaystyle 0<\alpha <1}$.

Such an algorithm is called an ${\displaystyle \alpha }$-approximation algorithm for the optimization problem. The ratio ${\displaystyle \alpha }$ is called the approximation ratio.

We are particularly interested in the approximation algorithms running in poly-time of the input size, for the NP-hard optimization problems.

Combinatorial approximation algorithms

Classic approximation algorithms are mostly combinatorial algorithms, which uses standard algorithm design techniques (e.g., greedy, dynamic programming, etc.). In addition to the time complexity, we have to bound the approximation ratio.

We will introduce two examples, the minimum makespan scheduling, and the knapsack problem.

Scheduling

Scheduling is a class of problems. We consider a central problem in scheduling theory: the minimum makespan scheduling.

 Problem (Minimum makespan scheduling) Given processing times for ${\displaystyle n}$ jobs, ${\displaystyle p_{1},p_{2},\ldots ,p_{n}}$, and an integer ${\displaystyle m}$, find an assignment of the jobs to ${\displaystyle m}$ identical machines so that the completion time, also called the makespan, is minimum. Formally, given as input a sequence ${\displaystyle p_{1},p_{2},\ldots ,p_{n}}$ and an integer ${\displaystyle m}$, find an ${\displaystyle h:[n]\rightarrow [m]}$ such that ${\displaystyle \max _{i\in [m]}\sum _{j\in h^{-1}(i)}p_{j}}$ is minimized.

This problem is known to be NP-hard. We consider a very simple and natural algorithm:

• Schedule the jobs one by one, in an arbitrary order, each job being assigned to a machine with least amount of work so far.

This algorithms is due to Graham (1966), who also shows that the algorithm approximates the optimal makespan within a factor of 2.

Let OPT be the optimal makespan. We have two observations:

• ${\displaystyle \mathrm {OPT} \geq {\frac {1}{m}}\sum _{j=1}^{n}p_{j}}$, i.e. the optimal makespan is no smaller than the average processing time.
• ${\displaystyle \mathrm {OPT} \geq \max _{1\leq j\leq n}p_{j}}$, i.e. the optimal makespan is no smaller than the largest processing time.

Proof of the approximation factor: We assume that the last job scheduled is ${\displaystyle p_{n}}$, assigned to machine ${\displaystyle i}$, and the makespan produced by the algorithm is ${\displaystyle M}$. Since the algorithm assigns a job to the least loaded machine, right before ${\displaystyle p_{n}}$ is scheduled, machine ${\displaystyle i}$ whose current load is ${\displaystyle M-p_{n}}$, must be the least loaded. Thus, ${\displaystyle M-p_{n}}$ is no greater than the average processing time. Then,

${\displaystyle M-p_{n}\leq {\frac {1}{m}}\sum _{j=1}^{n}p_{j}\leq \mathrm {OPT} }$.

And since ${\displaystyle p_{n}\leq \max _{1\leq j\leq n}p_{j}\leq \mathrm {OPT} }$, it holds that ${\displaystyle M=M-p_{n}+p_{n}\leq 2\cdot \mathrm {OPT} }$.

${\displaystyle \square }$

There are poly-time algorithm using different ideas with much better approximation ratio for this problem.

Knapsack

 Problem (Knapsack) Given a set ${\displaystyle S=\{a_{1},a_{2},\ldots ,a_{n}\}}$ of objects, each with specified weight ${\displaystyle w_{i}\in \mathbb {Z} ^{+}}$ and profit ${\displaystyle p_{i}\in \mathbb {Z} ^{+}}$, and a knapsack capacity ${\displaystyle B}$, find a subset of objects whose total weight is bounded by ${\displaystyle B}$ and total profits is maximized.

A heuristic for the problem is to sort the objects by decreasing ratio of profit to weight, and then greedily pick objects in this order. However, the ratio between the solution of this heuristic and the OPT can be made arbitrarily bad. This is left as an exercise.

A pseudo-polynomial time algorithm by dynamic programming

Let ${\displaystyle P=\sum _{i=1}^{n}p_{i}}$ be the largest possible total profit. We first minimize the total weight for each possible total profit value ${\displaystyle p}$.

For each ${\displaystyle i\in \{1,2,\ldots ,n\}}$ and ${\displaystyle p\in \{1,2,\ldots ,P\}}$, let ${\displaystyle S_{i,p}\subseteq \{a_{1},a_{2},\ldots ,a_{i}\}}$ denote a subset of the first ${\displaystyle i}$ objects whose total profit is exactly ${\displaystyle p}$ and whose total weight is minimized.

Let ${\displaystyle A(i,p)}$ be the total weight of the set ${\displaystyle S_{i},p}$ (${\displaystyle A(i,p)=\infty }$ if no such set exists). Clearly ${\displaystyle A(1,p)}$ is known for every ${\displaystyle p}$. The following recurrence holds:

{\displaystyle {\begin{aligned}A(1,p)&=\min\{w_{i}\mid p_{i}=p\};\\A(i+1,p)&={\begin{cases}\min\{A(i,p),w_{i+1}+A(i,p-p_{i+1})\}&{\mbox{if }}p_{i+1}\leq p,\\A(i,p)&{\mbox{otherwise}}.\end{cases}}\end{aligned}}}

By dynamic programming, we can compute all values ${\displaystyle A(i,p)}$ by constructing the ${\displaystyle n\times P}$ matrix ${\displaystyle A(\cdot ,\cdot )}$ in an appropriate order. This takes totally ${\displaystyle O(nP)}$ time.

The optimal profit is given by ${\displaystyle \max\{p\mid A(i,p)\leq B\}}$.

Did we just solve an NP-hard problem in polynomial time, and prove that P=NP?

Actually, no, because of the factor ${\displaystyle P}$ in the time complexity. A poly-time algorithm, by definition, is that the worst-case running time is bounded with a polynomial of the size of the input, where the input is represented in binary.

The binary representation of the input of the knapsack problem only takes about ${\displaystyle O(\sum _{i=1}^{n}\log p_{i})}$ size, which means that the above algorithm is not poly-time. However, if we represent the input in the unary (一进制) form, it takes ${\displaystyle \Omega (P+n)}$ unary bits to represent the inputs. With unary representation of the input, our algorithm is poly-time. An algorithm with this property is called a pseudo-polynomial time algorithm.

An FPTAS by by "rounding and scaling"

We notice that the above dynamic programming algorithm is poly-time if the total profit ${\displaystyle p}$ of sets of objects only has ${\displaystyle \mathrm {poly} (n)}$ many possible values, and thus the matrix ${\displaystyle A}$ filled by the dynamic programming, is only polynomially large.

This leads us to an approximation algorithm by approximating all the profits to nearby lattice points. The idea is informally called the "rounding and scaling" technique:

 Algorithm Given ${\displaystyle \epsilon >0}$, let ${\displaystyle K={\frac {\epsilon p_{\max }}{n}}}$, where ${\displaystyle p_{\max }=\max _{1\leq i\leq n}p_{i}}$. For each object ${\displaystyle a_{i}}$, let ${\displaystyle p_{i}'=\left\lfloor {\frac {p_{i}}{K}}\right\rfloor }$. With these new profits of objects, using the dynamic programming algorithm, find the most profit set, say ${\displaystyle S}$. Return ${\displaystyle S}$.

The following lemma bounds the approximation ratio of the algorithm. The algorithm takes as input an instance of the knapsack problem and a parameter ${\displaystyle 0<\epsilon <1}$, and returns a knapset solution with approximation ratio of ${\displaystyle (1-\epsilon )}$ (it is a maximization problem, so the approximation ratio is less than 1).

 Lemma Let ${\displaystyle S}$ be the output of the algorithm, and ${\displaystyle p(S)=\sum _{i\in S}p_{i}}$ be the total profit of the objects in ${\displaystyle S}$. Then ${\displaystyle p(S)\geq (1-\epsilon )\cdot \mathrm {OPT} }$.
Proof.
 Let ${\displaystyle O}$ be the optimal set. For any object ${\displaystyle a_{i}}$, because of rounding down as ${\displaystyle p'_{i}=\left\lfloor {\frac {p_{i}}{K}}\right\rfloor }$, it holds that ${\displaystyle {\frac {p_{i}}{K}}-1\leq p'_{i}\leq {\frac {p_{i}}{K}}}$, and equivalently ${\displaystyle p_{i}-K\leq K\cdot p'_{i}\leq p_{i}}$. Therefore, ${\displaystyle p(O)-K\cdot p'(O)\leq nK}$. The dynamic programming algorithm returns the optimal solution for the new profits ${\displaystyle p'}$. Therefore, {\displaystyle {\begin{aligned}p(S)&\geq K\cdot p'(O)\\&\geq p(O)-nK\\&=\mathrm {OPT} -\epsilon p_{\max }&&\quad (K={\frac {\epsilon p_{\max }}{n}})\\&\geq (1-\epsilon )\cdot \mathrm {OPT} .&&\quad (\mathrm {OPT} \geq p_{\max })\end{aligned}}}
${\displaystyle \square }$

The running time of the approximation algorithm depends on the size of the matrix ${\displaystyle A}$ of dynamic programming, which is

${\displaystyle O\left(n\sum _{i=1}^{n}\left\lfloor {\frac {p_{i}}{K}}\right\rfloor \right)=O\left(n^{2}\left\lfloor {\frac {p_{\max }}{K}}\right\rfloor \right)=O\left(n^{2}\left\lfloor {\frac {n}{\epsilon }}\right\rfloor \right)=O(n^{3}\epsilon ^{-1})}$.

Therefore, the algorithm finds a solution with a total profit at least ${\displaystyle (1-\epsilon )\mathrm {OPT} }$ in time polynomial of ${\displaystyle n}$ and ${\displaystyle {\frac {1}{\epsilon }}}$, for any ${\displaystyle 0<\epsilon <1}$. This is called a fully polynomial-time approximation scheme (FPTAS). This is the best we can expect from a poly-time algorithm assuming that P${\displaystyle \neq }$NP.

LP-based approximation algorithms

A more advanced technique for designing approximation algorithms is linear programming. Many combinatorial optimization problems can be stated as integer programs (IP). The integer programs can be relaxed to linear programming problems. A feasible solution to the LP-relaxation is a fractional solution to the original combinatorial problem.

However, in the case of an NP-hard problem, we cannot expect the LP-relaxation has the same optimal solution as the original integer program (or otherwise a poly-time algorithm would solve an NP-hard problem). Thus, our task is not to look for an optimal solution to the LP-relaxation, but rather an integral solution which approximate the optimum within some bounded ratio.

There are two fundamental techniques for designing approximation algorithms using linear programming:

• The first is to solve the LP-relaxation and convert the fractional optimal solution to an integral solution, with bounded approximation ratio to the optimal integral solution. This technique is called rounding.
• The second is less obvious but with deeper insight, which uses the idea of duality. This technique is called the primal-dual schema.

We will focus on the rounding technique and demonstrate the power of randomized rounding. For other uses of LP for designing approximation algorithms, please refer to the classic textbook "Approximation Algorithms" by Vazirani.

Randomized Rounding

The integrality gap

Given an LP-relaxation for a minimization problem, let ${\displaystyle \mathrm {OPT} (I)}$ be the cost of an optimal solution to instance ${\displaystyle I}$, and let ${\displaystyle \mathrm {OPT} _{f}(I)}$ denote the cost of an optimal fractional solution to instance ${\displaystyle I}$, i.e., the value of the objective function of an optimal solution to the LP-relaxation. The integrality gap is defined to be

${\displaystyle \sup _{I}{\frac {\mathrm {OPT} (I)}{\mathrm {OPT} _{f}(I)}}.}$

The integrality gap for a maximization problem can be defined similarly.

Max-SAT

We consider the Max-SAT problem:

 Problem (Max-SAT) Given a conjunctive normal form (CNF) formula of ${\displaystyle m}$ clauses defined on ${\displaystyle n}$ boolean variables ${\displaystyle x_{1},x_{2},\ldots ,x_{n}}$, find a truth assignment to the boolean variables that maximizes the number of satisfied clauses.

and obtain two approximation algorithms by two different techniques.

The probabilistic method

A straightforward way to solve Max-SAT is to uniformly and independently assign each variable a random truth assignment. We have learned in the lecture about the probabilistic method that this "algorithm" has fine expected performance. Specifically, a clause ${\displaystyle C_{j}}$ of ${\displaystyle k}$ distinct variables is unsatisfied only if every one of the ${\displaystyle k}$ variables has the wrong "guess". The probability of this bad event is ${\displaystyle 2^{-k}}$, thus the clause ${\displaystyle C_{j}}$ is satisfied with probability ${\displaystyle 1-2^{-k}\geq {\frac {1}{2}}}$. Due to the linearity of expectation, expectedly at least ${\displaystyle {\frac {m}{2}}\geq {\frac {1}{2}}\cdot \mathrm {OPT} }$ clauses are satisfied.

We will see that by randomized rounding an LP, we can have a better approximation ratio.

LP relaxation + randomized rounding

For a clause ${\displaystyle C_{j}}$, let ${\displaystyle C_{i}^{+}}$ be the set of indices of the variables that appear in the uncomplemented form in clause ${\displaystyle C_{j}}$, and let ${\displaystyle C_{i}^{-}}$ be the set of indices of the variables that appear in the complemented form in clause ${\displaystyle C_{j}}$. The Max-SAT problem can be formulated as the following integer linear programing.

{\displaystyle {\begin{aligned}{\mbox{maximize}}&\quad \sum _{j=1}^{m}z_{j}\\{\mbox{subject to}}&\quad \sum _{i\in C_{j}^{+}}y_{i}+\sum _{i\in C_{j}^{-}}(1-y_{i})\geq z_{j},&&\forall 1\leq j\leq m\\&\qquad \qquad y_{i}\in \{0,1\},&&\forall 1\leq i\leq n\\&\qquad \qquad z_{j}\in \{0,1\},&&\forall 1\leq j\leq m\end{aligned}}}

Each ${\displaystyle y_{i}}$ in the programing indicates the truth assignment to the variable ${\displaystyle x_{i}}$, and each ${\displaystyle z_{j}}$ indicates whether the claus ${\displaystyle C_{j}}$ is satisfied. The inequalities ensure that a clause is deemed to be true only if at least one of the literals in the clause is assigned the value 1.

The integer linear programming is relaxed to the following linear programming:

{\displaystyle {\begin{aligned}{\mbox{maximize}}&\quad \sum _{j=1}^{m}z_{j}\\{\mbox{subject to}}&\quad \sum _{i\in C_{j}^{+}}y_{i}+\sum _{i\in C_{j}^{-}}(1-y_{i})\geq z_{j},&&\forall 1\leq j\leq m\\&\qquad \qquad 0\leq y_{i}\leq 1,&&\forall 1\leq i\leq n\\&\qquad \qquad 0\leq z_{j}\leq 1,&&\forall 1\leq j\leq m\end{aligned}}}

Let ${\displaystyle {\hat {y}}_{i}}$ and ${\displaystyle {\hat {z}}_{j}}$ be the fractional optimal solutions to the above linear programming. Clearly, ${\displaystyle \sum _{j=1}^{m}{\hat {z}}_{j}}$ is an upper bound on the optimal number of satisfied clauses.

Apply a very natural randomized rounding scheme. For each ${\displaystyle 1\leq i\leq n}$, independently

${\displaystyle y_{i}={\begin{cases}1&{\mbox{with probability }}{\hat {y}}_{i}.\\0&{\mbox{with probability }}1-{\hat {y}}_{i}.\end{cases}}}$

Correspondingly, each ${\displaystyle x_{i}}$ is assigned to TRUE independently with probability ${\displaystyle {\hat {y}}_{i}}$.

 Lemma Let ${\displaystyle C_{j}}$ be a clause with ${\displaystyle k}$ literals. The probability that it is satisfied by randomized rounding is at least ${\displaystyle (1-(1-1/k)^{k}){\hat {z}}_{j}}$.
Proof.
 Without loss of generality, we assume that all ${\displaystyle k}$ variables appear in ${\displaystyle C_{j}}$ in the uncomplemented form, and we assume that ${\displaystyle C_{j}=x_{1}\vee x_{2}\vee \cdots \vee x_{k}}$. The complemented cases are symmetric. Clause ${\displaystyle C_{j}}$ remains unsatisfied by randomized rounding only if every one of ${\displaystyle x_{i}}$, ${\displaystyle 1\leq i\leq k}$, is assigned to FALSE, which corresponds to that every one of ${\displaystyle y_{i}}$, ${\displaystyle 1\leq i\leq k}$, is rounded to 0. This event occurs with probability ${\displaystyle \prod _{i=1}^{k}(1-{\hat {y}}_{i})}$. Therefore, the clause ${\displaystyle C_{j}}$ is satisfied by the randomized rounding with probability ${\displaystyle 1-\prod _{i=1}^{k}(1-{\hat {y}}_{i})}$. By the linear programming constraints, ${\displaystyle {\hat {y}}_{1}+{\hat {y}}_{2}+\cdots +{\hat {y}}_{k}\geq {\hat {z}}_{j}}$. Then the value of ${\displaystyle 1-\prod _{i=1}^{k}(1-{\hat {y}}_{i})}$ is minimized when all ${\displaystyle {\hat {y}}_{i}}$ are equal and ${\displaystyle {\hat {y}}_{i}={\hat {z}}_{j}/k}$. Thus, the probability that ${\displaystyle C_{j}}$ is satisfied is ${\displaystyle 1-\prod _{i=1}^{k}(1-{\hat {y}}_{i})\geq 1-(1-{\hat {z}}_{j}/k)^{k}\geq (1-(1-1/k)^{k}){\hat {z}}_{j}}$, where the last inequality is due to the concaveness of the function ${\displaystyle 1-(1-{\hat {z}}_{j}/k)^{k}}$ of variable ${\displaystyle {\hat {z}}_{j}}$.
${\displaystyle \square }$

For any ${\displaystyle k\geq 1}$, it holds that ${\displaystyle 1-(1-1/k)^{k}>1-1/e}$. Therefore, by the linearity of expectation, the expected number of satisfied clauses by the randomized rounding, is at least

${\displaystyle (1-1/e)\sum _{j=1}{\hat {z}}_{j}\geq (1-1/e)\cdot \mathrm {OPT} }$.

The inequality is due to the fact that ${\displaystyle {\hat {z}}_{j}}$ are the optimal fractional solutions to the relaxed LP, thus are no worse than the optimal integral solutions.

Let the two algorithms compete with each other

For any instance of the Max-SAT, let ${\displaystyle m_{1}}$ be the expected number of satisfied clauses when each variable is independently set to TRUE with probability ${\displaystyle {\frac {1}{2}}}$; and let ${\displaystyle m_{2}}$ be the expected number of satisfied clauses when we use the linear programming followed by randomized rounding.

We will show that on any instance of the Max-SAT, one of the two algorithms is a ${\displaystyle {\frac {3}{4}}}$-approximation algorithm.

 Theorem ${\displaystyle \max\{m_{1},m_{2}\}\geq {\frac {3}{4}}\cdot \mathrm {OPT} .}$
Proof.
 It suffices to show that ${\displaystyle {\frac {(m_{1}+m_{2})}{2}}\geq {\frac {3}{4}}\sum _{j=1}^{m}{\hat {z}}_{j}}$. Letting ${\displaystyle S_{k}}$ denote the set of clauses that contain ${\displaystyle k}$ literals, we know that ${\displaystyle m_{1}=\sum _{k=1}^{n}\sum _{C_{j}\in S_{k}}(1-2^{-k})\geq \sum _{k=1}^{n}\sum _{C_{j}\in S_{k}}(1-2^{-k}){\hat {z}}_{j}.}$ By the analysis of randomized rounding, ${\displaystyle m_{2}\geq \sum _{k=1}^{n}\sum _{C_{j}\in S_{k}}(1-(1-1/k)^{k}){\hat {z}}_{j}.}$ Thus ${\displaystyle {\frac {(m_{1}+m_{2})}{2}}\geq \sum _{k=1}^{n}\sum _{C_{j}\in S_{k}}{\frac {1-2^{-k}+1-(1-1/k)^{k}}{2}}{\hat {z}}_{j}.}$ An easy calculation shows that ${\displaystyle {\frac {1-2^{-k}+1-(1-1/k)^{k}}{2}}\geq {\frac {3}{4}}}$ for any ${\displaystyle k}$, so that we have ${\displaystyle {\frac {(m_{1}+m_{2})}{2}}\geq {\frac {3}{4}}\sum _{k=1}^{n}\sum _{C_{j}\in S_{k}}{\hat {z}}_{j}={\frac {3}{4}}\sum _{j=1}^{m}{\hat {z}}_{j}\geq {\frac {3}{4}}\cdot \mathrm {OPT} .}$
${\displaystyle \square }$