Randomized Algorithms (Spring 2010)/Randomized approximation algorithms: Difference between revisions

From TCS Wiki
Jump to navigation Jump to search
imported>WikiSysop
No edit summary
imported>WikiSysop
No edit summary
Line 71: Line 71:
We give the following algorithm:
We give the following algorithm:
# Given <math>\epsilon>0</math>, let <math>K=\frac{\epsilon p_\max}{n}</math>.
# Given <math>\epsilon>0</math>, let <math>K=\frac{\epsilon p_\max}{n}</math>.
# For each object <math>a_i</math>, let <math>p'_i=\left\floor\frac{p_i}{K}\right\floor</math>.
# For each object <math>a_i</math>, let <math>p_i'=\left\lfloor\frac{p_i}{K}\right\rfloor</math>.
# With these new profits of objects, using the dynamic programming algorithm, find the most profit set, say <math>S</math>.
# With these new profits of objects, using the dynamic programming algorithm, find the most profit set, say <math>S</math>.
# Return <math>S</math>.
# Return <math>S</math>.

Revision as of 09:56, 27 May 2010

Approximation Algorithms

Coping with the NP-hardness

Combinatorial approximation algorithms

Scheduling

Scheduling is a class of problems. We consider a central problem in scheduling theory: the minimum makespan scheduling.

Problem (Minimum makespan scheduling)
Given processing times for [math]\displaystyle{ n }[/math] jobs, [math]\displaystyle{ p_1,p_2,\ldots,p_n }[/math], and an integer [math]\displaystyle{ m }[/math], find an assignment of the jobs to [math]\displaystyle{ m }[/math] identical machines so that the completion time, also called the makespan, is minimum.
Formally, given as input a sequence [math]\displaystyle{ p_1,p_2,\ldots,p_n }[/math] and an integer [math]\displaystyle{ m }[/math], find an [math]\displaystyle{ h:[n]\rightarrow[m] }[/math] such that [math]\displaystyle{ \max_{i\in[m]}\sum_{j\in h^{-1}(i)}p_j }[/math] is minimized.

This problem is known to be NP-hard. We consider a very simple and natural algorithm:

  • Schedule the jobs one by one, in an arbitrary order, each job being assigned to a machine with least amount of work so far.

This algorithms is due to Graham (1966), who also shows that the algorithm approximates the optimal makespan within a factor of 2.

Let OPT be the optimal makespan. We have two observations:

  • [math]\displaystyle{ \mathrm{OPT}\ge\frac{1}{m}\sum_{j=1}^n p_j }[/math], i.e. the optimal makespan is no smaller than the average processing time.
  • [math]\displaystyle{ \mathrm{OPT}\ge \max_{1\le j\le n}p_j }[/math], i.e. the optimal makespan is no smaller than the largest processing time.

Proof of the approximation factor: We assume that the last job scheduled is [math]\displaystyle{ p_n }[/math], assigned to machine [math]\displaystyle{ i }[/math], and the makespan produced by the algorithm is [math]\displaystyle{ M }[/math]. Since the algorithm assigns a job to the least loaded machine, right before [math]\displaystyle{ p_n }[/math] is scheduled, machine [math]\displaystyle{ i }[/math] whose current load is [math]\displaystyle{ M-p_n }[/math], must be the least loaded. Thus, [math]\displaystyle{ M-p_n }[/math] is no greater than the average processing time. Then,

[math]\displaystyle{ M-p_n\le \frac{1}{m}\sum_{j=1}^n p_j\le \mathrm{OPT} }[/math].

And since [math]\displaystyle{ p_n\le\max_{1\le j\le n}p_j\le \mathrm{OPT} }[/math], it holds that [math]\displaystyle{ M=M-p_n+p_n\le 2\cdot\mathrm{OPT} }[/math].

[math]\displaystyle{ \square }[/math]

There are poly-time algorithm using different ideas with much better approximation ratio for this problem.

Knapsack

Problem (Knapsack)
Given a set [math]\displaystyle{ S=\{a_1,a_2,\ldots,a_n\} }[/math]of objects, each with specified weight [math]\displaystyle{ w_i\in\mathbb{Z}^+ }[/math] and profit [math]\displaystyle{ p_i\in\mathbb{Z}^+ }[/math], and a knapsack capacity [math]\displaystyle{ B }[/math], find a subset of objects whose total weight is bounded by [math]\displaystyle{ B }[/math] and total profits is maximized.

A heuristic for the problem is to sort the objects by decreasing ratio of profit to weight, and then greedily pick objects in this order. However, the ratio between the solution of this heuristic and the OPT can be made arbitrarily bad. This is left as an exercise.

A pseudo-polynomial time algorithm by dynamic programming

Let [math]\displaystyle{ P=\sum_{i=1}^n p_i }[/math] be the largest possible total profit. We first minimize the total weight for each possible total profit value [math]\displaystyle{ p }[/math].

For each [math]\displaystyle{ i\in\{1,2,\ldots,n\} }[/math] and [math]\displaystyle{ p\in\{1,2,\ldots,P\} }[/math], let [math]\displaystyle{ S_{i,p}\subseteq\{a_1,a_2,\ldots,a_i\} }[/math] denote a subset of the first [math]\displaystyle{ i }[/math] objects whose total profit is exactly [math]\displaystyle{ p }[/math] and whose total weight is minimized.

Let [math]\displaystyle{ A(i,p) }[/math] be the total weight of the set [math]\displaystyle{ S_i,p }[/math] ([math]\displaystyle{ A(i,p)=\infty }[/math] if no such set exists). Clearly [math]\displaystyle{ A(1,p) }[/math] is known for every [math]\displaystyle{ p }[/math]. The following recurrence holds:

[math]\displaystyle{ \begin{align} A(1,p) &= \min\{w_i\mid p_i=p\};\\ A(i+1,p) &= \begin{cases} \min\{A(i,p),w_{i+1}+A(i,p-p_{i+1})\} & \mbox{if }p_{i+1}\le p,\\ A(i,p) & \mbox{otherwise}. \end{cases} \end{align} }[/math]

By dynamic programming, we can compute all values [math]\displaystyle{ A(i,p) }[/math] by constructing the [math]\displaystyle{ n\times P }[/math] matrix [math]\displaystyle{ A(\cdot,\cdot) }[/math] in an appropriate order. This takes totally [math]\displaystyle{ O(n P) }[/math] time.

The optimal profit is given by [math]\displaystyle{ \max\{p\mid A(i,p)\le B\} }[/math].

Did we just solve an NP-hard problem in polynomial time, and prove that P=NP?

Actually, no, because of the factor [math]\displaystyle{ P }[/math] in the time complexity. A poly-time algorithm, by definition, is that the worst-case running time is bounded with a polynomial of the size of the input, where the input is represented in binary.

The binary representation of the input of the knapsack problem only takes about [math]\displaystyle{ O(\sum_{i=1}^n\log p_i) }[/math] size, which means that the above algorithm is not poly-time. However, if we represent the input in the unary (一进制) form, it takes [math]\displaystyle{ \Omega(P+n) }[/math] unary bits to represent the inputs. With unary representation of the input, our algorithm is poly-time. An algorithm with this property is called a pseudo-polynomial time algorithm.

An FPTAS by by "rounding and scaling"

We give the following algorithm:

  1. Given [math]\displaystyle{ \epsilon\gt 0 }[/math], let [math]\displaystyle{ K=\frac{\epsilon p_\max}{n} }[/math].
  2. For each object [math]\displaystyle{ a_i }[/math], let [math]\displaystyle{ p_i'=\left\lfloor\frac{p_i}{K}\right\rfloor }[/math].
  3. With these new profits of objects, using the dynamic programming algorithm, find the most profit set, say [math]\displaystyle{ S }[/math].
  4. Return [math]\displaystyle{ S }[/math].


LP-based approximation algorithms

Randomized Rounding

The integrality gap

Randomized rounding

Max-SAT

Covering and Packing