高级算法 (Fall 2016)/Greedy and Local Search: Difference between revisions

From TCS Wiki
Jump to navigation Jump to search
imported>Etone
imported>Etone
Line 267: Line 267:


== Local Search ==
== Local Search ==
{{Theorem|Local Search for Scheduling|
:start with an arbitrary schedule of <math>n</math> jobs to <math>m</math> machines;
:while (true)
::let <math>\ell</math> denote the job finished at last in the current schedule;
::if there is machine <math>i</math> such that job <math>\ell</math> can finish earlier by moving to machine <math>i</math>
:::job <math>\ell</math> moves to machine <math>i</math>;
::else break;
}}


== Longest Processing Time (LPT)==
== Longest Processing Time (LPT)==


== Online Algorithms and Competitive Ratio==
== Online Algorithms and Competitive Ratio==

Revision as of 03:04, 29 September 2016

Under construction. 

Set cover

Given m subsets S1,S2,,SmU of a universe U of size n=|U|, a C{1,2,,m} forms a set cover if U=iCSi, that is, C is a sub-collection of sets whose union "covers" all elements in the universe.

Without loss of generality, we always assume that the universe is U==i=1mSi.

This defines an important optimization problem:

Set Cover Problem
  • Input: m subsets S1,S2,,SmU of a universe U of size n;
  • Output: the smallest C{1,2,,m} such that U=iCSi.

We can think of each instance as a bipartite graph G(U,{S1,S2,,Sn},E) with n vertices on the right side, each corresponding to an element xU, m vertices on the left side, each corresponding to one of the m subsets S1,S2,,Sm, and there is a bipartite edge connecting x with Si if and only if xSi. By this translation the set cover problem is precisely the problem of given as input a bipartite graph G(U,V,E), to find the smallest subset CV of vertices on the right side to "cover" all vertices on the left side, such that every vertex on the left side xU is incident to some vertex in C.

By alternating the roles of sets and elements in the above interpretation of set cover instances as bipartite graphs, the set cover problem can be translated to the following equivalent hitting set problem:

Hitting Set Problem
  • Input: n subsets S1,S2,,SnU of a universe U of size m;
  • Output: the smallest subset CU of elements such that C intersects with every set Si for 1in.

Frequency and Vertex Cover

Given an instance of set cover problem S1,S2,,SmU, for every element xU, its frequency, denoted as frequency(x), is defined as the number of sets containing X. Formally,

frequency(x)=|{ixSi}|.

In the hitting set version, the frequency should be defined for each set: for a set Si its frequency frequency(Si)=|Si| is just the size of the set Si.

The set cover problem restricted to the instances with frequency 2 becomes the vertex cover problem.

Given an undirected graph G(U,V), a vertex cover is a subset CV of vertices such that every edge uvE has at least one endpoint in C.

Vertex Cover Problem
  • Input: an undirected graph G(V,E)
  • Output: the smallest CV such that every edge eE is incident to at least one vertex in C.

It is easy to compare with the hitting set problem:

  • For graph G(V,E), its edges e1,e2,,enV are vertex-sets of size 2.
  • A subset CV of vertices is a vertex cover if and only if it is a hitting sets for e1,e2,,en, i.e. every ei intersects with C.

Therefore vertex cover is just set cover with frequency 2.

The vertex cover problem is NP-hard. Its decision version is among Karp's 21 NP-complete problems. Since vertex cover is a special case of set cover, the set cover problem is also NP-hard.

Greedy Algorithm

We present our algorithms in the original set cover setting (instead of the hitting set version).

A natural algorithm is the greedy algorithm: sequentially add such i to the cover C, where each Si covers the largest number of currently uncovered elements, until no element is left uncovered.

GreedyCover
Input: sets S1,S2,,Sm;

initially, U=i=1mSi, and C=;
while U do
find i{1,2,,m} with the largest |SiU|;
let C=C{i} and U=USi;
return C;

Obviously the algorithm runs in polynomial time and always returns a set cover. We will then show how good the set cover returned by the algorithm compared to the optimal solution by analyzing its approximation ratio.

We define the following notations:

  • We enumerate all elements of the universe U as x1,x2,,xn, in the order in which they are covered in the algorithm.
  • For t=1,2,, let Ut denote the set of uncovered elements in the beginning of the t-th iteration of the algorithm.
  • For the k-th element xk covered, supposed that it is covered by Si in the t-th iteration, define
price(xk)=1|SiUt|
to be the average "price" to cover element xk in the algorithm.

Observe that if xk is covered by Si in the t-th iteration, then there are precisely |SiUt| elements, including xk, become covered in that iteration, and all these elements have price 1/|SiUt|. Then it is easy to have the following lemma:

Lemma 1
For the set cover C returned by the algorithm, |C|=k=1nprice(xk).

This lemma connect the size of the returned set cover to the prices of elements. The next lemme connects the price of each element to the optimal solution.

Lemma 2
For each xk, price(xk)OPTnk+1, where OPT is the size of the optimal set cover.
Proof.

For an instance S1,S2,,SmU with a universe of size n=|U|, let C{1,2,,m} denote an optimal set cover. Then

U=iCSi.

By averaging principle, there must be an Si of size

|Si|n|C|=nOPT.

By the greediness of the algorithm, in the first iteration the algorithm must choose a set Si of at least this size to add to the set cover C, which means the price the element covered at first, x1, along with all elements covered in the first iteration, are priced as

price(x1)=1maxi|Si|OPTn.

For the k-th element covered by the algorithm, supposed that it is covered by in the t-th iteration, and the current universe for the uncovered elements is Ut, it holds that

|Ut|nk+1,

since there are at most k1 elements covered before xk.

The uncovered elements constitutes a set cover instance S1Ut,S2Ut,,SmUt (some of which may be empty), with universe Ut. Note that this smaller instance has an optimal set cover of size at most OPT (since the optimal solution for the original instance must also be an optimal solution for this smaller instance), and xk is among the elements covered in the first iteration of the algorithm running on this smaller instance. By the above argument, it holds for the price(xk)=1|SiUt| (also note that this value is not changed no matter as in the t-th integration of the algorithm running on the original instance or as in the first iteration of the algorithm on the smaller instance) that price(xk)=1|SiUt|OPT|Ut|=OPTnk+1. The lemma is proved.


Combining Lemma 1 and Lemma 2, we have

|C|=k=1nprice(xk)k=1nOPTnk+1=k=1nOPTk=HnOPT,

where Hn=k=1n1klnn+O(1) is the n-th Harmonic number.

This shows that the GreedyCover algorithm has approximation ratio Hnlnn.

Theorem
For any set cover instance S1,S2,,SmU with optimal set cover of size OPT, the GreedyCover returns a set cover of size
CHnOPT,
where n=|U| is the size of the universe and Hnlnn represents the n-th Harmonic number.

Ignoring lower order terms, lnn is also the best approximation ratio achievable by polynomial time algorithms, assuming that NPneqP.

  • Lund and Yannakakis (1994) showed that there is no poly-time algorithm with approximation ratio <12log2n, unless all NP problems have quasi-polynomial time algorithms (which runs in time npolylog(n)).
  • Feige (1998) showed that there is no poly-time algorithm with approximation ratio better than (1o(1))lnn with the same complexity assumption.
  • Ras and Safra (1997) showed that there is no poly-time algorithm with approximation ratio better than clnn for a constant c assuming that NPP.
  • Dinur and Steurer (2014) showed that there is no poly-time algorithm with approximation ratio better than (1o(1))lnn assuming that NPP.

Primal-Dual Algorithm

Given an instance S1,S2,,SmU for set cover, the set cover problem asks for minimizing the size of |C| subject to the constraints that C{1,2,,m} and U=iCSi, i.e. C is a set cover. We can define a dual problem on the same instance. The original problem, the set cover problem is called the primal problem. Formally, the primal and dual problems are defined as follows:

Primal:
minimize |C|
subject to C{1,2,,m}
U=iCSi
Dual:
maximize |M|
subject to MU
|SiM|1, 1im

The dual problem is a "maximum matching" problem, where the matching is defined for the set system instead of graph. Given an instance S1,S2,,SmU, an MU is called a matching for S1,S2,,Sm if |SiM|1 for all i=1,2,,m.

It is easier to see these two optimization problems are dual to each other if we write them as mathematical programs.

For the primal problem (set cover), for each 1im, let xi{0,1} indicate whether iC. The set cover problem can be written as the following integer linear programming (ILP).

Primal:
minimize i=1mxi
subject to i:vSixi1, vU
xi{0,1}, 1im

For the dual problem (maximum matching), for each vU, let yv{0,1} indicate whether vM. Then the dual problem can be written as the following ILP.

Dual:
maximize vUyv
subject to vSiyv1, 1im
yv{0,1}, vU

It is fundamental fact that for a minimization problem, every feasible solution to the dual problem (which is a maximization problem) is a lower bound for the optimal solution to the primal problem. This is called the weak duality phenomenon. The easy direction (that every cut is a lower bound for every flow) of the famous "max-flow min-cut" is an instance of weak duality.

Theorem (Weak Duality)
For every matching M and every vertex cover C for S1,S2,,SmU, it holds that |M||C|.
Proof.

If MU is a matching for S1,S2,,SmU, then every Si intersects with M on at most one element, which means no two elements in M can be covered by one Si, and hence each element in M will consume at least one distinct Si to cover. Therefore, for any set cover, in order to cover all elements in M must cost at least |M| sets.

More formally, for any matching M and set cover C, it holds that

|M|=|iC(SiM)|iC|SiM||C|,

where the first equality is because C is a set cover and the last inequality is because M is a matching.

As a corollary, every matching M for S1,S2,,SmU is a lower bound for the optimal set cover OPT:

Corollary
Let S1,S2,,SmU be an instance for set cover, and OPT the size of the optimal set cover.
For every matching M for S1,S2,,Sm, it holds that |M|OPT.

Now we are ready to present our algorithm. It is a greedy algorithm in the dual world. And the maximal (local optimal) solution to the dual problem helps us to find a good enough solution to the primal problem.

DualCover
Input: sets S1,S2,,SmU;

construct a maximal matching MU such that |SiM|1 for all i=1,2,,m
by sequentially adding elements to M until nothing can be added;
return C={iSiM}

The algorithm constructs the maximal matching M by sequentially adding elements into M until reaching the maximality. This obviously takes polynomial time.

It is not so obvious to see that the returned C is always a set cover. This is due to the maximality of the matching M:

  • By contradiction, assuming that C is not a set cover, which means there is an element xU such that for all iC, xSi, which implies that xM and the M=M{x} is still a matching, contradicting the maximality of M.

Therefore the C constructed as in the DualCover algorithm must be a set cover.

For the maximal matching M constructed by the DualCover algorithm, the output set cover is the collection of all sets which contain at least one element in M. Recall that the frequency of an element frequency(x) is defined as the number of sets Si containing x. Then each element xM may contribute at most frequency(x) many sets into the set cover C. Then it holds that

|C|xMfrequency(x)f|M|fOPT,

where f=maxxUfrequency(x) denotes the maximum frequency of all elements.

We proved the following f-approximation bound for the DualCover algorithm on set cover instances with maximum frequency f.

Theorem
For any set cover instance S1,S2,,SmU with optimal set cover of size OPT, the DualCover returns a set cover of size
CfOPT,
where f=maxxUfrequency(x)=maxxU|{ixSi}| is the maximum frequency.

When the frequency f=2, the set cover problem becomes the vertex cover problem. And the DualCover algorithm works simply as follows:

DualCover for vertex cover problem
Input: an undirected graph G(V,E);

initially C=;
while E
pick an arbitrary edge uvE and add both u and v to C;
remove all edges in E incident to u or v;
return C;

Since this algorithm is just an implementation of the DualCover algorithm on the vertex cover instances (set cover instances with frequency f=2), by the analysis of the DualCover algorithm, it is a 2-approximation algorithm for the vertex cover problem.

Ignoring lower order terms, 2 is also the best approximation ratio achievable by polynomial time algorithms, assuming certain complexity assumption.

Scheduling

We consider the following scheduling problem:

  • There are n jobs to be processed.
  • There are m identical parallel machines. Each machine can start processing jobs at time 0 and can process at most one job at a time.
  • Each job j=1,2,,n must be processed on one of these machines for pj time units without interruption. pj is called the processing time of job j.

In a schedule each job is assigned to a machine to process starting at some time, respecting the above rules. The goal is to complete all jobs as soon as possible.

Suppose each job j is completed at time Cj, the objective is to minimize the makespan Cmax=maxjCj.

This problem is called minimum makespan on identical parallel machines. It can be described as the following simpler version as a load balancing problem:

Minimum Makespan on Identical Parallel Machines (load balancing version)
Input: n positive integers p1,p2,,pn and a positive integer m;
Output: an assignment σ:[n][m] which minimizes Cmax=maxi[m]j:i=σ(j)pj.

With the α|β|γ notation for scheduling, this problem is the scheduling problem P||Cmax.

The α|β|γ notation was introduced by Ron Graham et al. to model scheduling problems. See this note for more details.

The problem is NP-hard. In particular, when m=2, the problem can solve the partition problem, which is among Karp's 21 NP-complete problems.

Partition Problem
Input: a set of n positive integers S={x1,x2,,xn};
Determine whether there is a partition of S into A and B such that xA=xB.

Graham's List algorithm

In a technical report in the Bell labs in 1966, Graham described a natural greedy procedure for scheduling jobs on parallel identical machines and gave an elegant analysis of the performance of the procedure. It was probably the first approximation algorithm in modern dates with provable approximation ratio. Interestingly, it was even earlier than the discovery of the notion of NP-hardness.

Graham's List algorithm takes a list of jobs as input. The load of a machine is defined as the total processing time of the jobs currently assigned to the machine.

The List algorithm (Graham 1966)
Input: a list of jobs j=1,2,,n with processing times p1,p2,,pn;

for j=1,2,,n
assign job j to the machine that currently has the smallest load;

In a scheduling language, the List algorithm can be more simply described as:

  • Whenever a machine becomes idle, it starts processing the next job on the list.

It is well known that the List algorithm has approximation ratio (21m).

Theorem
For every instance of scheduling n jobs with processing times p1,p2,,pn on m parallel identical machines, the List algorithm finds a schedule with makespan Cmax(21m)OPT, where OPT is the makespan of optimal schedules.
Proof.

Obviously for any schedule the makespan is at least the maximum processing time:

OPTmax1jnpj

and by averaging principle, the makespan (maximum load) is at least the average load:

OPT1mj=1npj.

Suppose that in the schedule given by the List algorithm, job finished at last, so the makespan Cmax=C where C is the completion time of job .

By the greediness of the List algorithm, before job is scheduled, the machine to which job is going to be assigned has the smallest load. By averaging principle:

Cp1mjpj.

On the other hand,

pmax1jnpj.

Together, we have

Cmax=C1mj=1mpj+(11m)p1mj=1mpj+(11m)max1jnpj(21m)OPT.

The analysis is tight, you can try to construct a family of instances on which the List returns schedules with makespan at least (21m)OPT.

Local Search

Local Search for Scheduling
start with an arbitrary schedule of n jobs to m machines;
while (true)
let denote the job finished at last in the current schedule;
if there is machine i such that job can finish earlier by moving to machine i
job moves to machine i;
else break;

Longest Processing Time (LPT)

Online Algorithms and Competitive Ratio