高级算法 (Fall 2016)/Greedy and Local Search: Difference between revisions

From TCS Wiki
Jump to navigation Jump to search
imported>Etone
imported>Etone
No edit summary
Line 2: Line 2:


= Set cover =
= Set cover =
Given a number of subsets <math>S_1,S_2,\ldots,S_n\subseteq U</math> of a universe <math>U</math>, a <math>C\subseteq\{1,2,\ldots,n\}</math> forms a '''set cover''' if <math>U=\bigcup_{i\in\mathcal{C}}S_i</math>, that is, <math>\mathcal{C}</math> is a sub-collection of sets whose union "covers" all elements in the universe.
Given <math>m</math> subsets <math>S_1,S_2,\ldots,S_m\subseteq U</math> of a universe <math>U</math> of size <math>n=|U|</math>, a <math>C\subseteq\{1,2,\ldots,m\}</math> forms a '''set cover''' if <math>U=\bigcup_{i\in\mathcal{C}}S_i</math>, that is, <math>C</math> is a sub-collection of sets whose union "covers" all elements in the universe.  


This defines a natural optimization problem:
Without loss of generality, we always assume that the universe is <math>U==\bigcup_{i=1}^mS_i</math>.
 
This defines an important optimization problem:
{{Theorem|Set Cover Problem|
{{Theorem|Set Cover Problem|
*'''Input''': a number of sets <math>S_1,S_2,\ldots,S_n</math> with the universe <math>U=\bigcup_{i=1}^nS_i</math>;
*'''Input''': <math>m</math> subsets <math>S_1,S_2,\ldots,S_m\subseteq U</math> of a universe <math>U</math> of size <math>n</math>;
*'''Output''': the smallest <math>C\subseteq\{1,2,\ldots,n\}</math> such that <math>U=\bigcup_{i\in C}S_i</math>.
*'''Output''': the smallest <math>C\subseteq\{1,2,\ldots,m\}</math> such that <math>U=\bigcup_{i\in C}S_i</math>.
}}
}}


We can think of each instance as a bipartite graph <math>G(U,\{S_1,S_2,\ldots,S_n\}, E)</math> with elements <math>x\in U</math> of the universe on the left side, subsets <math>S_1,S_2,\ldots,S_n</math> on the right side, and there is a bipartite edge between element <math>x</math> and set <math>S_i</math> if and only if <math>x\in S_i</math>. By this translation the set cover problem is precisely the problem of given as input a bipartite graph <math>G(U,V,E)</math>, to find the smallest subset <math>C\subseteq V</math> of vertices on the right side to "cover" all vertices on the left side, i.e. every vertex on the left side <math>x\in U</math> is incident to some vertex in <math>C</math>.
We can think of each instance as a bipartite graph <math>G(U,\{S_1,S_2,\ldots,S_n\}, E)</math> with <math>n</math> vertices on the right side, each corresponding to an element <math>x\in U</math>, <math>m</math> vertices on the left side, each corresponding to one of the <math>m</math> subsets <math>S_1,S_2,\ldots,S_m</math>, and there is a bipartite edge connecting <math>x</math> with <math>S_i</math> if and only if <math>x\in S_i</math>. By this translation the set cover problem is precisely the problem of given as input a bipartite graph <math>G(U,V,E)</math>, to find the smallest subset <math>C\subseteq V</math> of vertices on the right side to "cover" all vertices on the left side, such that every vertex on the left side <math>x\in U</math> is incident to some vertex in <math>C</math>.


By alternating the roles of sets and elements in the above interpretation of set cover as bipartite cover, the set cover problem can be translated to the following equivalent hitting set problem.
By alternating the roles of sets and elements in the above interpretation of set cover instances as bipartite graphs, the set cover problem can be translated to the following equivalent hitting set problem:
{{Theorem|Hitting Set Problem|
{{Theorem|Hitting Set Problem|
*'''Input''': a number of sets <math>S_1,S_2,\ldots,S_m</math> with the universe <math>U=\bigcup_{i=1}^mS_i</math>;
*'''Input''': <math>n</math> subsets <math>S_1,S_2,\ldots,S_n\subseteq U</math> of a universe <math>U</math> of size <math>m</math>;
*'''Output''': the smallest subset <math>C\subseteq U</math> of elements such that <math>C</math> intersects with every set <math>S_i</math> for <math>1\le i\le m</math>.
*'''Output''': the smallest subset <math>C\subseteq U</math> of elements such that <math>C</math> intersects with every set <math>S_i</math> for <math>1\le i\le n</math>.
}}
}}


== Frequency and Vertex Cover==
== Frequency and Vertex Cover==
Given an instance of set cover problem <math>S_1,S_2,\ldots,S_n\subseteq U</math>, for every element <math>x\in U</math>, its '''frequency''' denoted as <math>frequency(x)</math> is defined as the number of sets containing <math>X</math>. Formally,
Given an instance of set cover problem <math>S_1,S_2,\ldots,S_m\subseteq U</math>, for every element <math>x\in U</math>, its '''frequency''', denoted as <math>frequency(x)</math>, is defined as the number of sets containing <math>X</math>. Formally,
:<math>frequency(x)=\{i\mid x\in S_i\}</math>.
:<math>frequency(x)=|\{i\mid x\in S_i\}|</math>.


In the hitting set version, the frequency should be defined on each set and for a set <math>S_i</math> its frequency <math>frequency(S_i)=|S_i|</math> is just the size of <math>|S_i|</math>.
In the hitting set version, the frequency should be defined for each set: for a set <math>S_i</math> its frequency <math>frequency(S_i)=|S_i|</math> is just the size of the set <math>S_i</math>.


The set cover problem restricted to the instances with frequency 2, becomes the vertex cover problem.  
The set cover problem restricted to the instances with frequency 2 becomes the vertex cover problem.  


Given an undirected graph <math>G(U,V)</math>, a '''vertex cover''' is a subset <math>C\subseteq V</math> of vertices such that every edge <math>uv\in E</math> has at least one endpoint in <math>C</math>.
Given an undirected graph <math>G(U,V)</math>, a '''vertex cover''' is a subset <math>C\subseteq V</math> of vertices such that every edge <math>uv\in E</math> has at least one endpoint in <math>C</math>.
Line 32: Line 34:
}}
}}
It is easy to compare with the hitting set problem:  
It is easy to compare with the hitting set problem:  
*For graph <math>G(V,E)</math>, its edges <math>e_1,e_2,\ldots,e_m\subseteq V</math> are vertex-sets of size 2.  
*For graph <math>G(V,E)</math>, its edges <math>e_1,e_2,\ldots,e_n\subseteq V</math> are vertex-sets of size 2.  
*A subset <math>C\subseteq V</math> of vertices is a vertex cover if and only if it is a hitting sets for <math>e_1,e_2,\ldots,e_m</math>, i.e. every <math>e_i</math> intersects with <math>C</math>.
*A subset <math>C\subseteq V</math> of vertices is a vertex cover if and only if it is a hitting sets for <math>e_1,e_2,\ldots,e_n</math>, i.e. every <math>e_i</math> intersects with <math>C</math>.
Therefore vertex cover is just set cover with frequency 2.
Therefore vertex cover is just set cover with frequency 2.


Line 41: Line 43:
We present our algorithms in the original set cover setting (instead of the hitting set version).
We present our algorithms in the original set cover setting (instead of the hitting set version).


A natural choice is the greedy algorithm: sequentially find the <math>S_i</math> that covers the largest number of ''currently uncovered'' elements, and add <math>i</math> to <math>C</math>, until no element is left uncovered.
A natural algorithm is the greedy algorithm: sequentially add such <math>i</math> to the cover <math>C</math>, where each <math>S_i</math> covers the largest number of ''currently uncovered'' elements, until no element is left uncovered.


{{Theorem|''GreedyCover''|
{{Theorem|''GreedyCover''|
:'''Input:''' sets <math>S_1,S_2,\ldots,S_n</math>;
:'''Input:''' sets <math>S_1,S_2,\ldots,S_m</math>;
----
----
:initially, <math>U=\bigcup_{i=1}^nS_i</math>, and <math>C=\emptyset</math>;
:initially, <math>U=\bigcup_{i=1}^mS_i</math>, and <math>C=\emptyset</math>;
:while <math>U\neq\emptyset</math> do
:while <math>U\neq\emptyset</math> do
::find <math>i\in\{1,2,\ldots, n\}</math> with the largest <math>|S_i\cap U|</math>;
::find <math>i\in\{1,2,\ldots, m\}</math> with the largest <math>|S_i\cap U|</math>;
::let <math>C=C\cup\{i\}</math> and <math>U=U\setminus S_i</math>;
::let <math>C=C\cup\{i\}</math> and <math>U=U\setminus S_i</math>;
:return <math>C</math>;
:return <math>C</math>;
Line 56: Line 58:


{{Theorem|Theorem|
{{Theorem|Theorem|
:
:For any set cover instance <math>S_1,S_2,\ldots,S_m\subseteq U</math> with optimal set cover of size <math>OPT</math>, the ''GreedyCover'' returns a set cover <math>C</math> that <math>\frac{|C|}{OPT}\le H_n</math>, where <math>n=|U|</math> is the size of the universe and <math>H_n</math> represents the <math>n</math>-th Harmonic number.
}}
}}


Line 63: Line 65:


{{Theorem|''DualCover''|
{{Theorem|''DualCover''|
:'''Input:''' sets <math>S_1,S_2,\ldots,S_n\subseteq U</math>;
:'''Input:''' sets <math>S_1,S_2,\ldots,S_m\subseteq U</math>;
----
----
:construct a ''maximal'' <math>M\subseteq U</math> such that <math>|S_i\cap M|\le 1</math> for all <math>i=1,2,\ldots, n</math>;
:construct a ''maximal'' <math>M\subseteq U</math> such that <math>|S_i\cap M|\le 1</math> for all <math>i=1,2,\ldots, m</math>;
:return <math>C=\{i\mid S_i\cap M\neq\}</math>
:return <math>C=\{i\mid S_i\cap M\neq\}</math>
}}
}}


= Scheduling =
= Scheduling =

Revision as of 11:18, 25 September 2016

Under construction. 

Set cover

Given [math]\displaystyle{ m }[/math] subsets [math]\displaystyle{ S_1,S_2,\ldots,S_m\subseteq U }[/math] of a universe [math]\displaystyle{ U }[/math] of size [math]\displaystyle{ n=|U| }[/math], a [math]\displaystyle{ C\subseteq\{1,2,\ldots,m\} }[/math] forms a set cover if [math]\displaystyle{ U=\bigcup_{i\in\mathcal{C}}S_i }[/math], that is, [math]\displaystyle{ C }[/math] is a sub-collection of sets whose union "covers" all elements in the universe.

Without loss of generality, we always assume that the universe is [math]\displaystyle{ U==\bigcup_{i=1}^mS_i }[/math].

This defines an important optimization problem:

Set Cover Problem
  • Input: [math]\displaystyle{ m }[/math] subsets [math]\displaystyle{ S_1,S_2,\ldots,S_m\subseteq U }[/math] of a universe [math]\displaystyle{ U }[/math] of size [math]\displaystyle{ n }[/math];
  • Output: the smallest [math]\displaystyle{ C\subseteq\{1,2,\ldots,m\} }[/math] such that [math]\displaystyle{ U=\bigcup_{i\in C}S_i }[/math].

We can think of each instance as a bipartite graph [math]\displaystyle{ G(U,\{S_1,S_2,\ldots,S_n\}, E) }[/math] with [math]\displaystyle{ n }[/math] vertices on the right side, each corresponding to an element [math]\displaystyle{ x\in U }[/math], [math]\displaystyle{ m }[/math] vertices on the left side, each corresponding to one of the [math]\displaystyle{ m }[/math] subsets [math]\displaystyle{ S_1,S_2,\ldots,S_m }[/math], and there is a bipartite edge connecting [math]\displaystyle{ x }[/math] with [math]\displaystyle{ S_i }[/math] if and only if [math]\displaystyle{ x\in S_i }[/math]. By this translation the set cover problem is precisely the problem of given as input a bipartite graph [math]\displaystyle{ G(U,V,E) }[/math], to find the smallest subset [math]\displaystyle{ C\subseteq V }[/math] of vertices on the right side to "cover" all vertices on the left side, such that every vertex on the left side [math]\displaystyle{ x\in U }[/math] is incident to some vertex in [math]\displaystyle{ C }[/math].

By alternating the roles of sets and elements in the above interpretation of set cover instances as bipartite graphs, the set cover problem can be translated to the following equivalent hitting set problem:

Hitting Set Problem
  • Input: [math]\displaystyle{ n }[/math] subsets [math]\displaystyle{ S_1,S_2,\ldots,S_n\subseteq U }[/math] of a universe [math]\displaystyle{ U }[/math] of size [math]\displaystyle{ m }[/math];
  • Output: the smallest subset [math]\displaystyle{ C\subseteq U }[/math] of elements such that [math]\displaystyle{ C }[/math] intersects with every set [math]\displaystyle{ S_i }[/math] for [math]\displaystyle{ 1\le i\le n }[/math].

Frequency and Vertex Cover

Given an instance of set cover problem [math]\displaystyle{ S_1,S_2,\ldots,S_m\subseteq U }[/math], for every element [math]\displaystyle{ x\in U }[/math], its frequency, denoted as [math]\displaystyle{ frequency(x) }[/math], is defined as the number of sets containing [math]\displaystyle{ X }[/math]. Formally,

[math]\displaystyle{ frequency(x)=|\{i\mid x\in S_i\}| }[/math].

In the hitting set version, the frequency should be defined for each set: for a set [math]\displaystyle{ S_i }[/math] its frequency [math]\displaystyle{ frequency(S_i)=|S_i| }[/math] is just the size of the set [math]\displaystyle{ S_i }[/math].

The set cover problem restricted to the instances with frequency 2 becomes the vertex cover problem.

Given an undirected graph [math]\displaystyle{ G(U,V) }[/math], a vertex cover is a subset [math]\displaystyle{ C\subseteq V }[/math] of vertices such that every edge [math]\displaystyle{ uv\in E }[/math] has at least one endpoint in [math]\displaystyle{ C }[/math].

Vertex Cover Problem
  • Input: an undirected graph [math]\displaystyle{ G(V,E) }[/math]
  • Output: the smallest [math]\displaystyle{ C\subseteq V }[/math] such that every edge [math]\displaystyle{ e\in E }[/math] is incident to at least one vertex in [math]\displaystyle{ C }[/math].

It is easy to compare with the hitting set problem:

  • For graph [math]\displaystyle{ G(V,E) }[/math], its edges [math]\displaystyle{ e_1,e_2,\ldots,e_n\subseteq V }[/math] are vertex-sets of size 2.
  • A subset [math]\displaystyle{ C\subseteq V }[/math] of vertices is a vertex cover if and only if it is a hitting sets for [math]\displaystyle{ e_1,e_2,\ldots,e_n }[/math], i.e. every [math]\displaystyle{ e_i }[/math] intersects with [math]\displaystyle{ C }[/math].

Therefore vertex cover is just set cover with frequency 2.

The vertex cover problem is NP-hard. Its decision version is among Karp's 21 NP-complete problems. Since vertex cover is a special case of set cover, the set cover problem is also NP-hard.

Greedy Algorithm for Set Cover

We present our algorithms in the original set cover setting (instead of the hitting set version).

A natural algorithm is the greedy algorithm: sequentially add such [math]\displaystyle{ i }[/math] to the cover [math]\displaystyle{ C }[/math], where each [math]\displaystyle{ S_i }[/math] covers the largest number of currently uncovered elements, until no element is left uncovered.

GreedyCover
Input: sets [math]\displaystyle{ S_1,S_2,\ldots,S_m }[/math];

initially, [math]\displaystyle{ U=\bigcup_{i=1}^mS_i }[/math], and [math]\displaystyle{ C=\emptyset }[/math];
while [math]\displaystyle{ U\neq\emptyset }[/math] do
find [math]\displaystyle{ i\in\{1,2,\ldots, m\} }[/math] with the largest [math]\displaystyle{ |S_i\cap U| }[/math];
let [math]\displaystyle{ C=C\cup\{i\} }[/math] and [math]\displaystyle{ U=U\setminus S_i }[/math];
return [math]\displaystyle{ C }[/math];

Obviously the algorithm runs in polynomial time. We will show that it has approximation ratio [math]\displaystyle{ O(\ln n) }[/math].

Theorem
For any set cover instance [math]\displaystyle{ S_1,S_2,\ldots,S_m\subseteq U }[/math] with optimal set cover of size [math]\displaystyle{ OPT }[/math], the GreedyCover returns a set cover [math]\displaystyle{ C }[/math] that [math]\displaystyle{ \frac{|C|}{OPT}\le H_n }[/math], where [math]\displaystyle{ n=|U| }[/math] is the size of the universe and [math]\displaystyle{ H_n }[/math] represents the [math]\displaystyle{ n }[/math]-th Harmonic number.

Primal-Dual Greedy Algorithm for Set Cover

DualCover
Input: sets [math]\displaystyle{ S_1,S_2,\ldots,S_m\subseteq U }[/math];

construct a maximal [math]\displaystyle{ M\subseteq U }[/math] such that [math]\displaystyle{ |S_i\cap M|\le 1 }[/math] for all [math]\displaystyle{ i=1,2,\ldots, m }[/math];
return [math]\displaystyle{ C=\{i\mid S_i\cap M\neq\} }[/math]

Scheduling