File:Cook-optimization.jpg and 组合数学 (Fall 2011)/Optimization: Difference between pages

Revision as of 03:41, 17 August 2011

Duality

Consider the following LP:

[math]\displaystyle{ \begin{align} \text{minimize} && 7x_1+x_2+5x_3\\ \text{subject to} && x_1-x_2+3x_3 &\ge 10\\ && 5x_1-2x_2-x_3 &\ge 6\\ && x_1,x_2,x_3 &\ge 0 \end{align} }[/math]

Let [math]\displaystyle{ OPT }[/math] be the value of the optimal solution. We want to estimate the upper and lower bound of [math]\displaystyle{ OPT }[/math].

Since [math]\displaystyle{ OPT }[/math] is the minimum over the feasible set, every feasible solution forms an upper bound for [math]\displaystyle{ OPT }[/math]. For example [math]\displaystyle{ \boldsymbol{x}=(2,1,3) }[/math] is a feasible solution, thus [math]\displaystyle{ OPT\le 7\cdot 2+1+5\cdot 3=30 }[/math].

For the lower bound, the optimal solution must satisfy the two constraints:

[math]\displaystyle{ \begin{align} x_1-x_2+3x_3 &\ge 10,\\ 5x_1-2x_2-x_3 &\ge 6.\\ \end{align} }[/math]

Since the [math]\displaystyle{ x_i }[/math]'s are restricted to be nonnegative, term-by-term comparison of coefficients shows that

[math]\displaystyle{ 7x_1+x_2+5x_3\ge(x_1-x_2+3x_3)+(5x_1-2x_2-x_3)\ge 16. }[/math]

The idea behind this lower bound process is that we are finding suitable nonnegative multipliers (in the above case the multipliers are all 1s) for the constraints so that when we take their sum, the coefficient of each [math]\displaystyle{ x_i }[/math] in the sum is dominated by the coefficient in the objective function. It is important to ensure that the multipliers are nonnegative, so they do not reverse the direction of the constraint inequality.

To find the best lower bound, we need to choose the multipliers in such a way that the sum is as large as possible. Interestingly, the problem of finding the best lower bound can be formulated as another LP:

[math]\displaystyle{ \begin{align} \text{maximize} && 10y_1+6y_2\\ \text{subject to} && y_1+5y_2 &\le 7\\ && -y_1+2y_2 &\le 1\\ &&3y_1-y_2 &\le 5\\ && y_1,y_2&\ge 0 \end{align} }[/math]

Here [math]\displaystyle{ y_1 }[/math] and [math]\displaystyle{ y_2 }[/math] were chosen to be nonnegative multipliers for the first and the second constraint, respectively. We call the first LP the primal program and the second LP the dual program. By definition, every feasible solution to the dual program gives a lower bound for the primal program.

LP duality

Given an LP in canonical form, called the primal LP:

[math]\displaystyle{ \begin{align} \text{minimize} && \boldsymbol{c}^T\boldsymbol{x}\\ \text{subject to} && A\boldsymbol{x} &\ge\boldsymbol{b}\\ && \boldsymbol{x} &\ge \boldsymbol{0} \end{align} }[/math]

the dual LP is defined as follows:

[math]\displaystyle{ \begin{align} \text{maximum} && \boldsymbol{b}^T\boldsymbol{y}\\ \text{subject to} && A^T\boldsymbol{y} &\le\boldsymbol{c}\\ && \boldsymbol{y} &\ge \boldsymbol{0} \end{align} }[/math]

We then give some examples.

Surviving problem (diet problem)

Let us consider the surviving problem. Suppose we have [math]\displaystyle{ n }[/math] types of natural food, each containing up to [math]\displaystyle{ m }[/math] types of vitamins. The [math]\displaystyle{ j }[/math]th food has [math]\displaystyle{ a_{ij} }[/math] amount of vitamin [math]\displaystyle{ i }[/math], and the price of the [math]\displaystyle{ j }[/math]th food is [math]\displaystyle{ c_j }[/math]. We need to consume [math]\displaystyle{ b_i }[/math] amount of vitamin [math]\displaystyle{ i }[/math] for each [math]\displaystyle{ 1\le i\le m }[/math] to keep a good health. We want to minimize the total costs of food while keeping healthy. The problem can be formalized as the following LP:

[math]\displaystyle{ \begin{align} \text{minimize} \quad& c_1x_1+c_2x_2+\cdots+c_nx_n\\ \begin{align} \text{subject to} \\ \\ \end{align} \quad & \begin{align} a_{i1}x_{1}+a_{i2}x_{2}+\cdots+a_{in}x_{n} &\le b_{i} &\quad& \forall 1\le i\le m\\ x_{j}&\ge 0 &\quad& \forall 1\le j\le n \end{align} \end{align} }[/math]

The dual LP is

[math]\displaystyle{ \begin{align} \text{maximize} \quad& b_1y_1+b_2y_2+\cdots+b_ny_m\\ \begin{align} \text{subject to} \\ \\ \end{align} \quad & \begin{align} a_{1j}y_{1}+a_{2j}y_{2}+\cdots+a_{mj}y_{m} &\le c_{j} &\quad& \forall 1\le j\le n\\ y_{i}&\ge 0 &\quad& \forall 1\le i\le m \end{align} \end{align} }[/math]

The problem can be interpreted as follows: A food company produces [math]\displaystyle{ m }[/math] types of vitamin pills. The company wants to design a pricing system such that

The vitamin [math]\displaystyle{ i }[/math] has a nonnegative price [math]\displaystyle{ y_i }[/math].
The price system should be competitive to any natural food. A costumer cannot replace the vitamins by any natural food and get a cheaper price, that is, [math]\displaystyle{ \sum_{i=1}^my_ja_{ij}\le c_j }[/math] for any [math]\displaystyle{ 1\le j\le n }[/math].
The company wants to find the maximal profit, assuming that the customer only buy exactly the necessary amount of vitamins ([math]\displaystyle{ b_i }[/math] for vitamin [math]\displaystyle{ i }[/math]).

Maximum flow problem

In the last lecture, we defined the maximum flow problem, whose LP is

[math]\displaystyle{ \begin{align} \text{maximize} \quad& \sum_{v:(s,v)\in E}f_{sv}\\ \begin{align} \text{subject to} \\ \\ \\ \\ \end{align} \quad & \begin{align} f_{uv}&\le c_{uv} &\quad& \forall (u,v)\in E\\ \sum_{u:(u,v)\in E}f_{uv}-\sum_{w:(v,w)\in E}f_{vw} &=0 &\quad& \forall v\in V\setminus\{s,t\}\\ f_{uv}&\ge 0 &\quad& \forall (u,v)\in E \end{align} \end{align} }[/math]

where directed graph [math]\displaystyle{ G(V,E) }[/math] is the flow network, [math]\displaystyle{ s\in V }[/math] is the source, [math]\displaystyle{ t\in V }[/math] is the sink, and [math]\displaystyle{ c_{uv} }[/math] is the capacity of directed edge [math]\displaystyle{ (u,v)\in E }[/math].

We add a new edge from [math]\displaystyle{ t }[/math] to [math]\displaystyle{ s }[/math] to [math]\displaystyle{ E }[/math], and let the capacity be [math]\displaystyle{ c_{ts}=\infty }[/math]. Let [math]\displaystyle{ E' }[/math] be the new edge set. The LP for the max-flow problem can be rewritten as:

[math]\displaystyle{ \begin{align} \text{maximize} \quad& f_{ts}\\ \begin{align} \text{subject to} \\ \\ \\ \\ \end{align} \quad & \begin{align} f_{uv}&\le c_{uv} &\quad& \forall (u,v)\in E\\ \sum_{u:(u,v)\in E'}f_{uv}-\sum_{w:(v,w)\in E'}f_{vw} &\le0 &\quad& \forall v\in V\\ f_{uv}&\ge 0 &\quad& \forall (u,v)\in E' \end{align} \end{align} }[/math]

The second set of inequalities seem weaker than the original conservation constraint of flows, however, if this inequality holds at every node, then in fact it must be satisfied with equality at every node, thereby implying the flow conservation.

To obtain the dual program we introduce variables [math]\displaystyle{ d_{uv} }[/math] and [math]\displaystyle{ p_v }[/math] corresponding to the two types of inequalities in the primal. The dual LP is:

[math]\displaystyle{ \begin{align} \text{minimize} \quad& \sum_{(u,v)\in E}c_{uv}d_{uv}\\ \begin{align} \text{subject to} \\ \\ \\ \\ \end{align} \quad & \begin{align} d_{uv}-p_u+p_v &\ge 0 &\quad& \forall (u,v)\in E\\ p_s-p_t &\ge1 \\ d_{uv} &\ge 0 &\quad& \forall (u,v)\in E\\ p_v&\ge 0 &\quad& \forall v\in V \end{align} \end{align} }[/math]

It is more helpful to consider its integer version:

[math]\displaystyle{ \begin{align} \text{minimize} \quad& \sum_{(u,v)\in E}c_{uv}d_{uv}\\ \begin{align} \text{subject to} \\ \\ \\ \\ \end{align} \quad & \begin{align} d_{uv}-p_u+p_v &\ge 0 &\quad& \forall (u,v)\in E\\ p_s-p_t &\ge1 \\ d_{uv} &\in\{0,1\} &\quad& \forall (u,v)\in E\\ p_v&\in\{0,1\} &\quad& \forall v\in V \end{align} \end{align} }[/math]

In the last lecture, we know that the LP for max-flow is totally unimordular, so is this dual LP, therefore the optimal solutions to the integer program are the optimal solutions to the LP.

The variables [math]\displaystyle{ p_v }[/math] defines a bipartition of vertex set [math]\displaystyle{ V }[/math]. Let [math]\displaystyle{ S=\{v\in V\mid p_v=1\} }[/math]. The complement [math]\displaystyle{ \bar{S}=\{v\in V\mid p_v=1\} }[/math].

For 0/1-valued variables, the only way to satisfy [math]\displaystyle{ p_s-p_t\ge1 }[/math] is to have [math]\displaystyle{ p_s=1 }[/math] and [math]\displaystyle{ p_t=0 }[/math]. Therefore, [math]\displaystyle{ (S,\bar{S}) }[/math] is an [math]\displaystyle{ s }[/math]-[math]\displaystyle{ t }[/math] cut.

In an optimal solution, [math]\displaystyle{ d_{uv}=1 }[/math] if and only if [math]\displaystyle{ u\in S,v\in\bar{S} }[/math] and [math]\displaystyle{ (u,v)\in E }[/math]. Therefore, the objective function of an optimal solution [math]\displaystyle{ \sum_{u\in S,v\not\in S\atop (u,v)\in E}c_{uv} }[/math] is the capacity of the minimum [math]\displaystyle{ s }[/math]-[math]\displaystyle{ t }[/math] cut [math]\displaystyle{ (S,\bar{S}) }[/math].

Duality theorems

Let the primal LP be:

[math]\displaystyle{ \begin{align} \text{minimize} && \boldsymbol{c}^T\boldsymbol{x}\\ \text{subject to} && A\boldsymbol{x} &\ge\boldsymbol{b}\\ && \boldsymbol{x} &\ge \boldsymbol{0} \end{align} }[/math]

Its dual LP is:

[math]\displaystyle{ \begin{align} \text{maximum} && \boldsymbol{b}^T\boldsymbol{y}\\ \text{subject to} && A^T\boldsymbol{y} &\le\boldsymbol{c}\\ && \boldsymbol{y} &\ge \boldsymbol{0} \end{align} }[/math]

Theorem

The dual of a dual is the primal.

Proof.

The dual program can be written as the following minimization in canonical form:

[math]\displaystyle{ \begin{align} \min && -\boldsymbol{b}^T\boldsymbol{y}\\ \text{s.t.} && -A^T\boldsymbol{y} &\ge-\boldsymbol{c}\\ && \boldsymbol{y} &\ge \boldsymbol{0} \end{align} }[/math]

Its dual is:

[math]\displaystyle{ \begin{align} \max && -\boldsymbol{c}^T\boldsymbol{x}\\ \text{s.t.} && -A\boldsymbol{x} &\le-\boldsymbol{b}\\ && \boldsymbol{x} &\ge \boldsymbol{0} \end{align} }[/math]

which is equivalent to the primal:

[math]\displaystyle{ \begin{align} \min && \boldsymbol{c}^T\boldsymbol{x}\\ \text{s.t.} && A\boldsymbol{x} &\ge\boldsymbol{b}\\ && \boldsymbol{x} &\ge \boldsymbol{0} \end{align} }[/math]

[math]\displaystyle{ \square }[/math]

We have shown that feasible solutions of a dual program can be used to lower bound the optimum of the primal program. This is formalized by the following important theorem.

Theorem (Weak duality theorem)

If there exists an optimal solution to the primal LP:

[math]\displaystyle{ \begin{align} \min && \boldsymbol{c}^T\boldsymbol{x}\\ \text{s.t.} && A\boldsymbol{x} &\ge\boldsymbol{b}\\ && \boldsymbol{x} &\ge \boldsymbol{0} \end{align} }[/math]

then,

[math]\displaystyle{ \begin{align} \begin{align} \min && \boldsymbol{c}^T\boldsymbol{x}\\ \text{s.t.} && A\boldsymbol{x} &\ge\boldsymbol{b}\\ && \boldsymbol{x} &\ge \boldsymbol{0} \end{align} &\begin{align} \ge\\ \\ \\ \end{align}&\quad \begin{align} \max && \boldsymbol{b}^T\boldsymbol{y}\\ \text{s.t.} && A^T\boldsymbol{y} &\le\boldsymbol{c}\\ && \boldsymbol{y} &\ge \boldsymbol{0} \end{align} \end{align} }[/math]

Proof.

Let [math]\displaystyle{ \boldsymbol{x} }[/math] be an arbitrary feasible solution to the primal LP, and [math]\displaystyle{ \boldsymbol{y} }[/math] be an arbitrary feasible solution to the dual LP.

We estimate [math]\displaystyle{ \boldsymbol{y}^TA\boldsymbol{x} }[/math] in two ways. Recall that [math]\displaystyle{ A\boldsymbol{x} \ge\boldsymbol{b} }[/math] and [math]\displaystyle{ A^T\boldsymbol{y} \le\boldsymbol{c} }[/math], thus

[math]\displaystyle{ \boldsymbol{y}^T\boldsymbol{b}\le\boldsymbol{y}^TA\boldsymbol{x}\le\boldsymbol{c}^T\boldsymbol{x} }[/math].

Since this holds for any feasible solutions, it must also hold for the optimal solutions.

[math]\displaystyle{ \square }[/math]

A harmonically beautiful result is that the optimums of the primal LP and its dual are equal. This is called the strong duality theorem of linear programming.

Theorem (Strong duality theorem)

If there exists an optimal solution to the primal LP:

[math]\displaystyle{ \begin{align} \min && \boldsymbol{c}^T\boldsymbol{x}\\ \text{s.t.} && A\boldsymbol{x} &\ge\boldsymbol{b}\\ && \boldsymbol{x} &\ge \boldsymbol{0} \end{align} }[/math]

then,

[math]\displaystyle{ \begin{align} \begin{align} \min && \boldsymbol{c}^T\boldsymbol{x}\\ \text{s.t.} && A\boldsymbol{x} &\ge\boldsymbol{b}\\ && \boldsymbol{x} &\ge \boldsymbol{0} \end{align} &\begin{align} =\\ \\ \\ \end{align}&\quad \begin{align} \max && \boldsymbol{b}^T\boldsymbol{y}\\ \text{s.t.} && A^T\boldsymbol{y} &\le\boldsymbol{c}\\ && \boldsymbol{y} &\ge \boldsymbol{0} \end{align} \end{align} }[/math]

Unimodularity

Integer Programming

Consider the maximum integral flow problem: given as input a flow network [math]\displaystyle{ (G(V,E),c,s,t) }[/math] where for every [math]\displaystyle{ uv\in E }[/math] the capacity [math]\displaystyle{ c_{uv} }[/math] is integer. We want to find the integral flow [math]\displaystyle{ f:E\rightarrow\mathbb{Z} }[/math] with maximum value.

The mathematical programming for the problem is:

[math]\displaystyle{ \begin{align} \text{maximize} \quad& \sum_{v:(s,v)\in E}f_{sv}\\ \begin{align} \text{subject to} \\ \\ \\ \\ \\ \end{align} \quad & \begin{align} f_{uv}&\le c_{uv} &\quad& \forall (u,v)\in E\\ \sum_{u:(u,v)\in E}f_{uv}-\sum_{w:(v,w)\in E}f_{vw} &=0 &\quad& \forall v\in V\setminus\{s,t\}\\ f_{uv}&\in\mathbb{N} &\quad& \forall (u,v)\in E \end{align} \end{align} }[/math]

where [math]\displaystyle{ \mathbb{N} }[/math] is the set of all nonnegative integers. Compared to the LP for the max-flow problem, we just replace the last line [math]\displaystyle{ f_{uv}\ge 0 }[/math] with [math]\displaystyle{ f_{uv}\in\mathbb{N} }[/math]. The resulting optimization is called an integer programming (IP), or more specific integer linear programming (ILP).

Due to the Flow Integrality Theorem, when capacities are integers, there must be an integral flow whose value is maximum among all flows (integral or not). This means the above IP can be efficiently solved by solving its LP-relaxation. This is usually impossible for general IPs.

Generally, an IP of canonical form is written as

[math]\displaystyle{ \begin{align} \text{maximize} \quad& \boldsymbol{c}^T\boldsymbol{x}\\ \begin{align} \text{subject to} \\ \\ \\ \end{align} \quad & \begin{align} A\boldsymbol{x} &\ge\boldsymbol{b} \\ \boldsymbol{x}&\ge \boldsymbol{0}\\ \boldsymbol{x}&\in\mathbb{Z}^n \end{align} \end{align} }[/math]

Consider the 3SAT problem. Each instance is a 3CNF(conjunctive normal form): [math]\displaystyle{ \bigwedge_{i=1}^m(\ell_{i_1}\vee\ell_{i_2}\vee\ell_{i_3}) }[/math], where each [math]\displaystyle{ (\ell_{i_1}\vee\ell_{i_2}\vee\ell_{i_3}) }[/math] is a clause and each [math]\displaystyle{ \ell_{i_r}\in\{x_{j},\neg x_{j}\mid 1\le j\le n\} }[/math], called a literal, is either a boolean variable or a negation of a boolean variable. We want to determine whether there exists an truth assignment of the [math]\displaystyle{ n }[/math] boolean variables [math]\displaystyle{ x_1,\ldots,x_n }[/math] such that the input formula is satisfied (i.e., is true).

The following IP solves 3SAT:

[math]\displaystyle{ \begin{align} \text{maximize} \quad& \sum_{i=1}^mz_i\\ \begin{align} \text{subject to} \\ \\ \\ \\ \end{align} \quad & \begin{align} z_i &\le y_{i_1}+y_{i_2}+y_{i_3} &\quad& \forall 1\le i\le m\\ y_{i_r}&\le x_{j} &\quad& \text{if }\ell_{i_r}=x_{j} \\ y_{i_r}&\le 1-x_{j} &\quad& \text{if }\ell_{i_r}=\neg x_{j} \\ z_i, x_j&\in\{0,1\} &\quad& \forall 1\le i\le m, 1\le j\le n \end{align} \end{align} }[/math]

Since 3SAT is NP-hard (actually it is the first problem known to be NP-hard), generally IP is NP-hard.

Integrality of polytopes

A point in an [math]\displaystyle{ n }[/math]-dimensional space is integral if it belongs to [math]\displaystyle{ \mathbb{Z}^n }[/math], i.e., if all its coordinates are integers.

A polyhedron is said to be integral if all its vertices are integral.

An easy observation is that an integer programming has the same optimal solutions as its LP-relaxation when the polyhedron defined by the LP-relaxation is integral.

Theorem (Hoffman 1974)

If a polyhedron [math]\displaystyle{ P }[/math] is integral then for all integer vectors [math]\displaystyle{ \boldsymbol{c} }[/math] there is an optimal solution to [math]\displaystyle{ \max\{\boldsymbol{c}^T\boldsymbol{x}\mid \boldsymbol{x}\in P\} }[/math] which is integral.

Proof.

There always exists an optimal solution which is a vertex in [math]\displaystyle{ P }[/math]. For integral [math]\displaystyle{ P }[/math], all vertices are integral.

[math]\displaystyle{ \square }[/math]

Unimodularity and total unimodularity

Definition (Unimodularity)

An [math]\displaystyle{ n\times n }[/math] integer matrix [math]\displaystyle{ A }[/math] is called unimodular if [math]\displaystyle{ \det(A)=\pm1 }[/math].

An [math]\displaystyle{ m\times n }[/math] integer matrix [math]\displaystyle{ A }[/math] is called total unimodular if every square submatrix [math]\displaystyle{ B }[/math] of [math]\displaystyle{ A }[/math] has [math]\displaystyle{ \det(B)\in\{1,-1,0\} }[/math], that is, every square, nonsingular submatrix of [math]\displaystyle{ A }[/math] is unimodular.

A totally unimodular matrix defines a integral convex polyhedron.

Theorem

Let [math]\displaystyle{ A }[/math] be an [math]\displaystyle{ m\times n }[/math] integer matrix.

If [math]\displaystyle{ A }[/math] is totally unimodualr, then for any integer vector [math]\displaystyle{ \boldsymbol{b}\in\mathbb{Z}^n }[/math] the polyhedron [math]\displaystyle{ \{\boldsymbol{x}\in\mathbb{R}^n\mid A\boldsymbol{x}=\boldsymbol{b}, \boldsymbol{x}\ge \boldsymbol{0}\} }[/math] is integral.

Proof.

Let [math]\displaystyle{ B }[/math] be a basis of [math]\displaystyle{ A }[/math], and let [math]\displaystyle{ \boldsymbol{b}' }[/math] be the corresponding coordinates in [math]\displaystyle{ \boldsymbol{b} }[/math]. A basic solution is formed by [math]\displaystyle{ B^{-1}\boldsymbol{b}' }[/math] and zeros. Since [math]\displaystyle{ A }[/math] is totally unimodular and [math]\displaystyle{ B }[/math] is a basis thus nonsingular, [math]\displaystyle{ \det(B)\in\{1,-1,0\} }[/math]. By Cramer's rule, [math]\displaystyle{ B^{-1} }[/math] has integer entries, thus [math]\displaystyle{ B^{-1}\boldsymbol{b}' }[/math] is integral. Therefore, any basic solution of [math]\displaystyle{ A\boldsymbol{x}=\boldsymbol{b}, \boldsymbol{x}\ge \boldsymbol{0} }[/math] is integral, which means the polyhedron [math]\displaystyle{ \{\boldsymbol{x}\in\mathbb{R}^n\mid A\boldsymbol{x}=\boldsymbol{b}, \boldsymbol{x}\ge \boldsymbol{0}\} }[/math] is integral.

[math]\displaystyle{ \square }[/math]

Our next result is the famous Hoffman-Kruskal theorem.

Theorem (Hoffman-Kruskal 1956)

Let [math]\displaystyle{ A }[/math] be an [math]\displaystyle{ m\times n }[/math] integer matrix.

If [math]\displaystyle{ A }[/math] is totally unimodualr, then for any integer vector [math]\displaystyle{ \boldsymbol{b}\in\mathbb{Z}^n }[/math] the polyhedron [math]\displaystyle{ \{\boldsymbol{x}\in\mathbb{R}^n\mid A\boldsymbol{x}\ge\boldsymbol{b}, \boldsymbol{x}\ge \boldsymbol{0}\} }[/math] is integral.

Proof.

Let [math]\displaystyle{ A'=\begin{bmatrix}A & -I\end{bmatrix} }[/math]. We claim that [math]\displaystyle{ A' }[/math] is also totally unimodular. Any square submatrix [math]\displaystyle{ B }[/math] of [math]\displaystyle{ A }[/math] can be written in the following form after permutation:

[math]\displaystyle{ B=\begin{bmatrix} C & 0\\ D & I \end{bmatrix} }[/math]

where [math]\displaystyle{ C }[/math] is a square submatrix of [math]\displaystyle{ A }[/math] and [math]\displaystyle{ I }[/math] is identity matrix. Therefore,

[math]\displaystyle{ \det(B)=\det(C)\in\{1,-1,0\} }[/math],

thus [math]\displaystyle{ A' }[/math] is totally unimodular.

Add slack variables to transform the constraints to the standard form [math]\displaystyle{ A'\boldsymbol{z}=\boldsymbol{b},\boldsymbol{z}\ge\boldsymbol{0} }[/math]. The polyhedron [math]\displaystyle{ \{\boldsymbol{x}\mid A\boldsymbol{x}\ge\boldsymbol{b}, \boldsymbol{x}\ge \boldsymbol{0}\} }[/math] is integral if the polyhedron [math]\displaystyle{ \{\boldsymbol{z}\mid A'\boldsymbol{z}=\boldsymbol{b}, \boldsymbol{z}\ge \boldsymbol{0}\} }[/math] is integral, which is implied by the total unimodularity of [math]\displaystyle{ A'\, }[/math].

[math]\displaystyle{ \square }[/math]

File history

Click on a date/time to view the file as it appeared at that time.

	Date/Time	Thumbnail	Dimensions	User	Comment
current	12:43, 30 August 2022		316 × 475 (21 KB)	Maintenance script (talk \| contribs)	== Summary == Importing file

You cannot overwrite this file.

File usage

The following 5 pages use this file:

File:Cook-optimization.jpg and 组合数学 (Fall 2011)/Optimization: Difference between pages

Revision as of 03:41, 17 August 2011

Contents

Duality

LP duality

Duality theorems

Unimodularity

Integer Programming

Integrality of polytopes

Unimodularity and total unimodularity

File history

File usage

Navigation menu

@@ Line 1: / Line 1: @@
+== Duality ==
+Consider the following LP:
+:<math>
+\begin{align}
+\text{minimize} && 7x_1+x_2+5x_3\\
+\text{subject to}  &&
+x_1-x_2+3x_3 &\ge 10\\
+ &&
+x_1-2x_2-x_3 &\ge 6\\
+&& x_1,x_2,x_3 &\ge 0
+\end{align}
+</math>
+Let <math>OPT</math> be the value of the optimal solution. We want to estimate the upper and lower bound of <math>OPT</math>.
+Since <math>OPT</math> is the minimum over the feasible set, every feasible solution forms an upper bound for <math>OPT</math>. For example <math>\boldsymbol{x}=(2,1,3)</math> is a feasible solution, thus <math>OPT\le 7\cdot 2+1+5\cdot 3=30</math>.
+For the lower bound, the optimal solution must satisfy the two constraints:
+:<math>
+\begin{align}
+x_1-x_2+3x_3 &\ge 10,\\
+x_1-2x_2-x_3 &\ge 6.\\
+\end{align}
+</math>
+Since the <math>x_i</math>'s are restricted to be nonnegative, term-by-term comparison of coefficients shows that
+:<math>7x_1+x_2+5x_3\ge(x_1-x_2+3x_3)+(5x_1-2x_2-x_3)\ge 16.</math>
+The idea behind this lower bound process is that we are finding suitable nonnegative multipliers (in the above case the multipliers are all 1s) for the constraints so that when we take their sum, the coefficient of each <math>x_i</math> in the sum is dominated by the coefficient in the objective function. It is important to ensure that the multipliers are nonnegative, so they do not reverse the direction of the constraint inequality.
+To find the best lower bound, we need to choose the multipliers in such a way that the sum is as large as possible. Interestingly, the problem of finding the best lower bound can be formulated as another LP:
+::<math>
+\begin{align}
+\text{maximize} && 10y_1+6y_2\\
+\text{subject to}  &&
+y_1+5y_2 &\le 7\\
+ &&
+-y_1+2y_2 &\le 1\\
+&&3y_1-y_2 &\le 5\\
+&& y_1,y_2&\ge 0
+\end{align}
+</math>
+Here <math>y_1</math> and <math>y_2</math> were chosen to be nonnegative multipliers for the first and the second constraint, respectively. We call the first LP the '''primal program''' and the second LP the '''dual program'''. By definition, every feasible solution to the dual program gives a lower bound for the primal program.
+=== LP duality ===
+Given an LP in canonical form, called the '''primal''' LP:
+:<math>
+\begin{align}
+\text{minimize} && \boldsymbol{c}^T\boldsymbol{x}\\
+\text{subject to} &&
+A\boldsymbol{x} &\ge\boldsymbol{b}\\
+&& \boldsymbol{x} &\ge \boldsymbol{0}
+\end{align}
+</math>
+the '''dual''' LP is defined as follows:
+:<math>
+\begin{align}
+\text{maximum} && \boldsymbol{b}^T\boldsymbol{y}\\
+\text{subject to} &&
+A^T\boldsymbol{y} &\le\boldsymbol{c}\\
+&& \boldsymbol{y} &\ge \boldsymbol{0}
+\end{align}
+</math>
+We then give some examples.
+;Surviving problem (diet problem)
+Let us consider the surviving problem. Suppose we have <math>n</math> types of natural food, each containing up to <math>m</math> types of vitamins.  The <math>j</math>th food has <math>a_{ij}</math> amount of vitamin <math>i</math>, and the price of the <math>j</math>th food is <math>c_j</math>. We need to consume <math>b_i</math> amount of vitamin <math>i</math> for each <math>1\le i\le m</math> to keep a good health. We want to minimize the total costs of food while keeping healthy. The problem can be formalized as the following LP:
+:<math>
+\begin{align}
+\text{minimize} \quad& c_1x_1+c_2x_2+\cdots+c_nx_n\\
+\begin{align}
+\text{subject to} \\
+\\
+\end{align}
+\quad &
+\begin{align} a_{i1}x_{1}+a_{i2}x_{2}+\cdots+a_{in}x_{n} &\le b_{i} &\quad& \forall 1\le i\le m\\
+ x_{j}&\ge 0 &\quad& \forall 1\le j\le n
+\end{align}
+\end{align}
+</math>
+The dual LP is
+:<math>
+\begin{align}
+\text{maximize} \quad& b_1y_1+b_2y_2+\cdots+b_ny_m\\
+\begin{align}
+\text{subject to} \\
+\\
+\end{align}
+\quad &
+\begin{align} a_{1j}y_{1}+a_{2j}y_{2}+\cdots+a_{mj}y_{m} &\le c_{j} &\quad& \forall 1\le j\le n\\
+ y_{i}&\ge 0 &\quad& \forall 1\le i\le m
+\end{align}
+\end{align}
+</math>
+The problem can be interpreted as follows: A food company produces <math>m</math> types of vitamin pills. The company wants to design a pricing system such that
+* The vitamin <math>i</math> has a nonnegative price <math>y_i</math>.
+* The price system should be competitive to any natural food. A costumer cannot replace the vitamins by any natural food and get a cheaper price, that is, <math>\sum_{i=1}^my_ja_{ij}\le c_j</math> for any <math>1\le j\le n</math>.
+* The company wants to find the maximal profit, assuming that the customer only buy exactly the necessary amount of vitamins (<math>b_i</math> for vitamin <math>i</math>).
+;Maximum flow problem
+In the last lecture, we defined the maximum flow problem, whose LP is
+:<math>
+\begin{align}
+\text{maximize} \quad& \sum_{v:(s,v)\in E}f_{sv}\\
+\begin{align}
+\text{subject to} \\
+\\
+\\
+\\
+\end{align}
+\quad &
+\begin{align} f_{uv}&\le c_{uv} &\quad& \forall (u,v)\in E\\
+\sum_{u:(u,v)\in E}f_{uv}-\sum_{w:(v,w)\in E}f_{vw} &=0 &\quad& \forall v\in V\setminus\{s,t\}\\
+ f_{uv}&\ge 0 &\quad& \forall (u,v)\in E
+\end{align}
+\end{align}
+</math>
+where directed graph <math>G(V,E)</math> is the flow network, <math>s\in V</math> is the source, <math>t\in V</math> is the sink, and <math>c_{uv}</math> is the capacity of directed edge <math>(u,v)\in E</math>.
+We add a new edge from <math>t</math> to <math>s</math> to <math>E</math>, and let the capacity be <math>c_{ts}=\infty</math>. Let <math>E'</math> be the new edge set. The LP for the max-flow problem can be rewritten as:
+:<math>
+\begin{align}
+\text{maximize} \quad& f_{ts}\\
+\begin{align}
+\text{subject to} \\
+\\
+\\
+\\
+\end{align}
+\quad &
+\begin{align} f_{uv}&\le c_{uv} &\quad& \forall (u,v)\in E\\
+\sum_{u:(u,v)\in E'}f_{uv}-\sum_{w:(v,w)\in E'}f_{vw} &\le0 &\quad& \forall v\in V\\
+ f_{uv}&\ge 0 &\quad& \forall (u,v)\in E'
+\end{align}
+\end{align}
+</math>
+The second set of inequalities seem weaker than the original conservation constraint of flows, however, if this inequality holds at every node, then in fact it must be satisfied with equality at every node, thereby implying the flow conservation.
+To obtain the dual program we introduce variables <math>d_{uv}</math> and <math>p_v</math> corresponding to the two types of inequalities in the primal. The dual LP is:
+:<math>
+\begin{align}
+\text{minimize} \quad& \sum_{(u,v)\in E}c_{uv}d_{uv}\\
+\begin{align}
+\text{subject to} \\
+\\
+\\
+\\
+\end{align}
+\quad &
+\begin{align} d_{uv}-p_u+p_v &\ge 0 &\quad& \forall (u,v)\in E\\
+p_s-p_t &\ge1 \\
+ d_{uv} &\ge 0 &\quad& \forall (u,v)\in E\\
+ p_v&\ge 0 &\quad& \forall v\in V
+\end{align}
+\end{align}
+</math>
+It is more helpful to consider its integer version:
+:<math>
+\begin{align}
+\text{minimize} \quad& \sum_{(u,v)\in E}c_{uv}d_{uv}\\
+\begin{align}
+\text{subject to} \\
+\\
+\\
+\\
+\end{align}
+\quad &
+\begin{align} d_{uv}-p_u+p_v &\ge 0 &\quad& \forall (u,v)\in E\\
+p_s-p_t &\ge1 \\
+ d_{uv} &\in\{0,1\} &\quad& \forall (u,v)\in E\\
+ p_v&\in\{0,1\} &\quad& \forall v\in V
+\end{align}
+\end{align}
+</math>
+In the last lecture, we know that the LP for max-flow is totally unimordular, so is this dual LP, therefore the optimal solutions to the integer program are the optimal solutions to the LP.
+The variables <math>p_v</math> defines a bipartition of vertex set <math>V</math>. Let <math>S=\{v\in V\mid p_v=1\}</math>. The complement <math>\bar{S}=\{v\in V\mid p_v=1\}</math>.
+For 0/1-valued variables, the only way to satisfy <math>p_s-p_t\ge1</math> is to have <math>p_s=1</math> and <math>p_t=0</math>. Therefore, <math>(S,\bar{S})</math> is an <math>s</math>-<math>t</math> cut.
+In an optimal solution, <math>d_{uv}=1</math> if and only if <math>u\in S,v\in\bar{S}</math> and <math>(u,v)\in E</math>. Therefore, the objective function of an optimal solution <math>\sum_{u\in S,v\not\in S\atop (u,v)\in E}c_{uv}</math> is the capacity of the minimum <math>s</math>-<math>t</math> cut <math>(S,\bar{S})</math>.
+=== Duality theorems ===
+Let the primal LP be:
+:<math>
+\begin{align}
+\text{minimize} && \boldsymbol{c}^T\boldsymbol{x}\\
+\text{subject to} &&
+A\boldsymbol{x} &\ge\boldsymbol{b}\\
+&& \boldsymbol{x} &\ge \boldsymbol{0}
+\end{align}
+</math>
+Its dual LP is:
+:<math>
+\begin{align}
+\text{maximum} && \boldsymbol{b}^T\boldsymbol{y}\\
+\text{subject to} &&
+A^T\boldsymbol{y} &\le\boldsymbol{c}\\
+&& \boldsymbol{y} &\ge \boldsymbol{0}
+\end{align}
+</math>
+{{Theorem|Theorem|
+: The dual of a dual is the primal.
+}}
+{{Proof|
+The dual program can be written as the following minimization in canonical form:
+:<math>
+\begin{align}
+\min && -\boldsymbol{b}^T\boldsymbol{y}\\
+\text{s.t.} &&
+-A^T\boldsymbol{y} &\ge-\boldsymbol{c}\\
+&& \boldsymbol{y} &\ge \boldsymbol{0}
+\end{align}
+</math>
+Its dual is:
+:<math>
+\begin{align}
+\max && -\boldsymbol{c}^T\boldsymbol{x}\\
+\text{s.t.} &&
+-A\boldsymbol{x} &\le-\boldsymbol{b}\\
+&& \boldsymbol{x} &\ge \boldsymbol{0}
+\end{align}
+</math>
+which is equivalent to the primal:
+:<math>
+\begin{align}
+\min && \boldsymbol{c}^T\boldsymbol{x}\\
+\text{s.t.} &&
+A\boldsymbol{x} &\ge\boldsymbol{b}\\
+&& \boldsymbol{x} &\ge \boldsymbol{0}
+\end{align}
+</math>
+}}
+We have shown that feasible solutions of a dual program can be used to lower bound the optimum of the primal program. This is formalized by the following important theorem.
+{{Theorem|Theorem (Weak duality theorem)|
+:If there exists an optimal solution to the primal LP:
+::<math>
+\begin{align}
+\min && \boldsymbol{c}^T\boldsymbol{x}\\
+\text{s.t.} &&
+A\boldsymbol{x} &\ge\boldsymbol{b}\\
+&& \boldsymbol{x} &\ge \boldsymbol{0}
+\end{align}
+</math>
+:then,
+::<math>
+\begin{align}
+\begin{align}
+\min && \boldsymbol{c}^T\boldsymbol{x}\\
+\text{s.t.} &&
+A\boldsymbol{x} &\ge\boldsymbol{b}\\
+&& \boldsymbol{x} &\ge \boldsymbol{0}
+\end{align}
+&\begin{align}
+\ge\\
+\\
+\\
+\end{align}&\quad
+\begin{align}
+\max && \boldsymbol{b}^T\boldsymbol{y}\\
+\text{s.t.} &&
+A^T\boldsymbol{y} &\le\boldsymbol{c}\\
+&& \boldsymbol{y} &\ge \boldsymbol{0}
+\end{align}
+\end{align}
+</math>
+}}
+{{proof|
+Let <math>\boldsymbol{x}</math> be an arbitrary feasible solution to the primal LP, and <math>\boldsymbol{y}</math> be an arbitrary feasible solution to the dual LP.
+We estimate <math>\boldsymbol{y}^TA\boldsymbol{x}</math> in two ways. Recall that <math>A\boldsymbol{x} \ge\boldsymbol{b}</math> and <math>A^T\boldsymbol{y} \le\boldsymbol{c}</math>, thus
+:<math>\boldsymbol{y}^T\boldsymbol{b}\le\boldsymbol{y}^TA\boldsymbol{x}\le\boldsymbol{c}^T\boldsymbol{x}</math>.
+Since this holds for any feasible solutions, it must also hold for the optimal solutions.
+}}
+A harmonically beautiful result is that the optimums of the primal LP and its dual are equal. This is called the strong duality theorem of linear programming.
+{{Theorem|Theorem (Strong duality theorem)|
+:If there exists an optimal solution to the primal LP:
+::<math>
+\begin{align}
+\min && \boldsymbol{c}^T\boldsymbol{x}\\
+\text{s.t.} &&
+A\boldsymbol{x} &\ge\boldsymbol{b}\\
+&& \boldsymbol{x} &\ge \boldsymbol{0}
+\end{align}
+</math>
+:then,
+::<math>
+\begin{align}
+\begin{align}
+\min && \boldsymbol{c}^T\boldsymbol{x}\\
+\text{s.t.} &&
+A\boldsymbol{x} &\ge\boldsymbol{b}\\
+&& \boldsymbol{x} &\ge \boldsymbol{0}
+\end{align}
+&\begin{align}
+=\\
+\\
+\\
+\end{align}&\quad
+\begin{align}
+\max && \boldsymbol{b}^T\boldsymbol{y}\\
+\text{s.t.} &&
+A^T\boldsymbol{y} &\le\boldsymbol{c}\\
+&& \boldsymbol{y} &\ge \boldsymbol{0}
+\end{align}
+\end{align}
+</math>
+}}
+== Unimodularity ==
+=== Integer Programming===
+Consider the '''maximum integral flow''' problem: given as input a flow network <math>(G(V,E),c,s,t)</math> where for every <math>uv\in E</math> the capacity <math>c_{uv}</math> is integer. We want to find the integral flow <math>f:E\rightarrow\mathbb{Z}</math> with maximum value.
+The mathematical programming for the problem is:
+:<math>
+\begin{align}
+\text{maximize} \quad& \sum_{v:(s,v)\in E}f_{sv}\\
+\begin{align}
+\text{subject to} \\
+\\
+\\
+\\
+\\
+\end{align}
+\quad &
+\begin{align} f_{uv}&\le c_{uv} &\quad& \forall (u,v)\in E\\
+\sum_{u:(u,v)\in E}f_{uv}-\sum_{w:(v,w)\in E}f_{vw} &=0 &\quad& \forall v\in V\setminus\{s,t\}\\
+ f_{uv}&\in\mathbb{N} &\quad& \forall (u,v)\in E
+\end{align}
+\end{align}
+</math>
+where <math>\mathbb{N}</math> is the set of all nonnegative integers. Compared to the LP for the max-flow problem, we just replace the last line <math> f_{uv}\ge 0</math> with <math> f_{uv}\in\mathbb{N}</math>. The resulting optimization is called an '''integer programming (IP)''', or more specific '''integer linear programming (ILP)'''.
+Due to the Flow Integrality Theorem, when capacities are integers, there must be an integral flow whose value is maximum among all flows (integral or not). This means the above IP can be efficiently solved by solving its LP-relaxation. This is usually impossible for general IPs.
+Generally, an IP of canonical form is written as
+:<math>
+\begin{align}
+\text{maximize} \quad& \boldsymbol{c}^T\boldsymbol{x}\\
+\begin{align}
+\text{subject to} \\
+\\
+\\
+\end{align}
+\quad &
+\begin{align}
+A\boldsymbol{x} &\ge\boldsymbol{b} \\
+\boldsymbol{x}&\ge \boldsymbol{0}\\
+\boldsymbol{x}&\in\mathbb{Z}^n
+\end{align}
+\end{align}
+</math>
+Consider the '''3SAT''' problem. Each instance is a '''3CNF(conjunctive normal form)''': <math>\bigwedge_{i=1}^m(\ell_{i_1}\vee\ell_{i_2}\vee\ell_{i_3})</math>, where each <math>(\ell_{i_1}\vee\ell_{i_2}\vee\ell_{i_3})</math> is a '''clause''' and each <math>\ell_{i_r}\in\{x_{j},\neg x_{j}\mid 1\le j\le n\}</math>, called a '''literal''', is either a boolean variable or a negation of a boolean variable. We want to determine whether there exists an truth assignment of the <math>n</math> boolean variables <math>x_1,\ldots,x_n</math> such that the input formula is satisfied (i.e., is true).
+The following IP solves 3SAT:
+:<math>
+\begin{align}
+\text{maximize} \quad& \sum_{i=1}^mz_i\\
+\begin{align}
+\text{subject to} \\
+\\
+\\
+\\
+\end{align}
+\quad &
+\begin{align}
+z_i &\le y_{i_1}+y_{i_2}+y_{i_3} &\quad& \forall 1\le i\le m\\
+y_{i_r}&\le x_{j} &\quad& \text{if }\ell_{i_r}=x_{j} \\
+y_{i_r}&\le 1-x_{j} &\quad& \text{if }\ell_{i_r}=\neg x_{j} \\
+z_i, x_j&\in\{0,1\} &\quad& \forall 1\le i\le m, 1\le j\le n
+\end{align}
+\end{align}
+</math>
+Since 3SAT is NP-hard (actually it is the first problem known to be NP-hard), generally IP is NP-hard.
+=== Integrality of polytopes ===
+A point in an <math>n</math>-dimensional space is integral if it belongs to <math>\mathbb{Z}^n</math>, i.e., if all its coordinates are integers.
+A polyhedron is said to be '''integral''' if all its vertices are integral.
+An easy observation is that an integer programming has the same optimal solutions as its LP-relaxation when the polyhedron defined by the LP-relaxation is integral.
+{{Theorem|Theorem (Hoffman 1974)|
+: If a polyhedron <math>P</math> is integral then for all integer vectors <math>\boldsymbol{c}</math> there is an optimal solution to <math>\max\{\boldsymbol{c}^T\boldsymbol{x}\mid \boldsymbol{x}\in P\}</math> which is integral.
+}}
+{{Proof|
+There always exists an optimal solution which is a vertex in <math>P</math>. For integral <math>P</math>, all vertices are integral.
+}}
+=== Unimodularity and total unimodularity ===
+{{Theorem|Definition (Unimodularity)|
+:An <math>n\times n</math> integer matrix <math>A</math> is called '''unimodular''' if <math>\det(A)=\pm1</math>.
+:An <math>m\times n</math> integer matrix <math>A</math> is called '''total unimodular''' if every square submatrix <math>B</math> of <math>A</math> has <math>\det(B)\in\{1,-1,0\}</math>, that is, every square, nonsingular submatrix of <math>A</math> is unimodular.
+}}
+A totally unimodular matrix defines a integral convex polyhedron.
+{{Theorem|Theorem|
+:Let <math>A</math> be an <math>m\times n</math> integer matrix.
+:If <math>A</math> is totally unimodualr, then for any integer vector <math>\boldsymbol{b}\in\mathbb{Z}^n</math> the polyhedron <math>\{\boldsymbol{x}\in\mathbb{R}^n\mid A\boldsymbol{x}=\boldsymbol{b}, \boldsymbol{x}\ge \boldsymbol{0}\}</math> is integral.
+}}
+{{Proof|
+Let <math>B</math> be a basis of <math>A</math>, and let <math>\boldsymbol{b}'</math> be the corresponding coordinates in <math>\boldsymbol{b}</math>. A basic solution is formed by <math>B^{-1}\boldsymbol{b}'</math> and zeros. Since <math>A</math> is totally unimodular and <math>B</math> is a basis thus nonsingular, <math>\det(B)\in\{1,-1,0\}</math>. By [http://en.wikipedia.org/wiki/Cramer's_rule Cramer's rule], <math>B^{-1}</math> has integer entries, thus <math>B^{-1}\boldsymbol{b}'</math> is integral. Therefore, any basic solution of <math>A\boldsymbol{x}=\boldsymbol{b}, \boldsymbol{x}\ge \boldsymbol{0}</math> is integral, which means the polyhedron  <math>\{\boldsymbol{x}\in\mathbb{R}^n\mid A\boldsymbol{x}=\boldsymbol{b}, \boldsymbol{x}\ge \boldsymbol{0}\}</math> is integral.
+}}
+Our next result is the famous Hoffman-Kruskal theorem.
+{{Theorem|Theorem (Hoffman-Kruskal 1956)|
+:Let <math>A</math> be an <math>m\times n</math> integer matrix.
+:If <math>A</math> is totally unimodualr, then for any integer vector <math>\boldsymbol{b}\in\mathbb{Z}^n</math> the polyhedron <math>\{\boldsymbol{x}\in\mathbb{R}^n\mid A\boldsymbol{x}\ge\boldsymbol{b}, \boldsymbol{x}\ge \boldsymbol{0}\}</math> is integral.
+}}
+{{Proof|
+Let <math>A'=\begin{bmatrix}A & -I\end{bmatrix}</math>. We claim that <math>A'</math> is also totally unimodular. Any square submatrix <math>B</math> of <math>A</math> can be written in the following form after permutation:
+:<math>B=\begin{bmatrix}
+C & 0\\
+D & I
+\end{bmatrix}</math>
+where <math>C</math> is a square submatrix of <math>A</math> and <math>I</math> is identity matrix. Therefore,
+:<math>\det(B)=\det(C)\in\{1,-1,0\}</math>,
+thus <math>A'</math> is totally unimodular.
+Add slack variables to transform the constraints to the standard form <math>A'\boldsymbol{z}=\boldsymbol{b},\boldsymbol{z}\ge\boldsymbol{0}</math>. The polyhedron <math>\{\boldsymbol{x}\mid A\boldsymbol{x}\ge\boldsymbol{b}, \boldsymbol{x}\ge \boldsymbol{0}\}</math> is integral if the polyhedron <math>\{\boldsymbol{z}\mid A'\boldsymbol{z}=\boldsymbol{b}, \boldsymbol{z}\ge \boldsymbol{0}\}</math> is integral, which is implied by the total unimodularity of <math>A'\,</math>.
+}}

File:Cook-optimization.jpg and 组合数学 (Fall 2011)/Optimization: Difference between pages

Revision as of 03:41, 17 August 2011

Duality

LP duality

Duality theorems

Unimodularity

Integer Programming

Integrality of polytopes

Unimodularity and total unimodularity

File history

File usage

Navigation menu

Search