组合数学 (Fall 2011)/Matroid

From TCS Wiki
Revision as of 03:40, 17 August 2011 by imported>WikiSysop (Created page with '== Matroid == The matroid is a structure shared by a class of optimization problems for which greedy algorithms work. === Kruskal's greedy algorithm for MST === For the '''minimu…')
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to navigation Jump to search

Matroid

The matroid is a structure shared by a class of optimization problems for which greedy algorithms work.

Kruskal's greedy algorithm for MST

For the minimum-weight spanning tree (MST) problem. We are given a connected undirected graph [math]\displaystyle{ G(V,E) }[/math] with positive edge weights [math]\displaystyle{ w:E\rightarrow\mathbb{R}^+ }[/math], and want to find a spanning tree [math]\displaystyle{ T }[/math] of the minimum accumulated weight [math]\displaystyle{ \sum_{e\in T}w_e }[/math].

We consider the equivalent maximum problem to find a spanning tree with maximum weight. To see this makes the problem no harder, we can replace the weight [math]\displaystyle{ w_e }[/math] for every edge [math]\displaystyle{ e\in E }[/math] by [math]\displaystyle{ W-w_e }[/math], where [math]\displaystyle{ W }[/math] is a sufficient large constant which is greater than all weights. The minimum-weight spanning tree for the modified weight is the maximum-weight spanning tree for the original weight.

The following greedy algorithm solves the maximum-weight spanning tree problem.

Kruskal's Algorithm
[math]\displaystyle{ S=\emptyset }[/math];
while [math]\displaystyle{ \exists e\in E }[/math] that [math]\displaystyle{ S\cup\{e\} }[/math] is forest
pick such [math]\displaystyle{ e }[/math] with maximum [math]\displaystyle{ w_e }[/math];
[math]\displaystyle{ S=S\cup\{e\} }[/math];

It is not hard to verify the correctness of this greedy algorithm. But we are more interested in the general framework underlying this algorithm.

Matroids

Let [math]\displaystyle{ X }[/math] be a finite set and [math]\displaystyle{ \mathcal{F}\subseteq 2^X }[/math] be a family of subsets of [math]\displaystyle{ X }[/math]. A member set [math]\displaystyle{ S\in\mathcal{F} }[/math] is called maximal if [math]\displaystyle{ S\cup\{x\}\not\in\mathcal{F} }[/math] for any [math]\displaystyle{ x\in X\setminus S }[/math].

For [math]\displaystyle{ Y\subseteq X }[/math], denote [math]\displaystyle{ \mathcal{F}_Y=\{S\in\mathcal{F}\mid S\subseteq Y\} }[/math]. Obviously,[math]\displaystyle{ \mathcal{F}_Y=\mathcal{F}\cap 2^Y\, }[/math].

Definition
A set system [math]\displaystyle{ \mathcal{F}\subseteq 2^X }[/math] is a matroid if it satisfies:
  • (hereditary) if [math]\displaystyle{ T\subseteq S\in\mathcal{F} }[/math] then [math]\displaystyle{ T\in\mathcal{F} }[/math];
  • (matroid property) for every [math]\displaystyle{ Y\subseteq X }[/math], all maximal [math]\displaystyle{ S\in\mathcal{F}_Y }[/math] have the same [math]\displaystyle{ |S| }[/math].

Suppose [math]\displaystyle{ \mathcal{F} }[/math] is a matroid. Some matroid terminologies:

  • Each member set [math]\displaystyle{ S\in\mathcal{F} }[/math] is called an independent set.
  • A maximal independent subset of a set [math]\displaystyle{ Y\subset X }[/math], i.e., a maximal [math]\displaystyle{ S\in\mathcal{F}_Y }[/math], is called a basis of [math]\displaystyle{ Y }[/math].
  • The size of the maximal [math]\displaystyle{ S\in\mathcal{F}_Y }[/math] is called the rank of [math]\displaystyle{ Y }[/math], denoted [math]\displaystyle{ r(Y) }[/math].

Graph matroids

Let [math]\displaystyle{ G(V,E) }[/math] be a graph. Define a set system with ground set [math]\displaystyle{ E }[/math] as

[math]\displaystyle{ \mathcal{F}=\{S\subseteq E\mid \text{there is no cycle in }S\}. }[/math]

That is, [math]\displaystyle{ \mathcal{F} }[/math] is the set of all forests in [math]\displaystyle{ G }[/math].

We claim that [math]\displaystyle{ \mathcal{F} }[/math] is a matroid.

First, [math]\displaystyle{ \mathcal{F} }[/math] is hereditary since any subgraph of a forest must also be a forest.

We then verify the matroid property of [math]\displaystyle{ \mathcal{F} }[/math]. Let [math]\displaystyle{ Y\subseteq E }[/math] be an arbitrary subgraph of [math]\displaystyle{ G }[/math]. Suppose [math]\displaystyle{ Y }[/math] has [math]\displaystyle{ k }[/math] connected components. For any maximal forest [math]\displaystyle{ S }[/math] in [math]\displaystyle{ Y }[/math] (i.e., [math]\displaystyle{ S }[/math] is a spanning forest in [math]\displaystyle{ Y }[/math]), it holds that [math]\displaystyle{ |S|=n-k }[/math]. In other words, for any [math]\displaystyle{ Y\subseteq E }[/math], all maximal member of [math]\displaystyle{ \mathcal{F}_Y }[/math] have the same cardinality.

Therefore, [math]\displaystyle{ \mathcal{F} }[/math] is a matroid. Each independent set (of matroid) is a forest in [math]\displaystyle{ G }[/math]. For any subgraph [math]\displaystyle{ Y\subseteq G }[/math], the rank of [math]\displaystyle{ Y }[/math] is the size of a spanning forest of [math]\displaystyle{ Y }[/math].

Linear matroids

Let [math]\displaystyle{ A }[/math] be an [math]\displaystyle{ m\times n }[/math] matrix. Define a set system [math]\displaystyle{ \mathcal{F}\subseteq 2^{[n]} }[/math] as

[math]\displaystyle{ \mathcal{F}=\{S\subseteq [n]\mid S\text{ is a set of linearly independent columns in }A\}. }[/math]

[math]\displaystyle{ \mathcal{F} }[/math] is hereditary since every any subset of a set of linearly independent vectors is still linearly independent.

For any subset [math]\displaystyle{ Y\subseteq [n] }[/math] of columns of [math]\displaystyle{ A }[/math]. Let [math]\displaystyle{ B }[/math] be the submatrix composed by these columns. Then [math]\displaystyle{ \mathcal{F}_Y }[/math] contains all sets of linearly independent columns of [math]\displaystyle{ B }[/math]. Clearly, all maximal such sets have the same size, which is the column-rank of [math]\displaystyle{ B }[/math].

Therefore, [math]\displaystyle{ \mathcal{F} }[/math] is a matroid. Each independent set (of matroid) is a linearly independent set of columns of matrix [math]\displaystyle{ A }[/math]. For any set [math]\displaystyle{ Y\subseteq[n] }[/math] of columns of matrix [math]\displaystyle{ A }[/math], the rank of [math]\displaystyle{ Y }[/math] is the column-rank of the submatrix defined by the columns in [math]\displaystyle{ Y }[/math].

Weighted matroid maximization

Consider the following weighted matroid maximization problem. Let [math]\displaystyle{ \mathcal{F}\subseteq2^X }[/math] be a matroid. We define positive weights [math]\displaystyle{ w:X\rightarrow\mathbb{R}^+ }[/math] of elements in [math]\displaystyle{ X }[/math]. Our goal is to find an independent set [math]\displaystyle{ S\in\mathcal{F} }[/math] with the maximum accumulated weight [math]\displaystyle{ \sum_{x\in S}w(x) }[/math].

We then introduce the Greedy Algorithm which finds the maximum-weight independent set.

Greedy Algorithm
[math]\displaystyle{ S=\emptyset }[/math];
while [math]\displaystyle{ \exists x\not\in S }[/math] with [math]\displaystyle{ S\cup\{x\}\in\mathcal{F} }[/math]
choose such [math]\displaystyle{ x }[/math] with maximum [math]\displaystyle{ w(x) }[/math];
[math]\displaystyle{ S=S\cup\{x\} }[/math];

The correctness of the greedy algorithm is due to the next theorem.

Theorem (Rado 1957; Edmonds 1970)
The greedy algorithm finds an independent set [math]\displaystyle{ S\in\mathcal{F} }[/math] with the maximum weight.
Proof.

Suppose the theorem is false. Let [math]\displaystyle{ S }[/math] be the independent set returned by the greedy algorithm and let [math]\displaystyle{ T }[/math] be a maximum-weight independent set.

Suppose [math]\displaystyle{ S=\{x_1,x_2,\ldots,x_m\} }[/math], where the [math]\displaystyle{ x_i }[/math]s are chosen by the algorithm in that order. Then it is easy to see that [math]\displaystyle{ w(x_1)\ge w(x_2)\ge\cdots\ge w(x_m) }[/math].

Suppose [math]\displaystyle{ T=\{y_1,y_2,\ldots,y_\ell\} }[/math], where [math]\displaystyle{ w(y_1)\ge w(y_2)\ge\cdots\ge w(y_\ell) }[/math].

Choose the least index [math]\displaystyle{ k }[/math] such that [math]\displaystyle{ w(x_k)\gt w(y_k) }[/math]. If none exists, then we must have [math]\displaystyle{ \ell\gt m }[/math]; in this case we can let [math]\displaystyle{ k=m+1 }[/math].

In either case we know that the greedy algorithm did not add any of [math]\displaystyle{ y_1,\ldots,y_{k} }[/math] in step [math]\displaystyle{ k }[/math]. Since what it did choose has smaller weight, it must be that [math]\displaystyle{ y_i }[/math], for [math]\displaystyle{ 1\le i\le k }[/math], has the property either that [math]\displaystyle{ y_i\in\{x_1,\ldots,x_{k-1}\} }[/math] or that [math]\displaystyle{ \{x_1,\ldots,x_{k-1},y_i\}\not\in\mathcal{F} }[/math]. In other words, [math]\displaystyle{ \{x_1,\ldots,x_{k-1}\} }[/math] is a basis of [math]\displaystyle{ Y=\{x_1,\ldots,x_{k-1},y_1,\ldots,y_k\} }[/math]. But this contradicts the matroid property, since [math]\displaystyle{ \{y_1,\ldots,y_k\} }[/math], being a subset of [math]\displaystyle{ T }[/math], is also an independent subset of [math]\displaystyle{ Y }[/math] and is larger.

[math]\displaystyle{ \square }[/math]

Matroid intersections