APPROXIMATION THEORY IN COMBINATORIAL OPTIMIZATION. APPLICATION TO THE GENERALIZED MINIMUM SPANNING TREE PROBLEM

. We present an overview of the approximation theory in combinatorial optimization. As an application we consider the Generalized Minimum Spanning Tree (GMST) problem which is deﬁned on an undirected complete graph with the nodes partitioned into clusters and non-negative costs are associated to the edges. This problem is NP-hard and it is known that there cannot exists a polynomial approximation algorithm. We present an in-approximability result for the GMST problem and under special assumptions: cost function satisfying the triangle inequality and with cluster sizes bounded by ρ , we give an approximation algorithm with ratio 2 ρ .

Combinatorial Optimization is the process of finding one or more best (optimal) solutions in a well defined discrete problem space, i.e. a space containing a finite set of possible solutions, that optimizes a certain function, the socalled objective function.The finite set of possible solutions can be described by inequality and equality constraints, and by integrality constraints.The integrality constraints force the variables to be integers.The set of points that satisfy all these constraints is called the (feasible) solution set.
Such problems occur in almost all fields of management (e.g.finance, marketing, production, scheduling, inventory control, facility location, etc.), as well as in many engineering disciplines (e.g.optimal design of waterways or bridges, design and analysis of data networks, energy resource-planning models, logistic of electrical power generation and transport, etc.).A survey of applications of combinatorial optimization is given by Grötschel in [7].
Combinatorial Optimization models are often referred to as integer programming models where some or all of the variables can take on only a finite number of alternative possibilities.
Many of the optimization problems we would like to solve are N P-hard.Therefore it is very unlikely that these problems could be solved by a polynomial-time algorithm.However, these problems still have to be solved in practice.In order to do that, we have to relax some of the requirements.There are in general, three different possibilities.
• Superpolynomial-time algorithms: Even though an optimization problem is N P-hard, there are "good" and "not so good" algorithms for solving it exactly.To the "not so good" algorithms certainly belong most of the simple enumeration methods where one would enumerate all the feasible solutions and then choose the one with the optimal value of the objective function.Such methods have very high time complexity.Among the "good" algorithms belong methods like branchand-bound where an analysis of the problem at hand is used to discard most of the feasible solutions before they are even considered.These approaches allow one to obtain exact solutions of reasonably large problem instances, but their running time still depends exponentially on the size of the problem.• Average-case polynomial-time algorithms: For some problems, it is possible to have algorithms which require superpolynomial-time on only a few instances and for the other instances run in polynomial time.A famous example is the Simplex Method for solving problems in Linear Programming.• Approximation Algorithms: We may also relax the requirement of obtaining an exact solution of the optimization problem and content ourselves with a solution which is "not too far" from the optimum.This is partially justified by the fact that, in practice, it is usually enough to obtain a solution that is slightly sub-optimal.
Clearly, there are good approximation algorithms and bad ones as well.What we need is some means of determining the quality of an approximation algorithm and a way of comparing different algorithms.There are a few criteria to consider: Average-case performance: One has to consider some probability distribution on the set of all possible instances of a given problem.Based on this assumption, an expectation of the performance can then be found.Results of this kind strongly depend on the choice of the initial distribution and do not provide us any information about the performance on a particular instance.
Experimental performance: This approach is based on running the algorithm on a few "typical" instances.It has been used mostly to compare performance of several approximation algorithms.Of course the result depend on the choice of the "typical" instances and may vary from experiment to experiment.
Worst-case performance: This is usually done by establishing upper and lower bounds for approximate solutions in terms of the optimum value.In case of minimization problems, we try to establish upper bounds, in case of maximization problems, one wants to find lower bounds.
The advantage of the worst-case bounds on the performance of approximation algorithms is the fact that given any instance of the optimization problem, we are guaranteed that the approximate solution stays within these bounds.It should also be noted that approximation algorithms usually output solutions much closer to the optimum than the worst-case bounds suggest.Thus it is of independent interest to see how tight the bounds on the performance of each algorithm are; that is, how bad the approximate solution can really get.This is usually done by providing examples of specific instances for which the approximate solution is very far from the optimum solution.
Establishing worst-case performance bounds for even simple algorithms often requires a very deep understanding of the problem at hand and the use of powerful theoretical results from areas like linear programming, combinatorics, graph theory, probability theory, etc.
We consider an N P-hard optimization problem for which as we have seen it is difficult to find the exact optimal solution within polynomial time.At the expense of reducing the quality of the solution by relaxing some of the requirements, we can often get considerable speed-up in the complexity.This leads us to the following definition: Definition 1. (Approximation algorithms) Let X be a minimization problem and α > 1.An algorithm AP P is called an α-approximation algorithm for problem X, if for all instances I of X it delivers in polynomial time a feasible solution with objective value AP P (I) such that (1) AP P (I) ≤ αOP T (I), where by AP P (I) and OP T (I) we denoted the values of an approximate solution and that of an optimal solution for instance I, respectively.
The value α is called the performance guarantee or the worst case ratio of the approximation algorithm AP P .The closer α is to 1 the better the algorithm is.

THE GENERALIZED MINIMUM SPANNING TREE PROBLEM
The generalized minimum spanning tree problem (GMST) is the problem to find a minimum-cost tree spanning a subset of nodes which includes exactly one node from each cluster.We will call a tree containing one node from each cluster a generalized spanning tree.In [10], Myung et al. proved that the GMST problem is N P-hard, and in [13], we presented a stronger result, namely, the GMST problem even on trees is N P-hard as well an exact exponential time algorithm based on dynamic programming.
There are various (slight) generalizations of the GMST problem.For example, the clusters may not be required to be distinct or it may be feasible to chose more than one node per cluster.The latter problem, i.e., the problem of finding a minimum cost tree spanning at least one node per cluster, is also known as the generalized Steiner tree problem.
In [4], Feremans and Grigoriev proposed an approximation scheme for the GMST problem in a special case of the problem.They considered a geometric case of the problem where the graph has all the vertices situated in the plane and Euclidean distance defines the edge cost.

A NEGATIVE RESULT FOR THE GMST PROBLEM
For some hard combinatorial optimization problems it is possible to show that they don't have an approximation algorithm unless P = N P.In order to give a result of this form it is enough to show that the existence of an α-approximation algorithm would allow one to solve some decision problem, known to be N P-complete, in polynomial time.
Applying this scheme to the GMST problem we obtain an in-approximability result.This result is a different formulation in terms of approximation algorithms of a result provided by Myung et al. [10] which says that even finding a near optimal solution for the GMST problem is N P-hard.Our proof is slightly different to the proof provided in [10].
Theorem 2. Under the assumption P = N P, there is no α-approximation algorithm for the GMST problem.
Proof.Assume that there exists an α-approximation algorithm AP P for the GMST problem, where α is a real number greater than or equal to 1.This means that AP P (I) ≤ αOP T (I), for every instance I, where OP T (I) and AP P (I) are the values of the optimal solution and of the solution found by the algorithm AP P , respectively.Then, we will show that AP P also solves the node-cover problem for a given graph G = (V, E) and an integer k such that k < |V |.This result contradicts the assumption that P = N P .
We construct a graph G = (V , E ) and the edge cost function such that the algorithm AP P finds a feasible solution with a value no greater than α times the optimal cost if and only if G contains C, where C is a node cover of G, i.e. a subset of V such that all the edges of G are adjacent to at least one node of C. Edges in G are constructed as follows: Theorem 6.Each node of V t for all t = 2, ..., k + 1 is connected to r by an edge.The set consisting of these edges is denoted by E 1 .

The graph G contains the following m
Theorem 7. Let i be a node of V t for any t ∈ {2, ..., k + 1} and j be a node of V t for any t ∈ {k + 2, ..., m}.Then, an edge is constructed between i and j if the edge of G corresponding to j is incident to the node of G corresponding to i, and let E 2 denote the set of those edges.Theorem 8. We also construct an edge between i and j even though the edge of G corresponding to j is not incident to the node of G corresponding to i, and we let E 3 denote the set of those edges.
Proof.We let The cost of each edge is defined as follows: for all i, j ∈ E 2 (|E| + 1)α for all i, j ∈ E 3 .

We claim that G contains C if and only if AP P (I) ≤ α|E|,
where instance I corresponds to G and its cost function.
Note that there always exists a generalized spanning tree in G : all the clusters different from the identical clusters V 2 , ..., V k+1 have only one node and if we select k nodes from V 2 , ..., V k+1 , one node from each cluster such that each node of C is included, then these k nodes together with the remaining nodes selected always uniquely form a generalized spanning tree of G using edges in E 1 ∪ E 2 , by the definition of G .
Suppose now that G contains a generalized spanning tree and let C be a set of distinct nodes selected from the clusters, V 2 , ..., V k+1 in the tree, then C is a node cover of G.
Therefore, we showed that a generalized spanning tree only using edges in In particular if G contains a node cover C then the approximation algorithm AP P will produce a solution with value AP P (I) ≤ α|E| = αOP T (I), i.e. the solution does not use any edge from E 3 and AP P identifies a node cover.
However, under further assumptions, in the next section we present a positive result for the GMST problem.

AN APPROXIMATION ALGORITHM FOR THE GMST PROBLEM
In this section we provide a polynomial time approximation algorithm for the metric GMST with bounded cluster size.I.e., we consider the case where the cost function c : E → R + satisfies the triangle inequality and the clusters are bounded by for some ρ > 0. For this class of problem instances we can efficiently construct a solution with cost at most 2ρ times the optimum (which is admittedly rather poor for practical purposes).
Our approach is based on ideas from Slavik [16] where a similar type of approximation algorithms is described for the so-called generalized TSP, the problem of determining a shortest cycle through m nodes, one from each cluster V k .The additional difficulty we encounter in our situation (as compared to the general TSP) is to compare the length of a minimum cost spanning tree to the minimum of c T x over the cut polytope (or subtour elimination polytope in the TSP terminology), cf.Lemma 3 in Section 5.
We consider the GMST problem on the graph G=(V,E) and assume that the cluster size is bounded, |V k | ≤ ρ, k ∈ K and that the cost function c satisfies the triangle inequality.We introduce the binary variables x ∈ R E and y ∈ R V with values x e = 1 if edge e is selected, x e = 0 otherwise and The optimum value of the corresponding GMST can then be defined by the following integer linear program Here, as usual, δ(S) ⊆ E denotes the cut induced by S ⊆ V , i.e., the set of edges joining S to its complement V \S.Furthermore, we use the general shorthand notation that exactly one node is "picked" in each cluster.Furthermore, the cut constraints x(δ(S)) ≥ y i ensure that x ∈ {0, 1} E connects all selected nodes (to the node v 1 selected in the "root cluster" V 1 ).
Obviously the GMST's of the instance defined by + are in 1 − 1 correspondence with optimum solutions of (ILP).For each possible root v ∈ V 1 , we consider the corresponding rooted relaxation of (ILP) By assumption, |V 1 | ≤ ρ, so there are at most ρ LP's that we solve.Each of these can be solved in polynomial time (relative to the input size of (GMST)) by means of the Ellipsoid Method (cf., e.g., [8]).(Note that we can check whether a given (x, y) is feasible by computing a maximum v − i flow w.r.t.edge capacities x ∈ R E + for all i ∈ V ).Furthermore, clearly, (2) min Assume that v 1 ∈ V 1 achieves the minimum in (2) and let (x * , y * ) be an optimal solution of (LP(v 1 )).Since |V k | ≤ ρ holds by assumption, we may choose in each cluster Let W = {v 1 , . . ., v m } ⊆ V denote the resulting set of chosen nodes.We now compute an MST on W (relative to the given cost function c) and claim that this tree, say T = T (W ), has cost c(T ) at most 2ρ times the optimum z * ILP .More precisely, we show that Theorem 9.The tree T = T (W ) has cost at most 2ρ z * (v 1 ).

PROOF OF CORRECTNESS
Our crucial argument relies on the following result of [6] (extending earlier work of [9]): Consider fixed connectivity requirements r ij ≥ 0 (i, j ∈ V ) that are symmetric in the sense that r ij = r ji for all i, j ∈ V .Then, for D ⊆ V , the linear programs have all the same optimum value (independent of D ⊆ V ).This is referred to as the parsimonious property (as we can force the least possible value of x on each elementary cut δ({i})).
Thus we conclude that (relative to r ij as in ( 4)) the optimum value z * of (R ∅ ) -and hence of (R V )-is less than or equal to z * (v 1 ): (5) To prove Theorem 2, we are left to show that a min cost tree T in the (complete) subgraph induced by W has cost c(T ) ≤ 2ρ z.Lemma 10.A min cost tree T spanning all nodes in W has cost c(T ) ≤ 2ρ z.
Proof.The value of a min cost spanning tree in the (complete) subgraph (W, E(W )) induced by W is given by (cf.eg.Schrijver [15, (50.To prove the claim of the lemma, it suffices to show that a feasible (optimal) solution x ∈ R E of the LP in (5) (which obviously has its support contained in E(W )) gives rise to a feasible solution of (MST), implying that the optimal tree cost is First note that the equality constraints in (5) imply x(E) = x(E(W )) = We finally comment on the variants of GMST were the clusters are nondisjoint or where at least one node has to be selected per cluster.Our approximation algorithm applies to these variants as well.Indeed, for the first case no change is necessary.For the second variant, one has to solve O(2 ρ ) rooted relaxations (one for each possible choice of roots to ensure connectivity of supp(x) ⊆ E. All other arguments remain unchanged.
and only if G contains a node cover C. If G contains C, then OP T (I) = |E|.Moreover, if G does not contain C, then any generalized spanning tree of G should use at least one edge in E 3 and thus OP T (I) ≥ |E| − 1 + α(|E| + 1) > α|E|.
12)]) (MST) min c T x x(E(S)) ≤ |S| − 1, ∅ = S ⊆ W x(E(W )) = |W | − 1, x ≥ 0Here, E(W ) is the set of edges induced by W and E(S) is the set of edges induced by S ⊆ W . Furthermore, by slightly misusing the notation, the vectors x and c in (MST) are restricted to the edges in E(W ).
of a single node denoted by r, Theorem 4. V 2 , ..., V k+1 are identical node sets, each of which has |V | nodes corresponding to the nodes of V , and Theorem 5. |E| node sets, V k+2 , ..., V m , each of which contains a single node corresponding to an edge e ∈ E.