Pagefy

Pagefy

Back to Data Structures and Algorithms

Approximation Algorithms

Introduction to Algorithms

Chapter 35: Approximation Algorithms

Many problems of practical significance are NP-complete, yet too important to abandon. Even if a problem is NP-complete, we have at least three ways to cope:

  1. If actual inputs are small, an exponential-time algorithm may suffice
  2. Isolate important special cases solvable in polynomial time
  3. Find near-optimal solutions in polynomial time

An algorithm that returns near-optimal solutions is called an approximation algorithm.

Performance Ratios for Approximation Algorithms

For an optimization problem where each potential solution has a positive cost:

  • Let C = cost of the approximate solution
  • Let C* = cost of an optimal solution

An algorithm has an approximation ratio of ρ(n) if, for any input of size n:

$$\max\left(\frac{C}{C^},\ \frac{C^}{C}\right) \leq \rho(n)$$

  • A ρ(n)-approximation algorithm achieves approximation ratio ρ(n)
  • The ratio is always ≥ 1 (a 1-approximation algorithm produces an optimal solution)
  • For minimization: C/C* gives the factor by which the approximate cost exceeds optimal
  • For maximization: C*/C gives the factor by which optimal exceeds the approximate cost

Approximation Schemes

  • An approximation scheme takes an additional input ε > 0 and is a (1 + ε)-approximation algorithm for any fixed ε
  • A polynomial-time approximation scheme (PTAS): for any fixed ε > 0, runs in time polynomial in input size n
    • Example running time: O(n^{2/ε})
  • A fully polynomial-time approximation scheme (FPTAS): runs in time polynomial in both 1/ε and n
    • Example running time: O((1/ε)² · n³)
    • Any constant-factor decrease in ε yields only a constant-factor increase in running time

35.1 The Vertex-Cover Problem

A vertex cover of an undirected graph G = (V, E) is a subset V' ⊆ V such that for every edge (u, v) ∈ E, at least one of u or v is in V'. The vertex-cover problem is to find a vertex cover of minimum size (an optimal vertex cover).

This is the optimization version of an NP-complete decision problem.

Algorithm: APPROX-VERTEX-COVER

APPROX-VERTEX-COVER(G)
1  C = ∅
2  E' = G.E
3  while E' ≠ ∅
4      let (u, v) be an arbitrary edge of E'
5      C = C ∪ {u, v}
6      remove from E' every edge incident on either u or v
7  return C

How it works:

  • Repeatedly pick an arbitrary edge, add both endpoints to the cover
  • Remove all edges incident on those endpoints
  • Continue until all edges are covered

Running time: O(V + E) using adjacency lists.

Theorem 35.1

APPROX-VERTEX-COVER is a polynomial-time 2-approximation algorithm.

Proof sketch:

  • Let A = set of edges picked in line 4
  • A is a maximal matching — no two edges in A share an endpoint
  • Any vertex cover (including optimal C*) must include at least one endpoint of each edge in A:
    • |C*| ≥ |A| (lower bound)
  • Each picked edge adds exactly 2 vertices to C:
    • |C| = 2|A| (upper bound)
  • Combining: |C| = 2|A| ≤ 2|C*|

Key insight: We prove the bound by relating the solution to a lower bound on the optimal, without knowing the optimal itself.


35.2 The Traveling-Salesman Problem

Given a complete undirected graph G = (V, E) with nonnegative integer cost c(u, v) on each edge, find a hamiltonian cycle (tour) of minimum cost.

The triangle inequality states: for all vertices u, v, w ∈ V,

$$c(u, w) \leq c(u, v) + c(v, w)$$

This holds naturally in many applications (e.g., Euclidean distances).

35.2.1 TSP with the Triangle Inequality

Strategy: Compute a minimum spanning tree (MST) as a lower bound, then convert it to a tour.

APPROX-TSP-TOUR(G, c)
1  select a vertex r ∈ G.V to be a "root" vertex
2  compute a minimum spanning tree T for G from root r
     using MST-PRIM(G, c, r)
3  let H be a list of vertices, ordered according to when they
     are first visited in a preorder tree walk of T
4  return the hamiltonian cycle H

Running time: Θ(V²) with a simple MST-PRIM implementation.

Theorem 35.2

APPROX-TSP-TOUR is a polynomial-time 2-approximation algorithm for TSP with the triangle inequality.

Proof sketch:

  1. Let H* be an optimal tour. Deleting any edge from H* gives a spanning tree, so:
    • c(T) ≤ c(H*) — MST weight is a lower bound on optimal tour cost
  2. A full walk W of T traverses every edge exactly twice:
    • c(W) = 2·c(T)
  3. Combining: c(W) ≤ 2·c(H*)
  4. W is not a tour (visits vertices multiple times). By the triangle inequality, removing repeated visits does not increase cost
  5. The preorder walk H removes duplicates from W:
    • c(H) ≤ c(W) ≤ 2·c(H*)

Example walkthrough:

  • Full walk: a, b, c, b, h, b, a, d, e, f, e, g, e, d, a
  • Preorder walk (first visits only): a, b, c, h, d, e, f, g
  • The preorder tour cost ≈ 19.074 vs. optimal ≈ 14.715

35.2.2 The General TSP (Without Triangle Inequality)

Theorem 35.3

If P ≠ NP, then for any constant ρ ≥ 1, there is no polynomial-time ρ-approximation algorithm for the general TSP.

Proof sketch (by contradiction):

  1. Assume a ρ-approximation algorithm A exists
  2. Given a hamiltonian-cycle instance G = (V, E), construct a TSP instance G' = (V, E') where:
    • c(u, v) = 1 if (u, v) ∈ E
    • c(u, v) = ρ|V| + 1 otherwise
  3. If G has a hamiltonian cycle → optimal tour cost = |V|
  4. If G has no hamiltonian cycle → any tour costs > ρ|V|
  5. The gap (factor > ρ) means algorithm A would distinguish the two cases, solving the NP-complete hamiltonian-cycle problem in polynomial time

General technique: Show that "yes" instances of an NP-hard problem X map to instances of Y with value ≤ k, while "no" instances map to value > ρk. This proves no ρ-approximation exists (unless P = NP).


35.3 The Set-Covering Problem

An instance (X, F) consists of:

  • A finite set X (the universe)
  • A family F of subsets of X, where X = ∪_{S ∈ F} S

Goal: Find a minimum-size subset C ⊆ F that covers all of X (i.e., X = ∪_{S ∈ C} S).

The size of C is the number of sets it contains. The decision version is NP-complete.

Practical example: X = set of required skills, F = sets of skills each person has. Find the smallest committee covering all skills.

Greedy Approximation Algorithm

GREEDY-SET-COVER(X, F)
1  U = X
2  C = ∅
3  while U ≠ ∅
4      select an S ∈ F that maximizes |S ∩ U|
5      U = U - S
6      C = C ∪ {S}
7  return C

Strategy: At each step, pick the set covering the most uncovered elements (breaking ties arbitrarily).

Running time: O(|X| · |F| · min(|X|, |F|)) with a simple implementation; can be improved to O(Σ_{S ∈ F} |S|).

Theorem 35.4

GREEDY-SET-COVER is a polynomial-time ρ(n)-approximation algorithm, where:

$$\rho(n) = H(\max{|S| : S \in F})$$

Here H(d) = Σ_{i=1}^{d} 1/i is the d-th harmonic number.

Proof sketch:

  • Assign a cost of 1 to each set selected; distribute this cost evenly among newly covered elements
  • For element x first covered by set Sᵢ: cₓ = 1 / |Sᵢ - (S₁ ∪ ... ∪ Sᵢ₋₁)|
  • Total cost: |C| = Σ_{x ∈ X} cₓ
  • Since each x ∈ X belongs to at least one set in optimal C*:
    • |C| ≤ Σ_{S ∈ C*} Σ_{x ∈ S} cₓ
  • Key inequality: For any S ∈ F, Σ_{x ∈ S} cₓ ≤ H(|S|)
    • Proved by tracking uᵢ = |S - (S₁ ∪ ... ∪ Sᵢ)| (uncovered elements of S after step i)
    • The greedy choice ensures Sᵢ covers at least as many new elements as S would
    • The sum telescopes to H(|S|)
  • Therefore: |C| ≤ |C*| · H(max{|S| : S ∈ F})

Corollary 35.5

GREEDY-SET-COVER is a polynomial-time (ln |X| + 1)-approximation algorithm.

Special cases: When max{|S| : S ∈ F} is a small constant, the approximation ratio is a small constant. For example, finding a vertex cover in a graph with max degree 3 gives ratio H(3) = 11/6 ≈ 1.83.


35.4 Randomization and Linear Programming

Two powerful techniques for designing approximation algorithms.

Randomized Approximation for MAX-3-CNF Satisfiability

MAX-3-CNF satisfiability: Given a 3-CNF formula, find an assignment maximizing the number of satisfied clauses.

Algorithm: Independently set each variable to 1 with probability 1/2 and to 0 with probability 1/2.

A randomized ρ(n)-approximation algorithm satisfies:

$$\max\left(\frac{C}{C^},\ \frac{C^}{C}\right) \leq \rho(n)$$

where C is the expected cost of the solution.

Theorem 35.6

The random assignment algorithm is a randomized 8/7-approximation algorithm for MAX-3-CNF satisfiability.

Proof:

  • Each clause has exactly 3 distinct literals; no clause contains both a variable and its negation
  • A clause is unsatisfied only if all 3 literals are set to 0
  • Pr[clause i is not satisfied] = (1/2)³ = 1/8
  • Pr[clause i is satisfied] = 7/8
  • Let Y = total number of satisfied clauses:
    • E[Y] = 7m/8 (by linearity of expectation)
  • Since m is an upper bound on the optimum:
    • Approximation ratio ≤ m / (7m/8) = 8/7

Approximating Weighted Vertex Cover Using Linear Programming

Minimum-weight vertex-cover problem: Given G = (V, E) with positive vertex weights w(v), find a vertex cover of minimum total weight.

Step 1: Formulate as a 0-1 integer program:

minimize   Σ_{v ∈ V} w(v) · x(v)
subject to x(u) + x(v) ≥ 1    for each (u, v) ∈ E
           x(v) ∈ {0, 1}       for each v ∈ V

Step 2: Relax to a linear program (replace x(v) ∈ {0, 1} with 0 ≤ x(v) ≤ 1):

minimize   Σ_{v ∈ V} w(v) · x(v)
subject to x(u) + x(v) ≥ 1    for each (u, v) ∈ E
           0 ≤ x(v) ≤ 1       for each v ∈ V

The LP optimal value is a lower bound on the integer program (and hence on the optimal vertex cover weight).

Step 3: Round the LP solution:

APPROX-MIN-WEIGHT-VC(G, w)
1  C = ∅
2  compute x̄, an optimal solution to the LP relaxation
3  for each v ∈ V
4      if x̄(v) ≥ 1/2
5          C = C ∪ {v}
6  return C

Theorem 35.7

APPROX-MIN-WEIGHT-VC is a polynomial-time 2-approximation algorithm for minimum-weight vertex cover.

Proof:

  1. C is a valid vertex cover: For any edge (u, v), the LP constraint gives x̄(u) + x̄(v) ≥ 1, so at least one of them is ≥ 1/2 and gets included
  2. Weight bound: Let z* = optimal LP value. Then:
    • z* ≥ Σ_{v: x̄(v) ≥ 1/2} w(v) · x̄(v) ≥ Σ_{v: x̄(v) ≥ 1/2} w(v) · (1/2) = w(C)/2
    • So w(C) ≤ 2z*
  3. Since z* ≤ w(C*) (LP relaxation lower bound):
    • w(C) ≤ 2z* ≤ 2w(C*)

35.5 The Subset-Sum Problem

Instance: A set S = {x₁, x₂, ..., xₙ} of positive integers and a target value t.

Optimization version: Find a subset of S whose sum is as large as possible but not larger than t.

Practical example: A truck with weight limit t and n boxes of various weights — maximize the load.

Exponential-Time Exact Algorithm

Key idea: Iteratively compute Lᵢ, the sorted list of all subset sums of {x₁, ..., xᵢ} that don't exceed t.

Let L + x denote the list obtained by adding x to every element of L.

EXACT-SUBSET-SUM(S, t)
1  n = |S|
2  L₀ = ⟨0⟩
3  for i = 1 to n
4      Lᵢ = MERGE-LISTS(Lᵢ₋₁, Lᵢ₋₁ + xᵢ)
5      remove from Lᵢ every element that is greater than t
6  return the largest element in Lₙ
  • MERGE-LISTS(L, L'): Merges two sorted lists, removing duplicates, in O(|L| + |L'|) time
  • Uses the identity: Pᵢ = Pᵢ₋₁ ∪ (Pᵢ₋₁ + xᵢ), where Pᵢ is the set of all achievable sums from {x₁, ..., xᵢ}
  • |Lᵢ| can be as large as 2ⁱ → exponential time in general

Fully Polynomial-Time Approximation Scheme

Key idea: Trim each list Lᵢ after creation. If two values are close, keep only one representative.

Trimming: Given a list L and parameter δ (0 < δ < 1), remove elements so that for every removed y, some remaining z satisfies:

$$\frac{y}{1 + \delta} \leq z \leq y$$

TRIM(L, δ)
1  let m be the length of L
2  L' = ⟨y₁⟩
3  last = y₁
4  for i = 2 to m
5      if yᵢ > last · (1 + δ)
6          append yᵢ onto the end of L'
7          last = yᵢ
8  return L'

Running time: Θ(m) for a list of length m.

The Approximation Scheme

APPROX-SUBSET-SUM(S, t, ε)
1  n = |S|
2  L₀ = ⟨0⟩
3  for i = 1 to n
4      Lᵢ = MERGE-LISTS(Lᵢ₋₁, Lᵢ₋₁ + xᵢ)
5      Lᵢ = TRIM(Lᵢ, ε/2n)
6      remove from Lᵢ every element that is greater than t
7  let z* be the largest value in Lₙ
8  return z*

Example: S = ⟨104, 102, 201, 101⟩, t = 308, ε = 0.40, δ = ε/8 = 0.05

StepList
L₀⟨0⟩
L₁ (after trim)⟨0, 104⟩
L₂ (after trim)⟨0, 102, 206⟩
L₃ (after trim & remove)⟨0, 102, 201, 303⟩
L₄ (after trim & remove)⟨0, 101, 201, 302⟩

Returns z* = 302, which is within 2% of optimal 307 = 104 + 102 + 101 (well within ε = 40%).

Theorem 35.8

APPROX-SUBSET-SUM is a fully polynomial-time approximation scheme for the subset-sum problem.

Proof of approximation ratio:

  • For every element y in Pᵢ that is ≤ t, there exists z ∈ Lᵢ such that:
    • y / (1 + ε/2n)ⁱ ≤ z ≤ y
  • For the optimal y* ∈ Pₙ and the returned z*:
    • y* / z* ≤ (1 + ε/2n)ⁿ ≤ e^{ε/2} ≤ 1 + ε (using the fact that ε < 1)

Proof of polynomial running time:

  • After trimming, successive elements differ by a factor of at least (1 + ε/2n)
  • Each list Lᵢ has at most O(n·ln(t)/ε) elements
  • This is polynomial in both the input size and 1/ε
  • Total running time is polynomial in n, lg t, and 1/ε

Summary of Approximation Results

ProblemAlgorithmApproximation RatioTime
Vertex CoverAPPROX-VERTEX-COVER2O(V + E)
TSP (triangle inequality)APPROX-TSP-TOUR2Θ(V²)
General TSPNo constant ratio (unless P = NP)
Set CoverGREEDY-SET-COVERH(max|S|) ≤ ln|X| + 1O(|X|·|F|·min(|X|,|F|))
MAX-3-CNF SATRandom assignment8/7 (expected)O(n)
Weighted Vertex CoverAPPROX-MIN-WEIGHT-VC (LP rounding)2Poly (LP solve)
Subset SumAPPROX-SUBSET-SUM(1 + ε) — FPTASPoly in n, lg t, 1/ε

Previous chapter

NP Completeness