Pagefy

Book

Introduction to Algorithms

1.The Role of Algorithms in Computing 2.Getting Started 3.Growth of Functions 4.Divide and Conquer 5.Probabilistic Analysis and Randomized Algorithms 6.Heapsort 7.Quicksort 8.Sorting in Linear Time 9.Medians and Order Statistics 10.Elementary Data Structures 11.Hash Tables 12.Binary Search Trees 13.Red Black Trees 14.Augmenting Data Structures 15.Dynamic Programming 16.Greedy Algorithms 17.Amortized Analysis 18.B Trees 19.Fibonacci Heaps 20.van Emde Boas Trees 21.Data Structures for Disjoint Sets 22.Elementary Graph Algorithms 23.Minimum Spanning Trees 24.Single Source Shortest Paths 25.All Pairs Shortest Paths 26.Maximum Flow 27.Multithreaded Algorithms 28.Matrix Operations 29.Linear Programming 30.Polynomials and the FFT 31.Number Theoretic Algorithms 32.String Matching 33.Computational Geometry 34.NP Completeness 35.Approximation Algorithms 36.Appendix A Summations 37.Appendix B Sets Etc 38.Appendix C Counting and Probability 39.Appendix D Matrices

Approximation Algorithms

Introduction to Algorithms

Chapter 35: Approximation Algorithms

Many problems of practical significance are NP-complete, yet too important to abandon. Even if a problem is NP-complete, we have at least three ways to cope:

If actual inputs are small, an exponential-time algorithm may suffice
Isolate important special cases solvable in polynomial time
Find near-optimal solutions in polynomial time

An algorithm that returns near-optimal solutions is called an approximation algorithm.

Performance Ratios for Approximation Algorithms

For an optimization problem where each potential solution has a positive cost:

Let C = cost of the approximate solution
Let C* = cost of an optimal solution

An algorithm has an approximation ratio of ρ(n) if, for any input of size n:

$$\max\left(\frac{C}{C^},\ \frac{C^}{C}\right) \leq \rho(n)$$

A ρ(n)-approximation algorithm achieves approximation ratio ρ(n)
The ratio is always ≥ 1 (a 1-approximation algorithm produces an optimal solution)
For minimization: C/C* gives the factor by which the approximate cost exceeds optimal
For maximization: C*/C gives the factor by which optimal exceeds the approximate cost

Approximation Schemes

An approximation scheme takes an additional input ε > 0 and is a (1 + ε)-approximation algorithm for any fixed ε
A polynomial-time approximation scheme (PTAS): for any fixed ε > 0, runs in time polynomial in input size n
- Example running time: O(n^{2/ε})
A fully polynomial-time approximation scheme (FPTAS): runs in time polynomial in both 1/ε and n
- Example running time: O((1/ε)² · n³)
- Any constant-factor decrease in ε yields only a constant-factor increase in running time

35.1 The Vertex-Cover Problem

A vertex cover of an undirected graph G = (V, E) is a subset V' ⊆ V such that for every edge (u, v) ∈ E, at least one of u or v is in V'. The vertex-cover problem is to find a vertex cover of minimum size (an optimal vertex cover).

This is the optimization version of an NP-complete decision problem.

Algorithm: APPROX-VERTEX-COVER

APPROX-VERTEX-COVER(G)
1  C = ∅
2  E' = G.E
3  while E' ≠ ∅
4      let (u, v) be an arbitrary edge of E'
5      C = C ∪ {u, v}
6      remove from E' every edge incident on either u or v
7  return C

How it works:

Repeatedly pick an arbitrary edge, add both endpoints to the cover
Remove all edges incident on those endpoints
Continue until all edges are covered

Running time: O(V + E) using adjacency lists.

Theorem 35.1

APPROX-VERTEX-COVER is a polynomial-time 2-approximation algorithm.

Proof sketch:

Let A = set of edges picked in line 4
A is a maximal matching — no two edges in A share an endpoint
Any vertex cover (including optimal C*) must include at least one endpoint of each edge in A:
- |C*| ≥ |A| (lower bound)
Each picked edge adds exactly 2 vertices to C:
- |C| = 2|A| (upper bound)
Combining: |C| = 2|A| ≤ 2|C*|

Key insight: We prove the bound by relating the solution to a lower bound on the optimal, without knowing the optimal itself.

35.2 The Traveling-Salesman Problem

Given a complete undirected graph G = (V, E) with nonnegative integer cost c(u, v) on each edge, find a hamiltonian cycle (tour) of minimum cost.

The triangle inequality states: for all vertices u, v, w ∈ V,

$$c(u, w) \leq c(u, v) + c(v, w)$$

This holds naturally in many applications (e.g., Euclidean distances).

35.2.1 TSP with the Triangle Inequality

Strategy: Compute a minimum spanning tree (MST) as a lower bound, then convert it to a tour.

APPROX-TSP-TOUR(G, c)
1  select a vertex r ∈ G.V to be a "root" vertex
2  compute a minimum spanning tree T for G from root r
     using MST-PRIM(G, c, r)
3  let H be a list of vertices, ordered according to when they
     are first visited in a preorder tree walk of T
4  return the hamiltonian cycle H

Running time: Θ(V²) with a simple MST-PRIM implementation.

Theorem 35.2

APPROX-TSP-TOUR is a polynomial-time 2-approximation algorithm for TSP with the triangle inequality.

Proof sketch:

Let H* be an optimal tour. Deleting any edge from H* gives a spanning tree, so:
- c(T) ≤ c(H*) — MST weight is a lower bound on optimal tour cost
A full walk W of T traverses every edge exactly twice:
- c(W) = 2·c(T)
Combining: c(W) ≤ 2·c(H*)
W is not a tour (visits vertices multiple times). By the triangle inequality, removing repeated visits does not increase cost
The preorder walk H removes duplicates from W:
- c(H) ≤ c(W) ≤ 2·c(H*)

Example walkthrough:

Full walk: a, b, c, b, h, b, a, d, e, f, e, g, e, d, a
Preorder walk (first visits only): a, b, c, h, d, e, f, g
The preorder tour cost ≈ 19.074 vs. optimal ≈ 14.715

35.2.2 The General TSP (Without Triangle Inequality)

Theorem 35.3

If P ≠ NP, then for any constant ρ ≥ 1, there is no polynomial-time ρ-approximation algorithm for the general TSP.

Proof sketch (by contradiction):

Assume a ρ-approximation algorithm A exists
Given a hamiltonian-cycle instance G = (V, E), construct a TSP instance G' = (V, E') where:
- c(u, v) = 1 if (u, v) ∈ E
- c(u, v) = ρ|V| + 1 otherwise
If G has a hamiltonian cycle → optimal tour cost = |V|
If G has no hamiltonian cycle → any tour costs > ρ|V|
The gap (factor > ρ) means algorithm A would distinguish the two cases, solving the NP-complete hamiltonian-cycle problem in polynomial time

General technique: Show that "yes" instances of an NP-hard problem X map to instances of Y with value ≤ k, while "no" instances map to value > ρk. This proves no ρ-approximation exists (unless P = NP).

35.3 The Set-Covering Problem

An instance (X, F) consists of:

A finite set X (the universe)
A family F of subsets of X, where X = ∪_{S ∈ F} S

Goal: Find a minimum-size subset C ⊆ F that covers all of X (i.e., X = ∪_{S ∈ C} S).

The size of C is the number of sets it contains. The decision version is NP-complete.

Practical example: X = set of required skills, F = sets of skills each person has. Find the smallest committee covering all skills.

Greedy Approximation Algorithm

GREEDY-SET-COVER(X, F)
1  U = X
2  C = ∅
3  while U ≠ ∅
4      select an S ∈ F that maximizes |S ∩ U|
5      U = U - S
6      C = C ∪ {S}
7  return C

Strategy: At each step, pick the set covering the most uncovered elements (breaking ties arbitrarily).

Running time: O(|X| · |F| · min(|X|, |F|)) with a simple implementation; can be improved to O(Σ_{S ∈ F} |S|).

Theorem 35.4

GREEDY-SET-COVER is a polynomial-time ρ(n)-approximation algorithm, where:

$$\rho(n) = H(\max{|S| : S \in F})$$

Here H(d) = Σ_{i=1}^{d} 1/i is the d-th harmonic number.

Proof sketch:

Assign a cost of 1 to each set selected; distribute this cost evenly among newly covered elements
For element x first covered by set Sᵢ: cₓ = 1 / |Sᵢ - (S₁ ∪ ... ∪ Sᵢ₋₁)|
Total cost: |C| = Σ_{x ∈ X} cₓ
Since each x ∈ X belongs to at least one set in optimal C*:
- |C| ≤ Σ_{S ∈ C*} Σ_{x ∈ S} cₓ
Key inequality: For any S ∈ F, Σ_{x ∈ S} cₓ ≤ H(|S|)
- Proved by tracking uᵢ = |S - (S₁ ∪ ... ∪ Sᵢ)| (uncovered elements of S after step i)
- The greedy choice ensures Sᵢ covers at least as many new elements as S would
- The sum telescopes to H(|S|)
Therefore: |C| ≤ |C*| · H(max{|S| : S ∈ F})

Corollary 35.5

GREEDY-SET-COVER is a polynomial-time (ln |X| + 1)-approximation algorithm.

Special cases: When max{|S| : S ∈ F} is a small constant, the approximation ratio is a small constant. For example, finding a vertex cover in a graph with max degree 3 gives ratio H(3) = 11/6 ≈ 1.83.

35.4 Randomization and Linear Programming

Two powerful techniques for designing approximation algorithms.

Randomized Approximation for MAX-3-CNF Satisfiability

MAX-3-CNF satisfiability: Given a 3-CNF formula, find an assignment maximizing the number of satisfied clauses.

Algorithm: Independently set each variable to 1 with probability 1/2 and to 0 with probability 1/2.

A randomized ρ(n)-approximation algorithm satisfies:

$$\max\left(\frac{C}{C^},\ \frac{C^}{C}\right) \leq \rho(n)$$

where C is the expected cost of the solution.

Theorem 35.6

The random assignment algorithm is a randomized 8/7-approximation algorithm for MAX-3-CNF satisfiability.

Proof:

Each clause has exactly 3 distinct literals; no clause contains both a variable and its negation
A clause is unsatisfied only if all 3 literals are set to 0
Pr[clause i is not satisfied] = (1/2)³ = 1/8
Pr[clause i is satisfied] = 7/8
Let Y = total number of satisfied clauses:
- E[Y] = 7m/8 (by linearity of expectation)
Since m is an upper bound on the optimum:
- Approximation ratio ≤ m / (7m/8) = 8/7

Approximating Weighted Vertex Cover Using Linear Programming

Minimum-weight vertex-cover problem: Given G = (V, E) with positive vertex weights w(v), find a vertex cover of minimum total weight.

Step 1: Formulate as a 0-1 integer program:

minimize   Σ_{v ∈ V} w(v) · x(v)
subject to x(u) + x(v) ≥ 1    for each (u, v) ∈ E
           x(v) ∈ {0, 1}       for each v ∈ V

Step 2: Relax to a linear program (replace x(v) ∈ {0, 1} with 0 ≤ x(v) ≤ 1):

minimize   Σ_{v ∈ V} w(v) · x(v)
subject to x(u) + x(v) ≥ 1    for each (u, v) ∈ E
           0 ≤ x(v) ≤ 1       for each v ∈ V

The LP optimal value is a lower bound on the integer program (and hence on the optimal vertex cover weight).

Step 3: Round the LP solution:

APPROX-MIN-WEIGHT-VC(G, w)
1  C = ∅
2  compute x̄, an optimal solution to the LP relaxation
3  for each v ∈ V
4      if x̄(v) ≥ 1/2
5          C = C ∪ {v}
6  return C

Theorem 35.7

APPROX-MIN-WEIGHT-VC is a polynomial-time 2-approximation algorithm for minimum-weight vertex cover.

Proof:

C is a valid vertex cover: For any edge (u, v), the LP constraint gives x̄(u) + x̄(v) ≥ 1, so at least one of them is ≥ 1/2 and gets included
Weight bound: Let z* = optimal LP value. Then:
- z* ≥ Σ_{v: x̄(v) ≥ 1/2} w(v) · x̄(v) ≥ Σ_{v: x̄(v) ≥ 1/2} w(v) · (1/2) = w(C)/2
- So w(C) ≤ 2z*
Since z* ≤ w(C*) (LP relaxation lower bound):
- w(C) ≤ 2z* ≤ 2w(C*)

35.5 The Subset-Sum Problem

Instance: A set S = {x₁, x₂, ..., xₙ} of positive integers and a target value t.

Optimization version: Find a subset of S whose sum is as large as possible but not larger than t.

Practical example: A truck with weight limit t and n boxes of various weights — maximize the load.

Exponential-Time Exact Algorithm

Key idea: Iteratively compute Lᵢ, the sorted list of all subset sums of {x₁, ..., xᵢ} that don't exceed t.

Let L + x denote the list obtained by adding x to every element of L.

EXACT-SUBSET-SUM(S, t)
1  n = |S|
2  L₀ = ⟨0⟩
3  for i = 1 to n
4      Lᵢ = MERGE-LISTS(Lᵢ₋₁, Lᵢ₋₁ + xᵢ)
5      remove from Lᵢ every element that is greater than t
6  return the largest element in Lₙ

MERGE-LISTS(L, L'): Merges two sorted lists, removing duplicates, in O(|L| + |L'|) time
Uses the identity: Pᵢ = Pᵢ₋₁ ∪ (Pᵢ₋₁ + xᵢ), where Pᵢ is the set of all achievable sums from {x₁, ..., xᵢ}
|Lᵢ| can be as large as 2ⁱ → exponential time in general

Fully Polynomial-Time Approximation Scheme

Key idea: Trim each list Lᵢ after creation. If two values are close, keep only one representative.

Trimming: Given a list L and parameter δ (0 < δ < 1), remove elements so that for every removed y, some remaining z satisfies:

$$\frac{y}{1 + \delta} \leq z \leq y$$

TRIM(L, δ)
1  let m be the length of L
2  L' = ⟨y₁⟩
3  last = y₁
4  for i = 2 to m
5      if yᵢ > last · (1 + δ)
6          append yᵢ onto the end of L'
7          last = yᵢ
8  return L'

Running time: Θ(m) for a list of length m.

The Approximation Scheme

APPROX-SUBSET-SUM(S, t, ε)
1  n = |S|
2  L₀ = ⟨0⟩
3  for i = 1 to n
4      Lᵢ = MERGE-LISTS(Lᵢ₋₁, Lᵢ₋₁ + xᵢ)
5      Lᵢ = TRIM(Lᵢ, ε/2n)
6      remove from Lᵢ every element that is greater than t
7  let z* be the largest value in Lₙ
8  return z*

Example: S = ⟨104, 102, 201, 101⟩, t = 308, ε = 0.40, δ = ε/8 = 0.05

Step	List
L₀	⟨0⟩
L₁ (after trim)	⟨0, 104⟩
L₂ (after trim)	⟨0, 102, 206⟩
L₃ (after trim & remove)	⟨0, 102, 201, 303⟩
L₄ (after trim & remove)	⟨0, 101, 201, 302⟩

Returns z* = 302, which is within 2% of optimal 307 = 104 + 102 + 101 (well within ε = 40%).

Theorem 35.8

APPROX-SUBSET-SUM is a fully polynomial-time approximation scheme for the subset-sum problem.

Proof of approximation ratio:

For every element y in Pᵢ that is ≤ t, there exists z ∈ Lᵢ such that:
- y / (1 + ε/2n)ⁱ ≤ z ≤ y
For the optimal y* ∈ Pₙ and the returned z*:
- y* / z* ≤ (1 + ε/2n)ⁿ ≤ e^{ε/2} ≤ 1 + ε (using the fact that ε < 1)

Proof of polynomial running time:

After trimming, successive elements differ by a factor of at least (1 + ε/2n)
Each list Lᵢ has at most O(n·ln(t)/ε) elements
This is polynomial in both the input size and 1/ε
Total running time is polynomial in n, lg t, and 1/ε

Summary of Approximation Results

Problem	Algorithm	Approximation Ratio	Time
Vertex Cover	APPROX-VERTEX-COVER	2	O(V + E)
TSP (triangle inequality)	APPROX-TSP-TOUR	2	Θ(V²)
General TSP	—	No constant ratio (unless P = NP)	—
Set Cover	GREEDY-SET-COVER	H(max\|S\|) ≤ ln\|X\| + 1	O(\|X\|·\|F\|·min(\|X\|,\|F\|))
MAX-3-CNF SAT	Random assignment	8/7 (expected)	O(n)
Weighted Vertex Cover	APPROX-MIN-WEIGHT-VC (LP rounding)	2	Poly (LP solve)
Subset Sum	APPROX-SUBSET-SUM	(1 + ε) — FPTAS	Poly in n, lg t, 1/ε

Previous chapter

NP Completeness

Next chapter

Appendix A Summations