Pagefy

Book

Introduction to Algorithms

1.The Role of Algorithms in Computing 2.Getting Started 3.Growth of Functions 4.Divide and Conquer 5.Probabilistic Analysis and Randomized Algorithms 6.Heapsort 7.Quicksort 8.Sorting in Linear Time 9.Medians and Order Statistics 10.Elementary Data Structures 11.Hash Tables 12.Binary Search Trees 13.Red Black Trees 14.Augmenting Data Structures 15.Dynamic Programming 16.Greedy Algorithms 17.Amortized Analysis 18.B Trees 19.Fibonacci Heaps 20.van Emde Boas Trees 21.Data Structures for Disjoint Sets 22.Elementary Graph Algorithms 23.Minimum Spanning Trees 24.Single Source Shortest Paths 25.All Pairs Shortest Paths 26.Maximum Flow 27.Multithreaded Algorithms 28.Matrix Operations 29.Linear Programming 30.Polynomials and the FFT 31.Number Theoretic Algorithms 32.String Matching 33.Computational Geometry 34.NP Completeness 35.Approximation Algorithms 36.Appendix A Summations 37.Appendix B Sets Etc 38.Appendix C Counting and Probability 39.Appendix D Matrices

Greedy Algorithms

Introduction to Algorithms

Chapter 16: Greedy Algorithms

A greedy algorithm always makes the choice that looks best at the moment — it makes a locally optimal choice in the hope that this leads to a globally optimal solution.

Greedy algorithms do not always yield optimal solutions, but for many problems they do.
They are typically simpler and more efficient than dynamic programming when applicable.
Key applications: minimum spanning trees (Ch. 23), Dijkstra's shortest paths (Ch. 24), Chvátal's set-covering heuristic (Ch. 35).

16.1 An Activity-Selection Problem

Problem Definition

Given a set S = {a₁, a₂, ..., aₙ} of n proposed activities that require exclusive use of a common resource (e.g., a lecture hall):

Each activity aᵢ has a start time sᵢ and a finish time fᵢ, where 0 ≤ sᵢ < fᵢ < ∞.
Activity aᵢ occupies the half-open interval [sᵢ, fᵢ).
Two activities aᵢ and aⱼ are compatible if their intervals do not overlap: sᵢ ≥ fⱼ or sⱼ ≥ fᵢ.
Goal: Select a maximum-size subset of mutually compatible activities.

Assumption: Activities are sorted in monotonically increasing order of finish time:

f₁ ≤ f₂ ≤ f₃ ≤ ... ≤ fₙ₋₁ ≤ fₙ

Example:

i	1	2	3	4	5	6	7	8	9	10	11
sᵢ	1	3	0	5	3	5	6	8	8	2	12
fᵢ	4	5	6	7	9	9	10	11	12	14	16

{a₃, a₉, a₁₁} is a compatible set but not maximum.
{a₁, a₄, a₈, a₁₁} is a maximum compatible set (size 4).
{a₂, a₄, a₉, a₁₁} is another maximum compatible set.

Optimal Substructure

Define Sᵢⱼ = set of activities that start after aᵢ finishes and finish before aⱼ starts.

If a maximum-size set Aᵢⱼ of mutually compatible activities in Sᵢⱼ includes activity aₖ, then:

Aᵢₖ = activities in Aᵢⱼ that finish before aₖ starts
Aₖⱼ = activities in Aᵢⱼ that start after aₖ finishes
Aᵢⱼ = Aᵢₖ ∪ {aₖ} ∪ Aₖⱼ
|Aᵢⱼ| = |Aᵢₖ| + |Aₖⱼ| + 1

DP recurrence (if we didn't know the greedy shortcut):

c[i, j] = 0                                    if Sᵢⱼ = ∅
c[i, j] = max over aₖ ∈ Sᵢⱼ { c[i,k] + c[k,j] + 1 }   if Sᵢⱼ ≠ ∅

Making the Greedy Choice

Key insight: Choose the activity that leaves the resource available for as many other activities as possible — pick the activity with the earliest finish time.

Since activities are sorted by finish time, the greedy choice is always a₁.
After choosing a₁, only one subproblem remains: find activities that start after a₁ finishes.
No activities can finish before a₁ starts (since f₁ is the earliest finish time).

Theorem 16.1: Consider any nonempty subproblem Sₖ, and let aₘ be the activity in Sₖ with the earliest finish time. Then aₘ is included in some maximum-size subset of mutually compatible activities of Sₖ.

Proof sketch: Take any optimal solution Aₖ. If its earliest-finishing activity aⱼ ≠ aₘ, swap aⱼ for aₘ. Since fₘ ≤ fⱼ, all activities remain compatible, and the set size is unchanged. So the modified set is also optimal and contains aₘ.

Recursive Greedy Algorithm

Add a fictitious activity a₀ with f₀ = 0. Initial call: RECURSIVE-ACTIVITY-SELECTOR(s, f, 0, n).

RECURSIVE-ACTIVITY-SELECTOR(s, f, k, n)
1   m = k + 1
2   while m ≤ n and s[m] < f[k]       // find first activity in Sₖ to finish
3       m = m + 1
4   if m ≤ n
5       return {aₘ} ∪ RECURSIVE-ACTIVITY-SELECTOR(s, f, m, n)
6   else return ∅

Running time: Θ(n) — each activity is examined exactly once across all recursive calls.

Iterative Greedy Algorithm

GREEDY-ACTIVITY-SELECTOR(s, f)
1   n = s.length
2   A = {a₁}
3   k = 1
4   for m = 2 to n
5       if s[m] ≥ f[k]
6           A = A ∪ {aₘ}
7           k = m
8   return A

Variable k tracks the most recent addition to A.
f[k] is always the maximum finish time of any activity in A.
To check compatibility, it suffices to verify s[m] ≥ f[k].
Running time: Θ(n), assuming activities are pre-sorted by finish time.

Design pattern: Greedy algorithms work top-down — make a choice, then solve the remaining subproblem. This contrasts with DP's bottom-up approach of solving subproblems before making choices.

16.2 Elements of the Greedy Strategy

Developing a Greedy Algorithm (Detailed Steps)

Determine the optimal substructure of the problem.
Develop a recursive solution.
Show that if we make the greedy choice, only one subproblem remains.
Prove that it is always safe to make the greedy choice.
Develop a recursive algorithm implementing the greedy strategy.
Convert the recursive algorithm to an iterative algorithm.

Streamlined Design Process

Cast the optimization problem as one where we make a choice and are left with one subproblem to solve.
Prove there is always an optimal solution that makes the greedy choice (greedy choice is safe).
Demonstrate optimal substructure: greedy choice + optimal subproblem solution = optimal overall solution.

Greedy-Choice Property

We can assemble a globally optimal solution by making locally optimal (greedy) choices.
The choice does not depend on solutions to subproblems (unlike DP).
Greedy: make a choice first, then solve the remaining subproblem (top-down).
DP: solve subproblems first, then make a choice (bottom-up).

Proof technique: Examine a globally optimal solution, then show how to modify it to include the greedy choice without worsening the solution (exchange argument).

Efficiency: Preprocessing input or using a priority queue often enables making greedy choices quickly.

Optimal Substructure

An optimal solution to the problem contains optimal solutions to subproblems.
Shared ingredient with dynamic programming.
For greedy algorithms: after making the greedy choice, argue that an optimal subproblem solution combined with the greedy choice yields an optimal overall solution.
This implicitly uses induction on subproblems.

Greedy vs. Dynamic Programming

0-1 Knapsack Problem (requires DP):

n items, item i worth vᵢ dollars, weighs wᵢ pounds.
Knapsack capacity W pounds. Must take whole items or leave them.
Greedy by value-per-pound fails: e.g., items (10 lb, $60), (20 lb, $100), (30 lb, $120) with W = 50.
- Greedy takes item 1 first ($6/lb) → $60 + $100 = $160 or $60 + $120 = $180.
- Optimal: items 2 + 3 = $220.
Problem has overlapping subproblems → use DP.

Fractional Knapsack Problem (greedy works):

Same setup, but can take fractions of items.
Greedy strategy: sort by value per pound (vᵢ/wᵢ), take greedily.
Running time: O(n lg n) for sorting.
Works because taking a fraction doesn't waste capacity.

Key difference: In 0-1 knapsack, taking an item may leave wasted capacity, lowering effective value per pound. In fractional knapsack, the thief can always fill the knapsack completely.

16.3 Huffman Codes

Overview

Huffman codes compress data by using variable-length binary codewords:

Frequent characters get short codewords.
Infrequent characters get long codewords.
Typical savings: 20% to 90%.

Example:

Character	a	b	c	d	e	f
Frequency (thousands)	45	13	12	16	9	5
Fixed-length code	000	001	010	011	100	101
Variable-length code	0	101	100	111	1101	1100

Fixed-length: 300,000 bits for 100,000 characters.
Variable-length (Huffman): (45×1 + 13×3 + 12×3 + 16×3 + 9×4 + 5×4) × 1,000 = 224,000 bits (~25% savings).

Prefix Codes

A prefix code (or prefix-free code) is one where no codeword is a prefix of another codeword.

Encoding: Concatenate codewords. E.g., "abc" → 0·101·100 = 0101100.
Decoding: Unambiguous — identify the initial codeword, translate, repeat. E.g., 001011101 → 0|0|101|1101 → aabe.
Prefix codes achieve optimal compression among all character codes.

Tree Representation

Represent a prefix code as a binary tree where leaves are characters.
Path from root to leaf gives the codeword: 0 = left child, 1 = right child.
An optimal code corresponds to a full binary tree (every internal node has exactly 2 children).
For alphabet C with all positive frequencies: |C| leaves, |C| − 1 internal nodes.

Cost of tree T:

B(T) = Σ over c ∈ C of c.freq × d_T(c)

where d_T(c) is the depth of character c's leaf (= length of its codeword).

Constructing a Huffman Code

The algorithm builds the tree bottom-up using a min-priority queue Q keyed on frequency.

HUFFMAN(C)
1   n = |C|
2   Q = C
3   for i = 1 to n - 1
4       allocate a new node z
5       z.left = x = EXTRACT-MIN(Q)
6       z.right = y = EXTRACT-MIN(Q)
7       z.freq = x.freq + y.freq
8       INSERT(Q, z)
9   return EXTRACT-MIN(Q)          // return the root of the tree

How it works:

Initialize Q with all characters.
Repeat n − 1 times: extract the two lowest-frequency nodes x and y, create a new internal node z with z.freq = x.freq + y.freq, insert z back into Q.
The last node in Q is the root of the Huffman tree.

Running time:

Building the initial min-heap: O(n).
Loop executes n − 1 times, each iteration does O(lg n) heap operations.
Total: O(n lg n).
Can be reduced to O(n lg lg n) using a van Emde Boas tree.

Correctness of Huffman's Algorithm

Lemma 16.2 (Greedy-choice property): Let x and y be two characters in C with the lowest frequencies. Then there exists an optimal prefix code in which x and y have the same length codewords and differ only in the last bit (i.e., they are siblings at maximum depth).

Proof idea: Take any optimal tree T. Let a, b be sibling leaves at maximum depth. Swap a with x, then b with y. Each swap does not increase cost (since x, y have minimum frequency and a, b are at maximum depth). The resulting tree T'' is optimal with x, y as siblings at maximum depth.

Lemma 16.3 (Optimal substructure): Let x, y be two minimum-frequency characters. Let C' = C − {x, y} ∪ {z} where z.freq = x.freq + y.freq. If T' is an optimal tree for C', then the tree T obtained by replacing z's leaf with an internal node having x and y as children is optimal for C.

Key relationship:

B(T) = B(T') + x.freq + y.freq

Proof: By contradiction — if T were not optimal for C, we could construct a tree for C' with cost less than B(T'), contradicting T' being optimal.

Theorem 16.4: Procedure HUFFMAN produces an optimal prefix code. (Follows directly from Lemmas 16.2 and 16.3.)

16.4 Matroids and Greedy Methods

Definition of a Matroid

A matroid is an ordered pair M = (S, 𝓘) satisfying:

S is a finite set.
𝓘 is a nonempty family of subsets of S, called independent subsets, such that if B ∈ 𝓘 and A ⊆ B, then A ∈ 𝓘. (𝓘 is hereditary.) Note: ∅ ∈ 𝓘.
Exchange property: If A ∈ 𝓘, B ∈ 𝓘, and |A| < |B|, then ∃ x ∈ B − A such that A ∪ {x} ∈ 𝓘.

Graphic Matroid

Given an undirected graph G = (V, E), the graphic matroid M_G = (S_G, 𝓘_G):

S_G = E (the edge set).
A ∈ 𝓘_G if and only if A is acyclic (forms a forest).

Theorem 16.5: M_G is a matroid.

Proof sketch:

E is finite. ✓
Hereditary: a subset of a forest is a forest. ✓
Exchange property: A forest F = (V_F, E_F) contains exactly |V_F| − |E_F| trees. If |B| > |A|, forest G_B has fewer trees than G_A, so some tree in G_B spans vertices in two different trees of G_A. An edge connecting those trees can be added to A without creating a cycle. ✓

Key Matroid Concepts

Extension: Element x ∉ A is an extension of A ∈ 𝓘 if A ∪ {x} ∈ 𝓘.
Maximal independent subset: An independent set with no extensions.

Theorem 16.6: All maximal independent subsets in a matroid have the same size.

Proof: If maximal A were smaller than maximal B, the exchange property would let us extend A — contradicting maximality.

In a graphic matroid for a connected graph, every maximal independent subset is a spanning tree with exactly |V| − 1 edges.

Weighted Matroids

A matroid M = (S, 𝓘) is weighted if associated with a weight function w assigning a strictly positive weight w(x) to each x ∈ S.

w(A) = Σ over x ∈ A of w(x)

Optimal subset: An independent set A ∈ 𝓘 with maximum w(A). Since all weights are positive, an optimal subset is always maximal.

Connection to MST: For minimum spanning tree, define w'(e) = w₀ − w(e) where w₀ > max edge weight. Maximizing w'(A) over spanning trees is equivalent to minimizing w(A).

Greedy Algorithm on a Weighted Matroid

GREEDY(M, w)
1   A = ∅
2   sort M.S into monotonically decreasing order by weight w
3   for each x ∈ M.S, taken in monotonically decreasing order by weight w(x)
4       if A ∪ {x} ∈ M.𝓘
5           A = A ∪ {x}
6   return A

Consider elements in decreasing weight order.
Add element to A if independence is maintained.
Running time: O(n lg n + n·f(n)), where f(n) is the time to check independence.

Correctness Proof

Lemma 16.7 (Greedy-choice property): Let x be the first element of S (heaviest) such that {x} is independent. Then there exists an optimal subset containing x.

Proof: Take any optimal B. If x ∉ B, use the exchange property to build A from {x} by adding elements of B until |A| = |B|. Then A = B − {y} ∪ {x} for some y, and w(A) = w(B) − w(y) + w(x) ≥ w(B) since w(x) ≥ w(y).

Lemma 16.8: If x is an extension of some independent subset A, then x is also an extension of ∅.

Corollary 16.9: If x is not an extension of ∅, then x is not an extension of any independent subset. (GREEDY never errs by skipping elements that aren't extensions of ∅.)

Lemma 16.10 (Optimal substructure): After choosing x, the remaining problem reduces to finding a maximum-weight independent subset of the contraction M' = (S', 𝓘'):

S' = {y ∈ S : {x, y} ∈ 𝓘}
𝓘' = {B ⊆ S − {x} : B ∪ {x} ∈ 𝓘}

Theorem 16.11: GREEDY(M, w) returns an optimal subset. (Follows from Corollary 16.9, Lemma 16.7, and Lemma 16.10.)

16.5 A Task-Scheduling Problem as a Matroid

Problem Definition

Schedule unit-time tasks on a single processor to minimize total penalty for missed deadlines.

Inputs:

A set S = {a₁, a₂, ..., aₙ} of n unit-time tasks.
Integer deadlines d₁, d₂, ..., dₙ (each 1 ≤ dᵢ ≤ n); task aᵢ should finish by time dᵢ.
Nonnegative penalties w₁, w₂, ..., wₙ; penalty wᵢ incurred if aᵢ misses its deadline.

Key Definitions

A task is early if it finishes by its deadline; otherwise it is late.
Early-first form: All early tasks precede all late tasks.
Canonical form: Early tasks precede late tasks, and early tasks are in order of monotonically increasing deadlines.
A set A of tasks is independent if there exists a schedule for A with no late tasks.

Independence Characterization (Lemma 16.12)

For any set of tasks A, the following are equivalent:

A is independent.
For t = 0, 1, ..., n: Nₜ(A) ≤ t, where Nₜ(A) = number of tasks in A with deadline ≤ t.
If tasks in A are scheduled in order of monotonically increasing deadlines, no task is late.

Matroid Structure (Theorem 16.13)

The system (S, 𝓘), where 𝓘 is the set of all independent task sets, is a matroid.

Proof sketch:

Hereditary: subset of an independent set is independent. ✓
Exchange property: If |B| > |A| and both independent, there exists some t = k+1 where B has more tasks with deadline k+1 than A. Adding such a task from B − A to A preserves independence. ✓

Solving with the Greedy Algorithm

Minimizing total penalty of late tasks = maximizing total penalty of early tasks.
Apply GREEDY to the weighted matroid (S, 𝓘) with weights wᵢ.
The greedy algorithm finds a maximum-weight independent set A (the early tasks).
Schedule: list A in increasing deadline order, then list S − A in any order.

Example:

Task aᵢ	1	2	3	4	5	6	7
dᵢ	4	2	4	3	1	4	6
wᵢ	70	60	50	40	30	20	10

Greedy selects (in decreasing penalty order): a₁, a₂, a₃, a₄ ✓, rejects a₅ and a₆, accepts a₇.
Early set: {a₁, a₂, a₃, a₄, a₇}.
Optimal schedule: ⟨a₂, a₄, a₁, a₃, a₇, a₅, a₆⟩.
Total penalty: w₅ + w₆ = 30 + 20 = 50.

Running time: O(n²) — O(n) independence checks, each taking O(n).

Previous chapter

Dynamic Programming

Next chapter

Amortized Analysis