Minimum Spanning Trees
Introduction to AlgorithmsChapter 23: Minimum Spanning Trees
Given a connected, undirected graph G = (V, E) with weight function w : E → ℝ, we want to find an acyclic subset T ⊆ E that connects all vertices and minimizes the total weight:
w(T) = Σ w(u, v) for all (u, v) ∈ T
Since T is acyclic and connects all vertices, it forms a spanning tree. The problem of finding T is the minimum-spanning-tree (MST) problem.
- Both Kruskal's and Prim's algorithms solve this in O(E lg V) with binary heaps.
- With Fibonacci heaps, Prim's runs in O(E + V lg V), an improvement when |V| ≪ |E|.
- Both are greedy algorithms: at each step they make the locally optimal choice, which for MST provably yields a globally optimal solution.
23.1 Growing a Minimum Spanning Tree
The generic strategy grows the MST one edge at a time by maintaining a set A of edges with the following loop invariant:
Prior to each iteration, A is a subset of some minimum spanning tree.
An edge (u, v) that can be added to A while preserving this invariant is called a safe edge for A.
GENERIC-MST(G, w)
1 A = ∅
2 while A does not form a spanning tree
3 find an edge (u, v) that is safe for A
4 A = A ∪ {(u, v)}
5 return A
Loop invariant proof:
- Initialization: A = ∅ is trivially a subset of every MST.
- Maintenance: Only safe edges are added, so A remains a subset of some MST.
- Termination: A contains |V| − 1 edges forming an MST.
Key Definitions
- Cut: A partition (S, V − S) of the vertex set V.
- Crossing edge: An edge (u, v) with one endpoint in S and the other in V − S.
- Respecting a cut: A cut (S, V − S) respects a set A of edges if no edge in A crosses the cut.
- Light edge: An edge crossing a cut whose weight is minimum among all edges crossing that cut (ties allowed).
Theorem 23.1 (Recognizing Safe Edges)
Let G = (V, E) be a connected, undirected graph with weight function w. Let A ⊆ E be included in some MST, let (S, V − S) be any cut that respects A, and let (u, v) be a light edge crossing (S, V − S). Then (u, v) is safe for A.
Proof sketch (cut-and-paste):
- Let T be an MST containing A. If (u, v) ∈ T, done.
- Otherwise, (u, v) creates a cycle with the path p from u to v in T.
- Some edge (x, y) on p also crosses the cut. Since the cut respects A, (x, y) ∉ A.
- Form T' = T − {(x, y)} ∪ {(u, v)}.
- Since (u, v) is light: w(u, v) ≤ w(x, y), so w(T') ≤ w(T). Thus T' is also an MST.
- A ∪ {(u, v)} ⊆ T', so (u, v) is safe for A.
Corollary 23.2
Let A ⊆ E be included in some MST, and let C = (V_C, E_C) be a connected component (tree) in the forest G_A = (V, A). If (u, v) is a light edge connecting C to some other component in G_A, then (u, v) is safe for A.
Proof: The cut (V_C, V − V_C) respects A, and (u, v) is a light edge for this cut.
Properties of the Generic Method
- A is always acyclic — the graph G_A = (V, A) is a forest.
- Each safe edge connects distinct components of G_A.
- The while loop executes exactly |V| − 1 times (one edge per iteration).
- Initially there are |V| trees (one per vertex); each iteration merges two, until one tree remains.
23.2 Kruskal's Algorithm
Strategy: The set A is a forest. At each step, add the least-weight edge in the entire graph that connects two distinct components.
- By Corollary 23.2, such an edge is always safe.
- Uses a disjoint-set (union-find) data structure to track components.
MST-KRUSKAL(G, w)
1 A = ∅
2 for each vertex v ∈ G.V
3 MAKE-SET(v)
4 sort the edges of G.E into nondecreasing order by weight w
5 for each edge (u, v) ∈ G.E, taken in nondecreasing order by weight
6 if FIND-SET(u) ≠ FIND-SET(v)
7 A = A ∪ {(u, v)}
8 UNION(u, v)
9 return A
How It Works
- Lines 1–3: Initialize A = ∅ and create |V| singleton sets (one per vertex).
- Line 4: Sort all edges by weight.
- Lines 5–8: Process edges in order. For each edge (u, v):
- If u and v are in different components → add edge to A and merge components.
- If u and v are in the same component → discard (would create a cycle).
Running Time Analysis
| Operation | Cost |
|---|---|
| Initialize A | O(1) |
| V | |
| Sort edges | O(E lg E) |
| O(E) FIND-SET and UNION operations | O((V + E) α(V)) |
- With union-by-rank and path compression, disjoint-set operations take O(E α(V)) time, where α is the inverse Ackermann function (effectively constant).
- Sorting dominates: O(E lg E).
- Since |E| < |V|², we have lg|E| = O(lg V), so the total is O(E lg V).
23.3 Prim's Algorithm
Strategy: The set A always forms a single tree, starting from an arbitrary root vertex r and growing until it spans all vertices. At each step, add a light edge connecting the tree to an isolated vertex.
- By Corollary 23.2, this always adds a safe edge.
- Resembles Dijkstra's shortest-paths algorithm.
Data Structures
- Min-priority queue Q containing all vertices not yet in the tree.
- For each vertex v:
- v.key = minimum weight of any edge connecting v to a vertex in the tree (∞ if none).
- v.π = parent of v in the tree.
- The MST edge set is implicitly: A = {(v, v.π) : v ∈ V − {r} − Q}
- At termination: A = {(v, v.π) : v ∈ V − {r}}
MST-PRIM(G, w, r)
1 for each u ∈ G.V
2 u.key = ∞
3 u.π = NIL
4 r.key = 0
5 Q = G.V
6 while Q ≠ ∅
7 u = EXTRACT-MIN(Q)
8 for each v ∈ G.Adj[u]
9 if v ∈ Q and w(u, v) < v.key
10 v.π = u
11 v.key = w(u, v)
How It Works
- Lines 1–5: Set all keys to ∞, root key to 0, initialize Q with all vertices.
- Line 7: Extract vertex u with minimum key (first iteration: u = r).
- Lines 8–11: For each neighbor v of u still in Q, if w(u, v) < v.key, update v's key and parent.
Loop invariant (three parts):
- A = {(v, v.π) : v ∈ V − {r} − Q}
- Vertices in the MST so far are V − Q.
- For all v ∈ Q with v.π ≠ NIL: v.key is the weight of a light edge connecting v to some vertex already in the tree.
Running Time Analysis
With binary min-heap:
| Operation | Count | Cost Each | Total |
|---|---|---|---|
| BUILD-MIN-HEAP (lines 1–5) | 1 | O(V) | O(V) |
| EXTRACT-MIN | V | ||
| DECREASE-KEY (line 11) | O(E) | O(lg V) | O(E lg V) |
| Membership test (line 9) | O(E) | O(1) | O(E) |
Total with binary heap: O(V lg V + E lg V) = O(E lg V)
With Fibonacci heap:
| Operation | Count | Amortized Cost | Total |
|---|---|---|---|
| EXTRACT-MIN | V | ||
| DECREASE-KEY | O(E) | O(1) | O(E) |
Total with Fibonacci heap: O(E + V lg V)
This improves over the binary-heap version when |V| is much smaller than |E| (dense graphs).
Summary Comparison
| Property | Kruskal's | Prim's |
|---|---|---|
| Edge set A | Forest (multiple trees) | Single tree |
| Edge selection | Global minimum-weight edge connecting two components | Minimum-weight edge crossing tree boundary |
| Data structure | Disjoint-set (union-find) | Min-priority queue |
| Time (binary heap) | O(E lg V) | O(E lg V) |
| Time (Fibonacci heap) | — | O(E + V lg V) |
| Similar to | Connected-components algorithm | Dijkstra's algorithm |
| Best for | Sparse graphs | Dense graphs (with Fibonacci heap) |
Previous chapter
Elementary Graph AlgorithmsNext chapter
Single Source Shortest Paths