Pagefy

Pagefy

Back to Data Structures and Algorithms

Minimum Spanning Trees

Introduction to Algorithms

Chapter 23: Minimum Spanning Trees

Given a connected, undirected graph G = (V, E) with weight function w : E → ℝ, we want to find an acyclic subset T ⊆ E that connects all vertices and minimizes the total weight:

w(T) = Σ w(u, v) for all (u, v) ∈ T

Since T is acyclic and connects all vertices, it forms a spanning tree. The problem of finding T is the minimum-spanning-tree (MST) problem.

  • Both Kruskal's and Prim's algorithms solve this in O(E lg V) with binary heaps.
  • With Fibonacci heaps, Prim's runs in O(E + V lg V), an improvement when |V| ≪ |E|.
  • Both are greedy algorithms: at each step they make the locally optimal choice, which for MST provably yields a globally optimal solution.

23.1 Growing a Minimum Spanning Tree

The generic strategy grows the MST one edge at a time by maintaining a set A of edges with the following loop invariant:

Prior to each iteration, A is a subset of some minimum spanning tree.

An edge (u, v) that can be added to A while preserving this invariant is called a safe edge for A.

GENERIC-MST(G, w)
1  A = ∅
2  while A does not form a spanning tree
3      find an edge (u, v) that is safe for A
4      A = A ∪ {(u, v)}
5  return A

Loop invariant proof:

  • Initialization: A = ∅ is trivially a subset of every MST.
  • Maintenance: Only safe edges are added, so A remains a subset of some MST.
  • Termination: A contains |V| − 1 edges forming an MST.

Key Definitions

  • Cut: A partition (S, V − S) of the vertex set V.
  • Crossing edge: An edge (u, v) with one endpoint in S and the other in V − S.
  • Respecting a cut: A cut (S, V − S) respects a set A of edges if no edge in A crosses the cut.
  • Light edge: An edge crossing a cut whose weight is minimum among all edges crossing that cut (ties allowed).

Theorem 23.1 (Recognizing Safe Edges)

Let G = (V, E) be a connected, undirected graph with weight function w. Let A ⊆ E be included in some MST, let (S, V − S) be any cut that respects A, and let (u, v) be a light edge crossing (S, V − S). Then (u, v) is safe for A.

Proof sketch (cut-and-paste):

  1. Let T be an MST containing A. If (u, v) ∈ T, done.
  2. Otherwise, (u, v) creates a cycle with the path p from u to v in T.
  3. Some edge (x, y) on p also crosses the cut. Since the cut respects A, (x, y) ∉ A.
  4. Form T' = T − {(x, y)} ∪ {(u, v)}.
  5. Since (u, v) is light: w(u, v) ≤ w(x, y), so w(T') ≤ w(T). Thus T' is also an MST.
  6. A ∪ {(u, v)} ⊆ T', so (u, v) is safe for A.

Corollary 23.2

Let A ⊆ E be included in some MST, and let C = (V_C, E_C) be a connected component (tree) in the forest G_A = (V, A). If (u, v) is a light edge connecting C to some other component in G_A, then (u, v) is safe for A.

Proof: The cut (V_C, V − V_C) respects A, and (u, v) is a light edge for this cut.

Properties of the Generic Method

  • A is always acyclic — the graph G_A = (V, A) is a forest.
  • Each safe edge connects distinct components of G_A.
  • The while loop executes exactly |V| − 1 times (one edge per iteration).
  • Initially there are |V| trees (one per vertex); each iteration merges two, until one tree remains.

23.2 Kruskal's Algorithm

Strategy: The set A is a forest. At each step, add the least-weight edge in the entire graph that connects two distinct components.

  • By Corollary 23.2, such an edge is always safe.
  • Uses a disjoint-set (union-find) data structure to track components.
MST-KRUSKAL(G, w)
1  A = ∅
2  for each vertex v ∈ G.V
3      MAKE-SET(v)
4  sort the edges of G.E into nondecreasing order by weight w
5  for each edge (u, v) ∈ G.E, taken in nondecreasing order by weight
6      if FIND-SET(u) ≠ FIND-SET(v)
7          A = A ∪ {(u, v)}
8          UNION(u, v)
9  return A

How It Works

  1. Lines 1–3: Initialize A = ∅ and create |V| singleton sets (one per vertex).
  2. Line 4: Sort all edges by weight.
  3. Lines 5–8: Process edges in order. For each edge (u, v):
    • If u and v are in different components → add edge to A and merge components.
    • If u and v are in the same component → discard (would create a cycle).

Running Time Analysis

OperationCost
Initialize AO(1)
V
Sort edgesO(E lg E)
O(E) FIND-SET and UNION operationsO((V + E) α(V))
  • With union-by-rank and path compression, disjoint-set operations take O(E α(V)) time, where α is the inverse Ackermann function (effectively constant).
  • Sorting dominates: O(E lg E).
  • Since |E| < |V|², we have lg|E| = O(lg V), so the total is O(E lg V).

23.3 Prim's Algorithm

Strategy: The set A always forms a single tree, starting from an arbitrary root vertex r and growing until it spans all vertices. At each step, add a light edge connecting the tree to an isolated vertex.

  • By Corollary 23.2, this always adds a safe edge.
  • Resembles Dijkstra's shortest-paths algorithm.

Data Structures

  • Min-priority queue Q containing all vertices not yet in the tree.
  • For each vertex v:
    • v.key = minimum weight of any edge connecting v to a vertex in the tree (∞ if none).
    • v.π = parent of v in the tree.
  • The MST edge set is implicitly: A = {(v, v.π) : v ∈ V − {r} − Q}
  • At termination: A = {(v, v.π) : v ∈ V − {r}}
MST-PRIM(G, w, r)
1  for each u ∈ G.V
2      u.key = ∞
3      u.π = NIL
4  r.key = 0
5  Q = G.V
6  while Q ≠ ∅
7      u = EXTRACT-MIN(Q)
8      for each v ∈ G.Adj[u]
9          if v ∈ Q and w(u, v) < v.key
10             v.π = u
11             v.key = w(u, v)

How It Works

  1. Lines 1–5: Set all keys to ∞, root key to 0, initialize Q with all vertices.
  2. Line 7: Extract vertex u with minimum key (first iteration: u = r).
  3. Lines 8–11: For each neighbor v of u still in Q, if w(u, v) < v.key, update v's key and parent.

Loop invariant (three parts):

  1. A = {(v, v.π) : v ∈ V − {r} − Q}
  2. Vertices in the MST so far are V − Q.
  3. For all v ∈ Q with v.π ≠ NIL: v.key is the weight of a light edge connecting v to some vertex already in the tree.

Running Time Analysis

With binary min-heap:

OperationCountCost EachTotal
BUILD-MIN-HEAP (lines 1–5)1O(V)O(V)
EXTRACT-MINV
DECREASE-KEY (line 11)O(E)O(lg V)O(E lg V)
Membership test (line 9)O(E)O(1)O(E)

Total with binary heap: O(V lg V + E lg V) = O(E lg V)

With Fibonacci heap:

OperationCountAmortized CostTotal
EXTRACT-MINV
DECREASE-KEYO(E)O(1)O(E)

Total with Fibonacci heap: O(E + V lg V)

This improves over the binary-heap version when |V| is much smaller than |E| (dense graphs).


Summary Comparison

PropertyKruskal'sPrim's
Edge set AForest (multiple trees)Single tree
Edge selectionGlobal minimum-weight edge connecting two componentsMinimum-weight edge crossing tree boundary
Data structureDisjoint-set (union-find)Min-priority queue
Time (binary heap)O(E lg V)O(E lg V)
Time (Fibonacci heap)O(E + V lg V)
Similar toConnected-components algorithmDijkstra's algorithm
Best forSparse graphsDense graphs (with Fibonacci heap)