Pagefy

Pagefy

Back to Data Structures and Algorithms

Dynamic Programming

Introduction to Algorithms

Chapter 15: Dynamic Programming

Dynamic programming solves problems by combining solutions to subproblems. Unlike divide-and-conquer (which partitions into disjoint subproblems), DP applies when subproblems overlap — i.e., they share sub-subproblems. Each sub-subproblem is solved only once and its answer is saved in a table, avoiding redundant recomputation.

DP is typically applied to optimization problems: problems with many possible solutions where we seek one with the optimal (min or max) value.

Four Steps of Dynamic Programming

  1. Characterize the structure of an optimal solution.
  2. Recursively define the value of an optimal solution.
  3. Compute the value of an optimal solution (typically bottom-up).
  4. Construct an optimal solution from the computed information.

Steps 1–3 form the core. Step 4 is only needed if we want the actual solution (not just its value), and often requires maintaining additional information during step 3.


15.1 Rod Cutting

Problem Statement

Given a rod of length n inches and a price table p_i for rod pieces of length i = 1, 2, …, n, determine the maximum revenue r_n obtainable by cutting the rod and selling the pieces.

length i12345678910
price p_i1589101717202430
  • A rod of length n can be cut in 2^(n−1) different ways.
  • If an optimal solution cuts the rod into k pieces of lengths i_1, i_2, …, i_k, then: r_n = p_{i_1} + p_{i_2} + … + p_{i_k}

Optimal Revenue Values (Example)

r₁=1r₂=5r₃=8r₄=10r₅=13r₆=17r₇=18r₈=22r₉=25r₁₀=30

Recursive Formulation

The general recurrence (viewing the first piece of length i as fixed, with only the remainder further divided):

r_n = max₁≤i≤n (p_i + r_{n−i})

This exhibits optimal substructure: optimal solutions incorporate optimal solutions to subproblems.

Naive Recursive Implementation

CUT-ROD(p, n)
  if n == 0
    return 0
  q = -∞
  for i = 1 to n
    q = max(q, p[i] + CUT-ROD(p, n - i))
  return q
  • Running time: T(n) = 2^n — exponential, because it solves the same subproblems repeatedly.

Top-Down with Memoization

Save the result of each subproblem in an array. Before computing, check if the answer is already known.

MEMOIZED-CUT-ROD(p, n)
  let r[0..n] be a new array
  for i = 0 to n
    r[i] = -∞
  return MEMOIZED-CUT-ROD-AUX(p, n, r)

MEMOIZED-CUT-ROD-AUX(p, n, r)
  if r[n] ≥ 0
    return r[n]
  if n == 0
    q = 0
  else q = -∞
    for i = 1 to n
      q = max(q, p[i] + MEMOIZED-CUT-ROD-AUX(p, n - i, r))
  r[n] = q
  return q

Bottom-Up Method

Solve subproblems in order of increasing size. When solving size j, all smaller subproblems are already solved.

BOTTOM-UP-CUT-ROD(p, n)
  let r[0..n] be a new array
  r[0] = 0
  for j = 1 to n
    q = -∞
    for i = 1 to j
      q = max(q, p[i] + r[j - i])
    r[j] = q
  return r[n]
  • Running time: Θ(n²) for both top-down and bottom-up approaches.
  • Bottom-up typically has better constant factors (no recursion overhead).

Subproblem Graphs

  • A subproblem graph is a directed graph with one vertex per distinct subproblem and an edge from x to y if solving x directly requires the solution to y.
  • It is a "collapsed" version of the recursion tree.
  • Bottom-up DP processes vertices in reverse topological order.
  • The running time of DP is linear in the number of vertices and edges of the subproblem graph.

Reconstructing a Solution

Track not just the optimal value, but also the choice that led to it.

EXTENDED-BOTTOM-UP-CUT-ROD(p, n)
  let r[0..n] and s[0..n] be new arrays
  r[0] = 0
  for j = 1 to n
    q = -∞
    for i = 1 to j
      if q < p[i] + r[j - i]
        q = p[i] + r[j - i]
        s[j] = i
    r[j] = q
  return r and s

PRINT-CUT-ROD-SOLUTION(p, n)
  (r, s) = EXTENDED-BOTTOM-UP-CUT-ROD(p, n)
  while n > 0
    print s[n]
    n = n - s[n]

15.2 Matrix-Chain Multiplication

Problem Statement

Given a chain of n matrices ⟨A₁, A₂, …, Aₙ⟩ to multiply, where matrix A_i has dimensions p_{i−1} × p_i, find the parenthesization that minimizes the total number of scalar multiplications.

  • Multiplying a p × q matrix by a q × r matrix costs p·q·r scalar multiplications.
  • The order of parenthesization can have a dramatic impact on cost.

Example: For ⟨A₁, A₂, A₃⟩ with dimensions 10×100, 100×5, 5×50:

  • ((A₁A₂)A₃) costs 5,000 + 2,500 = 7,500
  • (A₁(A₂A₃)) costs 25,000 + 50,000 = 75,000 — 10× slower!

Number of Parenthesizations

The number of ways to parenthesize n matrices is given by the Catalan numbers, which grow as Ω(4ⁿ/n^(3/2)) — exhaustive search is infeasible.

Step 1: Optimal Substructure

  • To parenthesize A_i A_{i+1} … A_j, we split between A_k and A_{k+1} for some i ≤ k < j.
  • The parenthesization of the prefix A_i…A_k and suffix A_{k+1}…A_j within an optimal solution must themselves be optimal (by cut-and-paste argument).

Step 2: Recursive Solution

Let m[i, j] = minimum scalar multiplications to compute A_{i..j}.

m[i, j] = 0 if i = j

m[i, j] = min_{i≤k<j} { m[i, k] + m[k+1, j] + p_{i−1}·p_k·p_j } if i < j

Also define s[i, j] = the value of k that achieves the optimal split.

Step 3: Bottom-Up Computation

MATRIX-CHAIN-ORDER(p)
  n = p.length - 1
  let m[1..n, 1..n] and s[1..n-1, 2..n] be new tables
  for i = 1 to n
    m[i, i] = 0
  for l = 2 to n                          // l = chain length
    for i = 1 to n - l + 1
      j = i + l - 1
      m[i, j] = ∞
      for k = i to j - 1
        q = m[i, k] + m[k+1, j] + p_{i-1} · p_k · p_j
        if q < m[i, j]
          m[i, j] = q
          s[i, j] = k
  return m and s
  • Running time: O(n³)
  • Space: Θ(n²) for the m and s tables.
  • Fills the table by increasing chain length l.

Step 4: Constructing the Solution

PRINT-OPTIMAL-PARENS(s, i, j)
  if i == j
    print "A_i"
  else print "("
    PRINT-OPTIMAL-PARENS(s, i, s[i, j])
    PRINT-OPTIMAL-PARENS(s, s[i, j] + 1, j)
    print ")"

Example (n = 6): Dimensions ⟨30, 35, 15, 5, 10, 20, 25⟩ → optimal cost m[1,6] = 15,125, parenthesization: ((A₁(A₂A₃))((A₄A₅)A₆))


15.3 Elements of Dynamic Programming

Two key ingredients an optimization problem must have for DP to apply:

1. Optimal Substructure

A problem exhibits optimal substructure if an optimal solution contains within it optimal solutions to subproblems.

Pattern for discovering optimal substructure:

  1. Show that a solution consists of making a choice (e.g., where to cut, where to split), leaving subproblems to solve.
  2. Suppose you are given the choice leading to an optimal solution.
  3. Determine which subproblems ensue and characterize the subproblem space.
  4. Show subproblem solutions must be optimal via cut-and-paste: if a subproblem solution weren't optimal, you could substitute a better one, contradicting the optimality of the overall solution.

Optimal substructure varies in two ways:

  • How many subproblems an optimal solution uses.
  • How many choices we have in determining which subproblem(s) to use.

Running time informally depends on: (number of subproblems) × (number of choices per subproblem).

ProblemSubproblemsChoicesRunning Time
Rod cuttingΘ(n)≤ nO(n²)
Matrix-chain multiplicationΘ(n²)≤ n−1O(n³)

Subtlety: When Optimal Substructure Fails

  • Unweighted shortest path — has optimal substructure (subpaths of shortest paths are shortest paths).
  • Unweighted longest simple path — does NOT have optimal substructure. Subproblems are not independent (they share vertices/resources). This problem is NP-complete.

Key insight: Subproblems must be independent — the solution to one subproblem must not affect the solution to another subproblem of the same problem.

2. Overlapping Subproblems

The subproblem space must be "small" — a recursive algorithm solves the same subproblems over and over rather than always generating new ones. Typically the total number of distinct subproblems is polynomial in the input size.

  • DP stores solutions in a table for O(1) lookup.
  • Divide-and-conquer generates brand-new subproblems at each step → no overlap → DP doesn't help.

Memoization (Top-Down DP)

Memoization maintains a table of subproblem solutions with a top-down recursive control structure:

  1. Initialize all table entries to a sentinel value (e.g., ∞).
  2. On each recursive call, check if the answer is already in the table.
  3. If yes, return it immediately. If no, compute it, store it, then return.
MEMOIZED-MATRIX-CHAIN(p)
  n = p.length - 1
  let m[1..n, 1..n] be a new table
  for i = 1 to n
    for j = i to n
      m[i, j] = ∞
  return LOOKUP-CHAIN(m, p, 1, n)

LOOKUP-CHAIN(m, p, i, j)
  if m[i, j] < ∞
    return m[i, j]
  if i == j
    m[i, j] = 0
  else for k = i to j - 1
    q = LOOKUP-CHAIN(m, p, i, k)
      + LOOKUP-CHAIN(m, p, k+1, j) + p_{i-1} · p_k · p_j
    if q < m[i, j]
      m[i, j] = q
  return m[i, j]

Running time: O(n³) — same as bottom-up.

Bottom-Up vs. Top-Down (Memoized)

Bottom-UpTop-Down (Memoized)
ApproachSolve smallest subproblems firstRecurse naturally, cache results
Constant factorsBetter (no recursion overhead)Slightly worse
Subproblems solvedAll of themOnly those actually needed
When to preferAll subproblems must be solvedSome subproblems may be skipped

15.4 Longest Common Subsequence (LCS)

Definitions

  • A subsequence of X = ⟨x₁, x₂, …, x_m⟩ is obtained by deleting zero or more elements (preserving order).
  • A common subsequence of X and Y is a subsequence of both.
  • The longest common subsequence (LCS) is a common subsequence of maximum length.

Example: X = ⟨A, B, C, B, D, A, B⟩, Y = ⟨B, D, C, A, B, A⟩ → LCS = ⟨B, C, B, A⟩ (length 4).

  • Brute force: enumerate all 2^m subsequences of X → exponential.

Step 1: Optimal Substructure (Theorem 15.1)

Let X = ⟨x₁, …, x_m⟩, Y = ⟨y₁, …, y_n⟩, and Z = ⟨z₁, …, z_k⟩ be any LCS of X and Y. Define prefix X_i = ⟨x₁, …, x_i⟩.

  1. If x_m = y_n, then z_k = x_m = y_n and Z_{k−1} is an LCS of X_{m−1} and Y_{n−1}.
  2. If x_m ≠ y_n and z_k ≠ x_m, then Z is an LCS of X_{m−1} and Y.
  3. If x_m ≠ y_n and z_k ≠ y_n, then Z is an LCS of X and Y_{n−1}.

Step 2: Recursive Solution

Let c[i, j] = length of an LCS of X_i and Y_j.

c[i, j] = 0 if i = 0 or j = 0

c[i, j] = c[i−1, j−1] + 1 if i, j > 0 and x_i = y_j

c[i, j] = max(c[i, j−1], c[i−1, j]) if i, j > 0 and x_i ≠ y_j

Step 3: Computing the Length

LCS-LENGTH(X, Y)
  m = X.length
  n = Y.length
  let b[1..m, 1..n] and c[0..m, 0..n] be new tables
  for i = 1 to m
    c[i, 0] = 0
  for j = 0 to n
    c[0, j] = 0
  for i = 1 to m
    for j = 1 to n
      if x_i == y_j
        c[i, j] = c[i-1, j-1] + 1
        b[i, j] = "↖"
      elseif c[i-1, j] ≥ c[i, j-1]
        c[i, j] = c[i-1, j]
        b[i, j] = "↑"
      else
        c[i, j] = c[i, j-1]
        b[i, j] = "←"
  return c and b
  • Running time: Θ(mn)
  • Space: Θ(mn)

Step 4: Constructing an LCS

Follow the arrows in table b from b[m, n] back to a boundary:

PRINT-LCS(b, X, i, j)
  if i == 0 or j == 0
    return
  if b[i, j] == "↖"
    PRINT-LCS(b, X, i-1, j-1)
    print x_i
  elseif b[i, j] == "↑"
    PRINT-LCS(b, X, i-1, j)
  else
    PRINT-LCS(b, X, i, j-1)
  • Running time: O(m + n) — decrements at least one index per call.

Space Optimization

  • The b table can be eliminated: given c[i, j], we can determine in O(1) which of the three entries was used.
  • For computing only the LCS length, only two rows of c are needed at a time → O(min(m, n)) space.
  • However, reconstructing the actual LCS from the reduced table is not possible in O(m + n) time.

15.5 Optimal Binary Search Trees

Problem Statement

Given n distinct keys K = ⟨k₁ < k₂ < … < kₙ⟩ with search probabilities p_i, and n + 1 dummy keys d₀, d₁, …, dₙ (representing unsuccessful searches) with probabilities q_i, construct a BST that minimizes the expected search cost.

The probabilities satisfy: Σp_i + Σq_i = 1

Expected search cost:

E[search cost in T] = 1 + Σᵢ depth_T(k_i)·p_i + Σᵢ depth_T(d_i)·q_i

Key insight: An optimal BST is not necessarily the one with minimum height, nor the one with the highest-probability key at the root.

Step 1: Optimal Substructure

  • Any subtree of a BST contains keys in a contiguous range k_i, …, k_j and dummy keys d_{i−1}, …, d_j.
  • If T is an optimal BST and T' is a subtree containing k_i, …, k_j, then T' must be an optimal BST for that subproblem (cut-and-paste argument).
  • For keys k_i, …, k_j, some key k_r (i ≤ r ≤ j) is the root. The left subtree contains k_i, …, k_{r−1} and the right subtree contains k_{r+1}, …, k_j.

Step 2: Recursive Solution

Let e[i, j] = expected search cost of an optimal BST containing keys k_i, …, k_j.

Define the weight (sum of probabilities in a subtree):

w(i, j) = Σ_{l=i}^{j} p_l + Σ_{l=i−1}^{j} q_l

When a subtree becomes a child of a node, every node's depth increases by 1, adding w(i, j) to the cost.

Recurrence:

e[i, j] = q_{i−1} if j = i − 1 (only dummy key d_{i−1})

e[i, j] = min_{i≤r≤j} { e[i, r−1] + e[r+1, j] + w(i, j) } if i ≤ j

Also define root[i, j] = the index r achieving the minimum.

The weight can be computed incrementally:

w[i, j] = w[i, j−1] + p_j + q_j

Step 3: Bottom-Up Computation

OPTIMAL-BST(p, q, n)
  let e[1..n+1, 0..n], w[1..n+1, 0..n],
      and root[1..n, 1..n] be new tables
  for i = 1 to n + 1
    e[i, i-1] = q_{i-1}
    w[i, i-1] = q_{i-1}
  for l = 1 to n
    for i = 1 to n - l + 1
      j = i + l - 1
      e[i, j] = ∞
      w[i, j] = w[i, j-1] + p_j + q_j
      for r = i to j
        t = e[i, r-1] + e[r+1, j] + w[i, j]
        if t < e[i, j]
          e[i, j] = t
          root[i, j] = r
  return e and root
  • Running time: Θ(n³) — three nested loops, each up to n.
  • Space: Θ(n²) for the e, w, and root tables.
  • Structure is very similar to MATRIX-CHAIN-ORDER.

Example (n = 5)

i012345
p_i0.150.100.050.100.20
q_i0.050.100.050.050.050.10

Optimal expected cost: e[1, 5] = 2.75, with k₂ as root.

Note: Knuth showed that root[i, j−1] ≤ root[i, j] ≤ root[i+1, j], which can be exploited to reduce the running time to Θ(n²).


Summary of Key DP Problems

ProblemSubproblemsRecurrenceTimeSpace
Rod CuttingΘ(n)r_n = max(p_i + r_{n−i})Θ(n²)Θ(n)
Matrix-Chain Mult.Θ(n²)m[i,j] = min(m[i,k] + m[k+1,j] + p_{i−1}p_kp_j)O(n³)Θ(n²)
LCSΘ(mn)c[i,j] based on character matchΘ(mn)Θ(mn)
Optimal BSTΘ(n²)e[i,j] = min(e[i,r−1] + e[r+1,j] + w(i,j))Θ(n³)Θ(n²)

Two Hallmarks of DP

  1. Optimal substructure — optimal solutions contain optimal solutions to subproblems.
  2. Overlapping subproblems — a recursive solution revisits the same subproblems repeatedly.

Two Implementation Approaches

  1. Top-down with memoization — natural recursion + caching.
  2. Bottom-up tabulation — solve subproblems in size order.