Pagefy

Pagefy

Back to Data Structures and Algorithms

Divide and Conquer

Introduction to Algorithms

Chapter 4: Divide-and-Conquer

Overview

The divide-and-conquer paradigm solves a problem recursively by applying three steps at each level of recursion:

  1. Divide the problem into a number of subproblems that are smaller instances of the same problem.
  2. Conquer the subproblems by solving them recursively. If subproblem sizes are small enough, solve them directly (base case).
  3. Combine the solutions to the subproblems into the solution for the original problem.
  • When subproblems are large enough to solve recursively → recursive case
  • When subproblems are small enough to solve directly → base case

Recurrences

Recurrences describe the running time of divide-and-conquer algorithms — equations or inequalities that describe a function in terms of its value on smaller inputs.

  • Example (merge sort): T(n) = 2T(n/2) + Θ(n), solution: T(n) = Θ(n lg n)
  • Unequal splits are possible: T(n) = T(2n/3) + T(n/3) + Θ(n)
  • Linear recursion: T(n) = T(n − 1) + Θ(1)

Three methods for solving recurrences:

  • Substitution method — guess a bound, prove by induction
  • Recursion-tree method — convert to a tree, sum costs at each level
  • Master method — cookbook method for recurrences of the form T(n) = aT(n/b) + f(n)

Technicalities

  • Floors and ceilings are typically omitted for convenience (they usually don't affect asymptotic bounds)
  • Boundary conditions (e.g., T(1) = Θ(1)) are generally omitted; T(n) is assumed constant for small n
  • Recurrence inequalities: T(n) ≤ 2T(n/2) + Θ(n) → use O-notation; T(n) ≥ ... → use Ω-notation

4.1 The Maximum-Subarray Problem

Problem Statement

Given an array of numbers (containing some negatives), find the contiguous subarray whose values have the greatest sum.

  • Motivated by the stock trading problem: maximize profit by buying low and selling high
  • Transform daily prices into daily changes → finding the maximum subarray of changes gives the optimal buy/sell window
  • Brute force: check all Θ(n²) pairs → Θ(n²) time

Divide-and-Conquer Solution

For subarray A[low..high], find midpoint mid = ⌊(low + high)/2⌋. A maximum subarray must lie in exactly one of:

  • Entirely in A[low..mid] (left half)
  • Entirely in A[mid+1..high] (right half)
  • Crossing the midpoint (spans both halves)

Solve left and right halves recursively. Find the max crossing subarray in linear time, then return the best of the three.

Finding the Max Crossing Subarray

Any subarray crossing the midpoint is composed of A[i..mid] and A[mid+1..j]. Find each half's maximum independently.

FIND-MAX-CROSSING-SUBARRAY(A, low, mid, high)
    left-sum = -∞
    sum = 0
    for i = mid downto low
        sum = sum + A[i]
        if sum > left-sum
            left-sum = sum
            max-left = i
    right-sum = -∞
    sum = 0
    for j = mid + 1 to high
        sum = sum + A[j]
        if sum > right-sum
            right-sum = sum
            max-right = j
    return (max-left, max-right, left-sum + right-sum)
  • Runs in Θ(n) time where n = high − low + 1

Main Algorithm

FIND-MAXIMUM-SUBARRAY(A, low, high)
    if high == low
        return (low, high, A[low])          // base case: one element
    else mid = ⌊(low + high)/2⌋
        (left-low, left-high, left-sum) =
            FIND-MAXIMUM-SUBARRAY(A, low, mid)
        (right-low, right-high, right-sum) =
            FIND-MAXIMUM-SUBARRAY(A, mid + 1, high)
        (cross-low, cross-high, cross-sum) =
            FIND-MAX-CROSSING-SUBARRAY(A, low, mid, high)
        if left-sum ≥ right-sum and left-sum ≥ cross-sum
            return (left-low, left-high, left-sum)
        elseif right-sum ≥ left-sum and right-sum ≥ cross-sum
            return (right-low, right-high, right-sum)
        else return (cross-low, cross-high, cross-sum)

Analysis

The recurrence is:

  • T(1) = Θ(1)
  • T(n) = 2T(n/2) + Θ(n) for n > 1

Solution: T(n) = Θ(n lg n) — same as merge sort.

Note: A linear-time O(n) algorithm exists using Kadane's algorithm (dynamic programming), but the divide-and-conquer approach illustrates the paradigm.


4.2 Strassen's Algorithm for Matrix Multiplication

Standard Matrix Multiplication

For n × n matrices A and B, the product C = A · B is defined by:

c_ij = Σ(k=1 to n) a_ik · b_kj
SQUARE-MATRIX-MULTIPLY(A, B)
    n = A.rows
    let C be a new n × n matrix
    for i = 1 to n
        for j = 1 to n
            c_ij = 0
            for k = 1 to n
                c_ij = c_ij + a_ik · b_kj
    return C
  • Running time: Θ(n³) due to triply-nested loops

Simple Divide-and-Conquer Approach

Partition each n × n matrix into four (n/2 × n/2) submatrices:

A = | A11  A12 |    B = | B11  B12 |    C = | C11  C12 |
    | A21  A22 |        | B21  B22 |        | C21  C22 |

This gives:

  • C11 = A11·B11 + A12·B21
  • C12 = A11·B12 + A12·B22
  • C21 = A21·B11 + A22·B21
  • C22 = A21·B12 + A22·B22
SQUARE-MATRIX-MULTIPLY-RECURSIVE(A, B)
    n = A.rows
    let C be a new n × n matrix
    if n == 1
        c11 = a11 · b11
    else partition A, B, and C into n/2 × n/2 submatrices
        C11 = RECURSIVE(A11, B11) + RECURSIVE(A12, B21)
        C12 = RECURSIVE(A11, B12) + RECURSIVE(A12, B22)
        C21 = RECURSIVE(A21, B11) + RECURSIVE(A22, B21)
        C22 = RECURSIVE(A21, B12) + RECURSIVE(A22, B22)
    return C
  • 8 recursive multiplications of (n/2 × n/2) matrices + 4 matrix additions of Θ(n²)
  • Recurrence: T(n) = 8T(n/2) + Θ(n²)
  • Solution: T(n) = Θ(n³) — no improvement over the naive method

Key insight: Partitioning by index calculation takes Θ(1), not Θ(n²). Asymptotic notation subsumes constant factors, but recursive notation like T(n/2) does not — the factor of 8 matters.

Strassen's Method

The key idea: reduce from 8 to 7 recursive multiplications at the cost of more additions. This makes the recursion tree less "bushy."

Four steps:

  1. Divide A, B, C into n/2 × n/2 submatrices (Θ(1) by index calculation)
  2. Create 10 matrices S1..S10, each n/2 × n/2 (sums/differences of submatrices, Θ(n²) total)
  3. Recursively compute 7 products P1..P7 (each n/2 × n/2)
  4. Compute C submatrices from P1..P7 by addition/subtraction (Θ(n²))

Step 2 — The S matrices:

  • S1 = B12 − B22
  • S2 = A11 + A12
  • S3 = A21 + A22
  • S4 = B21 − B11
  • S5 = A11 + A22
  • S6 = B11 + B22
  • S7 = A12 − A22
  • S8 = B21 + B22
  • S9 = A11 − A21
  • S10 = B11 + B12

Step 3 — The 7 products:

  • P1 = A11 · S1
  • P2 = S2 · B22
  • P3 = S3 · B11
  • P4 = A22 · S4
  • P5 = S5 · S6
  • P6 = S7 · S8
  • P7 = S9 · S10

Step 4 — Combine:

  • C11 = P5 + P4 − P2 + P6
  • C12 = P1 + P2
  • C21 = P3 + P4
  • C22 = P5 + P1 − P3 − P7

Analysis

  • Recurrence: T(n) = 7T(n/2) + Θ(n²)
  • Solution: T(n) = Θ(n^(lg 7)) ≈ Θ(n^2.81)

Practical Considerations

Strassen's algorithm is not always preferred in practice because:

  1. The constant factor hidden in Θ(n^(lg 7)) is larger than in Θ(n³)
  2. Not as efficient for sparse matrices
  3. Less numerically stable (larger floating-point errors accumulate)
  4. Recursive submatrices consume more memory

In practice, implementations use Strassen's for large matrices and switch to the naive method below a crossover point (system-dependent, typically n = 400–2150).


4.3 The Substitution Method for Solving Recurrences

Overview

The substitution method has two steps:

  1. Guess the form of the solution
  2. Use mathematical induction to find the constants and prove the guess correct

Example: T(n) = 2T(⌊n/2⌋) + n

Guess: T(n) = O(n lg n), i.e., T(n) ≤ cn lg n for some c > 0.

Inductive step: Assume T(⌊n/2⌋) ≤ c⌊n/2⌋ lg(⌊n/2⌋). Then:

T(n) ≤ 2(c⌊n/2⌋ lg(⌊n/2⌋)) + n
     ≤ cn lg(n/2) + n
     = cn lg n − cn lg 2 + n
     = cn lg n − cn + n
     ≤ cn lg n          (when c ≥ 1)

Base case: T(1) = 1 but cn lg n = 0 at n = 1 — doesn't work! Fix: use T(2) and T(3) as base cases (n₀ = 2). Choose c ≥ 2 so that T(2) ≤ c·2·lg 2 and T(3) ≤ c·3·lg 3.

Making a Good Guess

  • Use similarity to known recurrences (e.g., T(n) = 2T(⌊n/2⌋ + 17) + n is still O(n lg n))
  • Start with loose bounds and tighten: prove Ω(n) lower bound and O(n²) upper bound, then converge
  • Use recursion trees to generate guesses

Subtleties: Strengthening the Inductive Hypothesis

For T(n) = T(⌊n/2⌋) + T(⌈n/2⌉) + 1:

  • Guessing T(n) ≤ cn gives T(n) ≤ cn + 1 — doesn't work!

  • Fix: Subtract a lower-order term. Guess T(n) ≤ cn − d (where d ≥ 0):

    T(n) ≤ (c⌊n/2⌋ − d) + (c⌈n/2⌉ − d) + 1
         = cn − 2d + 1
         ≤ cn − d          (when d ≥ 1) ✓
    

Counterintuitive: when induction fails, subtracting a lower-order term (strengthening the hypothesis) can be easier to prove than a weaker bound.

Avoiding Pitfalls

Do NOT do this:

T(n) ≤ 2(c⌊n/2⌋) + n ≤ cn + n = O(n)    ← WRONG!

You must prove the exact form T(n) ≤ cn, not just that T(n) = O(n).

Changing Variables

For T(n) = 2T(⌊√n⌋) + lg n:

  1. Let m = lg n, so T(2^m) = 2T(2^(m/2)) + m
  2. Let S(m) = T(2^m), giving S(m) = 2S(m/2) + m
  3. Solve: S(m) = O(m lg m)
  4. Back-substitute: T(n) = O(lg n · lg lg n)

4.4 The Recursion-Tree Method for Solving Recurrences

Overview

A recursion tree visualizes the cost of a recurrence:

  • Each node represents the cost of a single subproblem
  • Sum costs within each level to get per-level costs
  • Sum all levels to get the total cost

Best used to generate a good guess, then verify with the substitution method.

Example 1: T(n) = 3T(n/4) + cn²

Building the tree:

  • Root: cost cn²
  • Each node has 3 children, subproblem size shrinks by factor of 4
  • Depth i: 3^i nodes, each with cost c(n/4^i)²
  • Total cost at depth i: 3^i · c(n/4^i)² = (3/16)^i · cn²
  • Height: log₄ n (subproblem size n/4^i = 1 when i = log₄ n)
  • Leaves: 3^(log₄ n) = n^(log₄ 3) leaves, each Θ(1)

Summing all levels:

T(n) = Σ(i=0 to log₄n − 1) (3/16)^i · cn² + Θ(n^(log₄ 3))

Since 3/16 < 1, this is a decreasing geometric series:

T(n) < (1/(1 − 3/16)) · cn² + Θ(n^(log₄ 3))
     = (16/13) · cn² + Θ(n^(log₄ 3))
     = **O(n²)**

Verification by substitution: Show T(n) ≤ dn² for d ≥ (16/13)c. ✓

The root dominates the total cost — a constant fraction of the total is at the top level.

Example 2: T(n) = T(n/3) + T(2n/3) + cn

  • Unbalanced tree — left branch shrinks faster than right
  • Every level sums to at most cn
  • Longest root-to-leaf path: n → (2/3)n → (2/3)²n → ... → 1, height = log₃/₂ n
  • Guess: T(n) = O(n lg n)

Verification: Show T(n) ≤ dn lg n:

T(n) ≤ d(n/3)lg(n/3) + d(2n/3)lg(2n/3) + cn
     = dn lg n − d(n/3·lg 3 + 2n/3·lg(3/2)) + cn
     = dn lg n − dn(lg 3 − 2/3) + cn
     ≤ dn lg n          (when d ≥ c/(lg 3 − 2/3)) ✓

4.5 The Master Method for Solving Recurrences

The Master Theorem

For recurrences of the form:

T(n) = aT(n/b) + f(n)

where a ≥ 1, b > 1, and f(n) is asymptotically positive:

Compare f(n) with n^(log_b a):

CaseConditionSolution
1f(n) = O(n^(log_b a − ε)) for some ε > 0T(n) = Θ(n^(log_b a))
2f(n) = Θ(n^(log_b a))T(n) = Θ(n^(log_b a) · lg n)
3f(n) = Ω(n^(log_b a + ε)) for some ε > 0, and af(n/b) ≤ cf(n) for some c < 1T(n) = Θ(f(n))

Intuition

  • Case 1: n^(log_b a) dominates → leaf costs dominate the tree
  • Case 2: f(n) and n^(log_b a) are equal → costs spread evenly, multiply by lg n
  • Case 3: f(n) dominates → root cost dominates the tree

Important Caveats

  • In Case 1, f(n) must be polynomially smaller than n^(log_b a) (by a factor of n^ε)
  • In Case 3, f(n) must be polynomially larger AND satisfy the regularity condition af(n/b) ≤ cf(n)
  • There are gaps between the cases where the master method does not apply (e.g., f(n) = n lg n with a = 2, b = 2)

Examples

Example 1: T(n) = 9T(n/3) + n

  • a = 9, b = 3, f(n) = n, n^(log₃ 9) = n²
  • f(n) = O(n^(2−1)) → Case 1: T(n) = Θ(n²)

Example 2: T(n) = T(2n/3) + 1

  • a = 1, b = 3/2, f(n) = 1, n^(log₃/₂ 1) = n⁰ = 1
  • f(n) = Θ(1) → Case 2: T(n) = Θ(lg n)

Example 3: T(n) = 3T(n/4) + n lg n

  • a = 3, b = 4, n^(log₄ 3) ≈ n^0.793
  • f(n) = n lg n = Ω(n^(0.793 + ε)) for ε ≈ 0.2
  • Regularity: 3(n/4)lg(n/4) ≤ (3/4)n lg n ✓
  • Case 3: T(n) = Θ(n lg n)

Example 4 (does NOT apply): T(n) = 2T(n/2) + n lg n

  • a = 2, b = 2, n^(log₂ 2) = n
  • f(n) = n lg n is larger than n but not polynomially larger (lg n < n^ε for any ε > 0)
  • Falls in the gap between Case 2 and Case 3

Applying to Earlier Algorithms

AlgorithmRecurrenceabn^(log_b a)CaseSolution
Max subarray / Merge sortT(n) = 2T(n/2) + Θ(n)22n2Θ(n lg n)
Naive recursive matrix multiplyT(n) = 8T(n/2) + Θ(n²)821Θ(n³)
Strassen's algorithmT(n) = 7T(n/2) + Θ(n²)72n^(lg 7)1Θ(n^(lg 7)) ≈ Θ(n^2.81)

Summary Table

TopicKey Result
Maximum subarray (divide-and-conquer)Θ(n lg n)
Maximum subarray (Kadane's / DP)Θ(n)
Naive matrix multiplicationΘ(n³)
Recursive matrix multiplication (8 subproblems)Θ(n³)
Strassen's algorithm (7 subproblems)Θ(n^(lg 7)) ≈ Θ(n^2.81)
Best known matrix multiplication (Coppersmith-Winograd)O(n^2.376)
Substitution methodGuess + induction
Recursion-tree methodVisualize + sum levels
Master methodT(n) = aT(n/b) + f(n), three cases