Pagefy

Book

Introduction to Algorithms

1.The Role of Algorithms in Computing 2.Getting Started 3.Growth of Functions 4.Divide and Conquer 5.Probabilistic Analysis and Randomized Algorithms 6.Heapsort 7.Quicksort 8.Sorting in Linear Time 9.Medians and Order Statistics 10.Elementary Data Structures 11.Hash Tables 12.Binary Search Trees 13.Red Black Trees 14.Augmenting Data Structures 15.Dynamic Programming 16.Greedy Algorithms 17.Amortized Analysis 18.B Trees 19.Fibonacci Heaps 20.van Emde Boas Trees 21.Data Structures for Disjoint Sets 22.Elementary Graph Algorithms 23.Minimum Spanning Trees 24.Single Source Shortest Paths 25.All Pairs Shortest Paths 26.Maximum Flow 27.Multithreaded Algorithms 28.Matrix Operations 29.Linear Programming 30.Polynomials and the FFT 31.Number Theoretic Algorithms 32.String Matching 33.Computational Geometry 34.NP Completeness 35.Approximation Algorithms 36.Appendix A Summations 37.Appendix B Sets Etc 38.Appendix C Counting and Probability 39.Appendix D Matrices

Divide and Conquer

Introduction to AlgorithmsBuy the book

Chapter 4: Divide-and-Conquer

Overview

The divide-and-conquer paradigm solves a problem recursively by applying three steps at each level of recursion:

Divide the problem into a number of subproblems that are smaller instances of the same problem.
Conquer the subproblems by solving them recursively. If subproblem sizes are small enough, solve them directly (base case).
Combine the solutions to the subproblems into the solution for the original problem.

When subproblems are large enough to solve recursively → recursive case
When subproblems are small enough to solve directly → base case

Recurrences

Recurrences describe the running time of divide-and-conquer algorithms — equations or inequalities that describe a function in terms of its value on smaller inputs.

Example (merge sort): T(n) = 2T(n/2) + Θ(n), solution: T(n) = Θ(n lg n)
Unequal splits are possible: T(n) = T(2n/3) + T(n/3) + Θ(n)
Linear recursion: T(n) = T(n − 1) + Θ(1)

Three methods for solving recurrences:

Substitution method — guess a bound, prove by induction
Recursion-tree method — convert to a tree, sum costs at each level
Master method — cookbook method for recurrences of the form T(n) = aT(n/b) + f(n)

Technicalities

Floors and ceilings are typically omitted for convenience (they usually don't affect asymptotic bounds)
Boundary conditions (e.g., T(1) = Θ(1)) are generally omitted; T(n) is assumed constant for small n
Recurrence inequalities: T(n) ≤ 2T(n/2) + Θ(n) → use O-notation; T(n) ≥ ... → use Ω-notation

4.1 The Maximum-Subarray Problem

Problem Statement

Given an array of numbers (containing some negatives), find the contiguous subarray whose values have the greatest sum.

Motivated by the stock trading problem: maximize profit by buying low and selling high
Transform daily prices into daily changes → finding the maximum subarray of changes gives the optimal buy/sell window
Brute force: check all Θ(n²) pairs → Θ(n²) time

Divide-and-Conquer Solution

For subarray A[low..high], find midpoint mid = ⌊(low + high)/2⌋. A maximum subarray must lie in exactly one of:

Entirely in A[low..mid] (left half)
Entirely in A[mid+1..high] (right half)
Crossing the midpoint (spans both halves)

Solve left and right halves recursively. Find the max crossing subarray in linear time, then return the best of the three.

Finding the Max Crossing Subarray

Any subarray crossing the midpoint is composed of A[i..mid] and A[mid+1..j]. Find each half's maximum independently.

FIND-MAX-CROSSING-SUBARRAY(A, low, mid, high)
    left-sum = -∞
    sum = 0
    for i = mid downto low
        sum = sum + A[i]
        if sum > left-sum
            left-sum = sum
            max-left = i
    right-sum = -∞
    sum = 0
    for j = mid + 1 to high
        sum = sum + A[j]
        if sum > right-sum
            right-sum = sum
            max-right = j
    return (max-left, max-right, left-sum + right-sum)

Runs in Θ(n) time where n = high − low + 1

Main Algorithm

FIND-MAXIMUM-SUBARRAY(A, low, high)
    if high == low
        return (low, high, A[low])          // base case: one element
    else mid = ⌊(low + high)/2⌋
        (left-low, left-high, left-sum) =
            FIND-MAXIMUM-SUBARRAY(A, low, mid)
        (right-low, right-high, right-sum) =
            FIND-MAXIMUM-SUBARRAY(A, mid + 1, high)
        (cross-low, cross-high, cross-sum) =
            FIND-MAX-CROSSING-SUBARRAY(A, low, mid, high)
        if left-sum ≥ right-sum and left-sum ≥ cross-sum
            return (left-low, left-high, left-sum)
        elseif right-sum ≥ left-sum and right-sum ≥ cross-sum
            return (right-low, right-high, right-sum)
        else return (cross-low, cross-high, cross-sum)

Analysis

The recurrence is:

T(1) = Θ(1)
T(n) = 2T(n/2) + Θ(n) for n > 1

Solution: T(n) = Θ(n lg n) — same as merge sort.

Note: A linear-time O(n) algorithm exists using Kadane's algorithm (dynamic programming), but the divide-and-conquer approach illustrates the paradigm.

4.2 Strassen's Algorithm for Matrix Multiplication

Standard Matrix Multiplication

For n × n matrices A and B, the product C = A · B is defined by:

c_ij = Σ(k=1 to n) a_ik · b_kj

SQUARE-MATRIX-MULTIPLY(A, B)
    n = A.rows
    let C be a new n × n matrix
    for i = 1 to n
        for j = 1 to n
            c_ij = 0
            for k = 1 to n
                c_ij = c_ij + a_ik · b_kj
    return C

Running time: Θ(n³) due to triply-nested loops

Simple Divide-and-Conquer Approach

Partition each n × n matrix into four (n/2 × n/2) submatrices:

A = | A11  A12 |    B = | B11  B12 |    C = | C11  C12 |
    | A21  A22 |        | B21  B22 |        | C21  C22 |

This gives:

C11 = A11·B11 + A12·B21
C12 = A11·B12 + A12·B22
C21 = A21·B11 + A22·B21
C22 = A21·B12 + A22·B22

SQUARE-MATRIX-MULTIPLY-RECURSIVE(A, B)
    n = A.rows
    let C be a new n × n matrix
    if n == 1
        c11 = a11 · b11
    else partition A, B, and C into n/2 × n/2 submatrices
        C11 = RECURSIVE(A11, B11) + RECURSIVE(A12, B21)
        C12 = RECURSIVE(A11, B12) + RECURSIVE(A12, B22)
        C21 = RECURSIVE(A21, B11) + RECURSIVE(A22, B21)
        C22 = RECURSIVE(A21, B12) + RECURSIVE(A22, B22)
    return C

8 recursive multiplications of (n/2 × n/2) matrices + 4 matrix additions of Θ(n²)
Recurrence: T(n) = 8T(n/2) + Θ(n²)
Solution: T(n) = Θ(n³) — no improvement over the naive method

Key insight: Partitioning by index calculation takes Θ(1), not Θ(n²). Asymptotic notation subsumes constant factors, but recursive notation like T(n/2) does not — the factor of 8 matters.

Strassen's Method

The key idea: reduce from 8 to 7 recursive multiplications at the cost of more additions. This makes the recursion tree less "bushy."

Four steps:

Divide A, B, C into n/2 × n/2 submatrices (Θ(1) by index calculation)
Create 10 matrices S1..S10, each n/2 × n/2 (sums/differences of submatrices, Θ(n²) total)
Recursively compute 7 products P1..P7 (each n/2 × n/2)
Compute C submatrices from P1..P7 by addition/subtraction (Θ(n²))

Step 2 — The S matrices:

S1 = B12 − B22
S2 = A11 + A12
S3 = A21 + A22
S4 = B21 − B11
S5 = A11 + A22
S6 = B11 + B22
S7 = A12 − A22
S8 = B21 + B22
S9 = A11 − A21
S10 = B11 + B12

Step 3 — The 7 products:

P1 = A11 · S1
P2 = S2 · B22
P3 = S3 · B11
P4 = A22 · S4
P5 = S5 · S6
P6 = S7 · S8
P7 = S9 · S10

Step 4 — Combine:

C11 = P5 + P4 − P2 + P6
C12 = P1 + P2
C21 = P3 + P4
C22 = P5 + P1 − P3 − P7

Analysis

Recurrence: T(n) = 7T(n/2) + Θ(n²)
Solution: T(n) = Θ(n^(lg 7)) ≈ Θ(n^2.81)

Practical Considerations

Strassen's algorithm is not always preferred in practice because:

The constant factor hidden in Θ(n^(lg 7)) is larger than in Θ(n³)
Not as efficient for sparse matrices
Less numerically stable (larger floating-point errors accumulate)
Recursive submatrices consume more memory

In practice, implementations use Strassen's for large matrices and switch to the naive method below a crossover point (system-dependent, typically n = 400–2150).

4.3 The Substitution Method for Solving Recurrences

Overview

The substitution method has two steps:

Guess the form of the solution
Use mathematical induction to find the constants and prove the guess correct

Example: T(n) = 2T(⌊n/2⌋) + n

Guess: T(n) = O(n lg n), i.e., T(n) ≤ cn lg n for some c > 0.

Inductive step: Assume T(⌊n/2⌋) ≤ c⌊n/2⌋ lg(⌊n/2⌋). Then:

T(n) ≤ 2(c⌊n/2⌋ lg(⌊n/2⌋)) + n
     ≤ cn lg(n/2) + n
     = cn lg n − cn lg 2 + n
     = cn lg n − cn + n
     ≤ cn lg n          (when c ≥ 1)

Base case: T(1) = 1 but cn lg n = 0 at n = 1 — doesn't work! Fix: use T(2) and T(3) as base cases (n₀ = 2). Choose c ≥ 2 so that T(2) ≤ c·2·lg 2 and T(3) ≤ c·3·lg 3.

Making a Good Guess

Use similarity to known recurrences (e.g., T(n) = 2T(⌊n/2⌋ + 17) + n is still O(n lg n))
Start with loose bounds and tighten: prove Ω(n) lower bound and O(n²) upper bound, then converge
Use recursion trees to generate guesses

Subtleties: Strengthening the Inductive Hypothesis

For T(n) = T(⌊n/2⌋) + T(⌈n/2⌉) + 1:

Guessing T(n) ≤ cn gives T(n) ≤ cn + 1 — doesn't work!

Fix: Subtract a lower-order term. Guess T(n) ≤ cn − d (where d ≥ 0):

T(n) ≤ (c⌊n/2⌋ − d) + (c⌈n/2⌉ − d) + 1
     = cn − 2d + 1
     ≤ cn − d          (when d ≥ 1) ✓

Counterintuitive: when induction fails, subtracting a lower-order term (strengthening the hypothesis) can be easier to prove than a weaker bound.

Avoiding Pitfalls

Do NOT do this:

T(n) ≤ 2(c⌊n/2⌋) + n ≤ cn + n = O(n)    ← WRONG!

You must prove the exact form T(n) ≤ cn, not just that T(n) = O(n).

Changing Variables

For T(n) = 2T(⌊√n⌋) + lg n:

Let m = lg n, so T(2^m) = 2T(2^(m/2)) + m
Let S(m) = T(2^m), giving S(m) = 2S(m/2) + m
Solve: S(m) = O(m lg m)
Back-substitute: T(n) = O(lg n · lg lg n)

4.4 The Recursion-Tree Method for Solving Recurrences

Overview

A recursion tree visualizes the cost of a recurrence:

Each node represents the cost of a single subproblem
Sum costs within each level to get per-level costs
Sum all levels to get the total cost

Best used to generate a good guess, then verify with the substitution method.

Example 1: T(n) = 3T(n/4) + cn²

Building the tree:

Root: cost cn²
Each node has 3 children, subproblem size shrinks by factor of 4
Depth i: 3^i nodes, each with cost c(n/4^i)²
Total cost at depth i: 3^i · c(n/4^i)² = (3/16)^i · cn²
Height: log₄ n (subproblem size n/4^i = 1 when i = log₄ n)
Leaves: 3^(log₄ n) = n^(log₄ 3) leaves, each Θ(1)

Summing all levels:

T(n) = Σ(i=0 to log₄n − 1) (3/16)^i · cn² + Θ(n^(log₄ 3))

Since 3/16 < 1, this is a decreasing geometric series:

T(n) < (1/(1 − 3/16)) · cn² + Θ(n^(log₄ 3))
     = (16/13) · cn² + Θ(n^(log₄ 3))
     = **O(n²)**

Verification by substitution: Show T(n) ≤ dn² for d ≥ (16/13)c. ✓

The root dominates the total cost — a constant fraction of the total is at the top level.

Example 2: T(n) = T(n/3) + T(2n/3) + cn

Unbalanced tree — left branch shrinks faster than right
Every level sums to at most cn
Longest root-to-leaf path: n → (2/3)n → (2/3)²n → ... → 1, height = log₃/₂ n
Guess: T(n) = O(n lg n)

Verification: Show T(n) ≤ dn lg n:

T(n) ≤ d(n/3)lg(n/3) + d(2n/3)lg(2n/3) + cn
     = dn lg n − d(n/3·lg 3 + 2n/3·lg(3/2)) + cn
     = dn lg n − dn(lg 3 − 2/3) + cn
     ≤ dn lg n          (when d ≥ c/(lg 3 − 2/3)) ✓

4.5 The Master Method for Solving Recurrences

The Master Theorem

For recurrences of the form:

T(n) = aT(n/b) + f(n)

where a ≥ 1, b > 1, and f(n) is asymptotically positive:

Compare f(n) with n^(log_b a):

Case	Condition	Solution
1	f(n) = O(n^(log_b a − ε)) for some ε > 0	T(n) = Θ(n^(log_b a))
2	f(n) = Θ(n^(log_b a))	T(n) = Θ(n^(log_b a) · lg n)
3	f(n) = Ω(n^(log_b a + ε)) for some ε > 0, and af(n/b) ≤ cf(n) for some c < 1	T(n) = Θ(f(n))

Intuition

Case 1: n^(log_b a) dominates → leaf costs dominate the tree
Case 2: f(n) and n^(log_b a) are equal → costs spread evenly, multiply by lg n
Case 3: f(n) dominates → root cost dominates the tree

Important Caveats

In Case 1, f(n) must be polynomially smaller than n^(log_b a) (by a factor of n^ε)
In Case 3, f(n) must be polynomially larger AND satisfy the regularity condition af(n/b) ≤ cf(n)
There are gaps between the cases where the master method does not apply (e.g., f(n) = n lg n with a = 2, b = 2)

Examples

Example 1: T(n) = 9T(n/3) + n

a = 9, b = 3, f(n) = n, n^(log₃ 9) = n²
f(n) = O(n^(2−1)) → Case 1: T(n) = Θ(n²)

Example 2: T(n) = T(2n/3) + 1

a = 1, b = 3/2, f(n) = 1, n^(log₃/₂ 1) = n⁰ = 1
f(n) = Θ(1) → Case 2: T(n) = Θ(lg n)

Example 3: T(n) = 3T(n/4) + n lg n

a = 3, b = 4, n^(log₄ 3) ≈ n^0.793
f(n) = n lg n = Ω(n^(0.793 + ε)) for ε ≈ 0.2
Regularity: 3(n/4)lg(n/4) ≤ (3/4)n lg n ✓
Case 3: T(n) = Θ(n lg n)

Example 4 (does NOT apply): T(n) = 2T(n/2) + n lg n

a = 2, b = 2, n^(log₂ 2) = n
f(n) = n lg n is larger than n but not polynomially larger (lg n < n^ε for any ε > 0)
Falls in the gap between Case 2 and Case 3

Applying to Earlier Algorithms

Algorithm	Recurrence	a	b	n^(log_b a)	Case	Solution
Max subarray / Merge sort	T(n) = 2T(n/2) + Θ(n)	2	2	n	2	Θ(n lg n)
Naive recursive matrix multiply	T(n) = 8T(n/2) + Θ(n²)	8	2	n³	1	Θ(n³)
Strassen's algorithm	T(n) = 7T(n/2) + Θ(n²)	7	2	n^(lg 7)	1	Θ(n^(lg 7)) ≈ Θ(n^2.81)

Summary Table

Topic	Key Result
Maximum subarray (divide-and-conquer)	Θ(n lg n)
Maximum subarray (Kadane's / DP)	Θ(n)
Naive matrix multiplication	Θ(n³)
Recursive matrix multiplication (8 subproblems)	Θ(n³)
Strassen's algorithm (7 subproblems)	Θ(n^(lg 7)) ≈ Θ(n^2.81)
Best known matrix multiplication (Coppersmith-Winograd)	O(n^2.376)
Substitution method	Guess + induction
Recursion-tree method	Visualize + sum levels
Master method	T(n) = aT(n/b) + f(n), three cases

Previous chapter

Growth of Functions

Next chapter

Probabilistic Analysis and Randomized Algorithms