Pagefy

Book

Introduction to Algorithms

1.The Role of Algorithms in Computing 2.Getting Started 3.Growth of Functions 4.Divide and Conquer 5.Probabilistic Analysis and Randomized Algorithms 6.Heapsort 7.Quicksort 8.Sorting in Linear Time 9.Medians and Order Statistics 10.Elementary Data Structures 11.Hash Tables 12.Binary Search Trees 13.Red Black Trees 14.Augmenting Data Structures 15.Dynamic Programming 16.Greedy Algorithms 17.Amortized Analysis 18.B Trees 19.Fibonacci Heaps 20.van Emde Boas Trees 21.Data Structures for Disjoint Sets 22.Elementary Graph Algorithms 23.Minimum Spanning Trees 24.Single Source Shortest Paths 25.All Pairs Shortest Paths 26.Maximum Flow 27.Multithreaded Algorithms 28.Matrix Operations 29.Linear Programming 30.Polynomials and the FFT 31.Number Theoretic Algorithms 32.String Matching 33.Computational Geometry 34.NP Completeness 35.Approximation Algorithms 36.Appendix A Summations 37.Appendix B Sets Etc 38.Appendix C Counting and Probability 39.Appendix D Matrices

Polynomials and the FFT

Introduction to Algorithms

Chapter 30: Polynomials and the FFT

The fast Fourier transform (FFT) reduces polynomial multiplication from Θ(n²) to Θ(n lg n) time. The key insight: multiply in point-value form (Θ(n)), and use the FFT to convert between coefficient and point-value representations in Θ(n lg n).

Common applications include signal processing, digital audio/video compression (e.g., MP3), and computing convolutions.

30.1 Representing Polynomials

A polynomial in variable x over a field F:

$$A(x) = \sum_{j=0}^{n-1} a_j x^j$$

Degree: highest nonzero coefficient index k, written degree(A) = k
Degree-bound: any integer strictly greater than the degree (a polynomial of degree-bound n has degree between 0 and n−1)

Polynomial Operations

Addition: C(x) = A(x) + B(x), where c_j = a_j + b_j — takes Θ(n) time
Multiplication: C(x) = A(x)·B(x), a degree-bound 2n−1 polynomial where:
- c_j = Σ(k=0 to j) a_k · b_{j−k}
- This is the convolution of vectors a and b, denoted c = a ⊗ b
- Straightforward method takes Θ(n²) time

Coefficient Representation

A polynomial A(x) of degree-bound n is represented by the vector a = (a₀, a₁, ..., a_{n−1}).

Evaluation at a point x₀ using Horner's rule in Θ(n):
- A(x₀) = a₀ + x₀(a₁ + x₀(a₂ + ··· + x₀(a_{n−2} + x₀·a_{n−1}) ···))
Addition: Θ(n)
Multiplication: Θ(n²)

Point-Value Representation

A polynomial A(x) of degree-bound n is represented by n point-value pairs:

{(x₀, y₀), (x₁, y₁), ..., (x_{n−1}, y_{n−1})}

where all x_k are distinct and y_k = A(x_k).

Evaluation (coefficient → point-value): evaluate at n points — Θ(n²) naively, Θ(n lg n) with FFT
Interpolation (point-value → coefficient): the inverse operation

Theorem 30.1 (Uniqueness of Interpolating Polynomial): For any set of n point-value pairs with distinct x_k values, there is a unique polynomial of degree-bound n passing through them.

Proof uses the Vandermonde matrix V(x₀, ..., x_{n−1}), which is invertible when all x_k are distinct
Interpolation via LU decomposition: O(n³)
Lagrange's formula (Θ(n²)):

$$A(x) = \sum_{k=0}^{n-1} y_k \prod_{j \neq k} \frac{x - x_j}{x_k - x_j}$$

Operations in Point-Value Form

Operation	Time
Addition	Θ(n) — add corresponding y-values
Multiplication	Θ(n) — multiply corresponding y-values

Important caveat for multiplication: Since degree(C) = degree(A) + degree(B), we need 2n point-value pairs (an "extended" representation) to uniquely determine the product polynomial of degree-bound 2n.

Fast Multiplication of Polynomials in Coefficient Form

The strategy (illustrated in Figure 30.1 of CLRS):

Coefficient form                          Coefficient form
a₀,...,a_{n-1}  ──── Ordinary mult Θ(n²) ────►  c₀,...,c_{2n-2}
b₀,...,b_{n-1}
    │                                              ▲
    │ Evaluate (FFT)                               │ Interpolate (Inverse FFT)
    │ Θ(n lg n)                                    │ Θ(n lg n)
    ▼                                              │
Point-value form ── Pointwise mult Θ(n) ──► Point-value form

Procedure for Θ(n lg n) polynomial multiplication (n assumed a power of 2):

Double degree-bound: pad A(x) and B(x) with n high-order zero coefficients → degree-bound 2n
Evaluate: apply FFT of order 2n to get point-value representations at the (2n)th roots of unity
Pointwise multiply: multiply corresponding values — Θ(n)
Interpolate: apply inverse FFT to recover coefficient representation

Steps 1, 3: Θ(n). Steps 2, 4: Θ(n lg n). Total: Θ(n lg n).

Theorem 30.2: Two polynomials of degree-bound n can be multiplied in Θ(n lg n) time, with both input and output in coefficient form.

30.2 The DFT and FFT

Complex Roots of Unity

A complex nth root of unity is a complex number ω such that ωⁿ = 1.

There are exactly n such roots: e^{2πik/n} for k = 0, 1, ..., n−1
Using Euler's formula: e^{iu} = cos(u) + i·sin(u)
The n roots are equally spaced on the unit circle in the complex plane

The principal nth root of unity:

$$\omega_n = e^{2\pi i / n}$$

All other nth roots are powers of ω_n: ω_n⁰, ω_n¹, ..., ω_n^{n−1}.

These roots form a group under multiplication isomorphic to (Z_n, +).

Essential Properties

Lemma 30.3 (Cancellation Lemma): For any n ≥ 0, k ≥ 0, d > 0:

$$\omega_{dn}^{dk} = \omega_n^k$$

Corollary 30.4: For any even n > 0: ω_n^{n/2} = ω₂ = −1

Lemma 30.5 (Halving Lemma): If n > 0 is even, the squares of the n complex nth roots of unity are the n/2 complex (n/2)th roots of unity. Each (n/2)th root appears exactly twice:

(ω_n^k)² = ω_{n/2}^k
ω_n^{k+n/2} = −ω_n^k, so (ω_n^{k+n/2})² = (ω_n^k)²

This is essential for the divide-and-conquer approach — subproblems are half the size.

Lemma 30.6 (Summation Lemma): For any n ≥ 1 and nonzero integer k not divisible by n:

$$\sum_{j=0}^{n-1} (\omega_n^k)^j = 0$$

The DFT

Given polynomial A(x) = Σ a_j x^j of degree-bound n, evaluate at ω_n⁰, ω_n¹, ..., ω_n^{n−1}:

$$y_k = A(\omega_n^k) = \sum_{j=0}^{n-1} a_j \omega_n^{kj}$$

The vector y = (y₀, y₁, ..., y_{n−1}) is the Discrete Fourier Transform (DFT) of coefficient vector a. Written y = DFT_n(a).

As a matrix equation: y = V_n · a, where V_n is the Vandermonde matrix with (k, j) entry ω_n^{kj}.

The FFT Algorithm

The fast Fourier transform computes DFT_n(a) in Θ(n lg n) time using divide-and-conquer. Requires n to be a power of 2.

Key idea: Separate A(x) into even-indexed and odd-indexed coefficients:

A⁰(x) = a₀ + a₂x + a₄x² + ··· + a_{n−2}x^{n/2−1} (even-indexed)
A¹(x) = a₁ + a₃x + a₅x² + ··· + a_{n−1}x^{n/2−1} (odd-indexed)

Then: A(x) = A⁰(x²) + x·A¹(x²)

By the halving lemma, evaluating A at n nth roots of unity reduces to evaluating A⁰ and A¹ at n/2 (n/2)th roots of unity — two subproblems of half the size.

RECURSIVE-FFT(a)
  n = a.length                    // n is a power of 2
  if n == 1
      return a
  ω_n = e^{2πi/n}
  ω = 1
  a⁰ = (a₀, a₂, ..., a_{n-2})
  a¹ = (a₁, a₃, ..., a_{n-1})
  y⁰ = RECURSIVE-FFT(a⁰)
  y¹ = RECURSIVE-FFT(a¹)
  for k = 0 to n/2 - 1
      y_k       = y⁰_k + ω · y¹_k
      y_{k+n/2} = y⁰_k - ω · y¹_k
      ω = ω · ω_n
  return y

Lines 11–12 explanation:

For k = 0, ..., n/2−1: y_k = A⁰(ω_n^{2k}) + ω_n^k · A¹(ω_n^{2k}) = A(ω_n^k)
For k+n/2: y_{k+n/2} = A⁰(ω_n^{2k}) − ω_n^k · A¹(ω_n^{2k}) = A(ω_n^{k+n/2})
- Uses the fact that ω_n^{k+n/2} = −ω_n^k

The factors ω_n^k are called twiddle factors.

Running time: T(n) = 2T(n/2) + Θ(n) = Θ(n lg n)

Interpolation at Complex Roots of Unity (Inverse DFT)

The DFT as a matrix equation: y = V_n · a

Theorem 30.7: The (j, k) entry of V_n⁻¹ is ω_n^{−kj} / n.

Therefore the inverse DFT:

$$a_j = \frac{1}{n} \sum_{k=0}^{n-1} y_k \omega_n^{-kj}$$

To compute DFT⁻¹ in Θ(n lg n): modify the FFT to:

Swap roles of a and y
Replace ω_n with ω_n⁻¹
Divide each result element by n

Theorem 30.8 (Convolution Theorem): For any two vectors a and b of length n (power of 2), padded with zeros to length 2n:

$$a \otimes b = \text{DFT}{2n}^{-1}(\text{DFT}{2n}(a) \cdot \text{DFT}_{2n}(b))$$

where · denotes componentwise multiplication.

30.3 Efficient FFT Implementations

The Butterfly Operation

The core operation of the FFT loop, optimized to avoid redundant computation:

for k = 0 to n/2 - 1
    t = ω · y¹_k              // twiddle factor times odd element
    y_k       = y⁰_k + t
    y_{k+n/2} = y⁰_k - t
    ω = ω · ω_n

This is the butterfly operation: multiply by twiddle factor, then add and subtract. Eliminates the common subexpression ω_n^k · y¹_k that was computed twice in the naive loop.

Bit-Reversal Permutation

To convert the recursive FFT to an iterative bottom-up algorithm, the input must be reordered according to a bit-reversal permutation.

Place element a_k at position A[rev(k)], where rev(k) reverses the lg(n) bits of k
Example for n = 8: input order becomes (a₀, a₄, a₂, a₆, a₁, a₅, a₃, a₇)
- Indices: 0,4,2,6,1,5,3,7
- Binary: 000,100,010,110,001,101,011,111
- Reversed: 000,001,010,011,100,101,110,111

Intuition: At each tree level, even-indexed elements go left, odd-indexed go right. Stripping the low-order bit at each level produces the bit-reversal order at the leaves.

BIT-REVERSE-COPY(a, A)
  n = a.length
  for k = 0 to n - 1
      A[rev(k)] = a_k

Iterative FFT

ITERATIVE-FFT(a)
  BIT-REVERSE-COPY(a, A)
  n = a.length                        // n is a power of 2
  for s = 1 to lg(n)                  // stage s: combine 2^(s-1)-element DFTs into 2^s-element DFTs
      m = 2^s
      ω_m = e^{2πi/m}
      for k = 0 to n - 1 by m         // each group of m elements
          ω = 1
          for j = 0 to m/2 - 1        // each butterfly in the group
              t = ω · A[k + j + m/2]
              u = A[k + j]
              A[k + j]       = u + t
              A[k + j + m/2] = u - t
              ω = ω · ω_m
  return A

How it works (bottom-up):

After bit-reversal, take elements in pairs → compute 2-element DFTs via butterfly
Take pairs of 2-element DFTs → compute 4-element DFTs
Continue doubling until the full n-element DFT is assembled

Running time: Θ(n lg n)

BIT-REVERSE-COPY: O(n lg n), or Θ(n) with a precomputed lookup table
Innermost loop body executes L(n) = Σ(s=1 to lg n) (n/2^s)·2^{s−1} = Σ(s=1 to lg n) n/2 = Θ(n lg n)

Parallel FFT Circuit

The iterative FFT naturally maps to a parallel circuit:

Input: bit-reverse permuted values on n wires
lg(n) stages, each with n/2 butterfly operations executed in parallel
Circuit depth: Θ(lg n)
Total work: Θ(n lg n) butterfly operations

For stage s (s = 1, 2, ..., lg n):

n/2^s groups of butterflies
2^{s−1} butterflies per group
Twiddle factors: ω_m⁰, ω_m¹, ..., ω_m^{m/2−1} where m = 2^s

Example for n = 8:

Stage 1 (s=1): 4 butterflies with twiddle factor ω₂⁰
Stage 2 (s=2): 4 butterflies with twiddle factors ω₄⁰, ω₄¹
Stage 3 (s=3): 4 butterflies with twiddle factors ω₈⁰, ω₈¹, ω₈², ω₈³

Summary Table

Operation	Coefficient Form	Point-Value Form
Evaluation	Θ(n) (Horner's)	—
Addition	Θ(n)	Θ(n)
Multiplication	Θ(n²) naive / Θ(n lg n) via FFT	Θ(n)
Interpolation	—	Θ(n²) Lagrange / Θ(n lg n) inverse FFT

Key takeaway: The FFT exploits complex roots of unity and the halving lemma to convert between coefficient and point-value representations in Θ(n lg n), enabling fast polynomial multiplication via the convolution theorem.

Previous chapter

Linear Programming

Next chapter

Number Theoretic Algorithms