Polynomials and the FFT
Introduction to AlgorithmsChapter 30: Polynomials and the FFT
The fast Fourier transform (FFT) reduces polynomial multiplication from Θ(n²) to Θ(n lg n) time. The key insight: multiply in point-value form (Θ(n)), and use the FFT to convert between coefficient and point-value representations in Θ(n lg n).
Common applications include signal processing, digital audio/video compression (e.g., MP3), and computing convolutions.
30.1 Representing Polynomials
A polynomial in variable x over a field F:
$$A(x) = \sum_{j=0}^{n-1} a_j x^j$$
- Degree: highest nonzero coefficient index k, written degree(A) = k
- Degree-bound: any integer strictly greater than the degree (a polynomial of degree-bound n has degree between 0 and n−1)
Polynomial Operations
- Addition: C(x) = A(x) + B(x), where c_j = a_j + b_j — takes Θ(n) time
- Multiplication: C(x) = A(x)·B(x), a degree-bound 2n−1 polynomial where:
- c_j = Σ(k=0 to j) a_k · b_{j−k}
- This is the convolution of vectors a and b, denoted c = a ⊗ b
- Straightforward method takes Θ(n²) time
Coefficient Representation
A polynomial A(x) of degree-bound n is represented by the vector a = (a₀, a₁, ..., a_{n−1}).
- Evaluation at a point x₀ using Horner's rule in Θ(n):
- A(x₀) = a₀ + x₀(a₁ + x₀(a₂ + ··· + x₀(a_{n−2} + x₀·a_{n−1}) ···))
- Addition: Θ(n)
- Multiplication: Θ(n²)
Point-Value Representation
A polynomial A(x) of degree-bound n is represented by n point-value pairs:
{(x₀, y₀), (x₁, y₁), ..., (x_{n−1}, y_{n−1})}
where all x_k are distinct and y_k = A(x_k).
- Evaluation (coefficient → point-value): evaluate at n points — Θ(n²) naively, Θ(n lg n) with FFT
- Interpolation (point-value → coefficient): the inverse operation
Theorem 30.1 (Uniqueness of Interpolating Polynomial): For any set of n point-value pairs with distinct x_k values, there is a unique polynomial of degree-bound n passing through them.
- Proof uses the Vandermonde matrix V(x₀, ..., x_{n−1}), which is invertible when all x_k are distinct
- Interpolation via LU decomposition: O(n³)
- Lagrange's formula (Θ(n²)):
$$A(x) = \sum_{k=0}^{n-1} y_k \prod_{j \neq k} \frac{x - x_j}{x_k - x_j}$$
Operations in Point-Value Form
| Operation | Time |
|---|---|
| Addition | Θ(n) — add corresponding y-values |
| Multiplication | Θ(n) — multiply corresponding y-values |
Important caveat for multiplication: Since degree(C) = degree(A) + degree(B), we need 2n point-value pairs (an "extended" representation) to uniquely determine the product polynomial of degree-bound 2n.
Fast Multiplication of Polynomials in Coefficient Form
The strategy (illustrated in Figure 30.1 of CLRS):
Coefficient form Coefficient form
a₀,...,a_{n-1} ──── Ordinary mult Θ(n²) ────► c₀,...,c_{2n-2}
b₀,...,b_{n-1}
│ ▲
│ Evaluate (FFT) │ Interpolate (Inverse FFT)
│ Θ(n lg n) │ Θ(n lg n)
▼ │
Point-value form ── Pointwise mult Θ(n) ──► Point-value form
Procedure for Θ(n lg n) polynomial multiplication (n assumed a power of 2):
- Double degree-bound: pad A(x) and B(x) with n high-order zero coefficients → degree-bound 2n
- Evaluate: apply FFT of order 2n to get point-value representations at the (2n)th roots of unity
- Pointwise multiply: multiply corresponding values — Θ(n)
- Interpolate: apply inverse FFT to recover coefficient representation
Steps 1, 3: Θ(n). Steps 2, 4: Θ(n lg n). Total: Θ(n lg n).
Theorem 30.2: Two polynomials of degree-bound n can be multiplied in Θ(n lg n) time, with both input and output in coefficient form.
30.2 The DFT and FFT
Complex Roots of Unity
A complex nth root of unity is a complex number ω such that ωⁿ = 1.
- There are exactly n such roots: e^{2πik/n} for k = 0, 1, ..., n−1
- Using Euler's formula: e^{iu} = cos(u) + i·sin(u)
- The n roots are equally spaced on the unit circle in the complex plane
The principal nth root of unity:
$$\omega_n = e^{2\pi i / n}$$
All other nth roots are powers of ω_n: ω_n⁰, ω_n¹, ..., ω_n^{n−1}.
These roots form a group under multiplication isomorphic to (Z_n, +).
Essential Properties
Lemma 30.3 (Cancellation Lemma): For any n ≥ 0, k ≥ 0, d > 0:
$$\omega_{dn}^{dk} = \omega_n^k$$
Corollary 30.4: For any even n > 0: ω_n^{n/2} = ω₂ = −1
Lemma 30.5 (Halving Lemma): If n > 0 is even, the squares of the n complex nth roots of unity are the n/2 complex (n/2)th roots of unity. Each (n/2)th root appears exactly twice:
- (ω_n^k)² = ω_{n/2}^k
- ω_n^{k+n/2} = −ω_n^k, so (ω_n^{k+n/2})² = (ω_n^k)²
This is essential for the divide-and-conquer approach — subproblems are half the size.
Lemma 30.6 (Summation Lemma): For any n ≥ 1 and nonzero integer k not divisible by n:
$$\sum_{j=0}^{n-1} (\omega_n^k)^j = 0$$
The DFT
Given polynomial A(x) = Σ a_j x^j of degree-bound n, evaluate at ω_n⁰, ω_n¹, ..., ω_n^{n−1}:
$$y_k = A(\omega_n^k) = \sum_{j=0}^{n-1} a_j \omega_n^{kj}$$
The vector y = (y₀, y₁, ..., y_{n−1}) is the Discrete Fourier Transform (DFT) of coefficient vector a. Written y = DFT_n(a).
As a matrix equation: y = V_n · a, where V_n is the Vandermonde matrix with (k, j) entry ω_n^{kj}.
The FFT Algorithm
The fast Fourier transform computes DFT_n(a) in Θ(n lg n) time using divide-and-conquer. Requires n to be a power of 2.
Key idea: Separate A(x) into even-indexed and odd-indexed coefficients:
- A⁰(x) = a₀ + a₂x + a₄x² + ··· + a_{n−2}x^{n/2−1} (even-indexed)
- A¹(x) = a₁ + a₃x + a₅x² + ··· + a_{n−1}x^{n/2−1} (odd-indexed)
Then: A(x) = A⁰(x²) + x·A¹(x²)
By the halving lemma, evaluating A at n nth roots of unity reduces to evaluating A⁰ and A¹ at n/2 (n/2)th roots of unity — two subproblems of half the size.
RECURSIVE-FFT(a)
n = a.length // n is a power of 2
if n == 1
return a
ω_n = e^{2πi/n}
ω = 1
a⁰ = (a₀, a₂, ..., a_{n-2})
a¹ = (a₁, a₃, ..., a_{n-1})
y⁰ = RECURSIVE-FFT(a⁰)
y¹ = RECURSIVE-FFT(a¹)
for k = 0 to n/2 - 1
y_k = y⁰_k + ω · y¹_k
y_{k+n/2} = y⁰_k - ω · y¹_k
ω = ω · ω_n
return y
Lines 11–12 explanation:
- For k = 0, ..., n/2−1: y_k = A⁰(ω_n^{2k}) + ω_n^k · A¹(ω_n^{2k}) = A(ω_n^k)
- For k+n/2: y_{k+n/2} = A⁰(ω_n^{2k}) − ω_n^k · A¹(ω_n^{2k}) = A(ω_n^{k+n/2})
- Uses the fact that ω_n^{k+n/2} = −ω_n^k
The factors ω_n^k are called twiddle factors.
Running time: T(n) = 2T(n/2) + Θ(n) = Θ(n lg n)
Interpolation at Complex Roots of Unity (Inverse DFT)
The DFT as a matrix equation: y = V_n · a
Theorem 30.7: The (j, k) entry of V_n⁻¹ is ω_n^{−kj} / n.
Therefore the inverse DFT:
$$a_j = \frac{1}{n} \sum_{k=0}^{n-1} y_k \omega_n^{-kj}$$
To compute DFT⁻¹ in Θ(n lg n): modify the FFT to:
- Swap roles of a and y
- Replace ω_n with ω_n⁻¹
- Divide each result element by n
Theorem 30.8 (Convolution Theorem): For any two vectors a and b of length n (power of 2), padded with zeros to length 2n:
$$a \otimes b = \text{DFT}{2n}^{-1}(\text{DFT}{2n}(a) \cdot \text{DFT}_{2n}(b))$$
where · denotes componentwise multiplication.
30.3 Efficient FFT Implementations
The Butterfly Operation
The core operation of the FFT loop, optimized to avoid redundant computation:
for k = 0 to n/2 - 1
t = ω · y¹_k // twiddle factor times odd element
y_k = y⁰_k + t
y_{k+n/2} = y⁰_k - t
ω = ω · ω_n
This is the butterfly operation: multiply by twiddle factor, then add and subtract. Eliminates the common subexpression ω_n^k · y¹_k that was computed twice in the naive loop.
Bit-Reversal Permutation
To convert the recursive FFT to an iterative bottom-up algorithm, the input must be reordered according to a bit-reversal permutation.
- Place element a_k at position A[rev(k)], where rev(k) reverses the lg(n) bits of k
- Example for n = 8: input order becomes (a₀, a₄, a₂, a₆, a₁, a₅, a₃, a₇)
- Indices: 0,4,2,6,1,5,3,7
- Binary: 000,100,010,110,001,101,011,111
- Reversed: 000,001,010,011,100,101,110,111
Intuition: At each tree level, even-indexed elements go left, odd-indexed go right. Stripping the low-order bit at each level produces the bit-reversal order at the leaves.
BIT-REVERSE-COPY(a, A)
n = a.length
for k = 0 to n - 1
A[rev(k)] = a_k
Iterative FFT
ITERATIVE-FFT(a)
BIT-REVERSE-COPY(a, A)
n = a.length // n is a power of 2
for s = 1 to lg(n) // stage s: combine 2^(s-1)-element DFTs into 2^s-element DFTs
m = 2^s
ω_m = e^{2πi/m}
for k = 0 to n - 1 by m // each group of m elements
ω = 1
for j = 0 to m/2 - 1 // each butterfly in the group
t = ω · A[k + j + m/2]
u = A[k + j]
A[k + j] = u + t
A[k + j + m/2] = u - t
ω = ω · ω_m
return A
How it works (bottom-up):
- After bit-reversal, take elements in pairs → compute 2-element DFTs via butterfly
- Take pairs of 2-element DFTs → compute 4-element DFTs
- Continue doubling until the full n-element DFT is assembled
Running time: Θ(n lg n)
- BIT-REVERSE-COPY: O(n lg n), or Θ(n) with a precomputed lookup table
- Innermost loop body executes L(n) = Σ(s=1 to lg n) (n/2^s)·2^{s−1} = Σ(s=1 to lg n) n/2 = Θ(n lg n)
Parallel FFT Circuit
The iterative FFT naturally maps to a parallel circuit:
- Input: bit-reverse permuted values on n wires
- lg(n) stages, each with n/2 butterfly operations executed in parallel
- Circuit depth: Θ(lg n)
- Total work: Θ(n lg n) butterfly operations
For stage s (s = 1, 2, ..., lg n):
- n/2^s groups of butterflies
- 2^{s−1} butterflies per group
- Twiddle factors: ω_m⁰, ω_m¹, ..., ω_m^{m/2−1} where m = 2^s
Example for n = 8:
- Stage 1 (s=1): 4 butterflies with twiddle factor ω₂⁰
- Stage 2 (s=2): 4 butterflies with twiddle factors ω₄⁰, ω₄¹
- Stage 3 (s=3): 4 butterflies with twiddle factors ω₈⁰, ω₈¹, ω₈², ω₈³
Summary Table
| Operation | Coefficient Form | Point-Value Form |
|---|---|---|
| Evaluation | Θ(n) (Horner's) | — |
| Addition | Θ(n) | Θ(n) |
| Multiplication | Θ(n²) naive / Θ(n lg n) via FFT | Θ(n) |
| Interpolation | — | Θ(n²) Lagrange / Θ(n lg n) inverse FFT |
Key takeaway: The FFT exploits complex roots of unity and the halving lemma to convert between coefficient and point-value representations in Θ(n lg n), enabling fast polynomial multiplication via the convolution theorem.
Previous chapter
Linear ProgrammingNext chapter
Number Theoretic Algorithms