Pagefy

Pagefy

Back to Data Structures and Algorithms

Polynomials and the FFT

Introduction to Algorithms

Chapter 30: Polynomials and the FFT

The fast Fourier transform (FFT) reduces polynomial multiplication from Θ(n²) to Θ(n lg n) time. The key insight: multiply in point-value form (Θ(n)), and use the FFT to convert between coefficient and point-value representations in Θ(n lg n).

Common applications include signal processing, digital audio/video compression (e.g., MP3), and computing convolutions.


30.1 Representing Polynomials

A polynomial in variable x over a field F:

$$A(x) = \sum_{j=0}^{n-1} a_j x^j$$

  • Degree: highest nonzero coefficient index k, written degree(A) = k
  • Degree-bound: any integer strictly greater than the degree (a polynomial of degree-bound n has degree between 0 and n−1)

Polynomial Operations

  • Addition: C(x) = A(x) + B(x), where c_j = a_j + b_j — takes Θ(n) time
  • Multiplication: C(x) = A(x)·B(x), a degree-bound 2n−1 polynomial where:
    • c_j = Σ(k=0 to j) a_k · b_{j−k}
    • This is the convolution of vectors a and b, denoted c = a ⊗ b
    • Straightforward method takes Θ(n²) time

Coefficient Representation

A polynomial A(x) of degree-bound n is represented by the vector a = (a₀, a₁, ..., a_{n−1}).

  • Evaluation at a point x₀ using Horner's rule in Θ(n):
    • A(x₀) = a₀ + x₀(a₁ + x₀(a₂ + ··· + x₀(a_{n−2} + x₀·a_{n−1}) ···))
  • Addition: Θ(n)
  • Multiplication: Θ(n²)

Point-Value Representation

A polynomial A(x) of degree-bound n is represented by n point-value pairs:

{(x₀, y₀), (x₁, y₁), ..., (x_{n−1}, y_{n−1})}

where all x_k are distinct and y_k = A(x_k).

  • Evaluation (coefficient → point-value): evaluate at n points — Θ(n²) naively, Θ(n lg n) with FFT
  • Interpolation (point-value → coefficient): the inverse operation

Theorem 30.1 (Uniqueness of Interpolating Polynomial): For any set of n point-value pairs with distinct x_k values, there is a unique polynomial of degree-bound n passing through them.

  • Proof uses the Vandermonde matrix V(x₀, ..., x_{n−1}), which is invertible when all x_k are distinct
  • Interpolation via LU decomposition: O(n³)
  • Lagrange's formula (Θ(n²)):

$$A(x) = \sum_{k=0}^{n-1} y_k \prod_{j \neq k} \frac{x - x_j}{x_k - x_j}$$

Operations in Point-Value Form

OperationTime
AdditionΘ(n) — add corresponding y-values
MultiplicationΘ(n) — multiply corresponding y-values

Important caveat for multiplication: Since degree(C) = degree(A) + degree(B), we need 2n point-value pairs (an "extended" representation) to uniquely determine the product polynomial of degree-bound 2n.

Fast Multiplication of Polynomials in Coefficient Form

The strategy (illustrated in Figure 30.1 of CLRS):

Coefficient form                          Coefficient form
a₀,...,a_{n-1}  ──── Ordinary mult Θ(n²) ────►  c₀,...,c_{2n-2}
b₀,...,b_{n-1}
    │                                              ▲
    │ Evaluate (FFT)                               │ Interpolate (Inverse FFT)
    │ Θ(n lg n)                                    │ Θ(n lg n)
    ▼                                              │
Point-value form ── Pointwise mult Θ(n) ──► Point-value form

Procedure for Θ(n lg n) polynomial multiplication (n assumed a power of 2):

  1. Double degree-bound: pad A(x) and B(x) with n high-order zero coefficients → degree-bound 2n
  2. Evaluate: apply FFT of order 2n to get point-value representations at the (2n)th roots of unity
  3. Pointwise multiply: multiply corresponding values — Θ(n)
  4. Interpolate: apply inverse FFT to recover coefficient representation

Steps 1, 3: Θ(n). Steps 2, 4: Θ(n lg n). Total: Θ(n lg n).

Theorem 30.2: Two polynomials of degree-bound n can be multiplied in Θ(n lg n) time, with both input and output in coefficient form.


30.2 The DFT and FFT

Complex Roots of Unity

A complex nth root of unity is a complex number ω such that ωⁿ = 1.

  • There are exactly n such roots: e^{2πik/n} for k = 0, 1, ..., n−1
  • Using Euler's formula: e^{iu} = cos(u) + i·sin(u)
  • The n roots are equally spaced on the unit circle in the complex plane

The principal nth root of unity:

$$\omega_n = e^{2\pi i / n}$$

All other nth roots are powers of ω_n: ω_n⁰, ω_n¹, ..., ω_n^{n−1}.

These roots form a group under multiplication isomorphic to (Z_n, +).

Essential Properties

Lemma 30.3 (Cancellation Lemma): For any n ≥ 0, k ≥ 0, d > 0:

$$\omega_{dn}^{dk} = \omega_n^k$$

Corollary 30.4: For any even n > 0: ω_n^{n/2} = ω₂ = −1

Lemma 30.5 (Halving Lemma): If n > 0 is even, the squares of the n complex nth roots of unity are the n/2 complex (n/2)th roots of unity. Each (n/2)th root appears exactly twice:

  • (ω_n^k)² = ω_{n/2}^k
  • ω_n^{k+n/2} = −ω_n^k, so (ω_n^{k+n/2})² = (ω_n^k)²

This is essential for the divide-and-conquer approach — subproblems are half the size.

Lemma 30.6 (Summation Lemma): For any n ≥ 1 and nonzero integer k not divisible by n:

$$\sum_{j=0}^{n-1} (\omega_n^k)^j = 0$$

The DFT

Given polynomial A(x) = Σ a_j x^j of degree-bound n, evaluate at ω_n⁰, ω_n¹, ..., ω_n^{n−1}:

$$y_k = A(\omega_n^k) = \sum_{j=0}^{n-1} a_j \omega_n^{kj}$$

The vector y = (y₀, y₁, ..., y_{n−1}) is the Discrete Fourier Transform (DFT) of coefficient vector a. Written y = DFT_n(a).

As a matrix equation: y = V_n · a, where V_n is the Vandermonde matrix with (k, j) entry ω_n^{kj}.

The FFT Algorithm

The fast Fourier transform computes DFT_n(a) in Θ(n lg n) time using divide-and-conquer. Requires n to be a power of 2.

Key idea: Separate A(x) into even-indexed and odd-indexed coefficients:

  • A⁰(x) = a₀ + a₂x + a₄x² + ··· + a_{n−2}x^{n/2−1} (even-indexed)
  • A¹(x) = a₁ + a₃x + a₅x² + ··· + a_{n−1}x^{n/2−1} (odd-indexed)

Then: A(x) = A⁰(x²) + x·A¹(x²)

By the halving lemma, evaluating A at n nth roots of unity reduces to evaluating A⁰ and A¹ at n/2 (n/2)th roots of unity — two subproblems of half the size.

RECURSIVE-FFT(a)
  n = a.length                    // n is a power of 2
  if n == 1
      return a
  ω_n = e^{2πi/n}
  ω = 1
  a⁰ = (a₀, a₂, ..., a_{n-2})
  a¹ = (a₁, a₃, ..., a_{n-1})
  y⁰ = RECURSIVE-FFT(a⁰)
  y¹ = RECURSIVE-FFT(a¹)
  for k = 0 to n/2 - 1
      y_k       = y⁰_k + ω · y¹_k
      y_{k+n/2} = y⁰_k - ω · y¹_k
      ω = ω · ω_n
  return y

Lines 11–12 explanation:

  • For k = 0, ..., n/2−1: y_k = A⁰(ω_n^{2k}) + ω_n^k · A¹(ω_n^{2k}) = A(ω_n^k)
  • For k+n/2: y_{k+n/2} = A⁰(ω_n^{2k}) − ω_n^k · A¹(ω_n^{2k}) = A(ω_n^{k+n/2})
    • Uses the fact that ω_n^{k+n/2} = −ω_n^k

The factors ω_n^k are called twiddle factors.

Running time: T(n) = 2T(n/2) + Θ(n) = Θ(n lg n)

Interpolation at Complex Roots of Unity (Inverse DFT)

The DFT as a matrix equation: y = V_n · a

Theorem 30.7: The (j, k) entry of V_n⁻¹ is ω_n^{−kj} / n.

Therefore the inverse DFT:

$$a_j = \frac{1}{n} \sum_{k=0}^{n-1} y_k \omega_n^{-kj}$$

To compute DFT⁻¹ in Θ(n lg n): modify the FFT to:

  1. Swap roles of a and y
  2. Replace ω_n with ω_n⁻¹
  3. Divide each result element by n

Theorem 30.8 (Convolution Theorem): For any two vectors a and b of length n (power of 2), padded with zeros to length 2n:

$$a \otimes b = \text{DFT}{2n}^{-1}(\text{DFT}{2n}(a) \cdot \text{DFT}_{2n}(b))$$

where · denotes componentwise multiplication.


30.3 Efficient FFT Implementations

The Butterfly Operation

The core operation of the FFT loop, optimized to avoid redundant computation:

for k = 0 to n/2 - 1
    t = ω · y¹_k              // twiddle factor times odd element
    y_k       = y⁰_k + t
    y_{k+n/2} = y⁰_k - t
    ω = ω · ω_n

This is the butterfly operation: multiply by twiddle factor, then add and subtract. Eliminates the common subexpression ω_n^k · y¹_k that was computed twice in the naive loop.

Bit-Reversal Permutation

To convert the recursive FFT to an iterative bottom-up algorithm, the input must be reordered according to a bit-reversal permutation.

  • Place element a_k at position A[rev(k)], where rev(k) reverses the lg(n) bits of k
  • Example for n = 8: input order becomes (a₀, a₄, a₂, a₆, a₁, a₅, a₃, a₇)
    • Indices: 0,4,2,6,1,5,3,7
    • Binary: 000,100,010,110,001,101,011,111
    • Reversed: 000,001,010,011,100,101,110,111

Intuition: At each tree level, even-indexed elements go left, odd-indexed go right. Stripping the low-order bit at each level produces the bit-reversal order at the leaves.

BIT-REVERSE-COPY(a, A)
  n = a.length
  for k = 0 to n - 1
      A[rev(k)] = a_k

Iterative FFT

ITERATIVE-FFT(a)
  BIT-REVERSE-COPY(a, A)
  n = a.length                        // n is a power of 2
  for s = 1 to lg(n)                  // stage s: combine 2^(s-1)-element DFTs into 2^s-element DFTs
      m = 2^s
      ω_m = e^{2πi/m}
      for k = 0 to n - 1 by m         // each group of m elements
          ω = 1
          for j = 0 to m/2 - 1        // each butterfly in the group
              t = ω · A[k + j + m/2]
              u = A[k + j]
              A[k + j]       = u + t
              A[k + j + m/2] = u - t
              ω = ω · ω_m
  return A

How it works (bottom-up):

  1. After bit-reversal, take elements in pairs → compute 2-element DFTs via butterfly
  2. Take pairs of 2-element DFTs → compute 4-element DFTs
  3. Continue doubling until the full n-element DFT is assembled

Running time: Θ(n lg n)

  • BIT-REVERSE-COPY: O(n lg n), or Θ(n) with a precomputed lookup table
  • Innermost loop body executes L(n) = Σ(s=1 to lg n) (n/2^s)·2^{s−1} = Σ(s=1 to lg n) n/2 = Θ(n lg n)

Parallel FFT Circuit

The iterative FFT naturally maps to a parallel circuit:

  • Input: bit-reverse permuted values on n wires
  • lg(n) stages, each with n/2 butterfly operations executed in parallel
  • Circuit depth: Θ(lg n)
  • Total work: Θ(n lg n) butterfly operations

For stage s (s = 1, 2, ..., lg n):

  • n/2^s groups of butterflies
  • 2^{s−1} butterflies per group
  • Twiddle factors: ω_m⁰, ω_m¹, ..., ω_m^{m/2−1} where m = 2^s

Example for n = 8:

  • Stage 1 (s=1): 4 butterflies with twiddle factor ω₂⁰
  • Stage 2 (s=2): 4 butterflies with twiddle factors ω₄⁰, ω₄¹
  • Stage 3 (s=3): 4 butterflies with twiddle factors ω₈⁰, ω₈¹, ω₈², ω₈³

Summary Table

OperationCoefficient FormPoint-Value Form
EvaluationΘ(n) (Horner's)
AdditionΘ(n)Θ(n)
MultiplicationΘ(n²) naive / Θ(n lg n) via FFTΘ(n)
InterpolationΘ(n²) Lagrange / Θ(n lg n) inverse FFT

Key takeaway: The FFT exploits complex roots of unity and the halving lemma to convert between coefficient and point-value representations in Θ(n lg n), enabling fast polynomial multiplication via the convolution theorem.