Number Theoretic Algorithms
Introduction to AlgorithmsChapter 31: Number-Theoretic Algorithms
Number theory was once viewed as a beautiful but largely useless subject in pure mathematics. Today number-theoretic algorithms are used widely, due in large part to the invention of cryptographic schemes based on large prime numbers. These schemes are feasible because we can find large primes easily, and they are secure because we do not know how to factor the product of large primes efficiently.
Key idea: In this chapter, a "large input" means an input containing "large integers" rather than "many integers." We measure input size in terms of the number of bits required to represent the input. Multiplying two β-bit integers by the ordinary method uses Θ(β²) bit operations.
31.1 Elementary Number-Theoretic Notions
Divisibility and Divisors
- The notation d | a (read "d divides a") means that a = kd for some integer k
- Every integer divides 0
- If a > 0 and d | a, then |d| ≤ |a|
- If d | a, then a is a multiple of d
- If d does not divide a, we write d ∤ a
- A divisor of a nonzero integer a is at least 1 but not greater than |a|
- The trivial divisors of a positive integer a are 1 and a; the nontrivial divisors are the factors of a
Prime and Composite Numbers
- An integer a > 1 whose only divisors are 1 and a is a prime number
- First 20 primes: 2, 3, 5, 7, 11, 13, 17, 19, 23, 29, 31, 37, 41, 43, 47, 53, 59, 61, 67, 71
- An integer a > 1 that is not prime is a composite number
- The integer 1 is a unit — neither prime nor composite
Division Theorem, Remainders, and Modular Equivalence
Theorem 31.1 (Division Theorem): For any integer a and any positive integer n, there exist unique integers q and r such that 0 ≤ r < n and a = qn + r.
- q = ⌊a/n⌋ is the quotient
- r = a mod n is the remainder (or residue)
- n | a if and only if a mod n = 0
Equivalence class modulo n containing integer a:
[a]_n = {a + kn : k ∈ ℤ}
The set of all equivalence classes: Z_n = {[a]_n : 0 ≤ a ≤ n − 1}, commonly written as Z_n = {0, 1, ..., n − 1}.
Common Divisors and Greatest Common Divisors
Key properties of common divisors:
- d | a and d | b implies d | (a + b) and d | (a − b)
- d | a and d | b implies d | (ax + by) for any integers x, y
- a | b and b | a implies a = ±b
The greatest common divisor gcd(a, b) is the largest common divisor of a and b (not both zero).
Properties:
- gcd(a, b) = gcd(b, a)
- gcd(a, b) = gcd(−a, b)
- gcd(a, b) = gcd(|a|, |b|)
- gcd(a, 0) = |a|
- gcd(a, ka) = |a| for any k ∈ ℤ
Theorem 31.2: gcd(a, b) is the smallest positive element of the set {ax + by : x, y ∈ ℤ} of linear combinations of a and b.
Corollary 31.3: If d | a and d | b, then d | gcd(a, b).
Corollary 31.4: For all integers a, b and any nonnegative integer n: gcd(an, bn) = n · gcd(a, b).
Corollary 31.5: For all positive integers n, a, b: if n | ab and gcd(a, n) = 1, then n | b.
Relatively Prime Integers
- Two integers a and b are relatively prime if gcd(a, b) = 1
- Theorem 31.6: If gcd(a, p) = 1 and gcd(b, p) = 1, then gcd(ab, p) = 1
- Integers n₁, n₂, ..., nₖ are pairwise relatively prime if gcd(nᵢ, nⱼ) = 1 whenever i ≠ j
Unique Factorization
Theorem 31.7: For all primes p and all integers a, b: if p | ab, then p | a or p | b (or both).
Theorem 31.8 (Unique Factorization): There is exactly one way to write any composite integer a as a product of the form a = p₁^e₁ · p₂^e₂ · ... · pᵣ^eᵣ, where the pᵢ are prime, p₁ < p₂ < ... < pᵣ, and the eᵢ are positive integers.
31.2 Greatest Common Divisor
GCD Recursion Theorem
Theorem 31.9 (GCD Recursion Theorem): For any nonnegative integer a and any positive integer b:
gcd(a, b) = gcd(b, a mod b)
Euclid's Algorithm
EUCLID(a, b)
1 if b == 0
2 return a
3 else return EUCLID(b, a mod b)
Example: gcd(30, 21):
- EUCLID(30, 21) → EUCLID(21, 9) → EUCLID(9, 3) → EUCLID(3, 0) → 3
Running Time of Euclid's Algorithm
Lemma 31.10: If a > b ≥ 1 and EUCLID(a, b) performs k ≥ 1 recursive calls, then a ≥ F_{k+2} and b ≥ F_{k+1} (where Fₖ is the kth Fibonacci number).
Theorem 31.11 (Lamé's Theorem): For any integer k ≥ 1, if a > b ≥ 1 and b < F_{k+1}, then EUCLID(a, b) makes fewer than k recursive calls.
- The number of recursive calls is O(lg b)
- For two β-bit numbers: O(β) arithmetic operations and O(β³) bit operations
Extended Euclid's Algorithm
Computes integers x and y such that d = gcd(a, b) = ax + by.
EXTENDED-EUCLID(a, b)
1 if b == 0
2 return (a, 1, 0)
3 else (d', x', y') = EXTENDED-EUCLID(b, a mod b)
4 (d, x, y) = (d', y', x' − ⌊a/b⌋ · y')
5 return (d, x, y)
Example: EXTENDED-EUCLID(99, 78) returns (3, −11, 14), since gcd(99, 78) = 3 = 99·(−11) + 78·14.
| a | b | ⌊a/b⌋ | d | x | y |
|---|---|---|---|---|---|
| 99 | 78 | 1 | 3 | −11 | 14 |
| 78 | 21 | 3 | 3 | 3 | −11 |
| 21 | 15 | 1 | 3 | −2 | 3 |
| 15 | 6 | 2 | 3 | 1 | −2 |
| 6 | 3 | 2 | 3 | 0 | 1 |
| 3 | 0 | — | 3 | 1 | 0 |
Same running time as EUCLID: O(lg b) recursive calls.
31.3 Modular Arithmetic
Finite Groups
A group (S, ⊕) is a set S with a binary operation ⊕ satisfying:
- Closure: For all a, b ∈ S, we have a ⊕ b ∈ S
- Identity: There exists an element e ∈ S such that e ⊕ a = a ⊕ e = a for all a ∈ S
- Associativity: For all a, b, c ∈ S: (a ⊕ b) ⊕ c = a ⊕ (b ⊕ c)
- Inverses: For each a ∈ S, there exists a unique b ∈ S such that a ⊕ b = b ⊕ a = e
- If a ⊕ b = b ⊕ a for all a, b ∈ S, the group is abelian
- If |S| < ∞, it is a finite group
Additive Group Modulo n
The additive group modulo n is (Z_n, +_n) with:
- [a]_n +_n [b]_n = [a + b]_n
- Identity: 0
- Inverse of a: −a (i.e., n − a)
- Size: |Z_n| = n
Theorem 31.12: (Z_n, +_n) is a finite abelian group.
Multiplicative Group Modulo n
The multiplicative group modulo n is (Z*_n, ·_n) where:
Z*_n = {[a]_n ∈ Z_n : gcd(a, n) = 1}
- Identity: 1
- Size: |Z*_n| = φ(n) (Euler's phi function)
Theorem 31.13: (Z*_n, ·_n) is a finite abelian group.
Computing multiplicative inverses: Use EXTENDED-EUCLID(a, n). If gcd(a, n) = 1, then ax + ny = 1, so ax ≡ 1 (mod n), meaning x is the multiplicative inverse of a modulo n.
Euler's Phi Function
φ(n) = n · ∏_{p prime, p|n} (1 − 1/p)
Special cases:
- If p is prime: φ(p) = p − 1
- If n is composite: φ(n) < n − 1
- Lower bound for n > 5: φ(n) > n / (6 ln ln n)
Subgroups
- (S', ⊕) is a subgroup of (S, ⊕) if S' ⊆ S and (S', ⊕) is also a group
Theorem 31.14: If (S, ⊕) is a finite group and S' is any nonempty subset of S closed under ⊕, then (S', ⊕) is a subgroup of (S, ⊕).
Theorem 31.15 (Lagrange's Theorem): If (S', ⊕) is a subgroup of finite group (S, ⊕), then |S'| divides |S|.
Corollary 31.16: If S' is a proper subgroup of finite group S, then |S'| ≤ |S|/2.
Subgroups Generated by an Element
The subgroup generated by a, denoted ⟨a⟩:
⟨a⟩ = {a^(k) : k ≥ 1}
- In Z_n: a^(k) = ka mod n
- In Z*_n: a^(k) = aᵏ mod n
The order of a, denoted ord(a), is the smallest positive integer t such that a^(t) = e.
Theorem 31.17: ord(a) = |⟨a⟩|
Corollary 31.18: The sequence a^(1), a^(2), ... is periodic with period t = ord(a).
Corollary 31.19: For all a ∈ S in a finite group with identity e: a^(|S|) = e.
31.4 Solving Modular Linear Equations
We want to find all x satisfying: ax ≡ b (mod n), given a > 0, n > 0, and b.
Key Theorems
Theorem 31.20: For any positive integers a and n, if d = gcd(a, n), then ⟨a⟩ = ⟨d⟩ = {0, d, 2d, ..., ((n/d) − 1)d} in Z_n, and |⟨a⟩| = n/d.
Corollary 31.21: ax ≡ b (mod n) is solvable if and only if d | b, where d = gcd(a, n).
Corollary 31.22: The equation either has d distinct solutions modulo n, or no solutions.
Theorem 31.23: If d | b, then one solution is: x₀ = x'(b/d) mod n, where d = ax' + ny' (from EXTENDED-EUCLID).
Theorem 31.24: All d solutions are: xᵢ = x₀ + i(n/d) for i = 0, 1, ..., d − 1.
Algorithm
MODULAR-LINEAR-EQUATION-SOLVER(a, b, n)
1 (d, x', y') = EXTENDED-EUCLID(a, n)
2 if d | b
3 x₀ = x'(b/d) mod n
4 for i = 0 to d − 1
5 print (x₀ + i(n/d)) mod n
6 else print "no solutions"
Example: 14x ≡ 30 (mod 100)
- EXTENDED-EUCLID(14, 100) → (d, x', y') = (2, −7, 1)
- Since 2 | 30: x₀ = (−7)(15) mod 100 = 95
- Solutions: 95 and 45
Running time: O(lg n + gcd(a, n)) arithmetic operations.
Important Corollaries
Corollary 31.25: If gcd(a, n) = 1, then ax ≡ b (mod n) has a unique solution modulo n.
Corollary 31.26: If gcd(a, n) = 1, then ax ≡ 1 (mod n) has a unique solution (the multiplicative inverse a⁻¹ mod n). Otherwise, no solution exists.
Computing a⁻¹ mod n: Use EXTENDED-EUCLID(a, n) — the returned x is the inverse.
31.5 The Chinese Remainder Theorem
Statement
Theorem 31.27 (Chinese Remainder Theorem): Let n = n₁n₂···nₖ, where the nᵢ are pairwise relatively prime. Then the mapping:
a ↔ (a₁, a₂, ..., aₖ)
where aᵢ = a mod nᵢ, is a bijection between Z_n and Z_{n₁} × Z_{n₂} × ... × Z_{nₖ}.
Operations are performed componentwise:
- (a + b) mod n ↔ ((a₁ + b₁) mod n₁, ..., (aₖ + bₖ) mod nₖ)
- (a − b) mod n ↔ ((a₁ − b₁) mod n₁, ..., (aₖ − bₖ) mod nₖ)
- (ab) mod n ↔ (a₁b₁ mod n₁, ..., aₖbₖ mod nₖ)
Construction (Computing a from (a₁, ..., aₖ))
- Define mᵢ = n/nᵢ (product of all nⱼ except nᵢ)
- Define cᵢ = mᵢ · (mᵢ⁻¹ mod nᵢ)
- Compute: a ≡ (a₁c₁ + a₂c₂ + ... + aₖcₖ) (mod n)
The cᵢ form a "basis": cᵢ ↔ (0, 0, ..., 0, 1, 0, ..., 0) with 1 in position i.
Example
Solve: a ≡ 2 (mod 5) and a ≡ 3 (mod 13), so n = 65.
- m₁ = 13, m₂ = 5
- 13⁻¹ ≡ 2 (mod 5), so c₁ = 13 · 2 = 26
- 5⁻¹ ≡ 8 (mod 13), so c₂ = 5 · 8 = 40
- a ≡ 2·26 + 3·40 ≡ 52 + 120 ≡ 42 (mod 65)
Key Corollaries
Corollary 31.28: If n₁, ..., nₖ are pairwise relatively prime and n = n₁···nₖ, then the system x ≡ aᵢ (mod nᵢ) for i = 1, ..., k has a unique solution modulo n.
Corollary 31.29: x ≡ a (mod nᵢ) for all i = 1, ..., k if and only if x ≡ a (mod n).
31.6 Powers of an Element
Consider the sequence of powers of a modulo n: a⁰, a¹, a², a³, ... (mod n).
Euler's and Fermat's Theorems
Theorem 31.30 (Euler's Theorem): For any integer n > 1:
a^φ(n) ≡ 1 (mod n) for all a ∈ Z*_n
Theorem 31.31 (Fermat's Theorem): If p is prime:
a^(p−1) ≡ 1 (mod p) for all a ∈ Z*_p
Also: aᵖ ≡ a (mod p) for all a ∈ Z_p.
Primitive Roots and Discrete Logarithms
- If ord_n(g) = |Z_n|, then g is a primitive root (or generator) of Z_n
- If Z*_n has a primitive root, the group is cyclic
Theorem 31.32: Z*_n is cyclic for n = 2, 4, pᵉ, and 2pᵉ (for odd primes p, positive integers e).
- If g is a primitive root and a ∈ Z*_n, then gᶻ ≡ a (mod n) for some z
- z is the discrete logarithm (or index) of a: ind_{n,g}(a)
Theorem 31.33 (Discrete Logarithm Theorem): If g is a primitive root of Z*_n, then gˣ ≡ gʸ (mod n) if and only if x ≡ y (mod φ(n)).
Square Roots of 1
Theorem 31.34: If p is an odd prime and e ≥ 1, then x² ≡ 1 (mod pᵉ) has only two solutions: x = 1 and x = −1.
A nontrivial square root of 1 modulo n satisfies x² ≡ 1 (mod n) but x ≢ ±1 (mod n).
Corollary 31.35: If there exists a nontrivial square root of 1 modulo n, then n is composite.
Repeated Squaring (Modular Exponentiation)
Efficiently computes aᵇ mod n using the binary representation of b = ⟨bₖ, bₖ₋₁, ..., b₁, b₀⟩.
MODULAR-EXPONENTIATION(a, b, n)
1 c = 0
2 d = 1
3 let ⟨bₖ, bₖ₋₁, ..., b₀⟩ be the binary representation of b
4 for i = k downto 0
5 c = 2c
6 d = (d · d) mod n
7 if bᵢ == 1
8 c = c + 1
9 d = (d · a) mod n
10 return d
Loop invariant: Before each iteration, c equals the prefix ⟨bₖ, ..., b_{i+1}⟩ of b, and d = aᶜ mod n.
Running time: O(β) arithmetic operations, O(β³) bit operations for β-bit inputs.
31.7 The RSA Public-Key Cryptosystem
Public-Key Cryptosystem Concepts
Each participant has a public key P and a secret key S, which define inverse functions:
- M = S(P(M)) — decrypt after encrypt
- M = P(S(M)) — verify after sign
Encryption: Bob sends message M to Alice:
- Bob obtains Alice's public key P_A
- Bob computes ciphertext C = P_A(M) and sends C
- Alice decrypts: M = S_A(C) = S_A(P_A(M))
Digital Signatures: Alice signs message M':
- Alice computes signature σ = S_A(M')
- Alice sends (M', σ) to Bob
- Bob verifies: checks if M' = P_A(σ)
RSA Key Generation
- Select two large random primes p and q (p ≠ q), e.g., 1024 bits each
- Compute n = pq
- Select small odd integer e relatively prime to φ(n) = (p−1)(q−1)
- Compute d = e⁻¹ mod φ(n) (using EXTENDED-EUCLID)
- Public key: P = (e, n)
- Secret key: S = (d, n)
RSA Operations
- Encrypt/Apply public key: P(M) = Mᵉ mod n
- Decrypt/Apply secret key: S(C) = Cᵈ mod n
Both use MODULAR-EXPONENTIATION.
Running time:
- Public key operation: O(1) modular multiplications, O(β²) bit operations
- Secret key operation: O(β) modular multiplications, O(β³) bit operations
Correctness
Theorem 31.36 (Correctness of RSA): P(S(M)) = S(P(M)) = M^(ed) mod n = M for all M ∈ Z_n.
Proof sketch: Since ed ≡ 1 (mod φ(n)), we have ed = 1 + k(p−1)(q−1). By Fermat's theorem, M^(ed) ≡ M (mod p) and M^(ed) ≡ M (mod q). By the Chinese remainder theorem, M^(ed) ≡ M (mod n).
Security
- Security rests on the difficulty of factoring large integers
- If an adversary can factor n, they can derive the secret key
- No method easier than factoring n has been found to break RSA
- Recommended key sizes: 768 to 2048+ bits
Practical Considerations
- Hybrid mode: Use RSA to encrypt a short symmetric key K, then use K with a fast symmetric cipher for the actual message
- Efficient signatures: Use a collision-resistant hash function h; sign h(M) instead of M directly
- Certificates: A trusted authority T signs "Alice's public key is P_A" — anyone knowing P_T can verify
31.8 Primality Testing
Density of Primes
Theorem 31.37 (Prime Number Theorem): π(n) ~ n / ln n, where π(n) is the number of primes ≤ n.
- To find a 1024-bit prime: expect to test approximately ln(2¹⁰²⁴) ≈ 710 random candidates
- Trial division up to √n takes Θ(√n) = Θ(2^(β/2)) time — exponential in input length
Pseudoprimality Testing
n is a base-a pseudoprime if n is composite and a^(n−1) ≡ 1 (mod n).
PSEUDOPRIME(n)
1 if MODULAR-EXPONENTIATION(2, n − 1, n) ≢ 1 (mod n)
2 return COMPOSITE // definitely
3 else return PRIME // we hope!
- Only errs by declaring a composite as prime (one-sided error)
- Only 22 values < 10,000 fool this test (341, 561, 645, 1105, ...)
- Carmichael numbers satisfy a^(n−1) ≡ 1 (mod n) for all a ∈ Z*_n (e.g., 561, 1105, 1729) — they fool all base tests
The Miller-Rabin Primality Test
Two improvements over pseudoprimality testing:
- Try several randomly chosen base values a
- Check for nontrivial square roots of 1 during squaring steps
Write n − 1 = 2ᵗu where u is odd. Compute aᵘ mod n, then square t times.
WITNESS(a, n)
1 let t and u be such that t ≥ 1, u is odd, and n − 1 = 2ᵗu
2 x₀ = MODULAR-EXPONENTIATION(a, u, n)
3 for i = 1 to t
4 xᵢ = x²_{i−1} mod n
5 if xᵢ == 1 and x_{i−1} ≠ 1 and x_{i−1} ≠ n − 1
6 return TRUE // nontrivial square root found
7 if xₜ ≠ 1
8 return TRUE // Fermat test failed
9 return FALSE
MILLER-RABIN(n, s)
1 for j = 1 to s
2 a = RANDOM(1, n − 1)
3 if WITNESS(a, n)
4 return COMPOSITE // definitely
5 return PRIME // almost surely
How WITNESS Works (Four Cases for Sequence X = ⟨x₀, x₁, ..., xₜ⟩)
- X = ⟨..., d⟩ where d ≠ 1: Return TRUE (Fermat test fails)
- X = ⟨1, 1, ..., 1⟩: Return FALSE (not a witness)
- X = ⟨..., −1, 1, ..., 1⟩: Return FALSE (not a witness)
- X = ⟨..., d, 1, ..., 1⟩ where d ≠ ±1: Return TRUE (nontrivial square root found)
Error Analysis
Theorem 31.38: If n is an odd composite, the number of witnesses to compositeness is at least (n − 1)/2.
Theorem 31.39: The probability that MILLER-RABIN(n, s) errs is at most 2⁻ˢ.
- If n is prime: always returns PRIME
- If n is composite: chance of returning PRIME is ≤ 2⁻ˢ
- For practical use: s = 50 suffices for virtually any application
- Running time: O(sβ) arithmetic operations, O(sβ³) bit operations
31.9 Integer Factorization
Factoring a large integer n into its prime factors appears to be much harder than primality testing. Even with the best algorithms, we cannot feasibly factor an arbitrary 1024-bit number.
Pollard's Rho Heuristic
Uses only constant memory. Factors numbers up to R⁴ with the same work that trial division needs for R².
POLLARD-RHO(n)
1 i = 1
2 x₁ = RANDOM(0, n − 1)
3 y = x₁
4 k = 2
5 while TRUE
6 i = i + 1
7 xᵢ = (x²_{i−1} − 1) mod n
8 d = gcd(y − xᵢ, n)
9 if d ≠ 1 and d ≠ n
10 print d
11 if i == k
12 y = xᵢ
13 k = 2k
How It Works
- Generates a pseudorandom sequence: xᵢ = (x²_{i−1} − 1) mod n
- Periodically saves values of xᵢ (at powers of 2: x₁, x₂, x₄, x₈, ...)
- Checks gcd(y − xᵢ, n) for nontrivial factors
- The sequence eventually cycles (forming a ρ shape — hence the name)
Analysis
- Let p be a nontrivial factor of n with gcd(p, n/p) = 1
- The sequence modulo p also follows the same recurrence: x'ᵢ = (x'²_{i−1} − 1) mod p
- By the birthday paradox, expect Θ(√p) steps before the sequence mod p repeats
- A factor is found when two values are equal mod p (but not necessarily mod n)
Expected performance:
- Finds factor p after Θ(√p) iterations
- To completely factor a β-bit number: approximately n^(1/4) = 2^(β/4) arithmetic operations
- Bit operations: 2^(β/4) · β²
Best Known Factoring Algorithms
- General number field sieve: Running time roughly L(1/3, n)^(1.902+o(1)), where L(α, n) = e^((ln n)^α (ln ln n)^(1−α))
- Elliptic curve method (Lenstra): Effective for finding small prime factors; time to find p estimated as L(1/2, p)^(√2+o(1))
- No polynomial-time factoring algorithm is known — this is what makes RSA secure
Previous chapter
Polynomials and the FFTNext chapter
String Matching