Matrix Operations
Introduction to AlgorithmsChapter 28: Matrix Operations
This chapter covers efficient algorithms for matrix operations central to scientific computing: solving systems of linear equations via LUP decomposition, inverting matrices, and working with symmetric positive-definite matrices for least-squares approximation.
Numerical stability is a practical concern throughout — floating-point round-off errors can amplify and produce incorrect results. LUP decomposition is preferred over direct inversion because it is more numerically stable.
28.1 Solving Systems of Linear Equations
Given n equations in n unknowns:
a₁₁x₁ + a₁₂x₂ + ⋯ + a₁ₙxₙ = b₁
a₂₁x₁ + a₂₂x₂ + ⋯ + a₂ₙxₙ = b₂
⋮
aₙ₁x₁ + aₙ₂x₂ + ⋯ + aₙₙxₙ = bₙ
This can be written as the matrix-vector equation:
Ax = b
- If A is nonsingular (invertible), the unique solution is
x = A⁻¹b. - If rank(A) < n → underdetermined system (typically infinitely many solutions, or none if inconsistent).
- If more equations than unknowns → overdetermined system (may have no exact solution; see §28.3 for least-squares).
Overview of LUP Decomposition
The goal is to find three n × n matrices L, U, P such that:
PA = LU
where:
- L is a unit lower-triangular matrix (1s on diagonal, 0s above)
- U is an upper-triangular matrix
- P is a permutation matrix
Why LUP? Solving triangular systems is easy. Once we have PA = LU, solving Ax = b reduces to:
- Multiply both sides by P:
PAx = Pb, givingLUx = Pb - Let
y = Ux - Solve
Ly = Pbvia forward substitution → Θ(n²) - Solve
Ux = yvia back substitution → Θ(n²)
Correctness: Since A = P⁻¹LU, we verify Ax = P⁻¹LUx = P⁻¹Ly = P⁻¹Pb = b. ✓
Forward and Back Substitution
Forward substitution solves the lower-triangular system Ly = Pb:
y₁ = b[π(1)]
yᵢ = b[π(i)] − Σⱼ₌₁ⁱ⁻¹ lᵢⱼyⱼ for i = 2, 3, …, n
Back substitution solves the upper-triangular system Ux = y:
xₙ = yₙ / uₙₙ
xᵢ = (yᵢ − Σⱼ₌ᵢ₊₁ⁿ uᵢⱼxⱼ) / uᵢᵢ for i = n−1, n−2, …, 1
Both run in Θ(n²) time.
LUP-SOLVE(L, U, π, b)
n = L.rows
let x and y be new vectors of length n
for i = 1 to n // forward substitution
yᵢ = b[π(i)] − Σⱼ₌₁ⁱ⁻¹ lᵢⱼyⱼ
for i = n downto 1 // back substitution
xᵢ = (yᵢ − Σⱼ₌ᵢ₊₁ⁿ uᵢⱼxⱼ) / uᵢᵢ
return x
Running time: Θ(n²) (each loop has an implicit inner summation).
Computing an LU Decomposition
LU decomposition is the special case where P = Iₙ (no pivoting), so A = LU.
The method uses Gaussian elimination:
- Subtract multiples of the first equation from subsequent equations to eliminate the first variable
- Repeat for each subsequent variable
- The result is an upper-triangular matrix U; the multipliers form L
Recursive formulation: Partition A as:
A = | a₁₁ wᵀ |
| v A' |
where v is an (n−1)-vector (first column below a₁₁), wᵀ is a row (n−1)-vector, and A' is (n−1)×(n−1). Then:
A = | 1 0 | × | a₁₁ wᵀ |
| v/a₁₁ Iₙ₋₁ | | 0 A' − vwᵀ/a₁₁ |
The matrix A' − vwᵀ/a₁₁ is the Schur complement of A with respect to a₁₁. If A is nonsingular, the Schur complement is also nonsingular, so we can recurse.
The elements we divide by are called pivots (diagonal elements of U). If any pivot is 0, LU decomposition fails — this is why we need LUP decomposition with permutations.
LU-DECOMPOSITION(A)
n = A.rows
let L and U be new n × n matrices
initialize U with 0s below the diagonal
initialize L with 1s on the diagonal and 0s above the diagonal
for k = 1 to n
uₖₖ = aₖₖ
for i = k + 1 to n
lᵢₖ = aᵢₖ / aₖₖ // multiplier (v/a₁₁)
uₖᵢ = aₖᵢ // row of U (wᵀ)
for i = k + 1 to n
for j = k + 1 to n
aᵢⱼ = aᵢⱼ − lᵢₖ · uₖⱼ // Schur complement update
return L and U
Running time: Θ(n³) due to the triply nested loop.
In-place optimization: L and U can be stored directly in A, since lᵢⱼ occupies positions where i > j and uᵢⱼ occupies positions where i ≤ j.
Computing an LUP Decomposition
To avoid dividing by 0 (or by small values causing numerical instability), we pivot: at each step, select the element in the current column with the greatest absolute value and swap rows.
Key differences from LU decomposition:
- Before processing column k, find the row k' (with k' ≥ k) that has the largest |aₖ'ₖ|
- Swap rows k and k' (tracked via permutation array π)
- Both the column vector v/aₖ₁ and the Schur complement are multiplied by the permutation
LUP-DECOMPOSITION(A)
n = A.rows
let π[1..n] be a new array
for i = 1 to n
π[i] = i
for k = 1 to n
p = 0
for i = k to n
if |aᵢₖ| > p
p = |aᵢₖ|
k' = i
if p == 0
error "singular matrix"
exchange π[k] with π[k']
for i = 1 to n
exchange aₖᵢ with aₖ'ᵢ
for i = k + 1 to n
aᵢₖ = aᵢₖ / aₖₖ
for j = k + 1 to n
aᵢⱼ = aᵢⱼ − aᵢₖ · aₖⱼ
Running time: Θ(n³) — same asymptotic cost as LU decomposition. Pivoting adds at most a constant factor.
Output encoding: When the procedure terminates, the matrix A stores both L and U in place:
aᵢⱼ = lᵢⱼif i > jaᵢⱼ = uᵢⱼif i ≤ j
The permutation is stored in array π, where π[i] = j means row i of P has a 1 in column j.
28.2 Inverting Matrices
Computing a Matrix Inverse from an LUP Decomposition
To compute A⁻¹, treat the equation AX = Iₙ as n separate systems:
AXᵢ = eᵢ for i = 1, 2, …, n
where Xᵢ is the i-th column of X = A⁻¹ and eᵢ is the i-th unit vector.
Procedure:
- Compute the LUP decomposition of A once → Θ(n³)
- For each of the n columns, solve using LUP-SOLVE → Θ(n²) each
- Total: Θ(n³)
Matrix Multiplication and Matrix Inversion Are Equivalently Hard
Let M(n) = time to multiply two n × n matrices, I(n) = time to invert an n × n matrix.
Theorem 28.1 — Multiplication is no harder than inversion:
If I(n) = Ω(n²) and I(3n) = O(I(n)), then M(n) = O(I(n)).
Proof idea: To compute AB, construct the 3n × 3n matrix:
D = | Iₙ A 0 |
| 0 Iₙ B |
| 0 0 Iₙ |
Then:
D⁻¹ = | Iₙ −A AB |
| 0 Iₙ −B |
| 0 0 Iₙ |
The product AB appears in the upper-right n × n block of D⁻¹.
Theorem 28.2 — Inversion is no harder than multiplication:
If M(n) = Ω(n²) with regularity conditions M(n + k) = O(M(n)) for 0 ≤ k ≤ n and M(n/2) ≤ cM(n) for some c < 1/2, then I(n) = O(M(n)).
Proof sketch (for symmetric positive-definite A):
Partition A and A⁻¹ into four n/2 × n/2 blocks:
A = | B Cᵀ | A⁻¹ = | R T |
| C D | | U V |
Let S = D − CB⁻¹Cᵀ (Schur complement). Then:
A⁻¹ = | B⁻¹ + B⁻¹CᵀS⁻¹CB⁻¹ −B⁻¹CᵀS⁻¹ |
| −S⁻¹CB⁻¹ S⁻¹ |
This requires:
- 2 recursive inversions of n/2 × n/2 matrices
- 4 multiplications of n/2 × n/2 matrices
- O(n²) additional work
Recurrence: I(n) ≤ 2I(n/2) + 4M(n/2) + O(n²) = O(M(n))
For general nonsingular A: Use the identity A⁻¹ = (AᵀA)⁻¹Aᵀ, since AᵀA is always symmetric positive-definite (when A is nonsingular). This reduces general inversion to SPD inversion plus matrix multiplications, all in O(M(n)).
Practical implication: Strassen's O(n^lg7) matrix multiplication algorithm gives O(n^lg7) matrix inversion.
28.3 Symmetric Positive-Definite Matrices and Least-Squares Approximation
A matrix A is symmetric positive-definite (SPD) if:
- A = Aᵀ (symmetric)
- xᵀAx > 0 for all nonzero vectors x (positive-definite)
Key Properties of SPD Matrices
Lemma 28.3: Any positive-definite matrix is nonsingular.
- Proof: If A were singular, ∃ nonzero x with Ax = 0, so xᵀAx = 0, contradicting positive-definiteness.
Lemma 28.4: Every leading submatrix of an SPD matrix is also SPD.
- The k-th leading submatrix Aₖ consists of the first k rows and k columns of A.
- Proof: For any nonzero k-vector xₖ, extend it to x = (xₖ, 0)ᵀ. Then xᵀAx = xₖᵀAₖxₖ > 0.
Lemma 28.5 (Schur Complement Lemma): If A is SPD and Aₖ is a leading k × k submatrix, then the Schur complement
S = C − B Aₖ⁻¹ Bᵀ
(where A is partitioned as in Lemma 28.4) is also symmetric and positive-definite.
- Proof sketch: For any nonzero z, choose y = −Aₖ⁻¹Bᵀz and form x = (y, z)ᵀ. Then xᵀAx = zᵀSz > 0.
Corollary 28.6: LU decomposition of an SPD matrix never causes a division by 0.
- Every pivot is strictly positive.
- Proof: The first pivot a₁₁ = e₁ᵀAe₁ > 0. Each subsequent pivot is the first element of a Schur complement, which is SPD by Lemma 28.5. By induction, all pivots are positive.
Practical consequence: SPD matrices can use LU decomposition without pivoting — no need for the permutation matrix P.
Least-Squares Approximation
Given m data points (x₁, y₁), (x₂, y₂), …, (xₘ, yₘ), we want to find a function:
F(x) = Σⱼ₌₁ⁿ cⱼfⱼ(x)
that minimizes the approximation errors ηᵢ = F(xᵢ) − yᵢ.
- The basis functions fⱼ are chosen based on the problem (e.g., fⱼ(x) = x^(j−1) for polynomial fitting)
- Typically n ≪ m (fewer parameters than data points) to avoid overfitting
Setup: Define the m × n matrix A where aᵢⱼ = fⱼ(xᵢ), and let c = (c₁, …, cₙ)ᵀ. Then:
- Predicted values: Ac
- Error vector: η = Ac − y
- Goal: minimize ‖η‖² = Σᵢ ηᵢ²
Derivation: Setting the derivative of ‖η‖² with respect to each cₖ to zero yields the normal equation:
AᵀA c = Aᵀy
Key facts about AᵀA:
- It is symmetric (by properties of transpose)
- If A has full column rank, AᵀA is positive-definite (by Theorem D.6)
- Therefore (AᵀA)⁻¹ exists
Solution:
c = (AᵀA)⁻¹ Aᵀ y = A⁺ y
where A⁺ = (AᵀA)⁻¹Aᵀ is the pseudoinverse of A.
The pseudoinverse generalizes the matrix inverse to non-square matrices: compare c = A⁺y (approximate solution to Ac = y) with x = A⁻¹b (exact solution to Ax = b).
Least-Squares Example
Given five data points: (−1, 2), (1, 1), (2, 1), (3, 0), (5, 3), fit a quadratic F(x) = c₁ + c₂x + c₃x².
The matrix of basis-function values:
A = | 1 -1 1 |
| 1 1 1 |
| 1 2 4 |
| 1 3 9 |
| 1 5 25 |
Solving the normal equation yields:
c = (1.200, −0.757, 0.214)ᵀ
So the best-fit quadratic is: F(x) = 1.200 − 0.757x + 0.214x²
Practical Computation
To solve the normal equation in practice:
- Compute Aᵀy
- Compute AᵀA
- Find the LU decomposition of AᵀA (guaranteed nonsingular since AᵀA is SPD when A has full rank)
- Solve via forward and back substitution
Summary of Running Times
| Operation | Time |
|---|---|
| Forward / back substitution | Θ(n²) |
| LU decomposition | Θ(n³) |
| LUP decomposition | Θ(n³) |
| Solving Ax = b (via LUP) | Θ(n³) |
| Solving k systems with same A | Θ(n³ + kn²) |
| Matrix inversion (via LUP) | Θ(n³) |
| Matrix inversion (via Strassen) | O(n^lg7) |
Key Takeaways
- LUP decomposition (PA = LU) is the preferred method for solving linear systems — more numerically stable than computing A⁻¹ directly
- Pivoting (choosing the largest element in the column) prevents division by zero and improves numerical stability
- Matrix multiplication and inversion are equivalently hard — O(M(n)) for both, enabling Strassen's speedup for inversion
- Symmetric positive-definite matrices are always nonsingular, have all-positive pivots, and need no pivoting for LU decomposition
- Least-squares approximation reduces to solving the normal equation AᵀAc = Aᵀy, which is guaranteed to have a unique solution when A has full column rank
Previous chapter
Multithreaded AlgorithmsNext chapter
Linear Programming