Pagefy

Book

Introduction to Algorithms

1.The Role of Algorithms in Computing 2.Getting Started 3.Growth of Functions 4.Divide and Conquer 5.Probabilistic Analysis and Randomized Algorithms 6.Heapsort 7.Quicksort 8.Sorting in Linear Time 9.Medians and Order Statistics 10.Elementary Data Structures 11.Hash Tables 12.Binary Search Trees 13.Red Black Trees 14.Augmenting Data Structures 15.Dynamic Programming 16.Greedy Algorithms 17.Amortized Analysis 18.B Trees 19.Fibonacci Heaps 20.van Emde Boas Trees 21.Data Structures for Disjoint Sets 22.Elementary Graph Algorithms 23.Minimum Spanning Trees 24.Single Source Shortest Paths 25.All Pairs Shortest Paths 26.Maximum Flow 27.Multithreaded Algorithms 28.Matrix Operations 29.Linear Programming 30.Polynomials and the FFT 31.Number Theoretic Algorithms 32.String Matching 33.Computational Geometry 34.NP Completeness 35.Approximation Algorithms 36.Appendix A Summations 37.Appendix B Sets Etc 38.Appendix C Counting and Probability 39.Appendix D Matrices

Matrix Operations

Introduction to Algorithms

Chapter 28: Matrix Operations

This chapter covers efficient algorithms for matrix operations central to scientific computing: solving systems of linear equations via LUP decomposition, inverting matrices, and working with symmetric positive-definite matrices for least-squares approximation.

Numerical stability is a practical concern throughout — floating-point round-off errors can amplify and produce incorrect results. LUP decomposition is preferred over direct inversion because it is more numerically stable.

28.1 Solving Systems of Linear Equations

Given n equations in n unknowns:

a₁₁x₁ + a₁₂x₂ + ⋯ + a₁ₙxₙ = b₁
a₂₁x₁ + a₂₂x₂ + ⋯ + a₂ₙxₙ = b₂
              ⋮
aₙ₁x₁ + aₙ₂x₂ + ⋯ + aₙₙxₙ = bₙ

This can be written as the matrix-vector equation:

Ax = b

If A is nonsingular (invertible), the unique solution is x = A⁻¹b.
If rank(A) < n → underdetermined system (typically infinitely many solutions, or none if inconsistent).
If more equations than unknowns → overdetermined system (may have no exact solution; see §28.3 for least-squares).

Overview of LUP Decomposition

The goal is to find three n × n matrices L, U, P such that:

PA = LU

where:

L is a unit lower-triangular matrix (1s on diagonal, 0s above)
U is an upper-triangular matrix
P is a permutation matrix

Why LUP? Solving triangular systems is easy. Once we have PA = LU, solving Ax = b reduces to:

Multiply both sides by P: PAx = Pb, giving LUx = Pb
Let y = Ux
Solve Ly = Pb via forward substitution → Θ(n²)
Solve Ux = y via back substitution → Θ(n²)

Correctness: Since A = P⁻¹LU, we verify Ax = P⁻¹LUx = P⁻¹Ly = P⁻¹Pb = b. ✓

Forward and Back Substitution

Forward substitution solves the lower-triangular system Ly = Pb:

y₁ = b[π(1)]
yᵢ = b[π(i)] − Σⱼ₌₁ⁱ⁻¹ lᵢⱼyⱼ    for i = 2, 3, …, n

Back substitution solves the upper-triangular system Ux = y:

xₙ = yₙ / uₙₙ
xᵢ = (yᵢ − Σⱼ₌ᵢ₊₁ⁿ uᵢⱼxⱼ) / uᵢᵢ    for i = n−1, n−2, …, 1

Both run in Θ(n²) time.

LUP-SOLVE(L, U, π, b)
  n = L.rows
  let x and y be new vectors of length n
  for i = 1 to n                              // forward substitution
      yᵢ = b[π(i)] − Σⱼ₌₁ⁱ⁻¹ lᵢⱼyⱼ
  for i = n downto 1                          // back substitution
      xᵢ = (yᵢ − Σⱼ₌ᵢ₊₁ⁿ uᵢⱼxⱼ) / uᵢᵢ
  return x

Running time: Θ(n²) (each loop has an implicit inner summation).

Computing an LU Decomposition

LU decomposition is the special case where P = Iₙ (no pivoting), so A = LU.

The method uses Gaussian elimination:

Subtract multiples of the first equation from subsequent equations to eliminate the first variable
Repeat for each subsequent variable
The result is an upper-triangular matrix U; the multipliers form L

Recursive formulation: Partition A as:

A = | a₁₁   wᵀ  |
    |  v    A'   |

where v is an (n−1)-vector (first column below a₁₁), wᵀ is a row (n−1)-vector, and A' is (n−1)×(n−1). Then:

A = | 1        0    | × | a₁₁       wᵀ           |
    | v/a₁₁   Iₙ₋₁ |   |  0    A' − vwᵀ/a₁₁     |

The matrix A' − vwᵀ/a₁₁ is the Schur complement of A with respect to a₁₁. If A is nonsingular, the Schur complement is also nonsingular, so we can recurse.

The elements we divide by are called pivots (diagonal elements of U). If any pivot is 0, LU decomposition fails — this is why we need LUP decomposition with permutations.

LU-DECOMPOSITION(A)
  n = A.rows
  let L and U be new n × n matrices
  initialize U with 0s below the diagonal
  initialize L with 1s on the diagonal and 0s above the diagonal
  for k = 1 to n
      uₖₖ = aₖₖ
      for i = k + 1 to n
          lᵢₖ = aᵢₖ / aₖₖ          // multiplier (v/a₁₁)
          uₖᵢ = aₖᵢ                 // row of U (wᵀ)
      for i = k + 1 to n
          for j = k + 1 to n
              aᵢⱼ = aᵢⱼ − lᵢₖ · uₖⱼ  // Schur complement update
  return L and U

Running time: Θ(n³) due to the triply nested loop.

In-place optimization: L and U can be stored directly in A, since lᵢⱼ occupies positions where i > j and uᵢⱼ occupies positions where i ≤ j.

Computing an LUP Decomposition

To avoid dividing by 0 (or by small values causing numerical instability), we pivot: at each step, select the element in the current column with the greatest absolute value and swap rows.

Key differences from LU decomposition:

Before processing column k, find the row k' (with k' ≥ k) that has the largest |aₖ'ₖ|
Swap rows k and k' (tracked via permutation array π)
Both the column vector v/aₖ₁ and the Schur complement are multiplied by the permutation

LUP-DECOMPOSITION(A)
  n = A.rows
  let π[1..n] be a new array
  for i = 1 to n
      π[i] = i
  for k = 1 to n
      p = 0
      for i = k to n
          if |aᵢₖ| > p
              p = |aᵢₖ|
              k' = i
      if p == 0
          error "singular matrix"
      exchange π[k] with π[k']
      for i = 1 to n
          exchange aₖᵢ with aₖ'ᵢ
      for i = k + 1 to n
          aᵢₖ = aᵢₖ / aₖₖ
          for j = k + 1 to n
              aᵢⱼ = aᵢⱼ − aᵢₖ · aₖⱼ

Running time: Θ(n³) — same asymptotic cost as LU decomposition. Pivoting adds at most a constant factor.

Output encoding: When the procedure terminates, the matrix A stores both L and U in place:

aᵢⱼ = lᵢⱼ if i > j
aᵢⱼ = uᵢⱼ if i ≤ j

The permutation is stored in array π, where π[i] = j means row i of P has a 1 in column j.

28.2 Inverting Matrices

Computing a Matrix Inverse from an LUP Decomposition

To compute A⁻¹, treat the equation AX = Iₙ as n separate systems:

AXᵢ = eᵢ    for i = 1, 2, …, n

where Xᵢ is the i-th column of X = A⁻¹ and eᵢ is the i-th unit vector.

Procedure:

Compute the LUP decomposition of A once → Θ(n³)
For each of the n columns, solve using LUP-SOLVE → Θ(n²) each
Total: Θ(n³)

Matrix Multiplication and Matrix Inversion Are Equivalently Hard

Let M(n) = time to multiply two n × n matrices, I(n) = time to invert an n × n matrix.

Theorem 28.1 — Multiplication is no harder than inversion:

If I(n) = Ω(n²) and I(3n) = O(I(n)), then M(n) = O(I(n)).

Proof idea: To compute AB, construct the 3n × 3n matrix:

D = | Iₙ   A   0  |
    |  0   Iₙ   B  |
    |  0    0   Iₙ |

Then:

D⁻¹ = | Iₙ   −A    AB |
       |  0    Iₙ   −B |
       |  0     0    Iₙ |

The product AB appears in the upper-right n × n block of D⁻¹.

Theorem 28.2 — Inversion is no harder than multiplication:

If M(n) = Ω(n²) with regularity conditions M(n + k) = O(M(n)) for 0 ≤ k ≤ n and M(n/2) ≤ cM(n) for some c < 1/2, then I(n) = O(M(n)).

Proof sketch (for symmetric positive-definite A):

Partition A and A⁻¹ into four n/2 × n/2 blocks:

A = | B   Cᵀ |    A⁻¹ = | R   T |
    | C    D |           | U   V |

Let S = D − CB⁻¹Cᵀ (Schur complement). Then:

A⁻¹ = | B⁻¹ + B⁻¹CᵀS⁻¹CB⁻¹    −B⁻¹CᵀS⁻¹ |
       |       −S⁻¹CB⁻¹              S⁻¹     |

This requires:

2 recursive inversions of n/2 × n/2 matrices
4 multiplications of n/2 × n/2 matrices
O(n²) additional work

Recurrence: I(n) ≤ 2I(n/2) + 4M(n/2) + O(n²) = O(M(n))

For general nonsingular A: Use the identity A⁻¹ = (AᵀA)⁻¹Aᵀ, since AᵀA is always symmetric positive-definite (when A is nonsingular). This reduces general inversion to SPD inversion plus matrix multiplications, all in O(M(n)).

Practical implication: Strassen's O(n^lg7) matrix multiplication algorithm gives O(n^lg7) matrix inversion.

28.3 Symmetric Positive-Definite Matrices and Least-Squares Approximation

A matrix A is symmetric positive-definite (SPD) if:

A = Aᵀ (symmetric)
xᵀAx > 0 for all nonzero vectors x (positive-definite)

Key Properties of SPD Matrices

Lemma 28.3: Any positive-definite matrix is nonsingular.

Proof: If A were singular, ∃ nonzero x with Ax = 0, so xᵀAx = 0, contradicting positive-definiteness.

Lemma 28.4: Every leading submatrix of an SPD matrix is also SPD.

The k-th leading submatrix Aₖ consists of the first k rows and k columns of A.
Proof: For any nonzero k-vector xₖ, extend it to x = (xₖ, 0)ᵀ. Then xᵀAx = xₖᵀAₖxₖ > 0.

Lemma 28.5 (Schur Complement Lemma): If A is SPD and Aₖ is a leading k × k submatrix, then the Schur complement

S = C − B Aₖ⁻¹ Bᵀ

(where A is partitioned as in Lemma 28.4) is also symmetric and positive-definite.

Proof sketch: For any nonzero z, choose y = −Aₖ⁻¹Bᵀz and form x = (y, z)ᵀ. Then xᵀAx = zᵀSz > 0.

Corollary 28.6: LU decomposition of an SPD matrix never causes a division by 0.

Every pivot is strictly positive.
Proof: The first pivot a₁₁ = e₁ᵀAe₁ > 0. Each subsequent pivot is the first element of a Schur complement, which is SPD by Lemma 28.5. By induction, all pivots are positive.

Practical consequence: SPD matrices can use LU decomposition without pivoting — no need for the permutation matrix P.

Least-Squares Approximation

Given m data points (x₁, y₁), (x₂, y₂), …, (xₘ, yₘ), we want to find a function:

F(x) = Σⱼ₌₁ⁿ cⱼfⱼ(x)

that minimizes the approximation errors ηᵢ = F(xᵢ) − yᵢ.

The basis functions fⱼ are chosen based on the problem (e.g., fⱼ(x) = x^(j−1) for polynomial fitting)
Typically n ≪ m (fewer parameters than data points) to avoid overfitting

Setup: Define the m × n matrix A where aᵢⱼ = fⱼ(xᵢ), and let c = (c₁, …, cₙ)ᵀ. Then:

Predicted values: Ac
Error vector: η = Ac − y
Goal: minimize ‖η‖² = Σᵢ ηᵢ²

Derivation: Setting the derivative of ‖η‖² with respect to each cₖ to zero yields the normal equation:

AᵀA c = Aᵀy

Key facts about AᵀA:

It is symmetric (by properties of transpose)
If A has full column rank, AᵀA is positive-definite (by Theorem D.6)
Therefore (AᵀA)⁻¹ exists

Solution:

c = (AᵀA)⁻¹ Aᵀ y = A⁺ y

where A⁺ = (AᵀA)⁻¹Aᵀ is the pseudoinverse of A.

The pseudoinverse generalizes the matrix inverse to non-square matrices: compare c = A⁺y (approximate solution to Ac = y) with x = A⁻¹b (exact solution to Ax = b).

Least-Squares Example

Given five data points: (−1, 2), (1, 1), (2, 1), (3, 0), (5, 3), fit a quadratic F(x) = c₁ + c₂x + c₃x².

The matrix of basis-function values:

A = | 1  -1   1 |
    | 1   1   1 |
    | 1   2   4 |
    | 1   3   9 |
    | 1   5  25 |

Solving the normal equation yields:

c = (1.200, −0.757, 0.214)ᵀ

So the best-fit quadratic is: F(x) = 1.200 − 0.757x + 0.214x²

Practical Computation

To solve the normal equation in practice:

Compute Aᵀy
Compute AᵀA
Find the LU decomposition of AᵀA (guaranteed nonsingular since AᵀA is SPD when A has full rank)
Solve via forward and back substitution

Summary of Running Times

Operation	Time
Forward / back substitution	Θ(n²)
LU decomposition	Θ(n³)
LUP decomposition	Θ(n³)
Solving Ax = b (via LUP)	Θ(n³)
Solving k systems with same A	Θ(n³ + kn²)
Matrix inversion (via LUP)	Θ(n³)
Matrix inversion (via Strassen)	O(n^lg7)

Key Takeaways

LUP decomposition (PA = LU) is the preferred method for solving linear systems — more numerically stable than computing A⁻¹ directly
Pivoting (choosing the largest element in the column) prevents division by zero and improves numerical stability
Matrix multiplication and inversion are equivalently hard — O(M(n)) for both, enabling Strassen's speedup for inversion
Symmetric positive-definite matrices are always nonsingular, have all-positive pivots, and need no pivoting for LU decomposition
Least-squares approximation reduces to solving the normal equation AᵀAc = Aᵀy, which is guaranteed to have a unique solution when A has full column rank

Previous chapter

Multithreaded Algorithms

Next chapter

Linear Programming