Pagefy

Pagefy

Back to Data Structures and Algorithms

Probabilistic Analysis and Randomized Algorithms

Introduction to Algorithms

Chapter 5: Probabilistic Analysis and Randomized Algorithms

This chapter introduces probabilistic analysis and randomized algorithms — two powerful techniques for analyzing and designing algorithms using probability and randomness.


5.1 The Hiring Problem

Problem Setup

  • You need to hire an office assistant through an employment agency
  • Candidates arrive one per day, numbered 1 through n
  • After each interview, you decide to hire or not
  • Interviewing has a low cost cᵢ; hiring has a high cost cₕ
  • Strategy: always keep the best candidate seen so far — if a new candidate is better, fire the current one and hire the new one

Pseudocode: HIRE-ASSISTANT(n)

HIRE-ASSISTANT(n)
1  best = 0              // dummy candidate 0 (least qualified)
2  for i = 1 to n
3      interview candidate i
4      if candidate i is better than candidate best
5          best = i
6          hire candidate i

Cost Analysis

  • Total cost: O(cᵢ·n + cₕ·m) where m = number of people hired
  • Interviewing cost cᵢ·n is fixed (always interview all n candidates)
  • Focus is on analyzing the hiring cost cₕ·m, which varies per run

Worst-Case Analysis

  • Worst case: candidates arrive in strictly increasing order of quality
  • Every candidate is hired → m = n
  • Total hiring cost: O(cₕ·n)

Probabilistic Analysis

  • Uses probability to analyze expected behavior over a distribution of inputs
  • Assumes candidates arrive in random order — i.e., the rank list ⟨rank(1), rank(2), …, rank(n)⟩ is a uniform random permutation of ⟨1, 2, …, n⟩
  • Each of the n! permutations is equally likely
  • Computes an average-case running time averaged over all possible inputs

Randomized Algorithms

  • When we don't know the input distribution, we impose randomness ourselves
  • A randomized algorithm has behavior determined by both its input and values from a random-number generator
  • RANDOM(a, b): returns an integer in [a, b] uniformly at random; each call is independent
  • Key distinction:
    • Average-case running time: probability distribution is over the inputs
    • Expected running time: the algorithm itself makes random choices
  • Advantage: no particular input elicits worst-case behavior — even an adversary cannot produce a bad input

5.2 Indicator Random Variables

Definition

Given a sample space S and an event A, the indicator random variable I{A} is:

I{A} = 1  if A occurs
        0  if A does not occur

Lemma 5.1

Given a sample space S and an event A, let X_A = I{A}. Then E[X_A] = Pr{A}.

Proof: E[X_A] = 1 · Pr{A} + 0 · Pr{Ā} = Pr{A}

Example: Coin Flips

  • For n flips of a fair coin, let Xᵢ = I{ith flip is heads}
  • Total heads: X = X₁ + X₂ + … + Xₙ
  • By linearity of expectation: E[X] = Σ E[Xᵢ] = Σ (1/2) = n/2
  • Linearity of expectation applies even when variables are dependent

Analysis of the Hiring Problem

  1. Let Xᵢ = I{candidate i is hired}
  2. Total hires: X = X₁ + X₂ + … + Xₙ
  3. Candidate i is hired iff they are the best among the first i candidates
  4. Probability: Pr{candidate i is hired} = 1/i
  5. By Lemma 5.1: E[Xᵢ] = 1/i
  6. By linearity of expectation:
E[X] = Σᵢ₌₁ⁿ (1/i) = ln n + O(1)

Lemma 5.2

Assuming candidates are presented in random order, HIRE-ASSISTANT has an average-case total hiring cost of O(cₕ ln n).

This is a significant improvement over the worst-case cost of O(cₕ·n).


5.3 Randomized Algorithms

Key Idea

  • Instead of assuming a distribution on inputs, we impose one
  • Randomly permute the candidates before running the algorithm
  • This enforces the property that every permutation is equally likely

Pseudocode: RANDOMIZED-HIRE-ASSISTANT(n)

RANDOMIZED-HIRE-ASSISTANT(n)
1  randomly permute the list of candidates
2  best = 0
3  for i = 1 to n
4      interview candidate i
5      if candidate i is better than candidate best
6          best = i
7          hire candidate i

Lemma 5.3

The expected hiring cost of RANDOMIZED-HIRE-ASSISTANT is O(cₕ ln n).

Probabilistic Analysis vs. Randomized Algorithms

AspectProbabilistic Analysis (Lemma 5.2)Randomized Algorithm (Lemma 5.3)
AssumptionInput comes in random orderNo assumption on input
Randomness sourceThe inputThe algorithm itself
Cost termAverage-caseExpected
GuaranteeOver input distributionFor any input

Randomly Permuting Arrays

Method 1: PERMUTE-BY-SORTING

PERMUTE-BY-SORTING(A)
1  n = A.length
2  let P[1..n] be a new array
3  for i = 1 to n
4      P[i] = RANDOM(1, n³)
5  sort A, using P as sort keys
  • Assign each element a random priority, then sort by priority
  • Running time: Θ(n lg n) (dominated by the sort)
  • Range 1 to n³ makes it likely all priorities are unique (probability ≥ 1 − 1/n)

Lemma 5.4: PERMUTE-BY-SORTING produces a uniform random permutation (assuming distinct priorities).

Proof sketch: The probability of any specific permutation is:

(1/n) · (1/(n-1)) · (1/(n-2)) · … · (1/1) = 1/n!

Method 2: RANDOMIZE-IN-PLACE (Fisher-Yates Shuffle)

RANDOMIZE-IN-PLACE(A)
1  n = A.length
2  for i = 1 to n
3      swap A[i] with A[RANDOM(i, n)]
  • Runs in O(n) time — more efficient than sorting
  • In iteration i, choose A[i] randomly from A[i..n]
  • After iteration i, A[i] is never altered

Lemma 5.5: RANDOMIZE-IN-PLACE computes a uniform random permutation.

Proof (by loop invariant): Just prior to iteration i, for each possible (i−1)-permutation, the subarray A[1..i−1] contains it with probability (n−i+1)!/n!.

  • Initialization (i=1): Empty subarray trivially contains the 0-permutation with probability n!/n! = 1 ✓
  • Maintenance: Pr{i-permutation in A[1..i]} = (1/(n−i+1)) · ((n−i+1)!/n!) = (n−i)!/n! ✓
  • Termination (i=n+1): A[1..n] is any n-permutation with probability 0!/n! = 1/n! ✓

Common Pitfall

Simply swapping A[i] with A[RANDOM(1, n)] (choosing from the entire array) does not produce a uniform random permutation.


5.4 Probabilistic Analysis and Further Uses of Indicator Random Variables

5.4.1 The Birthday Paradox

Question: How many people must be in a room before there is a ≥ 50% chance that two share a birthday?

Answer: Surprisingly, only 23 people.

Setup

  • k people, n = 365 days in a year
  • Birthdays are independent and uniformly distributed
  • Pr{person i and person j share a birthday} = 1/n

Exact Analysis

Let Bₖ = event that all k birthdays are distinct:

Pr{Bₖ} = 1 · (n-1)/n · (n-2)/n · … · (n-k+1)/n
        = ∏ᵢ₌₁ᵏ⁻¹ (1 - i/n)

Using the inequality 1 + x ≤ eˣ:

Pr{Bₖ} ≤ e^(-1/n) · e^(-2/n) · … · e^(-(k-1)/n)
        = e^(-k(k-1)/(2n))

This is ≤ 1/2 when k(k−1) ≥ 2n ln 2, giving:

k ≥ (1 + √(1 + 8n·ln2)) / 2
  • For n = 365: k ≥ 23
  • For Mars (n = 669 Martian days): k ≥ 31

Analysis Using Indicator Random Variables

  • Let Xᵢⱼ = I{person i and person j share a birthday}, for 1 ≤ i < j ≤ k
  • E[Xᵢⱼ] = 1/n
  • Total matching pairs: X = Σᵢ Σⱼ>ᵢ Xᵢⱼ
  • By linearity of expectation:
E[X] = C(k,2) · (1/n) = k(k-1) / (2n)
  • When k(k−1) ≥ 2n, expected matching pairs ≥ 1
  • For n = 365: need k ≥ 28 for E[X] ≥ 1
  • Both analyses give Θ(√n) asymptotically

5.4.2 Balls and Bins

Consider tossing identical balls into b bins independently and uniformly at random.

Key Results

  1. Expected balls in a given bin (after n tosses): n/b

    • Follows binomial distribution b(k; n, 1/b)
  2. Expected tosses until a given bin gets a ball: b

    • Follows geometric distribution with probability 1/b
  3. Expected tosses until every bin has at least one ball: b ln b + O(b)

    • Partition tosses into stages: stage i = tosses from (i−1)th hit to ith hit
    • A "hit" = ball lands in an empty bin
    • During stage i: (i−1) bins occupied, (b−i+1) bins empty
    • Probability of a hit in stage i: (b−i+1)/b
    • nᵢ (tosses in stage i) follows geometric distribution
    • E[nᵢ] = b/(b−i+1)
    • Total: E[n] = Σᵢ₌₁ᵇ b/(b−i+1) = b · Σᵢ₌₁ᵇ (1/i) = b(ln b + O(1))

This is also known as the coupon collector's problem: collecting b different coupons requires approximately b ln b random draws.

5.4.3 Streaks

Question: In n fair coin flips, what is the expected length of the longest streak of consecutive heads?

Answer: Θ(lg n)

Upper Bound: O(lg n)

  • Let Aᵢₖ = event that a streak of ≥ k heads starts at position i
  • Pr{Aᵢₖ} = 1/2ᵏ (by independence)
  • For k = 2⌈lg n⌉: Pr{Aᵢₖ} ≤ 1/n²
  • By Boole's inequality (union bound): probability of any streak of length ≥ 2⌈lg n⌉ is < 1/n
  • Splitting E[L] into two ranges and bounding:
E[L] < 2⌈lg n⌉ · 1 + n · (1/n) = O(lg n)

Lower Bound: Ω(lg n)

  • Partition n flips into groups of s = ⌊(lg n)/2⌋ consecutive flips
  • Probability one group is all heads: ≥ 1/√n
  • Probability no group is all heads: ≤ (1 − 1/√n)^(2n/lg n − 1) = O(1/n)
  • So with probability ≥ 1 − O(1/n), the longest streak is ≥ ⌊(lg n)/2⌋
  • Therefore: E[L] = Ω(lg n)

Indicator Variable Approach

  • Let Xᵢₖ = I{streak of ≥ k heads starts at position i}
  • Expected number of streaks of length k: E[X] = (n−k+1)/2ᵏ
  • For k = c·lg n: E[X] = Θ(1/n^(c−1))
    • c large → few such streaks (unlikely)
    • c = 1/2 → E[X] = Θ(√n) → many such streaks (likely)
  • Confirms expected longest streak is Θ(lg n)

Probability of Long Streaks

Streak lengthProbability of occurring
≥ r⌈lg n⌉≤ 1/n^(r−1)
Example (n=1000): ≥ 20≤ 1/1000
Example (n=1000): ≥ 30≤ 1/1,000,000

5.4.4 The On-Line Hiring Problem

A variant where you must hire exactly once and decide immediately after each interview.

Strategy

  1. Interview and reject the first k candidates (observation phase)
  2. Then hire the first candidate who is better than all of the first k
  3. If no such candidate appears, hire the last one (candidate n)

Pseudocode: ON-LINE-MAXIMUM(k, n)

ON-LINE-MAXIMUM(k, n)
1  bestscore = -∞
2  for i = 1 to k
3      if score(i) > bestscore
4          bestscore = score(i)
5  for i = k+1 to n
6      if score(i) > bestscore
7          return i
8  return n

Analysis

  • Let S = event of successfully hiring the best candidate
  • Sᵢ = event of success when the best candidate is in position i
  • Pr{Sᵢ} = 0 for i = 1, …, k (always rejected)
  • For i > k: Pr{Sᵢ} = Pr{Bᵢ} · Pr{Oᵢ}
    • Bᵢ = best candidate is in position i → Pr{Bᵢ} = 1/n
    • Oᵢ = none of positions k+1 to i−1 are selected → Pr{Oᵢ} = k/(i−1)
    • These events are independent
    • Pr{Sᵢ} = k / (n(i−1))
Pr{S} = Σᵢ₌ₖ₊₁ⁿ k/(n(i-1)) = (k/n) · Σᵢ₌ₖⁿ⁻¹ (1/i)

Optimal k

  • Approximate the sum with integrals: Pr{S} ≈ (k/n)(ln n − ln k)
  • Differentiate and set to 0: ln k = ln n − 1 = ln(n/e)
  • Optimal k = n/e
  • Success probability: at least 1/e ≈ 0.368

By interviewing and rejecting the first n/e candidates, then hiring the first one better than all of them, we find the best candidate with probability at least 1/e — regardless of n.

Previous chapter

Divide and Conquer

Next chapter

Heapsort