Getting Started
Introduction to AlgorithmsChapter 2: Getting Started
This chapter introduces the framework for designing and analyzing algorithms. It covers insertion sort, algorithm analysis, and the divide-and-conquer design approach with merge sort.
2.1 Insertion Sort
The Sorting Problem
- Input: A sequence of n numbers ⟨a₁, a₂, ..., aₙ⟩
- Output: A permutation ⟨a'₁, a'₂, ..., a'ₙ⟩ such that a'₁ ≤ a'₂ ≤ ... ≤ a'ₙ
The numbers to be sorted are called keys. The input comes as an array with n elements.
How Insertion Sort Works
- Works like sorting a hand of playing cards
- Start with an empty left hand, cards face down on table
- Pick one card at a time and insert it into the correct position in the left hand
- Compare with cards already in hand from right to left
- At all times, the cards in the left hand are sorted
Algorithm
The algorithm sorts in place — rearranges numbers within the array with at most a constant number stored outside at any time.
INSERTION-SORT(A)
1 for j = 2 to A.length
2 key = A[j]
3 // Insert A[j] into the sorted sequence A[1..j-1]
4 i = j - 1
5 while i > 0 and A[i] > key
6 A[i + 1] = A[i]
7 i = i - 1
8 A[i + 1] = key
def insertion_sort(A):
for j in range(1, len(A)):
key = A[j]
i = j - 1
while i >= 0 and A[i] > key:
A[i + 1] = A[i]
i -= 1
A[i + 1] = key
Loop Invariants and Correctness
Loop Invariant: At the start of each iteration of the for loop, the subarray A[1..j-1] consists of the elements originally in A[1..j-1], but in sorted order.
Three properties must be shown:
-
Initialization: True prior to the first iteration of the loop.
- When j = 2, subarray A[1..j-1] = A[1] — a single element, trivially sorted.
-
Maintenance: If true before an iteration, it remains true before the next iteration.
- The body moves A[j-1], A[j-2], ... one position right until proper position for A[j] is found, then inserts A[j]. Subarray A[1..j] is now sorted.
-
Termination: When the loop terminates, the invariant gives a useful property showing correctness.
- Loop terminates when j = n + 1. Substituting: A[1..n] consists of original elements in sorted order. The entire array is sorted.
Note the similarity to mathematical induction: initialization = base case, maintenance = inductive step. The difference is that we stop when the loop terminates rather than applying indefinitely.
Pseudocode Conventions
- Indentation indicates block structure
- Loop constructs (
while,for,repeat-until) andif-elsework like C/Java/Python - Loop counter retains its value after exiting (first value exceeding the bound)
tofor incrementing loops,downtofor decrementing,byfor step size > 1//indicates comments- Variables are local to the procedure
- Arrays accessed with square brackets:
A[i], ranges:A[1..j] - Objects use dot notation:
A.length - Parameters passed by value; arrays/objects passed by pointer
- Boolean operators
and/orare short-circuiting NILrepresents a pointer to no object
2.2 Analyzing Algorithms
Analyzing an algorithm means predicting the resources it requires — most often computational time.
The RAM Model
- Random-Access Machine (RAM) model: generic one-processor, instructions executed sequentially (no concurrency)
- Contains instructions commonly found in real computers:
- Arithmetic: add, subtract, multiply, divide, remainder, floor, ceiling
- Data movement: load, store, copy
- Control: conditional/unconditional branch, subroutine call and return
- Each instruction takes constant time
- Data types: integer and floating point
- Word size: assumed c·lg(n) bits for constant c ≥ 1
- c ≥ 1 so each word can hold value n (to index input elements)
- c is constant so word size doesn't grow arbitrarily
- Does not model memory hierarchy (caches, virtual memory)
Input Size
The best notion depends on the problem:
- Sorting: number of items (array size n)
- Multiplying integers: total number of bits
- Graphs: number of vertices and edges
Running Time
- Number of primitive operations or steps executed
- Each execution of line i takes time cᵢ (a constant) — machine-independent view
Analysis of Insertion Sort
For each j = 2, 3, ..., n, let tⱼ = number of times the while loop test (line 5) is executed for that value of j.
| Line | Cost | Times |
|---|---|---|
1: for j = 2 to A.length | c₁ | n |
2: key = A[j] | c₂ | n - 1 |
3: // comment | 0 | n - 1 |
4: i = j - 1 | c₄ | n - 1 |
5: while i > 0 and A[i] > key | c₅ | Σ(j=2 to n) tⱼ |
6: A[i+1] = A[i] | c₆ | Σ(j=2 to n)(tⱼ - 1) |
7: i = i - 1 | c₇ | Σ(j=2 to n)(tⱼ - 1) |
8: A[i+1] = key | c₈ | n - 1 |
Total running time:
T(n) = c₁n + c₂(n-1) + c₄(n-1) + c₅·Σtⱼ + c₆·Σ(tⱼ-1) + c₇·Σ(tⱼ-1) + c₈(n-1)
Best Case: Already Sorted
- tⱼ = 1 for all j (while loop test fails immediately)
- T(n) = (c₁ + c₂ + c₄ + c₅ + c₈)n - (c₂ + c₄ + c₅ + c₈)
- This is an + b → a linear function of n
Worst Case: Reverse Sorted
- tⱼ = j for all j (must compare with every element)
- Using: Σ(j=2 to n) j = n(n+1)/2 - 1 and Σ(j=2 to n)(j-1) = n(n-1)/2
- T(n) = an² + bn + c → a quadratic function of n
Worst-Case and Average-Case Analysis
We usually focus on worst-case running time for three reasons:
- Provides an upper bound — guarantees the algorithm never takes longer
- For some algorithms, the worst case occurs fairly often (e.g., searching for absent information)
- The average case is often roughly as bad as the worst case (for insertion sort, average tⱼ ≈ j/2, still gives quadratic time)
Order of Growth
- We consider only the leading term of the formula
- Ignore lower-order terms (insignificant for large n)
- Ignore the leading term's constant coefficient
- Insertion sort worst case: Θ(n²)
We consider one algorithm more efficient than another if its worst-case running time has a lower order of growth. For large enough inputs, a Θ(n²) algorithm will always run faster than a Θ(n³) algorithm.
2.3 Designing Algorithms
Incremental Approach
- Used by insertion sort
- Having sorted A[1..j-1], insert A[j] into its proper place to get sorted A[1..j]
2.3.1 The Divide-and-Conquer Approach
Many algorithms are recursive in structure — they call themselves to deal with closely related subproblems.
Three steps at each level of recursion:
- Divide the problem into subproblems that are smaller instances of the same problem
- Conquer the subproblems by solving them recursively. If small enough, solve directly (base case)
- Combine the solutions to subproblems into the solution for the original problem
Merge Sort
Applies divide-and-conquer to sorting:
- Divide: Split the n-element sequence into two subsequences of n/2 elements each
- Conquer: Sort the two subsequences recursively using merge sort
- Combine: Merge the two sorted subsequences to produce the sorted answer
- Base case: Sequence of length 1 is already sorted
The MERGE Procedure
MERGE(A, p, q, r) — merges two sorted subarrays A[p..q] and A[q+1..r] into a single sorted subarray A[p..r].
- Uses sentinel value (∞) at the bottom of each pile to avoid checking if either pile is empty
- Runs in Θ(n) time where n = r - p + 1
MERGE(A, p, q, r)
1 n1 = q - p + 1
2 n2 = r - q
3 let L[1..n1+1] and R[1..n2+1] be new arrays
4 for i = 1 to n1
5 L[i] = A[p + i - 1]
6 for j = 1 to n2
7 R[j] = A[q + j]
8 L[n1 + 1] = ∞
9 R[n2 + 1] = ∞
10 i = 1
11 j = 1
12 for k = p to r
13 if L[i] ≤ R[j]
14 A[k] = L[i]
15 i = i + 1
16 else A[k] = R[j]
17 j = j + 1
def merge(A, p, q, r):
L = A[p:q+1] + [float('inf')]
R = A[q+1:r+1] + [float('inf')]
i = j = 0
for k in range(p, r + 1):
if L[i] <= R[j]:
A[k] = L[i]
i += 1
else:
A[k] = R[j]
j += 1
Loop invariant for MERGE (lines 12–17):
At the start of each iteration, A[p..k-1] contains the k-p smallest elements of L[1..n1+1] and R[1..n2+1] in sorted order. Moreover, L[i] and R[j] are the smallest elements of their arrays not yet copied back into A.
- Initialization: k = p, so A[p..k-1] is empty (0 smallest elements). i = j = 1, both point to smallest elements.
- Maintenance: The smaller of L[i] and R[j] is copied into A[k], maintaining sorted order.
- Termination: k = r + 1, so A[p..r] contains all r-p+1 smallest elements in sorted order. Only sentinels remain.
MERGE-SORT Procedure
MERGE-SORT(A, p, r)
1 if p < r
2 q = ⌊(p + r) / 2⌋
3 MERGE-SORT(A, p, q)
4 MERGE-SORT(A, q + 1, r)
5 MERGE(A, p, q, r)
def merge_sort(A, p, r):
if p < r:
q = (p + r) // 2
merge_sort(A, p, q)
merge_sort(A, q + 1, r)
merge(A, p, q, r)
Initial call: MERGE-SORT(A, 1, A.length)
The algorithm merges pairs of 1-item sequences → sorted sequences of length 2 → length 4 → ... → final sorted sequence of length n.
2.3.2 Analyzing Divide-and-Conquer Algorithms
When an algorithm contains a recursive call, its running time is described by a recurrence.
General divide-and-conquer recurrence:
T(n) = { Θ(1) if n ≤ c
{ aT(n/b) + D(n) + C(n) otherwise
Where:
- a = number of subproblems
- n/b = size of each subproblem
- D(n) = time to divide
- C(n) = time to combine
Analysis of Merge Sort
For n > 1:
- Divide: Compute middle of subarray → D(n) = Θ(1)
- Conquer: Recursively solve 2 subproblems of size n/2 → 2T(n/2)
- Combine: MERGE on n elements → C(n) = Θ(n)
Recurrence:
T(n) = { Θ(1) if n = 1
{ 2T(n/2) + Θ(n) if n > 1
Simplified (using constant c for both base case and per-element work):
T(n) = { c if n = 1
{ 2T(n/2) + cn if n > 1
Solving with a Recursion Tree
- Top level: cost cn
- Level 1: two nodes each costing cn/2 → total cn
- Level 2: four nodes each costing cn/4 → total cn
- Level i: 2ⁱ nodes each costing c(n/2ⁱ) → total cn
- Bottom level: n nodes each costing c → total cn
Number of levels: lg(n) + 1
Total cost: cn · (lg n + 1) = cn·lg(n) + cn = Θ(n lg n)
Comparison: Insertion Sort vs. Merge Sort
| Insertion Sort | Merge Sort | |
|---|---|---|
| Best case | Θ(n) | Θ(n lg n) |
| Worst case | Θ(n²) | Θ(n lg n) |
| In-place? | Yes | No (needs auxiliary arrays) |
| Approach | Incremental | Divide-and-conquer |
Since lg(n) grows more slowly than any linear function, for large enough inputs, merge sort with Θ(n lg n) outperforms insertion sort with Θ(n²) in the worst case.
Key Takeaways
- Insertion sort is efficient for small inputs; Θ(n) best case, Θ(n²) worst case
- Loop invariants prove algorithm correctness (initialization, maintenance, termination)
- RAM model provides a framework for analyzing algorithms independent of hardware
- Order of growth (Θ-notation) captures the essential efficiency of an algorithm
- Worst-case analysis is preferred: provides guarantees, often occurs in practice, average case is often similar
- Divide-and-conquer breaks problems into smaller subproblems, solves recursively, combines results
- Merge sort achieves Θ(n lg n) worst-case time by dividing in half and merging in linear time
- Recurrences describe running times of recursive algorithms; recursion trees help solve them
Previous chapter
The Role of Algorithms in ComputingNext chapter
Growth of Functions