๐Ÿ“š Heaps & Priority Queues

A Comprehensive Study Guide for CS210

1. Introduction to Collections

Fred Brooks: "Show me your code and conceal your data structures, and I shall continue to be mystified. Show me your data structures, and I won't usually need your code; it'll be obvious."

A collection is a data type that stores groups of items. Different collections have different key operations and are implemented with different data structures.

Common Collections

Data Type Key Operations Data Structure
Stack PUSH, POP Linked list, resizing array
Queue ENQUEUE, DEQUEUE Linked list, resizing array
Priority Queue INSERT, DELETE-MAX Binary heap
Symbol Table PUT, GET, DELETE BST, hash table
Set ADD, CONTAINS, DELETE BST, hash table

2. Priority Queue ADT

Definition

A Priority Queue is a collection where you insert and delete items, but unlike a regular queue, you remove the largest (or smallest) item, not the item that was added first or last.

Key Differences from Other Collections

  • Stack: Remove the item most recently added (LIFO)
  • Queue: Remove the item least recently added (FIFO)
  • Randomized Queue: Remove a random item
  • Priority Queue: Remove the largest (or smallest) item

Priority Queue API (Java)

public class MaxPQ<Key extends Comparable<Key>> { // Create an empty priority queue MaxPQ() // Create a priority queue with given keys MaxPQ(Key[] a) // Insert a key into the priority queue void insert(Key v) // Return and remove the largest key Key delMax() // Is the priority queue empty? boolean isEmpty() // Return the largest key (without removing) Key max() // Number of entries in the priority queue int size() }
โš ๏ธ Important: The generic type Key must implement Comparable<Key> so that items can be ordered.

Example Operations

Example: Priority Queue Operations

Operation Argument Return Value Contents (unordered)
insert P - P
insert Q - P Q
insert E - P Q E
removeMax - Q P E
insert X - P E X
insert A - P E X A
insert M - P E X A M
removeMax - X P E M A

3. Priority Queue Applications

Priority queues generalize stacks, queues, and randomized queues, and have numerous real-world applications:

๐ŸŒŸ Real-World Applications

  • Event-driven simulation: Customers in a line, colliding particles
  • Numerical computation: Reducing roundoff error
  • Data compression: Huffman codes
  • Graph searching: Dijkstra's algorithm, Prim's algorithm
  • Number theory: Sum of powers
  • Artificial intelligence: A* search
  • Statistics: Online median in data stream
  • Operating systems: Load balancing, interrupt handling
  • Computer networks: Web cache
  • Discrete optimization: Bin packing, scheduling
  • Spam filtering: Bayesian spam filter

4. Complete Binary Trees

Binary Tree

A binary tree is either empty or a node with links to left and right binary trees (each node has at most 2 children).

Complete Binary Tree

A complete tree is perfectly balanced, except possibly for the bottom level. All levels are completely filled except possibly the last level, which is filled from left to right.

Complete Binary Tree Example (N = 16, height = 4)

                    โ—‹
                   / \
                  โ—‹   โ—‹
                 / \ / \
                โ—‹ โ—‹ โ—‹ โ—‹
               /\\/\\/\\/\
              โ—‹โ—‹โ—‹โ—‹โ—‹โ—‹โ—‹โ—‹โ—‹
                    

๐Ÿ“ Important Property

Height of a complete tree with N nodes is โŒŠlg NโŒ‹

The height increases only when N is a power of 2.

This logarithmic height is what makes heaps efficient!

Why Complete Binary Trees?

โœ… Advantages

  • Guaranteed logarithmic height
  • Can be stored efficiently in an array
  • No explicit links needed
  • Easy to navigate using array indices

โš ๏ธ Constraints

  • Must maintain completeness property
  • Insertions must go at the end
  • Deletions require restructuring

5. Heap Definition & Properties

Binary Heap

A binary heap is an array representation of a heap-ordered complete binary tree.

Two Key Properties

1. Heap-Order Property

For every internal node v other than the root:

key(v) โ‰ฅ key(parent(v))

This means: Parent's key is no smaller than children's keys

Or equivalently: Each node's-key-must-be-less-than-or-equal-to-its-parent's key.

2. Complete Binary Tree Property

The tree must be a complete binary tree (as defined above).

Types of Heaps

Heap Type Property Root Contains
Max Heap Parent โ‰ฅ Children Maximum element
Min Heap Parent โ‰ค Children Minimum element
๐Ÿ“ Note: By default, when we say "heap" we typically mean a max heap. This study guide focuses on max heaps, but the concepts apply equally to min heaps with reversed comparisons.

Array Representation

Array Indices for Navigation

In a heap stored in array a[] starting at index 1:

  • Root: a[1]
  • Parent of node at k: a[k/2]
  • Left child of node at k: a[2k]
  • Right child of node at k: a[2k+1]

Heap Visualization (Max Heap)

Tree representation:

           T (1)
          / \
        S(2) R(3)
       / \   / \
     P(4)N(5)O(6)A(7)
     /\  /\
   E I  H G
   8 9 10 11
                    

Array representation:

0
-
1
T
2
S
3
R
4
P
5
N
6
O
7
A
8
E
9
I
10
H
11
G

Note: Index 0 is not used (we start at index 1)

๐ŸŽฏ Key Properties

  • The largest key is always at a[1] (the root)
  • We can navigate the tree using simple arithmetic on array indices
  • No explicit links (pointers) are needed!
  • Space efficient: only need array of size N+1

6. Heap Operations

6.1 Insert Operation (Swim Up)

Insert Algorithm

  1. Add the new key at the end of the array (position N+1)
  2. The new node becomes the new "last node" of the complete tree
  3. Swim up: Compare with parent and swap if necessary
  4. Repeat until heap order is restored

Swim Helper Method

private void swim(int k) { // While not at root AND child > parent while (k > 1 && less(k/2, k)) { exch(k, k/2); // Swap with parent k = k/2; // Move up to parent } }

Insert Method

public void insert(Key x) { pq[++N] = x; // Add at end and increment size swim(N); // Restore heap order }

Example: Insert 'S' into heap

Before:              After adding:         After swim:
     T                    T                    S
    / \                  / \                  / \
   P   R                P   R                T   R
  / \                  / \ /                / \ /
 N   H                N  H S              N  H P
    
Heap: T P R N H    โ†’ T P R N H S  โ†’  S T R N H P
                     (violates)       (restored)
                    

โฑ๏ธ Time Complexity of Insert

At most 1 + lg N compares

Why? In the worst case, we swim from the bottom to the top of the tree, and the height is โŒŠlg NโŒ‹.


6.2 Delete Maximum Operation (Sink Down)

Delete Max Algorithm

  1. Save the root value (maximum) to return later
  2. Exchange root with the last node (at position N)
  3. Decrement N (remove last position)
  4. Sink down: Compare with children and swap with larger child if necessary
  5. Repeat until heap order is restored
  6. Return the saved maximum value

Sink Helper Method

private void sink(int k) { while (2*k <= N) { // While has at least one child int j = 2*k; // Left child // Choose larger child if (j < N && less(j, j+1)) j++; // If parent >= larger child, done if (!less(k, j)) break; exch(k, j); // Swap with larger child k = j; // Move down } }

Delete Max Method

public Key delMax() { Key max = pq[1]; // Get root (maximum) exch(1, N--); // Exchange with last, decrement N sink(1); // Restore heap order pq[N+1] = null; // Prevent loitering return max; }

Example: Delete maximum from heap

Before:              After exchange:      After sink:
     T                    H                    S
    / \                  / \                  / \
   S   R                S   R                P   R
  / \ / \              / \ /                / \ /
 P  N O  A            P  N O  A            N  H O  A

Remove T  โ†’  Exchange with H  โ†’  Sink H down
(return T)   (violates order)    (restored)
                    

โ“ Why swap with the larger child?

If we swapped with the smaller child, we might end up with a parent smaller than its other child, violating heap order. By swapping with the larger child, we ensure both children will be smaller than the promoted parent.

โฑ๏ธ Time Complexity of Delete Max

At most 2 lg N compares

Why 2? At each level, we compare the two children (1 compare) and then compare with parent (1 more compare). Height is โŒŠlg NโŒ‹.

๐Ÿ’ก "Peter Principle": In swim, a node is promoted to its level of incompetence (can't go higher). In sink, there's a "power struggle" where the better subordinate gets promoted!

7. Complete Java Implementation

public class MaxPQ<Key extends Comparable<Key>> { private Key[] pq; // Array to store heap private int N; // Number of elements // Constructor public MaxPQ(int capacity) { pq = (Key[]) new Comparable[capacity + 1]; } // Is the priority queue empty? public boolean isEmpty() { return N == 0; } // Number of elements public int size() { return N; } // Insert a key public void insert(Key key) { pq[++N] = key; swim(N); } // Delete and return maximum public Key delMax() { Key max = pq[1]; exch(1, N--); sink(1); pq[N+1] = null; // Prevent loitering return max; } /******************************************* * Helper functions for heap operations *******************************************/ private void swim(int k) { while (k > 1 && less(k/2, k)) { exch(k, k/2); k = k/2; } } private void sink(int k) { while (2*k <= N) { int j = 2*k; if (j < N && less(j, j+1)) j++; if (!less(k, j)) break; exch(k, j); k = j; } } /******************************************* * Helper functions for comparisons and swaps *******************************************/ private boolean less(int i, int j) { return pq[i].compareTo(pq[j]) < 0; } private void exch(int i, int j) { Key t = pq[i]; pq[i] = pq[j]; pq[j] = t; } }
๐Ÿ“ Implementation Notes:
  • Array indices start at 1 (index 0 is not used)
  • For simplicity, this uses fixed capacity (could be made resizing)
  • Generic type Key must implement Comparable
  • Setting pq[N+1] = null prevents loitering (allows garbage collection)

8. Time Complexity Analysis

Elementary Implementations vs Binary Heap

Implementation Insert Delete Max Get Max
Unordered Array 1 N N
Ordered Array N 1 1
Binary Heap log N log N 1

๐ŸŽฏ Why Binary Heap is Optimal

  • Logarithmic operations: Both insert and delete max are O(log N)
  • Constant time max: Getting the maximum is O(1)
  • Space efficient: Only O(N) space needed
  • Simple implementation: Just an array with two helper functions

Detailed Analysis

Insert Operation

  • Best case: O(1) - element already in correct position
  • Worst case: O(log N) - swim from bottom to top
  • Average case: O(log N)

Delete Max Operation

  • Best case: O(log N) - still need to check children
  • Worst case: O(log N) - sink from top to bottom
  • Average case: O(log N)
๐Ÿ’ก Why log N? Because the height of a complete binary tree with N nodes is โŒŠlogโ‚‚ NโŒ‹, and in the worst case, we traverse from one end to the other.

9. Heap Sort

Heapsort Algorithm

Heapsort is a comparison-based sorting algorithm that uses a binary heap to sort an array in O(N log N) time using O(1) extra space.

Algorithm Overview

  1. Build heap: Rearrange array into a heap
  2. Sortdown: Repeatedly remove the maximum (root) and place at end

Using Priority Queue for Sorting

// Simple approach using priority queue public static void sort(Comparable[] a) { int N = a.length; MaxPQ pq = new MaxPQ(N); // Phase 1: Insert all elements for (int i = 0; i < N; i++) pq.insert(a[i]); // Phase 2: Remove in sorted order for (int i = N-1; i >= 0; i--) a[i] = pq.delMax(); }

Time Complexity

โฑ๏ธ Runtime Analysis

  • Building heap: O(N log N) with repeated inserts
  • Removing all elements: O(N log N)
  • Total: O(N log N)
  • Space: O(N) for the heap

Comparison with Other Sorts

Algorithm Time Complexity Space Stable?
Insertion Sort O(Nยฒ) O(1) โœ… Yes
Selection Sort O(Nยฒ) O(1) โŒ No
Merge Sort O(N log N) O(N) โœ… Yes
Heap Sort O(N log N) O(N) โŒ No
Quick Sort O(N log N)* O(log N) โŒ No

* Average case; worst case is O(Nยฒ)

โœ… Advantages of Heapsort

  • Guaranteed O(N log N) performance
  • In-place sorting (no extra array needed)
  • No worst-case quadratic behavior
  • Simple implementation

โš ๏ธ Disadvantages

  • Not stable
  • Poor cache performance
  • Slower than quicksort in practice
  • Not adaptive (doesn't benefit from partial order)

10. Practice Problems & Key Concepts

Practice Problem 1: Build a Max Heap

Problem

Insert the following elements into an empty max heap in order: 2, 5, 16, 4, 10, 23, 39, 18, 26, 15

Solution Steps

The solution requires building up the heap one element at a time, swimming up after each insertion. The final heap should be:

           39
          /  \
        26    23
       / \   / \
     18  15 16  5
     / \ /
    2  4 10
                        

Array: [-, 39, 26, 23, 18, 15, 16, 5, 2, 4, 10]

Practice Problem 2: Delete Operations

Problem

From the heap above, perform three deleteMax() operations. Show the heap after each deletion.

Hint

Remember: Exchange root with last element, remove last, then sink down.

Practice Problem 3: Is this a Heap?

Problem

Determine if the following binary tree is a valid max heap:

        8
       / \
      30  12
     / \  / \
    60 40 50 60
                    

Answer

NO - The smallest element in a heap should always be found in the root node. Here, 8 is the root but it's smaller than its children (30 and 12), violating the heap-order property.

๐ŸŽฏ Key Concepts to Remember

  • Complete Binary Tree: All levels filled except possibly last, which fills left-to-right
  • Heap-Order Property: Parent โ‰ฅ Children (max heap) or Parent โ‰ค Children (min heap)
  • Array Representation: Parent at k/2, children at 2k and 2k+1
  • Root is at index 1: Makes math simpler (parent = k/2)
  • Maximum always at root: O(1) access to max element
  • Height is โŒŠlog NโŒ‹: Guarantees logarithmic operations
  • Swim up: Used after insert, compares with parent
  • Sink down: Used after delete max, compares with larger child
  • Both operations: O(log N) time complexity
  • No explicit links needed: Array arithmetic handles navigation

๐Ÿ“š Study Tips

  • Draw it out: Visualize heap operations by drawing trees
  • Trace the code: Follow swim and sink operations step-by-step
  • Understand the why: Know why we swap with larger child in sink
  • Practice array indexing: Get comfortable with k/2, 2k, 2k+1
  • Memorize complexities: Insert O(log N), Delete Max O(log N), Get Max O(1)
  • Compare implementations: Understand trade-offs vs other data structures

Common Mistakes to Avoid

  • โŒ Starting array at index 0 (makes parent/child math harder)
  • โŒ Forgetting to set deleted elements to null (memory leak)
  • โŒ Swapping with smaller child in sink (violates heap order)
  • โŒ Not checking array bounds in sink
  • โŒ Confusing max heap with min heap properties
  • โŒ Thinking heap is sorted (it's only partially ordered)

๐ŸŽ“ Practice Questions

Question 1: Heap vs. Sorted Array for Priority Queue

Q: Compare using an unordered array, an ordered array, and a binary heap to implement a priority queue. When would you choose each?

A: - Unordered array: Insert is O(1) (append), but deleteMax is O(N) (must scan). Use only when insertions vastly outnumber deletions.
- Ordered array: DeleteMax is O(1) (pop from end), but insert is O(N) (shift elements). Use when deletions vastly outnumber insertions and N is small.
- Binary heap: Both insert and deleteMax are O(log N). This is the standard choice for general-purpose priority queues when both operations are frequent, as it balances the trade-off optimally.

Question 2: Array Navigation Arithmetic

Q: In a 1-indexed heap array, node k has its parent, left child, and right child at which indices? Verify with the node at index 5.

A: For a node at index k (using 1-based indexing):
- Parent: k/2 (integer division)
- Left child: 2k
- Right child: 2k + 1
For k = 5: parent is at index 2 (5/2 = 2), left child is at index 10, right child is at index 11. This elegant arithmetic is why index 0 is deliberately left unused โ€” it makes the formulas work cleanly with integer division.

Question 3: Trace swim() After Insert

Q: Given the max-heap array [-, T, S, R, P, N, O, A, E, I, H, G] (1-indexed), insert key 'U'. Trace each swap in the swim operation.

A: Append U at index 12. Array becomes [-, T, S, R, P, N, O, A, E, I, H, G, U].
- k=12, parent=6 (O). U > O โ†’ swap. Array: [..., U at 6, ..., O at 12]. k=6.
- k=6, parent=3 (R). U > R โ†’ swap. Array: [..., U at 3, ..., R at 6]. k=3.
- k=3, parent=1 (T). U > T โ†’ swap. Array: [-, S, ..., U at 1, ..., T at 3]. k=1.
- k=1, root reached. Stop.
Total: 3 swaps, 3 compares. U correctly rises to the root as the new maximum.

Question 4: Why Sink Swaps with the Larger Child

Q: During sink(), why must we swap with the LARGER child rather than just the first child that is greater than the current node?

A: The heap-order property requires that a parent is greater than or equal to BOTH its children. If we swap the current node with the smaller child, the promoted (smaller) child might still be less than the other child, violating the heap property at that position. By swapping with the LARGER child, we promote the biggest value upward, which guarantees it is at least as large as its sibling. This maintains the heap property at the swapped position and allows the sinking to continue correctly. In code: if (j < N && less(j, j+1)) j++; selects the larger child before comparing.

Question 5: Is This Array a Valid Max-Heap?

Q: Is the array [-, 90, 80, 85, 70, 60, 75, 50, 20, 30, 40, 55, 65] a valid max-heap? If not, identify the first violation.

A: Check every parent-child pair. Parent at index 5 is 60. Its children are at 10 (40) and 11 (55). 60 > 55 and 60 > 40 โ€” OK. Parent at index 6 is 75. Its children are at 12 (65) and 13 (none). 75 > 65 โ€” OK. Check index 2 (80): children at 4 (70) and 5 (60) โ€” OK. All checks pass. Yes, this is a valid max-heap. The largest element (90) is at the root (index 1), and every parent is greater than or equal to both its children.

Question 6: Exam Trap โ€” Heap Is NOT Sorted

Q: After building a max-heap from the array [3, 1, 4, 1, 5, 9, 2, 6], a classmate claims the resulting heap array will be in sorted order. Are they correct? Explain.

A: The classmate is wrong. A heap guarantees only the partial order property: every parent is โ‰ฅ its children. It does NOT guarantee a total sorted order. For example, a left child at index 4 could be smaller than the right child at index 5 โ€” there is no relationship between siblings. The heap tells you the maximum (root), and you can extract elements in sorted order by repeatedly calling deleteMax (which is essentially heapsort), but the raw heap array itself is not sorted. A sorted array satisfies stricter constraints than a heap.

Question 7: heapify() in O(n) vs. O(n log n)

Q: Building a heap by inserting N elements one at a time takes O(N log N). But the bottom-up heapify (sink from the middle down to root) takes only O(N). Intuitively, why is bottom-up faster?

A: When inserting one by one, each insertion swims from the bottom, potentially traversing the full height O(log N), and doing this N times gives O(N log N). In bottom-up heapify, each node is sunk rather than swum. Nodes near the bottom of the tree (the majority) have very short sink paths โ€” a node at height h can sink at most h levels. Most nodes are at height 0 (leaves) and do no work at all. Summing the work across all levels: there are N/2 nodes at height 0, N/4 at height 1, N/8 at height 2, etc. The total work is N/2ยท0 + N/4ยท1 + N/8ยท2 + ... = O(N) by geometric series analysis.

Question 8: Min-Heap from Max-Heap Code

Q: The implementations shown use a max-heap. What is the minimal change needed to make a min-heap? What does the root of a min-heap contain?

A: The only change needed is to reverse the comparison in the less() helper โ€” replace pq[i].compareTo(pq[j]) < 0 with pq[i].compareTo(pq[j]) > 0, effectively making "less" mean "greater." Alternatively, pass a reversed Comparator at construction. The root of a min-heap contains the minimum element. Min-heaps are used in Dijkstra's shortest-path algorithm and Prim's MST algorithm, where you always want to process the smallest-priority element next.