Data Structures Complete Cheat Sheet

Algorithm Complexity & Big-O Notation

Definition: Big-O notation describes the worst-case time complexity of an algorithm as input size grows. It provides an upper bound on the growth rate.

Key Points:

  • O(1) - Constant time: Operation doesn't depend on input size
  • O(log n) - Logarithmic: Divides problem in half each step (binary search)
  • O(n) - Linear: Processes each element once
  • O(n log n) - Linearithmic: Efficient sorting algorithms
  • O(n²) - Quadratic: Nested loops over input
  • O(2ⁿ) - Exponential: Doubles with each input addition
Extracted from past exam

Example: Analyze Code Complexity

int a = 0; 
for (i = 0; i < n; i++) { 
    for (j = 0; j < n; j++) { 
        a = a + i + j; 
    } 
}

Solution:

Step 1: Identify the loops - outer loop runs n times
Step 2: Inner loop also runs n times for each outer iteration
Step 3: Total operations: n × n = n²
Answer: O(n²)
Operation Array Linked List Hash Table BST (Balanced)
Access/Search O(1) / O(n) O(n) O(1) avg O(log n)
Insert O(n) O(1) at head O(1) avg O(log n)
Delete O(n) O(n) O(1) avg O(log n)

Linked Lists

Definition: A linear data structure where elements are stored in nodes, each containing data and a reference to the next node.

Key Points:

  • Singly Linked List: Each node points to the next node only
  • Doubly Linked List: Each node has references to both next and previous nodes
  • Circular Linked List: Last node points back to the first node
  • Dynamic size - can grow/shrink during runtime
  • Efficient insertion/deletion at any position (O(1) if position known)
  • No random access - must traverse from head (O(n))
  • Extra memory overhead for storing pointers
Extracted from past exam

Example: Insert in Sorted List

void insertSorted(int X) {
    // Insert X in a sorted singly linked list
}

Solution:

Step 1: Create new node with value X
Step 2: If list empty or X < head.value, insert at beginning
Step 3: Otherwise, traverse to find position where current.value < X < current.next.value
Step 4: Insert new node at found position
Time Complexity: O(n) worst case
Generated example

Example: Reverse a Linked List

Given list: 1 -> 2 -> 3 -> 4 -> null
Reverse it to: 4 -> 3 -> 2 -> 1 -> null

Solution:

Step 1: Initialize three pointers: prev = null, current = head, next = null
Step 2: While current is not null:
  - Store next = current.next
  - Reverse link: current.next = prev
  - Move pointers: prev = current, current = next
Step 3: Return prev as new head
Time Complexity: O(n), Space: O(1)

Stacks

Definition: A Last-In-First-Out (LIFO) data structure where elements are added and removed from the same end (top).

Key Points:

  • Push: Add element to top - O(1)
  • Pop: Remove element from top - O(1)
  • Peek/Top: View top element without removing - O(1)
  • Applications: Function call stack, undo operations, expression evaluation, backtracking
  • Can be implemented using arrays or linked lists
  • Array implementation: Fixed size but cache-friendly
  • Linked list implementation: Dynamic size but extra pointer overhead
Extracted from past exam

Example: Using Stack to Print Names in Reverse

Problem: Read a list of names and print them in opposite order.

Solution:

Step 1: Create an empty stack
Step 2: Push each name onto the stack as it's read
Step 3: Pop and print each name from stack until empty
Why it works: Stack's LIFO property ensures last name entered is first printed
Generated example

Example: Balance Parentheses

Check if parentheses are balanced: "((a+b)*(c-d))"

Solution:

Step 1: Initialize empty stack
Step 2: For each character:
  - If '(', push to stack
  - If ')', pop from stack (if empty, return false)
Step 3: After processing, stack should be empty
Result: Given string is balanced ✓

Queues

Definition: A First-In-First-Out (FIFO) data structure where elements are added at the rear and removed from the front.

Key Points:

  • Enqueue: Add element to rear - O(1)
  • Dequeue: Remove element from front - O(1)
  • Front: View front element without removing - O(1)
  • Circular Queue: Efficient array implementation using modulo arithmetic
  • Applications: Process scheduling, BFS, printer queues, message buffers
  • Overflow condition in circular queue: (rear + 1) % size == front
  • Empty condition: front == -1 or front == rear (after dequeue)
Extracted from past exam

Example: Reverse Queue Using Stack

Queue: [1, 2, 3, 4, 5] -> Reverse to: [5, 4, 3, 2, 1]

Solution:

Step 1: Create empty stack
Step 2: While queue not empty:
  stack.push(queue.dequeue())
Step 3: While stack not empty:
  queue.enqueue(stack.pop())
Result: Queue is now reversed

Recursion

Definition: A programming technique where a function calls itself to solve smaller instances of the same problem.

Key Points:

  • Base Case: Condition to stop recursion (prevents infinite calls)
  • Recursive Case: Function calls itself with modified parameters
  • Each call adds a frame to the call stack
  • Can lead to stack overflow if too deep
  • Often cleaner but may be less efficient than iteration
  • Common patterns: Divide & conquer, tree/graph traversal, backtracking
  • Time complexity: Analyze using recurrence relations
Extracted from past exam

Example: Analyze Recursive Function

function(int n) { 
    if (n==1) 
        return 0; 
    else
        return 1 + function(n-1); 
}

Solution:

Step 1: Identify base case: n==1, returns 0
Step 2: Recursive case: calls function(n-1)
Step 3: Number of calls: n-1 recursive calls
Step 4: Time complexity: O(n)
Function purpose: Returns n-1

Trees & Binary Trees

Definition: A hierarchical data structure with nodes connected by edges, where each node has at most one parent and any number of children.

Key Points:

  • Root: Top node with no parent
  • Leaf: Node with no children
  • Height: Longest path from root to leaf
  • Depth: Path length from root to a node
  • Binary Tree: Each node has at most 2 children
  • Complete Binary Tree: All levels filled except possibly last
  • Array representation: Node at index i has:
      - Left child at 2i+1
      - Right child at 2i+2
      - Parent at ⌊(i-1)/2⌋
  • Min height: ⌈log₂(n+1)⌉ - 1, Max height: n-1

Tree Traversals

Traversal Order Use Case
Pre-order Root → Left → Right Copy tree, prefix expression
In-order Left → Root → Right Sorted sequence in BST
Post-order Left → Right → Root Delete tree, postfix expression
Level-order Level by level (BFS) Breadth-first search
Extracted from past exam

Example: Check if Binary Tree is Complete

boolean isComplete(Node root, int index, int number_nodes) {
    if (root == null) return true;
    if (index >= number_nodes) return false;
    return isComplete(root.left, 2*index+1, number_nodes)
        && isComplete(root.right, 2*index+2, number_nodes);
}

Solution:

Step 1: Function checks if tree is complete binary tree
Step 2: Uses array indexing property
Step 3: If any node's index ≥ total nodes, tree not complete
Step 4: Recursively checks all nodes

Binary Search Trees (BST)

Definition: A binary tree where for each node, all values in left subtree are smaller and all values in right subtree are larger.

Key Points:

  • Search: O(h) where h is height - O(log n) balanced, O(n) skewed
  • Insert: O(h) - always insert as leaf
  • Delete: O(h) - three cases:
      1. Leaf node: Simply remove
      2. One child: Replace with child
      3. Two children: Replace with inorder successor/predecessor
  • Inorder traversal: Gives sorted sequence
  • Min value: Leftmost node
  • Max value: Rightmost node
  • Can become skewed (unbalanced) leading to O(n) operations
Extracted from past exam

Example: Delete Node with Two Children

Delete node 60 from BST
Original BST has 60 with left child 40 and right child 80

Solution:

Step 1: Node has two children - use inorder successor method
Step 2: Find minimum in right subtree (leftmost of 80's subtree)
Step 3: Replace 60's value with successor's value
Step 4: Delete the successor node (has at most one child)

AVL Trees

Definition: A self-balancing BST where the height difference between left and right subtrees is at most 1 for every node.

Key Points:

  • Balance Factor: height(left) - height(right) ∈ {-1, 0, 1}
  • Height: Always O(log n)
  • Operations: All O(log n) guaranteed
  • Rotations: Used to maintain balance
      - Single Right (LL case)
      - Single Left (RR case)
      - Left-Right (LR case)
      - Right-Left (RL case)
  • Insert: O(log n) - may require one rotation
  • Delete: O(log n) - may require multiple rotations
  • More balanced than regular BST but more complex
Extracted from past exam

Example: AVL Rotation Cost

What is the runtime cost of a double rotation in AVL tree?

Solution:

Step 1: Double rotation = two single rotations
Step 2: Each rotation involves constant pointer updates
Step 3: No traversal needed, just local changes
Answer: O(1)

Heaps & Priority Queues

Definition: A complete binary tree satisfying heap property - parent is greater (max-heap) or smaller (min-heap) than children.

Key Points:

  • Complete Binary Tree: All levels filled except possibly last
  • Array Implementation: Efficient, no pointers needed
  • Insert: O(log n) - add at end, bubble up
  • Extract Max/Min: O(log n) - replace root with last, bubble down
  • Get Max/Min: O(1) - just return root
  • Build Heap: O(n) using bottom-up approach
  • Heap Sort: O(n log n) time, O(1) space
  • Applications: Priority queues, Dijkstra's algorithm, scheduling
Extracted from past exam

Example: Priority Queue for Hospital Emergency

Arrange patients by injury severity using appropriate data structure.

Solution:

Step 1: Use Max-Heap based Priority Queue
Step 2: Priority = severity score
Step 3: Insert patient: O(log n)
Step 4: Get most severe: O(1)
Step 5: Remove after treatment: O(log n)
Generated example

Example: Insert 15 into Max-Heap

Heap array: [20, 18, 10, 12, 9, 8, 3]

Solution:

Step 1: Add 15 at end: [20, 18, 10, 12, 9, 8, 3, 15]
Step 2: Find parent: index 7's parent = (7-1)/2 = 3 (value 12)
Step 3: 15 > 12, swap: [20, 18, 10, 15, 9, 8, 3, 12]
Step 4: Check new parent: index 3's parent = 1 (value 18)
Step 5: 15 < 18, stop. Final: [20, 18, 10, 15, 9, 8, 3, 12]

Sorting Algorithms

Definition: Algorithms that arrange elements in a specific order (ascending/descending).

Algorithm Best Average Worst Space Stable Key Feature
Selection Sort O(n²) O(n²) O(n²) O(1) No n-1 swaps always
Insertion Sort O(n) O(n²) O(n²) O(1) Yes Best for nearly sorted
Merge Sort O(n log n) O(n log n) O(n log n) O(n) Yes Divide & conquer
Quick Sort O(n log n) O(n log n) O(n²) O(log n) No Fastest average case
Heap Sort O(n log n) O(n log n) O(n log n) O(1) No In-place, consistent
Extracted from past exam

Example: Best Algorithm for Sorted Data

Given sorted array {2, 3, 5, 6, 9, 11, 15}, which takes O(n) comparisons?

Solution:

Step 1: Analyze each algorithm on sorted input
Step 2: Insertion Sort: Each element compared once with previous
Step 3: Selection Sort: Still O(n²) comparisons
Step 4: Merge/Heap Sort: Still O(n log n)
Answer: Insertion Sort - O(n) on sorted data
Generated example

Example: Quick Sort Partition

Array: [5, 2, 8, 3, 9, 1, 7] - Pivot = 5

Solution:

Step 1: Initialize i = -1, j = 0
Step 2: Compare each element with pivot (5):
  2 < 5: swap with position 0
  3 < 5: swap with position 1
  1 < 5: swap with position 2
Step 3: Place pivot at correct position
Result: [2, 3, 1, 5, 9, 8, 7]

Hash Tables

Definition: A data structure that maps keys to values using a hash function for fast access.

Key Points:

  • Hash Function: Maps key to array index (h(k) = k mod size)
  • Collision: When two keys hash to same index
  • Load Factor: n/m (items/table size) - affects performance
  • Average Case: O(1) for insert, delete, search
  • Worst Case: O(n) when all keys hash to same index

Collision Resolution

Method Description Pros Cons
Chaining Linked list at each index Simple, handles high load Extra memory for pointers
Linear Probing Check next slot: (h+i) mod m Cache friendly Clustering problem
Quadratic Probing Check: (h+i²) mod m Reduces clustering May not probe all slots
Double Hashing Use second hash function Best distribution More computation
Extracted from past exam

Example: Linear Probing

Insert: 12, 5, 19, 2, 23 into table size 7
Hash function: h(k) = k mod 7

Solution:

Step 1: 12 mod 7 = 5 → slot[5] = 12
Step 2: 5 mod 7 = 5 → collision! Try 6 → slot[6] = 5
Step 3: 19 mod 7 = 5 → collision! Try 6, 7=0 → slot[0] = 19
Step 4: 2 mod 7 = 2 → slot[2] = 2
Step 5: 23 mod 7 = 2 → collision! Try 3 → slot[3] = 23
Total probes: 1+2+3+1+2 = 9

Graphs

Definition: A collection of vertices (nodes) connected by edges, representing relationships between objects.

Key Points:

  • Directed Graph: Edges have direction (A→B)
  • Undirected Graph: Edges bidirectional (A—B)
  • Weighted Graph: Edges have associated costs/weights
  • Connected: Path exists between any two vertices
  • Cycle: Path that starts and ends at same vertex
  • DAG: Directed Acyclic Graph - no cycles

Graph Representations

Representation Space Add Edge Find Edge Best For
Adjacency Matrix O(V²) O(1) O(1) Dense graphs
Adjacency List O(V+E) O(1) O(degree) Sparse graphs
Edge List O(E) O(1) O(E) Simple operations

Graph Algorithms

  • DFS (Depth-First Search): O(V+E) - uses stack/recursion
  • BFS (Breadth-First Search): O(V+E) - uses queue
  • Topological Sort: O(V+E) - ordering of DAG vertices
  • Dijkstra's: O((V+E)log V) - shortest path
  • Applications: Social networks, maps, dependencies, circuits
Extracted from past exam

Example: Choose Graph Storage

Find edge (source-destination) in constant time, may or may not exist.

Solution:

Step 1: Need O(1) edge lookup
Step 2: Adjacency Matrix: matrix[i][j] - O(1) ✓
Step 3: Adjacency List: must search list - O(degree)
Step 4: Edge List: must search all edges - O(E)
Answer: Adjacency Matrix
Generated example

Example: BFS Traversal

Graph: A-B, A-C, B-D, C-D, C-E. Start from A.

Solution:

Step 1: Initialize: queue=[A], visited=[A]
Step 2: Process A: Add neighbors B,C → queue=[B,C]
Step 3: Process B: Add neighbor D → queue=[C,D]
Step 4: Process C: D already queued, add E → queue=[D,E]
Step 5: Process D,E: No new neighbors
BFS Order: A → B → C → D → E