Data Structures Complete Cheat Sheet
Algorithm Complexity & Big-O Notation
Definition: Big-O notation describes the worst-case time complexity of an algorithm as input size grows. It provides an upper bound on the growth rate.
Key Points:
- O(1) - Constant time: Operation doesn't depend on input size
- O(log n) - Logarithmic: Divides problem in half each step (binary search)
- O(n) - Linear: Processes each element once
- O(n log n) - Linearithmic: Efficient sorting algorithms
- O(n²) - Quadratic: Nested loops over input
- O(2ⁿ) - Exponential: Doubles with each input addition
Example: Analyze Code Complexity
int a = 0;
for (i = 0; i < n; i++) {
for (j = 0; j < n; j++) {
a = a + i + j;
}
}
Solution:
| Operation | Array | Linked List | Hash Table | BST (Balanced) |
|---|---|---|---|---|
| Access/Search | O(1) / O(n) | O(n) | O(1) avg | O(log n) |
| Insert | O(n) | O(1) at head | O(1) avg | O(log n) |
| Delete | O(n) | O(n) | O(1) avg | O(log n) |
Linked Lists
Definition: A linear data structure where elements are stored in nodes, each containing data and a reference to the next node.
Key Points:
- Singly Linked List: Each node points to the next node only
- Doubly Linked List: Each node has references to both next and previous nodes
- Circular Linked List: Last node points back to the first node
- Dynamic size - can grow/shrink during runtime
- Efficient insertion/deletion at any position (O(1) if position known)
- No random access - must traverse from head (O(n))
- Extra memory overhead for storing pointers
Example: Insert in Sorted List
void insertSorted(int X) {
// Insert X in a sorted singly linked list
}
Solution:
Example: Reverse a Linked List
Given list: 1 -> 2 -> 3 -> 4 -> null Reverse it to: 4 -> 3 -> 2 -> 1 -> null
Solution:
- Store next = current.next
- Reverse link: current.next = prev
- Move pointers: prev = current, current = next
Stacks
Definition: A Last-In-First-Out (LIFO) data structure where elements are added and removed from the same end (top).
Key Points:
- Push: Add element to top - O(1)
- Pop: Remove element from top - O(1)
- Peek/Top: View top element without removing - O(1)
- Applications: Function call stack, undo operations, expression evaluation, backtracking
- Can be implemented using arrays or linked lists
- Array implementation: Fixed size but cache-friendly
- Linked list implementation: Dynamic size but extra pointer overhead
Example: Using Stack to Print Names in Reverse
Problem: Read a list of names and print them in opposite order.
Solution:
Example: Balance Parentheses
Check if parentheses are balanced: "((a+b)*(c-d))"
Solution:
- If '(', push to stack
- If ')', pop from stack (if empty, return false)
Queues
Definition: A First-In-First-Out (FIFO) data structure where elements are added at the rear and removed from the front.
Key Points:
- Enqueue: Add element to rear - O(1)
- Dequeue: Remove element from front - O(1)
- Front: View front element without removing - O(1)
- Circular Queue: Efficient array implementation using modulo arithmetic
- Applications: Process scheduling, BFS, printer queues, message buffers
- Overflow condition in circular queue: (rear + 1) % size == front
- Empty condition: front == -1 or front == rear (after dequeue)
Example: Reverse Queue Using Stack
Queue: [1, 2, 3, 4, 5] -> Reverse to: [5, 4, 3, 2, 1]
Solution:
stack.push(queue.dequeue())
queue.enqueue(stack.pop())
Recursion
Definition: A programming technique where a function calls itself to solve smaller instances of the same problem.
Key Points:
- Base Case: Condition to stop recursion (prevents infinite calls)
- Recursive Case: Function calls itself with modified parameters
- Each call adds a frame to the call stack
- Can lead to stack overflow if too deep
- Often cleaner but may be less efficient than iteration
- Common patterns: Divide & conquer, tree/graph traversal, backtracking
- Time complexity: Analyze using recurrence relations
Example: Analyze Recursive Function
function(int n) {
if (n==1)
return 0;
else
return 1 + function(n-1);
}
Solution:
Trees & Binary Trees
Definition: A hierarchical data structure with nodes connected by edges, where each node has at most one parent and any number of children.
Key Points:
- Root: Top node with no parent
- Leaf: Node with no children
- Height: Longest path from root to leaf
- Depth: Path length from root to a node
- Binary Tree: Each node has at most 2 children
- Complete Binary Tree: All levels filled except possibly last
- Array representation: Node at index i has:
- Left child at 2i+1
- Right child at 2i+2
- Parent at ⌊(i-1)/2⌋ - Min height: ⌈log₂(n+1)⌉ - 1, Max height: n-1
Tree Traversals
| Traversal | Order | Use Case |
|---|---|---|
| Pre-order | Root → Left → Right | Copy tree, prefix expression |
| In-order | Left → Root → Right | Sorted sequence in BST |
| Post-order | Left → Right → Root | Delete tree, postfix expression |
| Level-order | Level by level (BFS) | Breadth-first search |
Example: Check if Binary Tree is Complete
boolean isComplete(Node root, int index, int number_nodes) {
if (root == null) return true;
if (index >= number_nodes) return false;
return isComplete(root.left, 2*index+1, number_nodes)
&& isComplete(root.right, 2*index+2, number_nodes);
}
Solution:
Binary Search Trees (BST)
Definition: A binary tree where for each node, all values in left subtree are smaller and all values in right subtree are larger.
Key Points:
- Search: O(h) where h is height - O(log n) balanced, O(n) skewed
- Insert: O(h) - always insert as leaf
- Delete: O(h) - three cases:
1. Leaf node: Simply remove
2. One child: Replace with child
3. Two children: Replace with inorder successor/predecessor - Inorder traversal: Gives sorted sequence
- Min value: Leftmost node
- Max value: Rightmost node
- Can become skewed (unbalanced) leading to O(n) operations
Example: Delete Node with Two Children
Delete node 60 from BST Original BST has 60 with left child 40 and right child 80
Solution:
AVL Trees
Definition: A self-balancing BST where the height difference between left and right subtrees is at most 1 for every node.
Key Points:
- Balance Factor: height(left) - height(right) ∈ {-1, 0, 1}
- Height: Always O(log n)
- Operations: All O(log n) guaranteed
- Rotations: Used to maintain balance
- Single Right (LL case)
- Single Left (RR case)
- Left-Right (LR case)
- Right-Left (RL case) - Insert: O(log n) - may require one rotation
- Delete: O(log n) - may require multiple rotations
- More balanced than regular BST but more complex
Example: AVL Rotation Cost
What is the runtime cost of a double rotation in AVL tree?
Solution:
Heaps & Priority Queues
Definition: A complete binary tree satisfying heap property - parent is greater (max-heap) or smaller (min-heap) than children.
Key Points:
- Complete Binary Tree: All levels filled except possibly last
- Array Implementation: Efficient, no pointers needed
- Insert: O(log n) - add at end, bubble up
- Extract Max/Min: O(log n) - replace root with last, bubble down
- Get Max/Min: O(1) - just return root
- Build Heap: O(n) using bottom-up approach
- Heap Sort: O(n log n) time, O(1) space
- Applications: Priority queues, Dijkstra's algorithm, scheduling
Example: Priority Queue for Hospital Emergency
Arrange patients by injury severity using appropriate data structure.
Solution:
Example: Insert 15 into Max-Heap
Heap array: [20, 18, 10, 12, 9, 8, 3]
Solution:
Sorting Algorithms
Definition: Algorithms that arrange elements in a specific order (ascending/descending).
| Algorithm | Best | Average | Worst | Space | Stable | Key Feature |
|---|---|---|---|---|---|---|
| Selection Sort | O(n²) | O(n²) | O(n²) | O(1) | No | n-1 swaps always |
| Insertion Sort | O(n) | O(n²) | O(n²) | O(1) | Yes | Best for nearly sorted |
| Merge Sort | O(n log n) | O(n log n) | O(n log n) | O(n) | Yes | Divide & conquer |
| Quick Sort | O(n log n) | O(n log n) | O(n²) | O(log n) | No | Fastest average case |
| Heap Sort | O(n log n) | O(n log n) | O(n log n) | O(1) | No | In-place, consistent |
Example: Best Algorithm for Sorted Data
Given sorted array {2, 3, 5, 6, 9, 11, 15}, which takes O(n) comparisons?
Solution:
Example: Quick Sort Partition
Array: [5, 2, 8, 3, 9, 1, 7] - Pivot = 5
Solution:
2 < 5: swap with position 0
3 < 5: swap with position 1
1 < 5: swap with position 2
Hash Tables
Definition: A data structure that maps keys to values using a hash function for fast access.
Key Points:
- Hash Function: Maps key to array index (h(k) = k mod size)
- Collision: When two keys hash to same index
- Load Factor: n/m (items/table size) - affects performance
- Average Case: O(1) for insert, delete, search
- Worst Case: O(n) when all keys hash to same index
Collision Resolution
| Method | Description | Pros | Cons |
|---|---|---|---|
| Chaining | Linked list at each index | Simple, handles high load | Extra memory for pointers |
| Linear Probing | Check next slot: (h+i) mod m | Cache friendly | Clustering problem |
| Quadratic Probing | Check: (h+i²) mod m | Reduces clustering | May not probe all slots |
| Double Hashing | Use second hash function | Best distribution | More computation |
Example: Linear Probing
Insert: 12, 5, 19, 2, 23 into table size 7 Hash function: h(k) = k mod 7
Solution:
Graphs
Definition: A collection of vertices (nodes) connected by edges, representing relationships between objects.
Key Points:
- Directed Graph: Edges have direction (A→B)
- Undirected Graph: Edges bidirectional (A—B)
- Weighted Graph: Edges have associated costs/weights
- Connected: Path exists between any two vertices
- Cycle: Path that starts and ends at same vertex
- DAG: Directed Acyclic Graph - no cycles
Graph Representations
| Representation | Space | Add Edge | Find Edge | Best For |
|---|---|---|---|---|
| Adjacency Matrix | O(V²) | O(1) | O(1) | Dense graphs |
| Adjacency List | O(V+E) | O(1) | O(degree) | Sparse graphs |
| Edge List | O(E) | O(1) | O(E) | Simple operations |
Graph Algorithms
- DFS (Depth-First Search): O(V+E) - uses stack/recursion
- BFS (Breadth-First Search): O(V+E) - uses queue
- Topological Sort: O(V+E) - ordering of DAG vertices
- Dijkstra's: O((V+E)log V) - shortest path
- Applications: Social networks, maps, dependencies, circuits
Example: Choose Graph Storage
Find edge (source-destination) in constant time, may or may not exist.
Solution:
Example: BFS Traversal
Graph: A-B, A-C, B-D, C-D, C-E. Start from A.