Data Structures Complete Cheat Sheet

Algorithm Complexity & Big-O Notation

Definition: Big-O notation describes the worst-case time complexity of an algorithm as input size grows. It provides an upper bound on the growth rate.

Key Points:

O(1) - Constant time: Operation doesn't depend on input size
O(log n) - Logarithmic: Divides problem in half each step (binary search)
O(n) - Linear: Processes each element once
O(n log n) - Linearithmic: Efficient sorting algorithms
O(n²) - Quadratic: Nested loops over input
O(2ⁿ) - Exponential: Doubles with each input addition

Extracted from past exam

Example: Analyze Code Complexity

int a = 0; 
for (i = 0; i < n; i++) { 
    for (j = 0; j < n; j++) { 
        a = a + i + j; 
    } 
}

Solution:

Step 1: Identify the loops - outer loop runs n times

Step 2: Inner loop also runs n times for each outer iteration

Step 3: Total operations: n × n = n²

Answer: O(n²)

Operation	Array	Linked List	Hash Table	BST (Balanced)
Access/Search	O(1) / O(n)	O(n)	O(1) avg	O(log n)
Insert	O(n)	O(1) at head	O(1) avg	O(log n)
Delete	O(n)	O(n)	O(1) avg	O(log n)

Linked Lists

Definition: A linear data structure where elements are stored in nodes, each containing data and a reference to the next node.

Key Points:

Singly Linked List: Each node points to the next node only
Doubly Linked List: Each node has references to both next and previous nodes
Circular Linked List: Last node points back to the first node
Dynamic size - can grow/shrink during runtime
Efficient insertion/deletion at any position (O(1) if position known)
No random access - must traverse from head (O(n))
Extra memory overhead for storing pointers

Extracted from past exam

Example: Insert in Sorted List

void insertSorted(int X) {
    // Insert X in a sorted singly linked list
}

Solution:

Step 1: Create new node with value X

Step 2: If list empty or X < head.value, insert at beginning

Step 3: Otherwise, traverse to find position where current.value < X < current.next.value

Step 4: Insert new node at found position

Time Complexity: O(n) worst case

Generated example

Example: Reverse a Linked List

Given list: 1 -> 2 -> 3 -> 4 -> null
Reverse it to: 4 -> 3 -> 2 -> 1 -> null

Solution:

Step 1: Initialize three pointers: prev = null, current = head, next = null

Step 2: While current is not null:
  - Store next = current.next
  - Reverse link: current.next = prev
  - Move pointers: prev = current, current = next

Step 3: Return prev as new head

Time Complexity: O(n), Space: O(1)

Stacks

Definition: A Last-In-First-Out (LIFO) data structure where elements are added and removed from the same end (top).

Key Points:

Push: Add element to top - O(1)
Pop: Remove element from top - O(1)
Peek/Top: View top element without removing - O(1)
Applications: Function call stack, undo operations, expression evaluation, backtracking
Can be implemented using arrays or linked lists
Array implementation: Fixed size but cache-friendly
Linked list implementation: Dynamic size but extra pointer overhead

Extracted from past exam

Example: Using Stack to Print Names in Reverse

Problem: Read a list of names and print them in opposite order.

Solution:

Step 1: Create an empty stack

Step 2: Push each name onto the stack as it's read

Step 3: Pop and print each name from stack until empty

Why it works: Stack's LIFO property ensures last name entered is first printed

Generated example

Example: Balance Parentheses

Check if parentheses are balanced: "((a+b)*(c-d))"

Solution:

Step 1: Initialize empty stack

Step 2: For each character:
- If '(', push to stack
- If ')', pop from stack (if empty, return false)

Step 3: After processing, stack should be empty

Result: Given string is balanced ✓

Queues

Definition: A First-In-First-Out (FIFO) data structure where elements are added at the rear and removed from the front.

Key Points:

Enqueue: Add element to rear - O(1)
Dequeue: Remove element from front - O(1)
Front: View front element without removing - O(1)
Circular Queue: Efficient array implementation using modulo arithmetic
Applications: Process scheduling, BFS, printer queues, message buffers
Overflow condition in circular queue: (rear + 1) % size == front
Empty condition: front == -1 or front == rear (after dequeue)

Extracted from past exam

Example: Reverse Queue Using Stack

Queue: [1, 2, 3, 4, 5] -> Reverse to: [5, 4, 3, 2, 1]

Solution:

Step 1: Create empty stack

Step 2: While queue not empty:
stack.push(queue.dequeue())

Step 3: While stack not empty:
queue.enqueue(stack.pop())

Result: Queue is now reversed

Recursion

Definition: A programming technique where a function calls itself to solve smaller instances of the same problem.

Key Points:

Base Case: Condition to stop recursion (prevents infinite calls)
Recursive Case: Function calls itself with modified parameters
Each call adds a frame to the call stack
Can lead to stack overflow if too deep
Often cleaner but may be less efficient than iteration
Common patterns: Divide & conquer, tree/graph traversal, backtracking
Time complexity: Analyze using recurrence relations

Extracted from past exam

Example: Analyze Recursive Function

function(int n) { 
    if (n==1) 
        return 0; 
    else
        return 1 + function(n-1); 
}

Solution:

Step 1: Identify base case: n==1, returns 0

Step 2: Recursive case: calls function(n-1)

Step 3: Number of calls: n-1 recursive calls

Step 4: Time complexity: O(n)

Function purpose: Returns n-1

Trees & Binary Trees

Definition: A hierarchical data structure with nodes connected by edges, where each node has at most one parent and any number of children.

Key Points:

Root: Top node with no parent
Leaf: Node with no children
Height: Longest path from root to leaf
Depth: Path length from root to a node
Binary Tree: Each node has at most 2 children
Complete Binary Tree: All levels filled except possibly last
Array representation: Node at index i has:
  - Left child at 2i+1
  - Right child at 2i+2
  - Parent at ⌊(i-1)/2⌋
Min height: ⌈log₂(n+1)⌉ - 1, Max height: n-1

Tree Traversals

Traversal	Order	Use Case
Pre-order	Root → Left → Right	Copy tree, prefix expression
In-order	Left → Root → Right	Sorted sequence in BST
Post-order	Left → Right → Root	Delete tree, postfix expression
Level-order	Level by level (BFS)	Breadth-first search

Extracted from past exam

Example: Check if Binary Tree is Complete

boolean isComplete(Node root, int index, int number_nodes) {
    if (root == null) return true;
    if (index >= number_nodes) return false;
    return isComplete(root.left, 2*index+1, number_nodes)
        && isComplete(root.right, 2*index+2, number_nodes);
}

Solution:

Step 1: Function checks if tree is complete binary tree

Step 2: Uses array indexing property

Step 3: If any node's index ≥ total nodes, tree not complete

Step 4: Recursively checks all nodes

Binary Search Trees (BST)

Definition: A binary tree where for each node, all values in left subtree are smaller and all values in right subtree are larger.

Key Points:

Search: O(h) where h is height - O(log n) balanced, O(n) skewed
Insert: O(h) - always insert as leaf
Delete: O(h) - three cases:
  1. Leaf node: Simply remove
  2. One child: Replace with child
  3. Two children: Replace with inorder successor/predecessor
Inorder traversal: Gives sorted sequence
Min value: Leftmost node
Max value: Rightmost node
Can become skewed (unbalanced) leading to O(n) operations

Extracted from past exam

Example: Delete Node with Two Children

Delete node 60 from BST
Original BST has 60 with left child 40 and right child 80

Solution:

Step 1: Node has two children - use inorder successor method

Step 2: Find minimum in right subtree (leftmost of 80's subtree)

Step 3: Replace 60's value with successor's value

Step 4: Delete the successor node (has at most one child)

AVL Trees

Definition: A self-balancing BST where the height difference between left and right subtrees is at most 1 for every node.

Key Points:

Balance Factor: height(left) - height(right) ∈ {-1, 0, 1}
Height: Always O(log n)
Operations: All O(log n) guaranteed
Rotations: Used to maintain balance
  - Single Right (LL case)
  - Single Left (RR case)
  - Left-Right (LR case)
  - Right-Left (RL case)
Insert: O(log n) - may require one rotation
Delete: O(log n) - may require multiple rotations
More balanced than regular BST but more complex

Extracted from past exam

Example: AVL Rotation Cost

What is the runtime cost of a double rotation in AVL tree?

Solution:

Step 1: Double rotation = two single rotations

Step 2: Each rotation involves constant pointer updates

Step 3: No traversal needed, just local changes

Answer: O(1)

Heaps & Priority Queues

Definition: A complete binary tree satisfying heap property - parent is greater (max-heap) or smaller (min-heap) than children.

Key Points:

Complete Binary Tree: All levels filled except possibly last
Array Implementation: Efficient, no pointers needed
Insert: O(log n) - add at end, bubble up
Extract Max/Min: O(log n) - replace root with last, bubble down
Get Max/Min: O(1) - just return root
Build Heap: O(n) using bottom-up approach
Heap Sort: O(n log n) time, O(1) space
Applications: Priority queues, Dijkstra's algorithm, scheduling

Extracted from past exam

Example: Priority Queue for Hospital Emergency

Arrange patients by injury severity using appropriate data structure.

Solution:

Step 1: Use Max-Heap based Priority Queue

Step 2: Priority = severity score

Step 3: Insert patient: O(log n)

Step 4: Get most severe: O(1)

Step 5: Remove after treatment: O(log n)

Generated example

Example: Insert 15 into Max-Heap

Heap array: [20, 18, 10, 12, 9, 8, 3]

Solution:

Step 1: Add 15 at end: [20, 18, 10, 12, 9, 8, 3, 15]

Step 2: Find parent: index 7's parent = (7-1)/2 = 3 (value 12)

Step 3: 15 > 12, swap: [20, 18, 10, 15, 9, 8, 3, 12]

Step 4: Check new parent: index 3's parent = 1 (value 18)

Step 5: 15 < 18, stop. Final: [20, 18, 10, 15, 9, 8, 3, 12]

Sorting Algorithms

Definition: Algorithms that arrange elements in a specific order (ascending/descending).

Algorithm	Best	Average	Worst	Space	Stable	Key Feature
Selection Sort	O(n²)	O(n²)	O(n²)	O(1)	No	n-1 swaps always
Insertion Sort	O(n)	O(n²)	O(n²)	O(1)	Yes	Best for nearly sorted
Merge Sort	O(n log n)	O(n log n)	O(n log n)	O(n)	Yes	Divide & conquer
Quick Sort	O(n log n)	O(n log n)	O(n²)	O(log n)	No	Fastest average case
Heap Sort	O(n log n)	O(n log n)	O(n log n)	O(1)	No	In-place, consistent

Extracted from past exam

Example: Best Algorithm for Sorted Data

Given sorted array {2, 3, 5, 6, 9, 11, 15}, which takes O(n) comparisons?

Solution:

Step 1: Analyze each algorithm on sorted input

Step 2: Insertion Sort: Each element compared once with previous

Step 3: Selection Sort: Still O(n²) comparisons

Step 4: Merge/Heap Sort: Still O(n log n)

Answer: Insertion Sort - O(n) on sorted data

Generated example

Example: Quick Sort Partition

Array: [5, 2, 8, 3, 9, 1, 7] - Pivot = 5

Solution:

Step 1: Initialize i = -1, j = 0

Step 2: Compare each element with pivot (5):
  2 < 5: swap with position 0
  3 < 5: swap with position 1
  1 < 5: swap with position 2

Step 3: Place pivot at correct position

Result: [2, 3, 1, 5, 9, 8, 7]

Hash Tables

Definition: A data structure that maps keys to values using a hash function for fast access.

Key Points:

Hash Function: Maps key to array index (h(k) = k mod size)
Collision: When two keys hash to same index
Load Factor: n/m (items/table size) - affects performance
Average Case: O(1) for insert, delete, search
Worst Case: O(n) when all keys hash to same index

Collision Resolution

Method	Description	Pros	Cons
Chaining	Linked list at each index	Simple, handles high load	Extra memory for pointers
Linear Probing	Check next slot: (h+i) mod m	Cache friendly	Clustering problem
Quadratic Probing	Check: (h+i²) mod m	Reduces clustering	May not probe all slots
Double Hashing	Use second hash function	Best distribution	More computation

Extracted from past exam

Example: Linear Probing

Insert: 12, 5, 19, 2, 23 into table size 7
Hash function: h(k) = k mod 7

Solution:

Step 1: 12 mod 7 = 5 → slot[5] = 12

Step 2: 5 mod 7 = 5 → collision! Try 6 → slot[6] = 5

Step 3: 19 mod 7 = 5 → collision! Try 6, 7=0 → slot[0] = 19

Step 4: 2 mod 7 = 2 → slot[2] = 2

Step 5: 23 mod 7 = 2 → collision! Try 3 → slot[3] = 23

Total probes: 1+2+3+1+2 = 9

Graphs

Definition: A collection of vertices (nodes) connected by edges, representing relationships between objects.

Key Points:

Directed Graph: Edges have direction (A→B)
Undirected Graph: Edges bidirectional (A—B)
Weighted Graph: Edges have associated costs/weights
Connected: Path exists between any two vertices
Cycle: Path that starts and ends at same vertex
DAG: Directed Acyclic Graph - no cycles

Graph Representations

Representation	Space	Add Edge	Find Edge	Best For
Adjacency Matrix	O(V²)	O(1)	O(1)	Dense graphs
Adjacency List	O(V+E)	O(1)	O(degree)	Sparse graphs
Edge List	O(E)	O(1)	O(E)	Simple operations

Graph Algorithms

DFS (Depth-First Search): O(V+E) - uses stack/recursion
BFS (Breadth-First Search): O(V+E) - uses queue
Topological Sort: O(V+E) - ordering of DAG vertices
Dijkstra's: O((V+E)log V) - shortest path
Applications: Social networks, maps, dependencies, circuits

Extracted from past exam

Example: Choose Graph Storage

Find edge (source-destination) in constant time, may or may not exist.

Solution:

Step 1: Need O(1) edge lookup

Step 2: Adjacency Matrix: matrix[i][j] - O(1) ✓

Step 3: Adjacency List: must search list - O(degree)

Step 4: Edge List: must search all edges - O(E)

Answer: Adjacency Matrix

Generated example

Example: BFS Traversal

Graph: A-B, A-C, B-D, C-D, C-E. Start from A.

Solution:

Step 1: Initialize: queue=[A], visited=[A]

Step 2: Process A: Add neighbors B,C → queue=[B,C]

Step 3: Process B: Add neighbor D → queue=[C,D]

Step 4: Process C: D already queued, add E → queue=[D,E]

Step 5: Process D,E: No new neighbors

BFS Order: A → B → C → D → E