๐Ÿ“š Complete Study Guide: Trees

Data Structures and Algorithms - CS210

1. What is a Tree? ๐ŸŒณ

Definition: In computer science, a tree is an abstract model of a hierarchical structure consisting of nodes with a parent-child relationship.

Key Characteristics

  • Non-linear data structure (unlike arrays and linked lists)
  • Hierarchical organization of data
  • Each node can have multiple children
  • One root node at the top
  • No cycles (unlike graphs)

Real-World Applications

Common Uses:
  • Organization Charts: Company hierarchy
  • File Systems: Folders and files
  • Programming Environments: Abstract syntax trees
  • Database Indexing: B-trees and B+ trees
  • Decision Making: Decision trees in AI

2. Tree Terminology ๐Ÿ“–

Essential Terms to Master:

Root: The topmost node without a parent (e.g., node A)

Internal Node: A node with at least one child (nodes A, B, C, F)

External Node (Leaf): A node without children (nodes E, I, J, K, G, H, D)

Parent: A node that has one or more children

Child: A node that has a parent

Siblings: Nodes that share the same parent

Ancestors: Parent, grandparent, great-grandparent, etc.

Descendants: Child, grandchild, great-grandchild, etc.

Depth of a Node: Number of ancestors (distance from root)

Height of a Tree: Maximum depth of any node

Subtree: A tree consisting of a node and all its descendants

Example Tree Structure:

        A (depth 0, root)
       / \
      B   C (depth 1)
     / \   \
    E   F   G (depth 2)
       / \   \
      I   J   K (depth 3)
                    
  • Height of tree = 3
  • Internal nodes: A, B, C, F
  • Leaf nodes: E, I, J, G, K
  • Siblings: B and C; I and J

3. Tree ADT (Abstract Data Type) ๐Ÿ”ง

Trees use positions to abstract nodes. A position is a way to refer to a node without exposing its internal structure.

Generic Methods

Method Description Return Type
size() Returns the number of nodes in the tree integer
isEmpty() Checks if tree is empty boolean
iterator() Returns an iterator over tree elements Iterator
positions() Returns an iterable collection of all positions Iterable

Accessor Methods

Method Description Return Type
root() Returns the root position Position
parent(p) Returns the parent of position p Position
children(p) Returns an iterable of children of p Iterable
numChildren(p) Returns the number of children of p Integer

Query Methods

Method Description Return Type
isInternal(p) Checks if p has at least one child boolean
isExternal(p) Checks if p has no children (is a leaf) boolean
isRoot(p) Checks if p is the root boolean

4. Tree Traversals ๐Ÿšถ

A traversal is a systematic way of visiting all nodes in a tree exactly once.

Preorder Traversal (NLR - Node, Left, Right)

Algorithm:

Algorithm preOrder(v) visit(v) // Visit node first for each child w of v preOrder(w) // Then visit children

Visit Order: Root โ†’ Left subtree โ†’ Right subtree

Application: Printing structured documents, creating a copy of the tree

Postorder Traversal (LRN - Left, Right, Node)

Algorithm:

Algorithm postOrder(v) for each child w of v postOrder(w) // Visit children first visit(v) // Then visit node

Visit Order: Left subtree โ†’ Right subtree โ†’ Root

Application: Computing disk space, deleting a tree, evaluating expressions

Example:

For the tree structure:

        A
       /|\
      B C D
                    
  • Preorder: A, B, C, D
  • Postorder: B, C, D, A

5. Binary Trees ๐ŸŒฒ

A binary tree is a tree where each internal node has at most two children (exactly two for proper binary trees). The children are an ordered pair: left child and right child.

Properties of Binary Trees

  • Each node has at most 2 children
  • Children are ordered (left vs right matters)
  • Can be empty
  • Recursive definition: A binary tree is either:
    • A tree with a single node, OR
    • A tree whose root has an ordered pair of children, each of which is a binary tree

Properties of Proper Binary Trees

Important Formulas:

Let: n = number of nodes, e = external nodes, i = internal nodes, h = height

e = i + 1

Number of external nodes = internal nodes + 1

n = 2e - 1

Total nodes = 2 ร— external nodes - 1

h โ‰ค i

Height โ‰ค internal nodes

h โ‰ค (n - 1)/2

Height โ‰ค (total nodes - 1)/2

e โ‰ค 2h

External nodes โ‰ค 2 to the power of height

h โ‰ฅ logโ‚‚(e)

Height โ‰ฅ log base 2 of external nodes

Binary Tree ADT

The BinaryTree ADT extends the Tree ADT with additional methods:

  • left(p): Returns the left child of position p
  • right(p): Returns the right child of position p
  • sibling(p): Returns the sibling of position p

These methods return null when there is no left, right, or sibling.

6. Binary Tree Traversals ๐Ÿ”„

Inorder Traversal (LNR - Left, Node, Right)

Algorithm:

Algorithm inOrder(v) if left(v) โ‰  null inOrder(left(v)) // Visit left subtree visit(v) // Visit node if right(v) โ‰  null inOrder(right(v)) // Visit right subtree

Visit Order: Left subtree โ†’ Node โ†’ Right subtree

Application: Drawing a binary tree, producing sorted output from BST

Traversal Summary for Binary Trees

Traversal Type Order Mnemonic When to Use
Preorder Node โ†’ Left โ†’ Right NLR Copy tree, prefix expressions
Inorder Left โ†’ Node โ†’ Right LNR BST sorted output, infix expressions
Postorder Left โ†’ Right โ†’ Node LRN Delete tree, postfix expressions

Complete Example:

        A
       / \
      B   C
                    
  • Inorder: B, A, C
  • Preorder: A, B, C
  • Postorder: B, C, A

Euler Tour Traversal

A generic traversal that visits each node three times:
  • On the left (L): Before processing left subtree (preorder)
  • From below (B): Between left and right subtrees (inorder)
  • On the right (R): After processing right subtree (postorder)

Euler tour includes preorder, inorder, and postorder as special cases.

7. Applications of Binary Trees ๐Ÿ’ก

1. Arithmetic Expression Trees

Structure:

  • Internal nodes: operators (+, -, ร—, รท)
  • External nodes: operands (numbers, variables)

Example: (2 ร— (a - 1)) + (3 ร— b)

         +
        / \
       ร—   ร—
      / \ / \
     2  - 3  b
       / \
      a   1
                    

Printing Expressions (Inorder):

Algorithm printExpression(v) if left(v) โ‰  null print("(") inOrder(left(v)) print(v.element()) if right(v) โ‰  null inOrder(right(v)) print(")")

Output: ((2 ร— (a - 1)) + (3 ร— b))

Evaluating Expressions (Postorder):

Algorithm evalExpr(v) if isExternal(v) return v.element() else x โ† evalExpr(left(v)) y โ† evalExpr(right(v)) โ—Š โ† operator stored at v return x โ—Š y

2. Decision Trees

Structure:

  • Internal nodes: questions with yes/no answers
  • External nodes: decisions/outcomes

Applications: Machine learning, game AI, diagnostic systems

8. Tree Implementation Strategies ๐Ÿ”จ

Linked Structure for Trees

Node Structure:

  • Element (data)
  • Reference to parent node
  • Sequence/list of children nodes

This is the most flexible implementation for general trees.

Linked Structure for Binary Trees

Node Structure (simpler than general trees):

  • Element (data)
  • Reference to parent node
  • Reference to left child
  • Reference to right child

Array-Based Representation

Formula for node placement:

  • rank(root) = 0
  • If node is left child: rank(node) = 2 ร— rank(parent) + 1
  • If node is right child: rank(node) = 2 ร— rank(parent) + 2

Example Mapping:

      A (0)
     / \
    B   D
   (1) (2)
   / \   \
  E   F   J
 (3) (4) (6)
                        

Array: [A, B, D, E, F, _, J, ...]

Index: 0, 1, 2, 3, 4, 5, 6

Disadvantage: Can waste space for non-complete trees (many null entries).
Advantage: Fast access to parent and children using arithmetic.

9. Complexity Analysis โฑ๏ธ

Linked Structure Implementation

Operation Time Complexity Explanation
size(), isEmpty() O(1) Stored as instance variable
root(), parent(), left(), right() O(1) Direct reference access
isInternal(), isExternal(), isRoot() O(1) Simple checks
iterator(), positions() O(n) Must visit all n nodes
replace() O(1) Direct update of element
children(p) O(1) Return reference to list

Traversal Complexity

All traversals (preorder, inorder, postorder): O(n)

  • Each node is visited exactly once
  • Work done at each node is O(1)
  • Total time = n ร— O(1) = O(n)

Space Complexity

  • Linked Structure: O(n) - one node object per element
  • Array-Based: O(2h) - where h is height (can waste space)
  • Recursive Traversal: O(h) - call stack depth equals height

๐Ÿ“Œ Key Takeaways

Essential Concepts:

  1. Trees are hierarchical: Root at top, leaves at bottom, parent-child relationships
  2. Three main traversals: Preorder (NLR), Inorder (LNR), Postorder (LRN)
  3. Binary trees are special: At most 2 children, ordered left and right
  4. Height matters: Determines efficiency of many operations
  5. Proper binary trees: e = i + 1 (external = internal + 1)
  6. Implementation choices: Linked (flexible) vs Array (compact for complete trees)

Study Tips:

  • Practice drawing trees and performing traversals by hand
  • Memorize the formulas for proper binary trees
  • Understand when to use each traversal type
  • Know the time complexity of basic operations
  • Practice implementing tree operations recursively

๐ŸŽ“ Practice Questions

Question 1: Core Terminology

Q: Given the tree below, identify: (a) the root, (b) all leaf nodes, (c) the depth of node F, (d) the height of the tree, (e) the ancestors of node I.

        A (depth 0)
       / \
      B   C (depth 1)
     / \   \
    E   F   G (depth 2)
       / \
      I   J (depth 3)
                    

A: (a) Root: A (the only node with no parent).
(b) Leaf nodes (external): E, I, J, G โ€” nodes with no children.
(c) Depth of F: 2 (ancestors are B and A, so 2 edges to the root).
(d) Height of tree: 3 (maximum depth is that of I or J).
(e) Ancestors of I: F, B, A (every node on the path from I to the root, excluding I itself).

Question 2: Preorder vs. Postorder Traversal

Q: For the tree in Question 1, give the preorder and postorder traversal sequences. Explain intuitively why postorder is used to compute disk space usage of a file system.

A: Preorder (NLR): A, B, E, F, I, J, C, G
Postorder (LRN): E, I, J, F, B, G, C, A
Why postorder for disk space: To compute the size of a directory, you must first know the sizes of all its subdirectories and files (children). Postorder visits children before their parent, so by the time you reach a directory node, all its children's sizes have already been computed. You can then sum them up and add the directory's own overhead to get the total. Preorder would try to compute a directory's size before knowing its children's sizes โ€” impossible.

Question 3: Inorder Traversal and BST Property

Q: Given a BST with nodes inserted in the order 5, 3, 7, 1, 4, 6, 8, draw the tree and give the inorder traversal. What property of the inorder traversal makes it especially useful for BSTs?

A: The BST after all insertions:

        5
       / \
      3   7
     / \ / \
    1  4 6  8
                    
Inorder traversal (LNR): 1, 3, 4, 5, 6, 7, 8.
Key property: Inorder traversal of a BST always produces elements in sorted (ascending) order. This is because the BST property guarantees all keys in the left subtree are smaller than the node's key, which is smaller than all keys in the right subtree. Visiting left โ†’ node โ†’ right therefore visits values in increasing order.

Question 4: Proper Binary Tree Formulas

Q: A proper binary tree has 7 internal nodes. How many external (leaf) nodes does it have? What is the total number of nodes? Verify using the formula e = i + 1.

A: Using the formula e = i + 1 (external nodes = internal nodes + 1):
e = 7 + 1 = 8 external nodes.
Total nodes n = i + e = 7 + 8 = 15 nodes.
Alternatively, n = 2e - 1 = 2(8) - 1 = 15. โœ“
Intuition: Every time you convert a leaf into an internal node in a proper binary tree, you add 2 new leaves. Starting with 1 leaf (just the root), each internal node you add gives a net gain of +1 external node, so after i internal nodes you have 1 + i = i + 1 external nodes.

Question 5: Array-Based Representation

Q: In the array-based binary tree representation (0-indexed), node at index 3 has its parent, left child, and right child at which indices? What is the main disadvantage of this representation for unbalanced trees?

A: For a node at index k (0-indexed):
- Parent: (k - 1) / 2 = (3 - 1) / 2 = 1
- Left child: 2k + 1 = 2(3) + 1 = 7
- Right child: 2k + 2 = 2(3) + 2 = 8
Main disadvantage: For a skewed or sparse tree, many array positions are null (unused). In the worst case, a right-skewed tree with n nodes requires an array of size 2^n - 1, resulting in O(2^n) space even though only n cells are used. A linked structure uses exactly O(n) space regardless of tree shape.

Question 6: Traversal for Expression Evaluation

Q: The expression tree for (4 + 2) * 3 is shown below. Give the preorder, inorder, and postorder traversal sequences. Which corresponds to prefix, infix, and postfix notation?

        *
       / \
      +   3
     / \
    4   2
                    

A: Preorder: *, +, 4, 2, 3 โ†’ prefix notation (operator before operands).
Inorder: 4, +, 2, *, 3 โ†’ infix notation (operator between operands; parentheses needed to avoid ambiguity).
Postorder: 4, 2, +, 3, * โ†’ postfix (Reverse Polish) notation (operator after operands; evaluatable with a stack without parentheses).
Postfix is especially useful for calculators: push 4, push 2, see + โ†’ pop both, push 6, push 3, see * โ†’ pop both, push 18.

Question 7: Tree vs. Graph โ€” Edges and Cycles

Q: What structural property distinguishes a tree from a general graph? How many edges does a tree with n nodes have, and why?

A: A tree is a connected, acyclic graph. There is exactly one path between any two nodes and no cycles. A general graph may have multiple paths between nodes and may contain cycles.
A tree with n nodes has exactly n - 1 edges. Proof: Start with n isolated nodes (0 edges). Each edge you add connects exactly one new node to the growing tree โ€” adding a cycle would be the (n)th edge creating a second path between two already-connected nodes. After n-1 edges, all n nodes are connected. Adding a further edge would create a cycle. Therefore a tree always has exactly n - 1 edges.

Question 8: Choosing a Traversal for a Task

Q: Match each task to the appropriate traversal (preorder, inorder, postorder) and briefly justify: (a) print a structured document with section headings, (b) produce a sorted listing from a BST, (c) delete all nodes of a binary tree safely, (d) generate a postfix expression from an expression tree.

A: (a) Print structured document โ†’ Preorder. You want to process a section heading (parent) before its subsections (children), matching the document's top-down reading order.
(b) Sorted listing from BST โ†’ Inorder. Left subtree (smaller keys) โ†’ node โ†’ right subtree (larger keys) produces ascending order by the BST property.
(c) Delete all nodes โ†’ Postorder. A node should only be deleted after its children are deleted (to avoid losing references to children). Postorder processes children before the parent.
(d) Postfix expression โ†’ Postorder. In postfix notation, operators follow their operands. Postorder visits leaves (operands) before internal nodes (operators), generating operand operand operator โ€” the postfix structure.