CH · 10

Mass Storage Structure

CS330 · Operating Systems · Prince Sultan University

Chapter 10: Mass Storage Structure

How operating systems manage physical disks - from spinning platters to scheduling algorithms that minimize head movement.

📀 Magnetic Disks ⚡ SSDs 📐 Disk Scheduling 🛡 RAID 📊 Access Time
🗺
Chapter Overview
🎯 BIG PICTURE This chapter answers: "How does the OS talk to physical storage?" - it covers the hardware structure of disks, how access time is calculated, and how the OS decides the order to serve disk requests to minimize wasted movement.

🗂 What You'll Learn

  • Physical structure of magnetic disks & SSDs
  • How access time is calculated (seek + rotation + transfer)
  • Disk scheduling algorithms (FCFS, SSTF, SCAN, C-SCAN, LOOK, C-LOOK)
  • RAID levels and why they exist
  • Boot block & bad block management

🔗 Connections to Other Chapters

  • Ch 8 - 9 (Memory/VM): Swap space lives on disk
  • Ch 11 - 12 (File Systems): Files are stored on disk; allocation methods use disk blocks
  • Ch 3 (Processes): I/O-bound vs CPU-bound processes - I/O often means disk
💿
Magnetic Disk Structure

Anatomy of a Hard Disk (HDD)

Understanding the physical layout is key to understanding why access time has multiple components.

Track Head Spindle Sector Cylinder

Conceptual view of a magnetic disk platter

📐 Key Terms

  • Platter: Circular disk coated with magnetic material
  • Track: Concentric ring on a platter surface
  • Sector: Smallest addressable unit (usually 512B or 4KB)
  • Cylinder: All tracks at the same radius across all platters
  • Read/Write Head: Reads/writes data as platter spins under it
  • Disk Arm: Moves head radially across platters

🖥️ OS + User View

  • User sees: Files and folders
  • OS sees: Array of logical blocks
  • Disk sees: Sectors on tracks on cylinders
Logical Block → (Cylinder, Track, Sector)

The disk controller maps logical block numbers to physical locations.

💡 MEMORY TIP - "Cylinder = Tower" Think of a cylinder as a tower of tracks - all the tracks directly above and below each other on every platter. When we say "seek to cylinder 50," we mean move the arm so all heads are at track 50 on every platter simultaneously.
SSDs vs Magnetic Disks
Feature Magnetic HDD SSD (Solid-State)
Moving Parts Yes (arm, platters) No
Access Time ~3 - 12 ms ~0.1 ms
Volatile? No No
Fragmentation Issue Yes (seek time) No (random access same speed)
Price per GB Cheaper More expensive
Lifespan Limited by mechanical wear Limited write cycles
Suitable for Scheduling? Yes - head movement matters Less critical - no seek time
⚠ EXAM TRAP SSDs are faster and nonvolatile - this exact combination came up in Quiz 1! Don't confuse "nonvolatile" with "slow" - that's RAM vs disk confusion. SSDs are fast AND nonvolatile.
Disk Access Time

The Three Components of Access Time

Every disk access has three time costs. Understanding each component is critical for exam calculations.

① Seek Time

Time to move the disk arm to the correct cylinder (track). This is the dominant and most variable cost. Disk scheduling algorithms focus on minimizing this.

② Rotational Latency

Time for the desired sector to rotate under the read/write head. On average, this is half of one full rotation.

Avg Latency = ½ × (1/RPM) × 60,000 ms

③ Transfer Time

Time to actually read/write the data once the head is in position. Usually the fastest component.

Transfer Time = Data Size / Transfer Rate
Total Access Time = Seek Time + Rotational Latency + Transfer Time
✅ EXAMPLE (common exam style) If a disk spins at 7200 RPM: Average rotational latency = ½ × (1/7200) × 60,000 ≈ 4.17 ms. A typical seek time might be 5 - 10 ms, making seek + rotation the main bottleneck.

🧠 Memory Aid

S · R · T

Seek → Rotate → Transfer - like a journey: travel to the city → wait for the right street → pick up the package.

🗓
Disk Scheduling - Why It Matters

The Problem

Multiple processes request disk I/O simultaneously. The OS must decide the order to serve them. Since seek time is the biggest cost, a smart order can save huge amounts of time.

💡 ANALOGY Imagine an elevator with people requesting different floors. A naive elevator serves requests in arrival order (FCFS) - a smart one sweeps up then down (SCAN). Disk scheduling works exactly like an elevator!

Setting the Stage

All algorithms are evaluated on:

  • Total head movement (in cylinders) - lower is better
  • Throughput - requests served per time unit
  • Fairness - no request waits forever

📌 We always track: current head position, queue of pending requests, and sometimes direction of movement.

Disk Scheduling Algorithms
⚠ EXAM FOCUS You must be able to: (1) draw the head movement path, (2) calculate total cylinders traversed, for ALL algorithms. Practice with the examples below!

📋 Reference Example (used throughout)

Queue: 98, 183, 37, 122, 14, 124, 65, 67

Initial head position: 53 | Disk: 0 - 199

This is the classic textbook example from Silberschatz Ch.10

1. FCFS - First-Come, First-Served

Rule: Serve requests in the order they arrive. No reordering.

53 → 9818337122141246567

Total movement: |98−53| + |183−98| + |37−183| + ... = 640 cylinders

  • ✅ Simple, fair (no starvation)
  • ❌ Worst performance - head zigzags across disk

🧠 Remember

FCFS = First Come First Served = no optimization at all = like serving a restaurant table by table in order of who called first, even if they're all the way across town.

2. SSTF - Shortest Seek Time First

Rule: Always serve the request closest to the current head position.

53 → 6567371498122124183

Total movement: 236 cylinders

  • ✅ Much better than FCFS
  • Starvation possible - requests far from head may never get served
  • ❌ Not optimal (greedy approach)
⚠ KEY EXAM POINT SSTF can cause starvation - a request at cylinder 5 might wait forever if new requests keep arriving near cylinder 50. This is SSTF's critical flaw.

🧠 SSTF ≈ SJF for disks

Just like SJF scheduling picks the shortest CPU burst, SSTF picks the shortest seek distance. Same concept, same starvation problem!

3. SCAN - The Elevator Algorithm

Rule: Head moves in one direction, serving all requests along the way, then reverses direction at the end (or last request) and sweeps back.

53 → 656798122124183 → [199] → 3714

Total movement: ~236 cylinders (much less than FCFS)

  • ✅ No starvation - head eventually reaches every cylinder
  • ✅ Better throughput than FCFS
  • ❌ Requests just behind the head wait for a full sweep

🧠 Think: Elevator

An elevator goes UP serving all floors, reaches top, comes back DOWN. The disk head does the same across cylinders 0 - 199.

4. C-SCAN - Circular SCAN

Rule: Like SCAN, but when the head reaches the end, it immediately jumps back to the beginning without serving requests on the return trip. Only serves one direction.

53 → 656798122124183 → [199] → [0] → 1437
  • ✅ More uniform waiting time than SCAN
  • ✅ No starvation
  • ❌ Jump from end to beginning counts as head movement

🧠 C-SCAN = One-Way Street

SCAN = elevator going up AND down. C-SCAN = only going up, teleports back to ground floor. More fair because requests near cylinder 0 aren't always last.

5. LOOK & C-LOOK

LOOK: Like SCAN but doesn't go all the way to the end of the disk - only goes as far as the last request in that direction, then reverses.

C-LOOK: Like C-SCAN but only goes to the last request, then jumps to the first request (not cylinder 0).

  • ✅ More efficient than SCAN/C-SCAN - no unnecessary travel to disk edge
  • ✅ Commonly used in practice

🧠 LOOK vs SCAN - "Look Before You Go"

SCAN goes to the wall. LOOK looks ahead and stops at the last request. Saves travel time at the edges.

📊 Algorithm Comparison Table
Algorithm Strategy Starvation? Performance Best For
FCFS Arrival order No Worst Light loads only
SSTF Nearest cylinder first ⚠ Yes Good Low-latency if load is clustered
SCAN Elevator sweep, reverses at edge No Good Heavy loads
C-SCAN One-way sweep, jump to start No Good, uniform Uniform waiting time needed
LOOK Like SCAN, stops at last request No Better than SCAN General purpose
C-LOOK Like C-SCAN, jumps to first request No Best practical General purpose - most used
✅ SELECTION RULE (exam answer) The best algorithm depends on number and distribution of requests. For heavy disk I/O, SCAN/C-SCAN/LOOK are preferred. FCFS is only acceptable for very light loads where simplicity matters more.
💾
Other Storage Technologies

🔌 Swap Space (Swap Partition)

  • Section of disk used as extension of RAM
  • Pages swapped out to swap space when memory is full
  • Much slower than RAM - causes thrashing if overused
  • OS manages its own swap-space allocation separately from the file system

🔴 Boot Block (Bootstrap)

  • Located in a fixed location on disk (usually first sector)
  • Contains the bootstrap loader - loads the OS
  • ROM/EPROM on motherboard points CPU to boot block
  • Also called Master Boot Record (MBR)

⚠ Bad Blocks

  • Physical sectors that are damaged and can't reliably store data
  • Simple approach: Flag bad blocks in FAT so OS avoids them
  • Advanced approach: Sector sparing - disk controller secretly remaps bad blocks to spare sectors
  • Slip sparing - reassign all sectors to skip the bad one

📼 Tertiary Storage

  • Tapes, optical disks (CD/DVD), removable drives
  • Very slow but cheap and portable
  • Used for archival/backup storage
  • Accessed sequentially (tapes) or near-randomly (optical)
🛡
RAID - Redundant Array of Independent Disks

Why RAID?

A single disk can fail. RAID spreads data across multiple disks to improve reliability and/or performance. Two key techniques: redundancy (copies) and striping (parallelism).

RAID Level Name Technique Fault Tolerance Performance
RAID 0 Striping Data spread across disks, no redundancy ❌ None ⚡ Best (parallel reads/writes)
RAID 1 Mirroring Full copy on each disk ✅ Excellent (N-1 failures) Read fast, write same as 1 disk
RAID 4 Block-level striping + dedicated parity One disk stores parity blocks ✅ 1 failure Parity disk is bottleneck
RAID 5 Distributed parity Parity distributed across all disks ✅ 1 failure ✅ Good balance
RAID 6 Double parity Two parity blocks per stripe ✅ 2 failures Good, more overhead
RAID 10 Stripe of Mirrors RAID 0 + RAID 1 combined ✅ Excellent ⚡ Excellent
💡 RAID 0 TRICK RAID 0 is all about speed, zero protection. "0 protection." If one disk fails, ALL data is lost. Great for scratch disks, terrible for important data.
✅ RAID 5 = The Sweet Spot RAID 5 is the most commonly used in practice. It balances storage efficiency, fault tolerance, and performance. Distributed parity = no single bottleneck disk.

🧠 RAID Levels Mnemonic

0·1·5·6·10

0 = Zero protection (speed only)  |  1 = One full mirror  |  5 = Five-star balance  |  6 = Six is safer (2 parity)  |  10 = Best of both worlds (stripe + mirror)

Study Tips & Tricks

🎯 How to Tackle Disk Scheduling Problems

  1. Write out the request queue and starting head position clearly
  2. For SCAN/C-SCAN/LOOK/C-LOOK: establish the initial direction
  3. Draw the sequence of head movements step by step
  4. Compute |difference| between consecutive positions and sum them
  5. Remember: only SSTF can cause starvation

🧠 The "Elevator" Map

FCFS → No Order
SSTF → Nearest First
SCAN → ↑ then ↓ (to edge)
C-SCAN → ↑ always, jump ↩
LOOK → ↑ then ↓ (to last req)
C-LOOK → ↑ always, jump to first req

🧠 Disk Structure ABC

Sector = smallest unit
Track = ring on platter
Cylinder = stack of tracks
Platter = the disk
Arm = moves head radially

What OSes Use in Practice

Linux uses a variant of C-LOOK (deadline scheduler) with a real-time deadline to prevent starvation. Windows uses similar priority-based disk scheduling.

SSD Scheduling

SSDs don't need seek optimization since there's no moving head. FCFS or simple FIFO is often used with SSDs, because random access costs the same as sequential.

Access Time Formula - Always Draw It

In exams, if asked about access time, write: Total = Seek + Rotation + Transfer. Identify which value you're given and what you need to calculate.

RAID Questions

Know RAID 0 (no fault tolerance, max speed), RAID 1 (mirroring), RAID 5 (distributed parity, best balance). These three appear most in exams.

📌 CONNECT TO FINAL EXAM The Final Exam sample (fe_Solution_211.pdf) asked to evaluate FCFS, SSTF, SCAN, and C-LOOK on the same disk queue. Be ready to do all four on any given queue. The most common asks: (1) write the order, (2) total cylinders moved, (3) which is best.
📝
Past Exam Questions & Answers
💡 HOW TO USE THIS SECTION Try to answer each question before clicking to reveal the answer. These are drawn from actual PSU CS330 past exams and final exam samples.
Q1. Which statement about SSDs compared to magnetic disks is correct?
(a) Faster but volatile   (b) Faster and nonvolatile   (c) Slower but cheaper   (d) Larger and volatile
Quiz 1
✅ ANSWER: (b) Faster and nonvolatile SSDs have no moving parts so access time is ~0.1ms vs ~5 - 10ms for HDDs. They retain data without power (nonvolatile) because they use flash memory. They are more expensive per GB than HDDs, not cheaper.
Q2. In disk scheduling, what is the "Convoy Effect" and which algorithm suffers from it? Concept
✅ ANSWER: FCFS suffers from the convoy effect When a request at a far cylinder arrives first, all subsequent requests (even those very close to the current head) must wait. The head "convoys" all the way to the distant cylinder first. This is why FCFS has the worst performance for disk scheduling - similar to the convoy effect in CPU scheduling (FCFS, long job first).
Q3. Given disk queue: 69, 1212, 296, 1800, 669, 1618, 356, 723, 1965, 1681. Head at 1150, previous at 1805. Using FCFS - which direction does the algorithm move first? What is the first request served? Final Sample
✅ ANSWER: FCFS serves 69 first (first in queue) FCFS ignores the current head position or previous position entirely. It simply serves requests in arrival order: 69 → 1212 → 296 → 1800 → 669 → 1618 → 356 → 723 → 1965 → 1681. The head must travel all over the disk - worst performance.
Q4. For the same queue (Head at 1150, moving toward higher cylinders), apply SSTF. What is the order of service? Why might this cause a problem? Final Sample
✅ ANSWER (SSTF): From 1150, nearest is 1212, then 1618, then 1681, then 1800, then 1965, then back to 723, 669, 356, 296, 69. SSTF greedy approach gives good performance but can starve requests at far cylinders if new nearby requests keep arriving. In this example, cylinder 69 waits a long time.
Q5. Apply SCAN to: Queue 69, 1212, 296, 1800, 669, 1618, 356, 723, 1965, 1681. Head at 1150, moving toward higher numbers. Write the sequence and explain when direction reverses. Final Sample
✅ ANSWER (SCAN): Moving toward higher: 1150 → 1212 → 1618 → 1681 → 1800 → 1965 → [1999, edge] → reverses → 723 → 669 → 356 → 296 → 69. The head reaches disk edge (1999) then sweeps back. SCAN prevents starvation - every request is served within one full sweep. Much better total movement than FCFS.
Q6. What is the difference between SCAN and C-SCAN? Between LOOK and C-LOOK? Concept
✅ ANSWER: SCAN vs C-SCAN: SCAN reverses direction at the disk edge and serves requests on the way back. C-SCAN goes one direction only - when it reaches the end, it jumps back to the beginning (cylinder 0) without serving requests on the return. C-SCAN gives more uniform wait times.

LOOK vs C-LOOK: LOOK is like SCAN but stops at the last request (doesn't go all the way to the disk edge). C-LOOK is like C-SCAN but jumps to the first pending request instead of cylinder 0. Both LOOK variants are more efficient than SCAN/C-SCAN.
Q7. A disk has 2000 cylinders (0 - 1999). Head is at cylinder 1150, previous was at 1805. Queue: 69, 1212, 296, 1800, 669, 1618, 356, 723, 1965, 1681. Which algorithm gives the best (minimum) total head movement? Final Sample
✅ ANSWER: C-LOOK typically gives the least total movement in practice. Approximate totals: FCFS ≈ 7081 | SSTF ≈ 1745 | SCAN ≈ 1745 (similar) | C-LOOK ≈ similar to SCAN but slightly less since it doesn't go all the way to 0. In general, SSTF, SCAN, and C-LOOK all dramatically outperform FCFS. The "best" is C-LOOK in practice for balanced loads.
Q8. What are the three components of disk access time? Which one is the focus of disk scheduling algorithms? Concept
✅ ANSWER: The three components are: (1) Seek time - moving arm to the correct cylinder, (2) Rotational latency - waiting for the sector to rotate under the head, (3) Transfer time - reading/writing the actual data.

Disk scheduling algorithms focus on minimizing seek time because it is the largest and most variable component. Rotational latency and transfer time are fixed by disk hardware and cannot be controlled by the OS.
Q9. Which RAID level provides both striping and mirroring for maximum performance and fault tolerance? Which RAID level provides NO fault tolerance? Concept
✅ ANSWER: RAID 10 (or RAID 1+0) combines striping (RAID 0) with mirroring (RAID 1) for both maximum performance and excellent fault tolerance. It requires 2× the disk space.

RAID 0 provides NO fault tolerance - it only uses striping. If one disk fails, ALL data is lost. "RAID 0 = Zero protection."
Q10. A file consists of 30 disk blocks (0 to 29). A block is inserted in the middle (as the 15th block). Using linked allocation, how many disk read and write I/O operations are needed? Final Exam Sample
✅ ANSWER: 15 reads + 2 writes In linked allocation, to insert as the 15th block, you must follow the chain from block 0 all the way to block 14 - that's 15 reads (blocks 0 through 14). Then you need 2 writes: one to update the pointer in block 14 to point to the new block, and one to write the new block itself with its pointer to the former 15th block. This shows a major disadvantage of linked allocation - sequential traversal is required.
Q11. What are the disadvantages of contiguous disk allocation? (Give 3) Final Exam Sample
✅ ANSWER: (1) External fragmentation: Over time, free space becomes scattered in small holes too small for new files. Compaction is needed but expensive.
(2) File size estimation: Must know file size in advance to allocate contiguous space. Hard for output files whose size isn't known ahead.
(3) Files cannot grow: If adjacent space is taken, the file must be moved entirely to find a larger contiguous area.
Q12. Why do disk scheduling algorithms matter less for SSDs than HDDs? Concept
✅ ANSWER: SSDs have no moving mechanical parts - there is no disk arm to seek and no spinning platter to wait for rotation. Therefore, seek time and rotational latency are essentially zero. Random access to any location takes approximately the same time as sequential access. Since seek time minimization is the entire point of algorithms like SCAN and SSTF, those algorithms offer little benefit for SSDs. Simple FIFO/FCFS is often used instead.

📚 CHAPTER SUMMARY Chapter 10 bridges the gap between how users see storage (files & folders) and how it's physically organized (cylinders, tracks, sectors). The key insight is that seek time dominates disk access, so intelligent scheduling of disk requests dramatically improves performance. For modern SSDs, this physical concern disappears - but understanding it is fundamental to OS design.