CS330 · Chapter 10 · Mass Storage Structure

🗺

Chapter Overview

🎯 BIG PICTURE This chapter answers: "How does the OS talk to physical storage?" - it covers the hardware structure of disks, how access time is calculated, and how the OS decides the order to serve disk requests to minimize wasted movement.

🗂 What You'll Learn

Physical structure of magnetic disks & SSDs
How access time is calculated (seek + rotation + transfer)
Disk scheduling algorithms (FCFS, SSTF, SCAN, C-SCAN, LOOK, C-LOOK)
RAID levels and why they exist
Boot block & bad block management

🔗 Connections to Other Chapters

Ch 8 - 9 (Memory/VM): Swap space lives on disk
Ch 11 - 12 (File Systems): Files are stored on disk; allocation methods use disk blocks
Ch 3 (Processes): I/O-bound vs CPU-bound processes - I/O often means disk

💿

Magnetic Disk Structure

Anatomy of a Hard Disk (HDD)

Understanding the physical layout is key to understanding why access time has multiple components.

Conceptual view of a magnetic disk platter

📐 Key Terms

Platter: Circular disk coated with magnetic material
Track: Concentric ring on a platter surface
Sector: Smallest addressable unit (usually 512B or 4KB)
Cylinder: All tracks at the same radius across all platters
Read/Write Head: Reads/writes data as platter spins under it
Disk Arm: Moves head radially across platters

🖥️ OS + User View

User sees: Files and folders
OS sees: Array of logical blocks
Disk sees: Sectors on tracks on cylinders

Logical Block → (Cylinder, Track, Sector)

The disk controller maps logical block numbers to physical locations.

💡 MEMORY TIP - "Cylinder = Tower" Think of a cylinder as a tower of tracks - all the tracks directly above and below each other on every platter. When we say "seek to cylinder 50," we mean move the arm so all heads are at track 50 on every platter simultaneously.

⚡

SSDs vs Magnetic Disks

Feature	Magnetic HDD	SSD (Solid-State)
Moving Parts	Yes (arm, platters)	No
Access Time	~3 - 12 ms	~0.1 ms
Volatile?	No	No
Fragmentation Issue	Yes (seek time)	No (random access same speed)
Price per GB	Cheaper	More expensive
Lifespan	Limited by mechanical wear	Limited write cycles
Suitable for Scheduling?	Yes - head movement matters	Less critical - no seek time

⚠ EXAM TRAP SSDs are faster and nonvolatile - this exact combination came up in Quiz 1! Don't confuse "nonvolatile" with "slow" - that's RAM vs disk confusion. SSDs are fast AND nonvolatile.

⏱

Disk Access Time

The Three Components of Access Time

Every disk access has three time costs. Understanding each component is critical for exam calculations.

① Seek Time

Time to move the disk arm to the correct cylinder (track). This is the dominant and most variable cost. Disk scheduling algorithms focus on minimizing this.

② Rotational Latency

Time for the desired sector to rotate under the read/write head. On average, this is half of one full rotation.

Avg Latency = ½ × (1/RPM) × 60,000 ms

③ Transfer Time

Time to actually read/write the data once the head is in position. Usually the fastest component.

Transfer Time = Data Size / Transfer Rate

Total Access Time = Seek Time + Rotational Latency + Transfer Time

✅ EXAMPLE (common exam style) If a disk spins at 7200 RPM: Average rotational latency = ½ × (1/7200) × 60,000 ≈ 4.17 ms. A typical seek time might be 5 - 10 ms, making seek + rotation the main bottleneck.

🧠 Memory Aid

S · R · T

Seek → Rotate → Transfer - like a journey: travel to the city → wait for the right street → pick up the package.

🗓

Disk Scheduling - Why It Matters

The Problem

Multiple processes request disk I/O simultaneously. The OS must decide the order to serve them. Since seek time is the biggest cost, a smart order can save huge amounts of time.

💡 ANALOGY Imagine an elevator with people requesting different floors. A naive elevator serves requests in arrival order (FCFS) - a smart one sweeps up then down (SCAN). Disk scheduling works exactly like an elevator!

Setting the Stage

All algorithms are evaluated on:

Total head movement (in cylinders) - lower is better
Throughput - requests served per time unit
Fairness - no request waits forever

📌 We always track: current head position, queue of pending requests, and sometimes direction of movement.

⚙

Disk Scheduling Algorithms

⚠ EXAM FOCUS You must be able to: (1) draw the head movement path, (2) calculate total cylinders traversed, for ALL algorithms. Practice with the examples below!

📋 Reference Example (used throughout)

Queue: 98, 183, 37, 122, 14, 124, 65, 67

Initial head position: 53 | Disk: 0 - 199

This is the classic textbook example from Silberschatz Ch.10

1. FCFS - First-Come, First-Served

Rule: Serve requests in the order they arrive. No reordering.

53 → 98 → 183 → 37 → 122 → 14 → 124 → 65 → 67

Total movement: |98−53| + |183−98| + |37−183| + ... = 640 cylinders

✅ Simple, fair (no starvation)
❌ Worst performance - head zigzags across disk

🧠 Remember

FCFS = First Come First Served = no optimization at all = like serving a restaurant table by table in order of who called first, even if they're all the way across town.

2. SSTF - Shortest Seek Time First

Rule: Always serve the request closest to the current head position.

53 → 65 → 67 → 37 → 14 → 98 → 122 → 124 → 183

Total movement: 236 cylinders

✅ Much better than FCFS
❌ Starvation possible - requests far from head may never get served
❌ Not optimal (greedy approach)

⚠ KEY EXAM POINT SSTF can cause starvation - a request at cylinder 5 might wait forever if new requests keep arriving near cylinder 50. This is SSTF's critical flaw.

🧠 SSTF ≈ SJF for disks

Just like SJF scheduling picks the shortest CPU burst, SSTF picks the shortest seek distance. Same concept, same starvation problem!

3. SCAN - The Elevator Algorithm

Rule: Head moves in one direction, serving all requests along the way, then reverses direction at the end (or last request) and sweeps back.

53 → 65 → 67 → 98 → 122 → 124 → 183 → [199] → 37 → 14

Total movement: ~236 cylinders (much less than FCFS)

✅ No starvation - head eventually reaches every cylinder
✅ Better throughput than FCFS
❌ Requests just behind the head wait for a full sweep

🧠 Think: Elevator

An elevator goes UP serving all floors, reaches top, comes back DOWN. The disk head does the same across cylinders 0 - 199.

4. C-SCAN - Circular SCAN

Rule: Like SCAN, but when the head reaches the end, it immediately jumps back to the beginning without serving requests on the return trip. Only serves one direction.

53 → 65 → 67 → 98 → 122 → 124 → 183 → [199] → [0] → 14 → 37

✅ More uniform waiting time than SCAN
✅ No starvation
❌ Jump from end to beginning counts as head movement

🧠 C-SCAN = One-Way Street

SCAN = elevator going up AND down. C-SCAN = only going up, teleports back to ground floor. More fair because requests near cylinder 0 aren't always last.

5. LOOK & C-LOOK

LOOK: Like SCAN but doesn't go all the way to the end of the disk - only goes as far as the last request in that direction, then reverses.

C-LOOK: Like C-SCAN but only goes to the last request, then jumps to the first request (not cylinder 0).

✅ More efficient than SCAN/C-SCAN - no unnecessary travel to disk edge
✅ Commonly used in practice

🧠 LOOK vs SCAN - "Look Before You Go"

SCAN goes to the wall. LOOK looks ahead and stops at the last request. Saves travel time at the edges.

📊 Algorithm Comparison Table

Algorithm	Strategy	Starvation?	Performance	Best For
`FCFS`	Arrival order	No	Worst	Light loads only
`SSTF`	Nearest cylinder first	⚠ Yes	Good	Low-latency if load is clustered
`SCAN`	Elevator sweep, reverses at edge	No	Good	Heavy loads
`C-SCAN`	One-way sweep, jump to start	No	Good, uniform	Uniform waiting time needed
`LOOK`	Like SCAN, stops at last request	No	Better than SCAN	General purpose
`C-LOOK`	Like C-SCAN, jumps to first request	No	Best practical	General purpose - most used

✅ SELECTION RULE (exam answer) The best algorithm depends on number and distribution of requests. For heavy disk I/O, SCAN/C-SCAN/LOOK are preferred. FCFS is only acceptable for very light loads where simplicity matters more.

💾

Other Storage Technologies

🔌 Swap Space (Swap Partition)

Section of disk used as extension of RAM
Pages swapped out to swap space when memory is full
Much slower than RAM - causes thrashing if overused
OS manages its own swap-space allocation separately from the file system

🔴 Boot Block (Bootstrap)

Located in a fixed location on disk (usually first sector)
Contains the bootstrap loader - loads the OS
ROM/EPROM on motherboard points CPU to boot block
Also called Master Boot Record (MBR)

⚠ Bad Blocks

Physical sectors that are damaged and can't reliably store data
Simple approach: Flag bad blocks in FAT so OS avoids them
Advanced approach: Sector sparing - disk controller secretly remaps bad blocks to spare sectors
Slip sparing - reassign all sectors to skip the bad one

📼 Tertiary Storage

Tapes, optical disks (CD/DVD), removable drives
Very slow but cheap and portable
Used for archival/backup storage
Accessed sequentially (tapes) or near-randomly (optical)

🛡

RAID - Redundant Array of Independent Disks

Why RAID?

A single disk can fail. RAID spreads data across multiple disks to improve reliability and/or performance. Two key techniques: redundancy (copies) and striping (parallelism).

RAID Level	Name	Technique	Fault Tolerance	Performance
RAID 0	Striping	Data spread across disks, no redundancy	❌ None	⚡ Best (parallel reads/writes)
RAID 1	Mirroring	Full copy on each disk	✅ Excellent (N-1 failures)	Read fast, write same as 1 disk
RAID 4	Block-level striping + dedicated parity	One disk stores parity blocks	✅ 1 failure	Parity disk is bottleneck
RAID 5	Distributed parity	Parity distributed across all disks	✅ 1 failure	✅ Good balance
RAID 6	Double parity	Two parity blocks per stripe	✅ 2 failures	Good, more overhead
RAID 10	Stripe of Mirrors	RAID 0 + RAID 1 combined	✅ Excellent	⚡ Excellent

💡 RAID 0 TRICK RAID 0 is all about speed, zero protection. "0 protection." If one disk fails, ALL data is lost. Great for scratch disks, terrible for important data.

✅ RAID 5 = The Sweet Spot RAID 5 is the most commonly used in practice. It balances storage efficiency, fault tolerance, and performance. Distributed parity = no single bottleneck disk.

🧠 RAID Levels Mnemonic

0·1·5·6·10

0 = Zero protection (speed only) | 1 = One full mirror | 5 = Five-star balance | 6 = Six is safer (2 parity) | 10 = Best of both worlds (stripe + mirror)

✨

Study Tips & Tricks

🎯 How to Tackle Disk Scheduling Problems

Write out the request queue and starting head position clearly
For SCAN/C-SCAN/LOOK/C-LOOK: establish the initial direction
Draw the sequence of head movements step by step
Compute |difference| between consecutive positions and sum them
Remember: only SSTF can cause starvation

🧠 The "Elevator" Map

            FCFS → No Order

            SSTF → Nearest First

            SCAN → ↑ then ↓ (to edge)

            C-SCAN → ↑ always, jump ↩

            LOOK → ↑ then ↓ (to last req)

            C-LOOK → ↑ always, jump to first req

🧠 Disk Structure ABC

            Sector = smallest unit

            Track = ring on platter

            Cylinder = stack of tracks

            Platter = the disk

            Arm = moves head radially

What OSes Use in Practice

Linux uses a variant of C-LOOK (deadline scheduler) with a real-time deadline to prevent starvation. Windows uses similar priority-based disk scheduling.

SSD Scheduling

SSDs don't need seek optimization since there's no moving head. FCFS or simple FIFO is often used with SSDs, because random access costs the same as sequential.

Access Time Formula - Always Draw It

In exams, if asked about access time, write: Total = Seek + Rotation + Transfer. Identify which value you're given and what you need to calculate.

RAID Questions

Know RAID 0 (no fault tolerance, max speed), RAID 1 (mirroring), RAID 5 (distributed parity, best balance). These three appear most in exams.

📌 CONNECT TO FINAL EXAM The Final Exam sample (fe_Solution_211.pdf) asked to evaluate FCFS, SSTF, SCAN, and C-LOOK on the same disk queue. Be ready to do all four on any given queue. The most common asks: (1) write the order, (2) total cylinders moved, (3) which is best.

📝

Past Exam Questions & Answers

💡 HOW TO USE THIS SECTION Try to answer each question before clicking to reveal the answer. These are drawn from actual PSU CS330 past exams and final exam samples.

Q1. Which statement about SSDs compared to magnetic disks is correct?
(a) Faster but volatile (b) Faster and nonvolatile (c) Slower but cheaper (d) Larger and volatile Quiz 1

✅ ANSWER: (b) Faster and nonvolatile SSDs have no moving parts so access time is ~0.1ms vs ~5 - 10ms for HDDs. They retain data without power (nonvolatile) because they use flash memory. They are more expensive per GB than HDDs, not cheaper.

Q2. In disk scheduling, what is the "Convoy Effect" and which algorithm suffers from it? Concept

✅ ANSWER: FCFS suffers from the convoy effect When a request at a far cylinder arrives first, all subsequent requests (even those very close to the current head) must wait. The head "convoys" all the way to the distant cylinder first. This is why FCFS has the worst performance for disk scheduling - similar to the convoy effect in CPU scheduling (FCFS, long job first).

Q3. Given disk queue: 69, 1212, 296, 1800, 669, 1618, 356, 723, 1965, 1681. Head at 1150, previous at 1805. Using FCFS - which direction does the algorithm move first? What is the first request served? Final Sample

✅ ANSWER: FCFS serves 69 first (first in queue) FCFS ignores the current head position or previous position entirely. It simply serves requests in arrival order: 69 → 1212 → 296 → 1800 → 669 → 1618 → 356 → 723 → 1965 → 1681. The head must travel all over the disk - worst performance.

Q4. For the same queue (Head at 1150, moving toward higher cylinders), apply SSTF. What is the order of service? Why might this cause a problem? Final Sample

✅ ANSWER (SSTF): From 1150, nearest is 1212, then 1618, then 1681, then 1800, then 1965, then back to 723, 669, 356, 296, 69. SSTF greedy approach gives good performance but can starve requests at far cylinders if new nearby requests keep arriving. In this example, cylinder 69 waits a long time.

Q5. Apply SCAN to: Queue 69, 1212, 296, 1800, 669, 1618, 356, 723, 1965, 1681. Head at 1150, moving toward higher numbers. Write the sequence and explain when direction reverses. Final Sample

✅ ANSWER (SCAN): Moving toward higher: 1150 → 1212 → 1618 → 1681 → 1800 → 1965 → [1999, edge] → reverses → 723 → 669 → 356 → 296 → 69. The head reaches disk edge (1999) then sweeps back. SCAN prevents starvation - every request is served within one full sweep. Much better total movement than FCFS.

Q6. What is the difference between SCAN and C-SCAN? Between LOOK and C-LOOK? Concept

✅ ANSWER: SCAN vs C-SCAN: SCAN reverses direction at the disk edge and serves requests on the way back. C-SCAN goes one direction only - when it reaches the end, it jumps back to the beginning (cylinder 0) without serving requests on the return. C-SCAN gives more uniform wait times.

LOOK vs C-LOOK: LOOK is like SCAN but stops at the last request (doesn't go all the way to the disk edge). C-LOOK is like C-SCAN but jumps to the first pending request instead of cylinder 0. Both LOOK variants are more efficient than SCAN/C-SCAN.

Q7. A disk has 2000 cylinders (0 - 1999). Head is at cylinder 1150, previous was at 1805. Queue: 69, 1212, 296, 1800, 669, 1618, 356, 723, 1965, 1681. Which algorithm gives the best (minimum) total head movement? Final Sample

✅ ANSWER: C-LOOK typically gives the least total movement in practice. Approximate totals: FCFS ≈ 7081 | SSTF ≈ 1745 | SCAN ≈ 1745 (similar) | C-LOOK ≈ similar to SCAN but slightly less since it doesn't go all the way to 0. In general, SSTF, SCAN, and C-LOOK all dramatically outperform FCFS. The "best" is C-LOOK in practice for balanced loads.

Q8. What are the three components of disk access time? Which one is the focus of disk scheduling algorithms? Concept

✅ ANSWER: The three components are: (1) Seek time - moving arm to the correct cylinder, (2) Rotational latency - waiting for the sector to rotate under the head, (3) Transfer time - reading/writing the actual data.

Disk scheduling algorithms focus on minimizing seek time because it is the largest and most variable component. Rotational latency and transfer time are fixed by disk hardware and cannot be controlled by the OS.

Q9. Which RAID level provides both striping and mirroring for maximum performance and fault tolerance? Which RAID level provides NO fault tolerance? Concept

✅ ANSWER: RAID 10 (or RAID 1+0) combines striping (RAID 0) with mirroring (RAID 1) for both maximum performance and excellent fault tolerance. It requires 2× the disk space.

RAID 0 provides NO fault tolerance - it only uses striping. If one disk fails, ALL data is lost. "RAID 0 = Zero protection."

Q10. A file consists of 30 disk blocks (0 to 29). A block is inserted in the middle (as the 15th block). Using linked allocation, how many disk read and write I/O operations are needed? Final Exam Sample

✅ ANSWER: 15 reads + 2 writes In linked allocation, to insert as the 15th block, you must follow the chain from block 0 all the way to block 14 - that's 15 reads (blocks 0 through 14). Then you need 2 writes: one to update the pointer in block 14 to point to the new block, and one to write the new block itself with its pointer to the former 15th block. This shows a major disadvantage of linked allocation - sequential traversal is required.

Q11. What are the disadvantages of contiguous disk allocation? (Give 3) Final Exam Sample

✅ ANSWER: (1) External fragmentation: Over time, free space becomes scattered in small holes too small for new files. Compaction is needed but expensive.
(2) File size estimation: Must know file size in advance to allocate contiguous space. Hard for output files whose size isn't known ahead.
(3) Files cannot grow: If adjacent space is taken, the file must be moved entirely to find a larger contiguous area.

Q12. Why do disk scheduling algorithms matter less for SSDs than HDDs? Concept

✅ ANSWER: SSDs have no moving mechanical parts - there is no disk arm to seek and no spinning platter to wait for rotation. Therefore, seek time and rotational latency are essentially zero. Random access to any location takes approximately the same time as sequential access. Since seek time minimization is the entire point of algorithms like SCAN and SSTF, those algorithms offer little benefit for SSDs. Simple FIFO/FCFS is often used instead.

📚 CHAPTER SUMMARY Chapter 10 bridges the gap between how users see storage (files & folders) and how it's physically organized (cylinders, tracks, sectors). The key insight is that seek time dominates disk access, so intelligent scheduling of disk requests dramatically improves performance. For modern SSDs, this physical concern disappears - but understanding it is fundamental to OS design.

Chapter 10: Mass Storage Structure

🗂 What You'll Learn

🔗 Connections to Other Chapters

Anatomy of a Hard Disk (HDD)

📐 Key Terms

🖥️ OS + User View

The Three Components of Access Time

① Seek Time

② Rotational Latency

③ Transfer Time

🧠 Memory Aid

The Problem

Setting the Stage

📋 Reference Example (used throughout)

1. FCFS - First-Come, First-Served

🧠 Remember

2. SSTF - Shortest Seek Time First

🧠 SSTF ≈ SJF for disks

3. SCAN - The Elevator Algorithm

🧠 Think: Elevator

4. C-SCAN - Circular SCAN

🧠 C-SCAN = One-Way Street

5. LOOK & C-LOOK

🧠 LOOK vs SCAN - "Look Before You Go"

🔌 Swap Space (Swap Partition)

🔴 Boot Block (Bootstrap)

⚠ Bad Blocks

📼 Tertiary Storage

Why RAID?

🧠 RAID Levels Mnemonic

🎯 How to Tackle Disk Scheduling Problems

🧠 The "Elevator" Map

🧠 Disk Structure ABC

What OSes Use in Practice

SSD Scheduling

Access Time Formula - Always Draw It

RAID Questions