What You'll Master
- On-disk structures: boot block, superblock, inode, FCB
- In-memory structures: mount table, open-file tables, buffers
- Three allocation methods: Contiguous, Linked, Indexed
- Four free-space methods: Bit vector, Linked list, Grouping, Counting
- Calculating logical → physical addresses for each method
Exam Focus
- Know the names and jobs of the on-disk and in-memory structures
- Be able to translate a logical address for contiguous, linked, and indexed allocation
- Compare methods clearly: fragmentation, random access, growth, and overhead
- Recognize free-space methods quickly: bitmap, linked list, grouping, counting
The Disk Layout - Layers from Outer to Inner
Think of a disk as a book. The on-disk structures are the table of contents, chapters, and index - all physically stored on the disk, surviving power-off.
Disk Layout (Conceptual)
← Low address on disk High address →
① Boot Control Block
- First block of the volume (if it's bootable)
- Contains info needed to boot the OS from that partition
- Called Boot Sector (NTFS) or Boot Block (UFS)
- Empty on non-boot volumes
② Volume Control Block (Superblock)
- Stores partition-level metadata:
- Total number of blocks
- Number of free blocks
- Block size
- Free block pointers / free inode or FCB information
- Called Superblock in UNIX-like systems
③ Directory Structure
- Organises all files on the partition
- In UNIX: maps file names → inode numbers
- Starts at the root directory
- Stored on disk - survives reboot
④ Per-File FCB (inode)
- File Control Block - one per file
- Contains: unique ID, type, size, protection, timestamps, owner, and pointers to data blocks
- Called inode in most UNIX systems
- In UNIX, the file name is stored in the directory entry, not inside the inode itself
- Created at file creation; deleted when file is deleted
Boot block → Superblock → Directory → FCB/inode → Data blocks
Think: "Big Systems Directly
Find Data"
Why In-Memory Structures?
Disk access is slow. The OS therefore keeps frequently used file-system metadata in RAM so each file operation does not have to re-read the disk. Think of memory structures as the working copies of the permanent on-disk structures.
Mount Table
- Lists all currently mounted file system volumes
- Each entry: device, mount point, FS type, options
- When you run
mount /dev/sdb1 /mnt/usb, an entry is added here
Directory Structure Cache
- Caches recently accessed directory entries in memory
- Avoids re-reading directories from disk on every path lookup
- Called the dentry cache (dcache) in Linux
System-Wide Open-File Table
- One entry per open file in the entire system (not per process)
- Contains: a memory copy/reference of the FCB or inode plus status information such as open count
- When open count hits 0 → entry is removed
Per-Process Open-File Table
- One table per running process
- Contains: pointer into the system-wide table plus process-specific access information
- This is the table indexed by the file descriptor returned to the process
📖 How open() Works - Step by Step
📕 How close() Works
The Core Problem
A file is a sequence of logical blocks. The disk is a sequence of physical blocks. How do we map one to the other? Three strategies - each with distinct trade-offs on fragmentation, speed, and random access.
① Contiguous Allocation
Each file occupies a set of consecutive (contiguous) blocks on disk. The directory entry stores only the starting block address and the length (number of blocks).
Offset = LA mod BlockSize
Physical Block = StartAddress + LBN | Displacement = Offset
Q = 2500 ÷ 1024 = 2 | R = 2500 mod 1024 = 452
Physical Block = 6 + 2 = 8 | Displacement = 452
- ✅ Simple - only start + length needed
- ✅ Excellent for sequential AND random access
- ❌ External fragmentation - holes appear as files are deleted
- ❌ Files cannot grow - no adjacent space may exist
- ❌ Must know file size in advance
Contiguous = parking lot with one long reserved spot. Easy to find your car (start + length). Problem: can't grow, leaves gaps when cars leave.
② Linked Allocation
Each file is a linked list of disk blocks. Each block contains a pointer to the next block. Blocks can be anywhere on the disk. The directory stores the first block pointer only.
→10
→3
→nil
Logical Block Number = LA ÷ (BlockSize − PointerSize)
Offset = LA mod (BlockSize − PointerSize)
Then follow the chain from the first block for that many links
- ✅ No external fragmentation
- ✅ Files can grow easily - just add any free block
- ✅ No need to declare file size upfront
- ❌ Sequential access only - to reach block i, must traverse from block 0
- ❌ Each block wastes space on a pointer
- ❌ Poor reliability - one broken pointer loses the rest of the file
Linked = a treasure hunt - each clue leads to the next. Can't skip to clue 7 without following clues 1 - 6 first. Great for exploring in order, terrible for jumping around.
③ Indexed Allocation
Each file has its own dedicated index block - an array of pointers to all its data blocks. The directory stores the address of the index block. Supports both sequential and direct (random) access.
block
Offset = LA mod BlockSize
Physical Block = IndexBlock[Logical Block Number] | Displacement = Offset
Q = 5000 ÷ 2048 = 2 (truncated) | R = 5000 mod 2048 = 904
Physical Block = IndexBlock[2] (look up the 3rd pointer) | Displacement = 904
- ✅ No external fragmentation
- ✅ Supports random (direct) access efficiently
- ✅ Files can grow (add entries to index block)
- ❌ Index block takes up space even for tiny files
- ❌ Large files may need multi-level index blocks
Indexed = a book's table of contents. The TOC (index block) lists exactly where each chapter (data block) starts. You can jump directly to any chapter without reading the others.
UNIX UFS Combined Scheme (Inode)
UNIX handles large files by using a combined indexing scheme inside each inode:
| Pointer Type | How It Works | Covers |
|---|---|---|
| Direct (12 pointers) | Points directly to a data block | Small files (up to 12 × BlockSize) |
| Single Indirect | Points to a block of pointers → each points to data | Medium files |
| Double Indirect | Points to block of pointers → each points to block of pointers → data | Large files |
| Triple Indirect | 3 levels of indirection → data | Very large files |
"Direct, Single, Double, Triple" - like floors in a building: ground floor = direct access, floor 1 = one elevator ride, floor 2 = two rides, floor 3 = three rides. Higher floor = more overhead but reaches higher "addresses".
| Feature | Contiguous | Linked | Indexed |
|---|---|---|---|
| External Fragmentation | Yes ❌ | No ✅ | No ✅ |
| Main Space Cost | Possible unused bytes in last block | Pointer field inside each allocated block | Separate index block(s) + possible last-block waste |
| Sequential Access | Excellent ✅ | Good ✅ | Good ✅ |
| Random (Direct) Access | Excellent ✅ | Poor ❌ (must traverse) | Excellent ✅ |
| File Can Grow? | Hard ❌ | Easy ✅ | Easy ✅ |
| Disk I/O for access | 1 access | N accesses (chain walk) | 2 accesses (index + data) |
| Pointer Overhead | None | Per block | Per file (index block) |
| Classic Example | CD-ROM, old systems | Linked-list style files or FAT-style variants | UNIX UFS, ext3/4 |
The Problem
When a file is deleted, its blocks become free. The OS must track which blocks are available so it can allocate them to new files. Four strategies - each with different speed/space trade-offs.
① Bit Vector (Bitmap)
One bit per disk block. 1 = free, 0 = allocated.
A "0-value word" = all bits = 0 (all blocks in that word are allocated)
Word 0:
00000000 (0-value) |
Word 1: 00000000 (0-value) |
Word 2: 00010000 (has a free block at position 3)Block number = 8 × 2 + 3 = 19 → first free block is block 19.
Number of blocks = 1 TB ÷ 4 KB = 268,435,456 blocks.
Bitmap size = 268,435,456 bits ÷ 8 = 33,554,432 bytes = 32 MB
- ✅ Simple, fast to find first free block using CPU bit operations
- ✅ Easy to find contiguous free blocks
- ❌ Inefficient unless entire bitmap kept in memory
- ❌ Requires extra space proportional to disk size
② Linked List (Free List)
All free blocks are linked together. A special pointer (in the superblock) points to the first free block. Each free block contains a pointer to the next free block.
used
→7
→9
→5
→10
→nil
- ✅ No wasted space - pointers stored inside free blocks themselves
- ❌ Cannot efficiently find contiguous free space
- ❌ Traversing entire list requires many disk reads (I/O intensive)
- ❌ Poor performance for bulk allocation
Linked free list = chain of empty rooms in a hotel where each room's key tells you which room is free next. Finding 5 rooms in a row requires walking the whole chain.
③ Grouping
A modification of the linked list. The first free block stores the addresses of n free blocks. The last of these n blocks stores the addresses of the next n free blocks, and so on.
- ✅ Large number of free block addresses found quickly
- ✅ Better than simple linked list for bulk operations
- ❌ More complex than bitmap or linked list
④ Counting
Instead of listing every free block, store the address of the first free block and a count of how many consecutive free blocks follow it.
start
free
start
free
free
start
- ✅ Compact list - especially useful when free space is largely contiguous
- ✅ Each entry covers multiple blocks with (address, count)
- ❌ Each entry larger than a simple address
- ❌ Less efficient when free blocks are scattered
Counting = "I have 3 parking spots starting at row 7." Much more compact than listing spots 7, 8, 9 separately - especially in a mostly-empty lot.
| Method | Storage Structure | Find Free Block | Find Contiguous? | Best For |
|---|---|---|---|---|
| Bit Vector | 1 bit per block array | Scan bitmap - fast with CPU ops | Yes ✅ | General purpose - most common |
| Linked List | Chain of free blocks | Get head pointer - O(1) | No ❌ | Simple systems, low memory |
| Grouping | First free block holds n addresses | Read one block → n addresses | Partial | When many free blocks needed fast |
| Counting | (address, count) pairs | Scan short list | Yes ✅ | When free space is mostly contiguous |
Contiguous - fast, fragmented
Linked - flexible, no random access
Indexed - best of both, overhead
Bitmap - 1 bit per block
Linked - chain of free blocks
Grouping - chunks of n addresses
Counting - (start, count) pairs
🎯 How to Solve Address Translation Problems
- Identify the allocation method - contiguous, linked, or indexed?
- Write the correct formula for that method (Q and R)
- For Contiguous: Q = LA ÷ BlockSize, Physical = Start + Q, Offset = R
- For Linked: Q = LA ÷ (BlockSize − PtrSize), walk Q steps in chain
- For Indexed: Q = LA ÷ BlockSize, lookup IndexBlock[Q], Offset = R
- Check validity: Is the resulting block within the file's bounds?
🎯 How to Solve Bitmap Problems
- Identify word size (e.g. 8 bits, 16 bits, 32 bits)
- Count 0-value words (all bits = 0) from the start
- Find the first word that is NOT all zeros
- Within that word, find the offset (position) of the first 1 bit - count from the left (bit 0 = leftmost)
- Block number = (bits per word × number of 0-value words) + offset
If a question says "no external fragmentation" → Linked or Indexed.
Contiguous = external fragmentation. Indexed = index block overhead.
This is a classic example of why linked allocation has poor performance for insertions - sequential traversal is mandatory.
- External fragmentation: Free space breaks into scattered small holes. Requires compaction/defragmentation which is expensive.
- File size must be known in advance: Hard to estimate for output files. If you underestimate, the file can't grow.
- Files cannot grow: If there's no adjacent free space, the entire file must be relocated to a larger hole.
Q = 2500 ÷ 1024 = 2 (integer division)
R = 2500 mod 1024 = 2500 − (2 × 1024) = 452
Physical disk block = Start + Q = 6 + 2 = Block 8
Displacement within block = 452 bytes
- Boot Control Block (Boot Sector): Contains info needed to boot the OS from that partition. Located at the start of the volume.
- Volume Control Block (Superblock/MFT): Stores partition metadata - total blocks, free blocks, block size, free block pointers.
- Directory Structure: Organises file names and their associated inode/FCB numbers, starting at the root.
- Per-file FCB (inode): Stores file metadata such as type, size, timestamps, owner, protection, and pointers to data blocks.
The per-process open-file table has one entry per file per process. It contains a pointer into the system-wide table, the current file offset for that process, and the access mode (read/write). This is why two processes reading the same file have independent file positions.
Word 0: 00000000 | Word 1: 00000000 | Word 2: 00010000
What is the first free block number? Chapter 12 Slides
Step 1: Words 0 and 1 are 0-value (all bits = 0) → 2 zero-value words.
Step 2: Word 2 = 00010000 → first 1 bit is at position 3 (counting from left, index 3).
Step 3: Block = (bits per word × zero-value words) + offset = (8 × 2) + 3 = 19
Number of blocks = 1 TB ÷ 4 KB = 1,099,511,627,776 ÷ 4,096 = 268,435,456 blocks
Bitmap size = 268,435,456 bits ÷ 8 = 33,554,432 bytes = 32 MB
This shows a practical limitation of the bitmap approach - for large disks, the bitmap itself requires significant memory to keep it loaded in RAM.
FAT solves this by moving all the pointers out of the data blocks and into a separate table at the start of the disk. The FAT has one entry per disk block; each entry is the block number of the next block in the file chain (or a special EOF marker). The data blocks themselves contain only file data.
Advantages over standard linked: The entire FAT can be cached in memory for speed; data blocks are full-sized powers of 2; supports faster random access since you can walk the FAT in memory rather than reading disk blocks to find pointers.
- 12 direct pointers: Each points directly to a data block - efficient for small files
- 1 single indirect pointer: Points to a block of pointers (one level of indirection)
- 1 double indirect pointer: Points to a block of pointers, each pointing to another block of pointers
- 1 triple indirect pointer: Three levels of indirection - for very large files
In grouping, the first free block stores the addresses of n free blocks. The last of those n entries points to another group of n free blocks. Reading one disk block gives you n−1 usable free block addresses immediately - far fewer disk accesses for bulk allocation.
R = 1200 mod 512 = 1200 − 1024 = 176
Physical Block = IndexBlock[2] (read the 3rd entry, index 2, from the index block at disk block 25)
Displacement = 176 bytes into that block.
Note: You must first read disk block 25 to get the index table, then use the value at position [2] to get the actual data block address. This is why indexed allocation requires 2 disk accesses (1 for index + 1 for data).
- (a) External fragmentation: Contiguous = YES (big problem). Linked = No. Indexed = No.
- (b) Main space overhead: Contiguous = possible unused bytes in the last block. Linked = pointer field stored in each allocated block. Indexed = index block overhead (and possible unused bytes in the last data block).
- (c) Random/direct access: Contiguous = Excellent (O(1), start + Q). Linked = Poor (must traverse chain from start, O(n)). Indexed = Excellent (O(1), look up IndexBlock[Q]).
- Mount table: Info about each mounted volume/device currently attached to the system.
- Directory structure cache (dentry cache): Caches recently accessed directory entries to avoid repeated disk reads for path lookups.
- System-wide open-file table: One entry per open file in the system; contains FCB copy and open count.
- Per-process open-file table: One per process; contains pointer to system-wide entry, current file offset, and access mode.
- (Also: Buffers that hold file-system blocks being read/written to disk.)