CS330 · Operating Systems · Chapter 11

File-System Interface

File concepts, access methods, directory structures, and file protection - everything you need to master the FS interface layer.

01

File Concept

A file is a collection of related data, normally stored on secondary storage. It is the smallest unit of nameable storage from the user's perspective. The OS provides a uniform logical view of stored information regardless of the physical medium.

Key idea: Files hide the complexity of physical storage. Users and programs work with logical files; the OS maps them to physical blocks on disk.

File Attributes Memorize

Attribute Description
Name The only info stored in human-readable form
Type Needed for systems that support different file types
Location Pointer to file location on the device
Size Current file size in bytes, words, or blocks
Protection Controls who can read, write, execute
Time, date, user ID For protection, security, and usage monitoring

🧠 Mnemonic - File Attributes

N-T-L-S-P-T

Name · Type · Location · Size · Protection · Time/User

Think: "Navigating The Library Saves Plenty of Time"

File Types

Data Files
  • Numeric data
  • Character data
  • Binary data
Program Files
  • Source files (.c, .java)
  • Object files (.o)
  • Executable files (.exe)
💡 Remember
File type is indicated by extension (e.g. .txt, .exe). The OS uses extensions to decide what to do with a file (e.g. run it, open it with an editor).

File Operations Exam Favourite

The OS must provide 6 basic operations on files:

Create
Write
Read
Seek
Delete
Truncate

Two additional important operations:

Open(Fᵢ)
Close(Fᵢ)
Open vs Close: Open(Fᵢ) searches the directory on disk and moves the FCB (file control block) entry into memory. Close(Fᵢ) moves the in-memory FCB entry back to disk.

🧠 Mnemonic - 6 Basic Operations

C-W-R-S-D-T

Create · Write · Read · Seek · Delete · Truncate

"Clever Writers Really Seek Dramatic Titles"

Open Files - Key Data Structures Important

To avoid repeated directory searches, the OS keeps an open-file table in memory.

Data Item Purpose
Open-file table Contains info about all currently open files
File pointer Tracks last read/write position - per process
File-open count Counts how many processes have file open; when 0, file is removed from table
Disk location Location of file on disk - kept in memory to avoid repeated disk access
Access rights Per-process access mode (read, write, etc.)
⚠️ Common Mistake
The file pointer is per-process. Multiple processes can have the same file open simultaneously with different file pointers.

File Structure

Files can have different internal structures that the OS must understand:

None (sequence of bytes)
Simple record (lines, fixed/variable length)
Complex Structures (formatted docs, relocatable load files)
💡 Trick
The OS requires executable files to have a specific structure so it knows where to load them in memory and where the first instruction is. Other file structures are left to the programmer.
02

Access Methods

How can information stored in files be accessed and read into memory? There are two primary methods and one derived method:

Sequential Access

Information is processed in order - one record after another, like a tape.

Most common

Used by: compilers, editors, word processors.

read → read → read → read ← rewind ←

Direct (Random) Access

Fixed-length logical records. Programs can read/write records in any order with no restrictions.

Databases

File = numbered sequence of blocks.

read(block 5) → read(block 1) → write(block 9)

Other Access Methods - Index-Based

Built on top of direct access by creating an index for the file.

  • Index file stores pointers into the main data file.
  • For large files, a primary index points to secondary indices, which point to actual data.
  • Example: IBM's ISAM (Indexed Sequential Access Method).
Primary Index → Secondary Index → Data Items [small] [medium] [large file]
💡 Analogy
Think of it like a book's Table of Contents → Chapter headings → Actual text. Two-level indexing allows fast lookup even for massive files.
03

Directory Structure

A directory is a collection of nodes containing information about all files. Both the directory structure and the files reside on disk.

File-System Structure: Disk → Partitions → Each partition has its own directory → Files within. A typical partition includes a boot block, volume info, directory, and file data.

Operations on Directories

Search
Create
Delete
List
Rename
Traverse

Directory Organization Goals

Efficiency
Locate a file quickly.
Naming
Two users can have same name for different files. Same file can have several names.
Grouping
Logical grouping by properties (e.g. all Java programs).
Convenience
Easy for users to navigate and find files.

Types of Directory Structures Exam Favourite

① Single-Level Directory

All files are in the same directory. Simple to support and understand.

[ Dir: file1, file2, file3, ... fileN ]
⚠️ Limitations
  • Naming problem: All files need unique names (conflicts when users share the system).
  • Grouping problem: No way to logically group related files.

② Two-Level Directory

Separate directory for each user. A Master File Directory (MFD) is searched at login; each user has their own User File Directory (UFD).

MFD ├── User1: [file_a, file_b] ├── User2: [file_a, file_c] └── User3: [file_x]
✓ Same filename for different users
✓ Efficient searching
⚠️ Limitation
Isolates users - problem when users want to cooperate and share files. No grouping capability.

③ Tree-Structured Directories

Most common. A tree has a root directory and every file has a unique path name.

/ ├── spell/ │ └── mail/ │ ├── prog │ ├── copy │ └── count └── docs/ └── report.txt
✓ Efficient searching
✓ Grouping capability
Path Type Description
Absolute path Begins at root → full path down → file
Relative path Defines path from the current directory
💡 Remember
Current directory: Where most files of current interest live. Commands like mkdir and rm operate relative to current dir. In UNIX, rm -r removes an entire subtree!

④ Acyclic-Graph Directories

Allows directories to share subdirectories and files. Changes made by one user are immediately visible to the other.

User1/dict ──────────────────┐ shared_file User2/spell ─────────────────┘

Two implementation approaches:

Links (symbolic pointers to real path)
Duplicate file information
⚠️ Problems
  • Aliasing problem: A file can have multiple absolute path names.
  • Dangling pointer: If someone deletes the shared file, others' links break.
  • Solution: Delete only the link, OR preserve file until all references are deleted (reference count = 0 → then delete).

⑤ General Graph Directory

If links to subdirectories (not just files) are allowed, cycles can form. This complicates deletion.

⚠️ Problems with Cycles
  • Reference count may never reach 0 even when no external reference exists.
  • Requires garbage collection to find and free truly unreachable files.
  • Or use a cycle detection algorithm every time a new link is added.
💡 Simple Solution
Allow links only to files, not subdirectories. This keeps the structure acyclic.

🧠 Mnemonic - Directory Types in Order

S-T-T-A-G

Single · Two-level · Tree · Acyclic · Graph

"Students Tend To Ace General exams" - each adds more power but more complexity!

Type Naming ✓ Grouping ✓ Sharing ✓
Single-level
Two-level Limited
Tree
Acyclic graph
General graph
04

File Protection

Protection is needed so that users cannot accidentally or deliberately destroy someone else's information. The OS must control which operations each user can perform on each file.

Controlled operations include: Read · Write · Execute · Append · Delete · List

Access Lists and Groups Exam Favourite

An Access-Control List (ACL) specifies username + type of access allowed for each user.

⚠️ Problems with Full ACL
  • Length: Must list ALL users with access.
  • List construction is tedious.
  • Variable-size directory entries are complicated to manage.

UNIX-Style Three-Class Solution

Group all users into three classes. Much simpler - only 9 bits needed per file.

Class Who Permissions (RWX)
Owner The file's creator rwx = 111
Group Users in the same group rw- = 110
Universe All other users r-- = 100
Example UNIX permissions: rwxr-xr-- → Owner: rwx (can read, write, execute) → Group: r-x (can read and execute) → Others: r-- (can only read)

🧠 Mnemonic - UNIX Permission Classes

O-G-U

Owner · Group · Universe (World)

"Only Good Users" can access (in decreasing privilege).

Remember: R=4, W=2, X=1. So rwx = 7, rw- = 6, r-- = 4. File chmod 755 = Owner: 7 (rwx), Group: 5 (r-x), Others: 5 (r-x).

05

Tips, Tricks & Exam Strategy

🎯 Most Tested Concepts

Directory Types & Differences
Access Methods (Sequential vs Direct)
UNIX RWX Permissions
Open File Table entries
Dangling pointer problem
File Operations list
💡 Directory Shortcuts
  • Single-level = Naming problem (no duplicate names)
  • Two-level = Isolation problem (can't cooperate)
  • Tree = Current working directory concept
  • Acyclic = Dangling pointer + reference count
  • General graph = Garbage collection needed
💡 Access Method Rules
  • Sequential = tapes, compilers, editors
  • Direct = databases, large immediate lookups
  • Indexed = built on top of direct access
  • Direct access requires fixed-length records
  • Sequential access = one record after another
⚠️ Don't Confuse
Delete vs Truncate: Delete removes the file AND its entry from the directory. Truncate erases the file contents but keeps the file and its attributes (name, protection, etc.). The file still exists - just empty!
💡 File Pointer Facts
The file pointer is stored in the per-process open-file table, NOT in the system-wide open-file table. This is because different processes reading the same file need their own positions - like different people reading different pages of the same book.

📊 Quick Comparison Table

Feature Sequential Direct Indexed
Record size Variable Fixed Variable or fixed
Access order In order only Any order Any order (via index)
Use case Compilers, editors Databases Large structured files
Speed Slow (must scan) Fast Very fast

🧠 Big Picture Memory Map

FILE SYSTEM INTERFACE │ ├── FILE CONCEPT │ ├── Attributes: N-T-L-S-P-T │ ├── Operations: C-W-R-S-D-T + Open/Close │ └── Open-file table (per-process + system-wide) │ ├── ACCESS METHODS │ ├── Sequential (tape model, in-order) │ ├── Direct (random, fixed-length records) │ └── Indexed (built on direct, for large files) │ ├── DIRECTORY STRUCTURE │ ├── Single → Two → Tree → Acyclic → Graph │ └── Goals: Efficiency, Naming, Grouping │ └── PROTECTION ├── ACL (full list - complex) └── UNIX: Owner | Group | Universe (RWX)
06

Past Exam Questions & Answers

These questions are drawn from past CS330 exams and quizzes. Click any question to reveal the answer and explanation.

Multiple Choice Questions

MCQ Which operation removes all file contents but keeps its attributes?
✓ ANSWER: Truncate

Explanation: Truncate erases the contents of a file but keeps all its attributes (name, location, protection, etc.) in the directory entry. The file still exists - it just becomes empty.

  • Delete = removes file AND its directory entry entirely
  • Close = moves FCB from memory back to disk
  • Seek = repositions the read/write pointer
MCQ Which access method reads data in order, record after record?
✓ ANSWER: Sequential Access

Explanation: Sequential access processes information in order - one record after another, like a tape. It is the most common access method used by compilers and editors.

Direct access (random access) allows programs to read/write records in any order.

MCQ Which directory structure allows the same file to appear under multiple path names?
✓ ANSWER: Acyclic Graph Directory

Explanation: In an acyclic-graph directory, files can be shared via links. A single file can appear under multiple path names because different directories can point to the same inode/FCB. This is known as the aliasing problem.

  • Tree directories: each file has exactly one path (no sharing)
  • Acyclic graphs: sharing is possible via links → multiple paths
MCQ Which directory structure requires all file names to be unique (globally)?
✓ ANSWER: Single-Level Directory

Explanation: In a single-level directory, all files are in the same directory. Since there is no separation between users, every file must have a globally unique name. This is the naming problem of single-level directories.

MCQ Which type of users does the UNIX protection model define?
✓ ANSWER: Owner, Group, Universe

Explanation: UNIX uses a 3-class model instead of a full ACL for simplicity:

  • Owner: The user who created/owns the file
  • Group: Users belonging to the same group as the file
  • Universe (World/Others): All remaining users

Each class gets 3 permission bits: R (read=4), W (write=2), X (execute=1).

MCQ Which file operation repositions the read/write pointer within the file?
✓ ANSWER: Seek

Explanation: The Seek operation searches the directory for the appropriate entry and repositions the current-file-position pointer to a given value. It does not transfer any data.

MCQ When the file-open count reaches zero, the OS does what?
✓ ANSWER: Removes the entry from the open-file table

Explanation: The file-open count tracks how many processes currently have a file open. When the last process closes the file, the count drops to 0 and the OS removes the file's entry from the open-file table. The file is not deleted - it just no longer needs an in-memory entry.

Conceptual / Short Answer

Short Answer What is the main difference between Single-Level and Two-Level directory structures? What problem does the Two-Level design solve?
✓ Model Answer

Single-level: All files reside in one shared directory. Problem: All file names must be globally unique (naming problem), and there is no way to group related files (grouping problem).

Two-level: Each user has their own User File Directory (UFD), managed by a Master File Directory (MFD) at the top. This solves the naming problem - different users can have files with the same name because they live in separate UFDs. It also provides efficient searching. However, it still lacks grouping capability within a user's directory and isolates users (hard to share files between them).

Short Answer What is a dangling pointer in the context of acyclic-graph directories? How does the OS solve this problem?
✓ Model Answer

A dangling pointer occurs when a shared file is deleted (one user removes it), but other users/directories still have links pointing to it. Those links now point to a non-existent file - they "dangle."

Solutions:

  • Delete only the link, not the actual file. The original file remains until all links are removed.
  • Reference counting: Maintain a count of how many links point to the file. Only physically delete the file when the reference count reaches 0.
Short Answer What does the open-file table contain? Why is a per-process table needed in addition to the system-wide table?
✓ Model Answer

System-wide open-file table: Contains a copy of the FCB (file control block) of every open file, the open-count, and the disk location of the file. Shared across all processes.

Per-process open-file table: Contains the file pointer (current read/write position), access rights, and a pointer into the system-wide table. This is per-process because different processes reading the same file need their own independent position pointer.

The two-table design avoids redundancy (FCB stored once) while supporting independent file positions per process.

Short Answer Describe Sequential Access and Direct Access. Give one real-world use case for each.
✓ Model Answer

Sequential Access: Information is processed in order - one record after another. You can read forward or rewind to the beginning, but you cannot jump to an arbitrary record. Like a tape. Use case: Compilers reading source code, log files being processed.

Direct (Random) Access: The file consists of fixed-length logical records. A program can read or write any record in any order without restriction - you can jump directly to record N. Use case: Database systems needing immediate access to specific records (e.g. look up employee #5000 directly).

Short Answer A file has UNIX permissions rwxr-x--x. What can the Owner, Group, and Others do?
✓ Model Answer
Class Bits Permissions
Owner rwx Can read, write, AND execute
Group r-x Can read and execute, but NOT write
Others --x Can ONLY execute - cannot read or write

In octal: Owner=7, Group=5, Others=1 → chmod 751

Short Answer What are the six basic file operations the OS must support?
✓ Model Answer
  1. Create: Find space in the file system and make a directory entry for the new file.
  2. Write: System call specifying the file name and info to be written.
  3. Read: System call specifying the file name and where to put the next block.
  4. Seek: Reposition the current-file-position pointer within the file.
  5. Delete: Search directory for named file, erase the directory entry, release file space.
  6. Truncate: Erase contents of file but keep its attributes (file still exists, just empty).

Plus two critical operations: Open(Fᵢ) and Close(Fᵢ) - used before/after accessing files to manage the open-file table.

Short Answer Why are ACLs problematic in practice, and what simpler solution does UNIX use?
✓ Model Answer

ACL Problems:

  • The list can be very long (must list every user who can access the file).
  • The set of users may not be known in advance.
  • Variable-size directory entries complicate space management.

UNIX Solution: Condense all users into three classes - Owner, Group, and Universe - and assign three permission bits (R, W, X) to each class. This gives 9 bits total per file, which is compact and efficient.