File Concept
A file is a collection of related data, normally stored on secondary storage. It is the smallest unit of nameable storage from the user's perspective. The OS provides a uniform logical view of stored information regardless of the physical medium.
File Attributes Memorize
| Attribute | Description |
|---|---|
| Name | The only info stored in human-readable form |
| Type | Needed for systems that support different file types |
| Location | Pointer to file location on the device |
| Size | Current file size in bytes, words, or blocks |
| Protection | Controls who can read, write, execute |
| Time, date, user ID | For protection, security, and usage monitoring |
🧠 Mnemonic - File Attributes
Name · Type · Location · Size · Protection · Time/User
Think: "Navigating The Library Saves Plenty of Time"
File Types
- Numeric data
- Character data
- Binary data
- Source files (.c, .java)
- Object files (.o)
- Executable files (.exe)
.txt, .exe). The OS
uses extensions to decide what to do with a file (e.g. run it, open it with an editor).
File Operations Exam Favourite
The OS must provide 6 basic operations on files:
Two additional important operations:
Open(Fᵢ) searches the directory on disk and moves the
FCB (file control block) entry into memory. Close(Fᵢ) moves the in-memory FCB entry
back to disk.
🧠 Mnemonic - 6 Basic Operations
Create · Write · Read · Seek · Delete · Truncate
"Clever Writers Really Seek Dramatic Titles"
Open Files - Key Data Structures Important
To avoid repeated directory searches, the OS keeps an open-file table in memory.
| Data Item | Purpose |
|---|---|
| Open-file table | Contains info about all currently open files |
| File pointer | Tracks last read/write position - per process |
| File-open count | Counts how many processes have file open; when 0, file is removed from table |
| Disk location | Location of file on disk - kept in memory to avoid repeated disk access |
| Access rights | Per-process access mode (read, write, etc.) |
File Structure
Files can have different internal structures that the OS must understand:
Access Methods
How can information stored in files be accessed and read into memory? There are two primary methods and one derived method:
Sequential Access
Information is processed in order - one record after another, like a tape.
Used by: compilers, editors, word processors.
Direct (Random) Access
Fixed-length logical records. Programs can read/write records in any order with no restrictions.
File = numbered sequence of blocks.
Other Access Methods - Index-Based
Built on top of direct access by creating an index for the file.
- Index file stores pointers into the main data file.
- For large files, a primary index points to secondary indices, which point to actual data.
- Example: IBM's ISAM (Indexed Sequential Access Method).
Directory Structure
A directory is a collection of nodes containing information about all files. Both the directory structure and the files reside on disk.
Operations on Directories
Directory Organization Goals
Locate a file quickly.
Two users can have same name for different files. Same file can have several names.
Logical grouping by properties (e.g. all Java programs).
Easy for users to navigate and find files.
Types of Directory Structures Exam Favourite
① Single-Level Directory
All files are in the same directory. Simple to support and understand.
- Naming problem: All files need unique names (conflicts when users share the system).
- Grouping problem: No way to logically group related files.
② Two-Level Directory
Separate directory for each user. A Master File Directory (MFD) is searched at login; each user has their own User File Directory (UFD).
③ Tree-Structured Directories
Most common. A tree has a root directory and every file has a unique path name.
| Path Type | Description |
|---|---|
| Absolute path | Begins at root → full path down → file |
| Relative path | Defines path from the current directory |
mkdir and rm operate relative to current dir. In UNIX, rm -r
removes an entire subtree!
④ Acyclic-Graph Directories
Allows directories to share subdirectories and files. Changes made by one user are immediately visible to the other.
Two implementation approaches:
- Aliasing problem: A file can have multiple absolute path names.
- Dangling pointer: If someone deletes the shared file, others' links break.
- Solution: Delete only the link, OR preserve file until all references are deleted (reference count = 0 → then delete).
⑤ General Graph Directory
If links to subdirectories (not just files) are allowed, cycles can form. This complicates deletion.
- Reference count may never reach 0 even when no external reference exists.
- Requires garbage collection to find and free truly unreachable files.
- Or use a cycle detection algorithm every time a new link is added.
🧠 Mnemonic - Directory Types in Order
Single · Two-level · Tree · Acyclic · Graph
"Students Tend To Ace General exams" - each adds more power but more complexity!
| Type | Naming ✓ | Grouping ✓ | Sharing ✓ |
|---|---|---|---|
| Single-level | ✗ | ✗ | ✗ |
| Two-level | ✓ | ✗ | Limited |
| Tree | ✓ | ✓ | ✗ |
| Acyclic graph | ✓ | ✓ | ✓ |
| General graph | ✓ | ✓ | ✓ |
File Protection
Protection is needed so that users cannot accidentally or deliberately destroy someone else's information. The OS must control which operations each user can perform on each file.
Access Lists and Groups Exam Favourite
An Access-Control List (ACL) specifies username + type of access allowed for each user.
- Length: Must list ALL users with access.
- List construction is tedious.
- Variable-size directory entries are complicated to manage.
UNIX-Style Three-Class Solution
Group all users into three classes. Much simpler - only 9 bits needed per file.
| Class | Who | Permissions (RWX) |
|---|---|---|
| Owner | The file's creator | rwx = 111 |
| Group | Users in the same group | rw- = 110 |
| Universe | All other users | r-- = 100 |
🧠 Mnemonic - UNIX Permission Classes
Owner · Group · Universe (World)
"Only Good Users" can access (in decreasing privilege).
Remember: R=4, W=2, X=1. So rwx = 7,
rw- = 6, r-- = 4. File chmod 755 = Owner: 7 (rwx), Group:
5 (r-x), Others: 5 (r-x).
Tips, Tricks & Exam Strategy
🎯 Most Tested Concepts
- Single-level = Naming problem (no duplicate names)
- Two-level = Isolation problem (can't cooperate)
- Tree = Current working directory concept
- Acyclic = Dangling pointer + reference count
- General graph = Garbage collection needed
- Sequential = tapes, compilers, editors
- Direct = databases, large immediate lookups
- Indexed = built on top of direct access
- Direct access requires fixed-length records
- Sequential access = one record after another
Delete removes the file AND its entry from the
directory. Truncate erases the file contents but keeps the file and its
attributes (name, protection, etc.). The file still exists - just empty!
📊 Quick Comparison Table
| Feature | Sequential | Direct | Indexed |
|---|---|---|---|
| Record size | Variable | Fixed | Variable or fixed |
| Access order | In order only | Any order | Any order (via index) |
| Use case | Compilers, editors | Databases | Large structured files |
| Speed | Slow (must scan) | Fast | Very fast |
🧠 Big Picture Memory Map
Past Exam Questions & Answers
These questions are drawn from past CS330 exams and quizzes. Click any question to reveal the answer and explanation.
Multiple Choice Questions
MCQ Which operation removes all file contents but keeps its attributes?
Explanation: Truncate erases the contents of a file but keeps all
its attributes (name, location, protection, etc.) in the directory entry. The file still exists
- it just becomes empty.
- Delete = removes file AND its directory entry entirely
- Close = moves FCB from memory back to disk
- Seek = repositions the read/write pointer
MCQ Which access method reads data in order, record after record?
Explanation: Sequential access processes information in order - one record after another, like a tape. It is the most common access method used by compilers and editors.
Direct access (random access) allows programs to read/write records in any order.
MCQ Which directory structure allows the same file to appear under multiple path names?
Explanation: In an acyclic-graph directory, files can be shared via links. A single file can appear under multiple path names because different directories can point to the same inode/FCB. This is known as the aliasing problem.
- Tree directories: each file has exactly one path (no sharing)
- Acyclic graphs: sharing is possible via links → multiple paths
MCQ Which directory structure requires all file names to be unique (globally)?
Explanation: In a single-level directory, all files are in the same directory. Since there is no separation between users, every file must have a globally unique name. This is the naming problem of single-level directories.
MCQ Which type of users does the UNIX protection model define?
Explanation: UNIX uses a 3-class model instead of a full ACL for simplicity:
- Owner: The user who created/owns the file
- Group: Users belonging to the same group as the file
- Universe (World/Others): All remaining users
Each class gets 3 permission bits: R (read=4), W (write=2), X (execute=1).
MCQ Which file operation repositions the read/write pointer within the file?
Explanation: The Seek operation searches the directory for the
appropriate entry and repositions the current-file-position pointer to a given value. It does
not transfer any data.
MCQ When the file-open count reaches zero, the OS does what?
Explanation: The file-open count tracks how many processes currently have a file open. When the last process closes the file, the count drops to 0 and the OS removes the file's entry from the open-file table. The file is not deleted - it just no longer needs an in-memory entry.
Conceptual / Short Answer
Short Answer What is the main difference between Single-Level and Two-Level directory structures? What problem does the Two-Level design solve?
Single-level: All files reside in one shared directory. Problem: All file names must be globally unique (naming problem), and there is no way to group related files (grouping problem).
Two-level: Each user has their own User File Directory (UFD), managed by a Master File Directory (MFD) at the top. This solves the naming problem - different users can have files with the same name because they live in separate UFDs. It also provides efficient searching. However, it still lacks grouping capability within a user's directory and isolates users (hard to share files between them).
Short Answer What is a dangling pointer in the context of acyclic-graph directories? How does the OS solve this problem?
A dangling pointer occurs when a shared file is deleted (one user removes it), but other users/directories still have links pointing to it. Those links now point to a non-existent file - they "dangle."
Solutions:
- Delete only the link, not the actual file. The original file remains until all links are removed.
- Reference counting: Maintain a count of how many links point to the file. Only physically delete the file when the reference count reaches 0.
Short Answer What does the open-file table contain? Why is a per-process table needed in addition to the system-wide table?
System-wide open-file table: Contains a copy of the FCB (file control block) of every open file, the open-count, and the disk location of the file. Shared across all processes.
Per-process open-file table: Contains the file pointer (current read/write position), access rights, and a pointer into the system-wide table. This is per-process because different processes reading the same file need their own independent position pointer.
The two-table design avoids redundancy (FCB stored once) while supporting independent file positions per process.
Short Answer Describe Sequential Access and Direct Access. Give one real-world use case for each.
Sequential Access: Information is processed in order - one record after another. You can read forward or rewind to the beginning, but you cannot jump to an arbitrary record. Like a tape. Use case: Compilers reading source code, log files being processed.
Direct (Random) Access: The file consists of fixed-length logical records. A program can read or write any record in any order without restriction - you can jump directly to record N. Use case: Database systems needing immediate access to specific records (e.g. look up employee #5000 directly).
Short Answer A file has UNIX permissions rwxr-x--x. What can the Owner, Group, and Others do?
| Class | Bits | Permissions |
|---|---|---|
| Owner | rwx | Can read, write, AND execute |
| Group | r-x | Can read and execute, but NOT write |
| Others | --x | Can ONLY execute - cannot read or write |
In octal: Owner=7, Group=5, Others=1 → chmod 751
Short Answer What are the six basic file operations the OS must support?
- Create: Find space in the file system and make a directory entry for the new file.
- Write: System call specifying the file name and info to be written.
- Read: System call specifying the file name and where to put the next block.
- Seek: Reposition the current-file-position pointer within the file.
- Delete: Search directory for named file, erase the directory entry, release file space.
- Truncate: Erase contents of file but keep its attributes (file still exists, just empty).
Plus two critical operations: Open(Fᵢ) and Close(Fᵢ) - used before/after accessing files to manage the open-file table.
Short Answer Why are ACLs problematic in practice, and what simpler solution does UNIX use?
ACL Problems:
- The list can be very long (must list every user who can access the file).
- The set of users may not be known in advance.
- Variable-size directory entries complicate space management.
UNIX Solution: Condense all users into three classes - Owner, Group, and Universe - and assign three permission bits (R, W, X) to each class. This gives 9 bits total per file, which is compact and efficient.