Chapter 01

Introduction to Database Systems

CS 340: Foundations of Modern Data Management

Learning Objectives

Fundamental Definitions

Database An organized collection of related data, stored on disk, sometimes accessed by multiple people.
Examples of Databases:
  • University Database: Contains data on courses, students, faculty, and other resources
  • Movies Database: Contains data on movies, genres, actors, and directors
  • E-commerce Database: Contains products, customers, orders, and inventory information
Database Management System (DBMS) A software system that manages databases. It allows accessing, modifying, backing up, and recovering the database.
Popular DBMS Examples:
  • Oracle: Enterprise-level relational database system
  • PostgreSQL: Open-source relational database
  • MySQL: Widely used open-source database
  • MongoDB: NoSQL document-oriented database

Why Use a Database Management System?

Databases store all the data that a system would need in a reliable, secure, shareable, and efficient manner. However, they need to be designed carefully.

The File-Based System Problem

Prior to DBMS, the file system provided by the operating system was used to store information. In a file-based system, we have a collection of application programs that perform services for end users, where each program defines and manages its own data.

🔒 Security Issues

How do you prevent unauthorized access to files? File systems lack sophisticated access control mechanisms.

👥 Concurrent Access

What happens when multiple people access the same file simultaneously? Data corruption risks increase.

💾 Recovery Problems

What if the hard drive fails or power goes off? File systems don't guarantee data recovery.

⚡ Performance

Can you guarantee access at certain speeds? File systems lack optimized lookup mechanisms.

🔁 Data Redundancy

Student information repeated across multiple files leads to inconsistencies and wasted storage.

❌ File-Based Systems

  • Data redundancy and inconsistency
  • Difficulty accessing data
  • Data isolation
  • Integrity problems
  • Atomicity problems
  • Concurrent access anomalies
  • Security problems

✅ Database Management Systems

  • Centralized data management
  • Easy data access via queries
  • Data integration and relationships
  • Built-in integrity constraints
  • ACID transaction support
  • Concurrency control mechanisms
  • Authentication and authorization

Simplified DBMS Architecture

How Users Interact with a DBMS

Users and programmers interact with the database through application programs and queries. The DBMS software processes these requests and accesses the stored database on disk, managing both the actual data and the database definition (meta-data).

DBMS Components:

  • Application Programs/Queries: User interface layer
  • Software to Process Queries: Query processor and optimizer
  • Software to Access Stored Data: Storage manager
  • Stored Database: Actual data files
  • Stored Database Definition (Meta-Data): System catalog

Key Functions of a DBMS

📋 Define Database

Structure, data types, and constraints of a database

🔍 Query Data

Retrieve information based on stored data

✏️ Modify Data

Insert, update, or delete data

🔐 Security

Control access through authentication and authorization

👥 Concurrent Access

Handle multiple users simultaneously

🔧 Maintenance

Backup, recovery, and database optimization

Queries and Transactions

Applications issue two main types of requests to the database:

Queries Requests that generate a result based on the stored data. Queries retrieve information without modifying it.
Example Query:
SELECT s.name
FROM Student s
WHERE s.major = "CS"

Plain English: Find all students who are majoring in CS

Transactions Collections of operations that must be executed as a single unit to maintain database consistency.

Objectives of Database Systems

📊 Data Availability

Data are made available to a wide variety of users in a meaningful format at reasonable cost, ensuring easy access.

✅ Data Integrity

Data available in the database is reliable and correct, maintaining accuracy and consistency.

🔒 Data Security

Only authorized users can access the data. Security can be enforced through passwords and access control.

🔓 Data Independence

DBMS allows users to store, update, and retrieve data efficiently while hiding implementation details.

Main Characteristics of the Database Approach

1. Self-Describing Nature

A database system stores not only the database but also a complete definition or description of the database, known as the System Catalog. The description stored in the System Catalog is called meta-data.
Database Catalog Example:
Relation Name Number of Columns
STUDENT 4
COURSE 4
SECTION 5
GRADE_REPORT 3
Column Name Data Type Belongs to Relation
Name Character (30) STUDENT
Student_number Character (4) STUDENT
Class Integer (1) STUDENT
Major Major_type STUDENT

This allows the DBMS software to work with different database applications.

2. Program-Data Independence

The structure of data files is stored in the DBMS catalog separately from the access programs. DBMS allows changing data structures and storage organization without having to change the DBMS access programs.

3. Data Abstraction

A DBMS provides users with a conceptual representation of data that does not include many details of how data is stored or how operations are implemented. Programs refer to data model constructs rather than data storage details.

4. Support for Multiple Views

Each user may see a different view of the database, which describes only the data of interest to that user.
Example Views:

TRANSCRIPT View: Shows student names with their course numbers, grades, semesters, years, and section IDs.

COURSE_PREREQUISITES View: Shows course names with their course numbers and prerequisites.

Different users (students, faculty, administrators) can see different views of the same underlying data.

5. Multi-User Transaction Processing

Allows a set of concurrent users to retrieve from and update the database. Concurrency control within the DBMS guarantees that each transaction is correctly executed or aborted.

ACID Properties

Any DBMS must support the ACID concept to ensure that transactions are processed reliably:

A
Atomicity

The entire transaction takes place at once or doesn't happen at all. Every transaction is a single unit.

C
Consistency

The database must be consistent before and after a transaction. Data integrity constraints must be followed.

I
Isolation

Multiple transactions occur independently without interference. Concurrent transactions don't affect each other.

D
Durability

The changes of a successful transaction occur even if the system failure occurs (power off, crash, etc.).

How to Design and Implement a Database

There are multiple steps that align with the contents of this course:

Step 1
Conceptual Modeling

Create Entity-Relationship (ER) diagrams to represent real-world objects (entities) and their relationships. Entities are described by attributes.

Example:

Entity: Student (described by ID, Name, Address)

Relationship: Student "Takes" Course

Step 2
Logical Modeling

Use the relational model - the most used data model in the world today. Convert ER diagrams into tables.

Example: Student Table
Student ID Name Major
123 Ahmad CS
124 Ali SE

Note: The logical view is separate from the physical view (how data is stored). Applications refer to logical model tables rather than where data is stored.

Step 3
Optimization

Carefully design databases to capture all needed information, minimize space usage, and ensure easy maintenance. Focus on normalization and different normal forms.

Step 4
Querying the Data

Use Structured Query Language (SQL) to query data. SQL is a declarative language where statements describe the desired result rather than the computational steps.

SQL Query Example:
SELECT s.name
FROM Student s
WHERE s.major = "CS"

Plain English: Find all students who are majoring in CS

Step 5
Other Database Applications
  • Data Warehousing & OLAP: Organizations collect and organize current and historical data to identify patterns
  • OLTP: Online Transaction Processing ensures ACID properties
  • Data Mining: Exploration and analysis of large data to find knowledge in the form of patterns
Step 6
Create Your Own Application

Build complete applications using Java with JDBC (Java Database Connectivity) or other technologies like Python or NodeJS. Create full applications with GUI interfaces.

Database Users

👨‍💼 Database Administrator (DBA)

Handles authorizing access for users, monitors database use, and controls and manages the efficiency of the database.

🎨 Database Designers

Define the content, structure, and constraints of the database. Responsible for the conceptual and logical design.

👥 End-Users

Can be application developers or application users. They issue queries, ask for reports, and can update the database content.

When NOT to Use a Database

Short answer: Not always. Designing a DBMS is a long process that incurs a lot of overhead.

Scenarios where a database might not be necessary:

📄 Static Data

If your data is static and never changing, the overhead of a DBMS may not be justified.

⚙️ Hardware Limitations

Embedded systems with limited resources may not support full DBMS functionality.

👤 Single User

If there's no multiple user access requirement, simpler file storage might suffice.

Chapter Summary

In this chapter, we explored the fundamental concepts of databases and database management systems:

Next Steps:

In the following chapters, we will continue to look at all of these steps in detail, starting with conceptual modeling using Entity-Relationship diagrams.