Chapter 02

Conceptual Design

Entity Relationship Model

Course: CS 340: Introduction to Database Systems

Learning Objectives

Database Design Review

We learned that databases are useful because they store a lot of data for applications, give desirable guarantees for transactions, and are efficient and reliable. Before any of this is possible, we need to carefully design a database.

💾 Data Storage

Databases provide centralized, organized storage for application data

🔒 ACID Guarantees

Transactions ensure atomicity, consistency, isolation, and durability

⚡ Efficiency

Optimized query processing and data retrieval mechanisms

📐 Careful Design

Proper conceptual modeling is the foundation of a successful database

Data Abstraction and Architecture

Preliminaries

Database systems provide an abstract view of the data. The three levels of abstraction are as follows:

External Level (Views)
Shows different parts of the database to different users. Example: Student view is different from registrar or database admin.
Logical Level
What data are stored in the database and what relationships exist among those data.
Physical Level
How data are actually stored on disk (storage structures, file organization, indexing).
Data Abstraction The suppression of details of data organization and storage while highlighting essential features for improved understanding of data. This is one of the key advantages of using relational databases.

Schemas and Instances

Schema The logical structure of the database. It describes the database design and changes very infrequently.

📋 Logical Schema

Describes the database design at the logical level (e.g., Student takes Course)

💾 Physical Schema

Describes the database design at the physical storage level

👁️ External Schema (Subschema)

Describes different views of the database for various external users

Instance The data in the database at a particular moment in time. The database state changes every time the database is updated.
Schema vs. Instance Example:

Schema: Student(RollNo, Name, Address, PhoneNo, DOB, email, CourseNo, DNo)

Instance: The actual rows of data in the Student table at any given time

Name Student_number Class Major
Smith 17 1 CS
Brown 8 2 CS

Data Independence

Data Independence The protection of user applications from changes made in the definition and organization of data. Application programs should not be exposed to details of data representation and storage.

Types of Data Independence

📊 Logical Data Independence

The capacity to change the conceptual schema without having to change external schemas and their associated application programs.

Example: If user B adds a new column (e.g., salary) to their table, it will not affect the external view for user A, though the internal schema has been changed for both users.

💾 Physical Data Independence

The capacity to change the internal schema without having to change the conceptual schema.

Example: Changing file organization, storage structures, storage devices, or indexing strategy without affecting the conceptual or external schemas.

Important Note:

Logical data independence is more difficult to achieve than physical data independence, since application programs are heavily dependent on the logical structure of the data they access.

Conceptual Design Process

We have our enterprise we intend to model. What are the entities and relationships in this enterprise?

Entities (Usually) nouns that represent things in the enterprise (e.g., Student, Course, etc.)
Relationships (Usually) verbs that represent statements about 2 or more objects (e.g., Student takes Course)
Design Questions:
  • What information should we store about those entities and relationships?
  • What rules do we need to make sure our data is sound? (Integrity constraints)

The encoding in the relational database realm is referred to as an Entity-Relationship (ER) diagram.

ER Components: Entities and Attributes

Entity A real-world object distinguishable from other objects. Described by a set of attributes.
Entity Set A collection of similar entities (e.g., Employee is an entity set for all employees). All entities in an entity set have the same set of attributes.

🔑 Key Attribute

Each entity set has a key attribute that uniquely identifies each entity

📦 Domain

Each attribute has a domain (e.g., float, int, date, string)

🎯 Distinguishable

Each entity can be uniquely identified by its attributes

Types of Attributes

📌
Simple Attribute

Each entity has a single atomic value for the attribute.

Examples: SSN, Sex, Age

🧩
Composite Attribute

The attribute may be divided into smaller parts.

Examples:
Address(Number, Street, City, Country)
Name(FirstName, MiddleName, LastName)

📚
Multi-valued Attribute

A set of values for the same entity.

Examples:
{Color} 1...3 for a car
{CollegeDegree} for a person

💾
Stored Attribute

Value stored directly in the database.

Example: BirthDate

🔄
Derived Attribute

Value can be obtained from other attributes.

Example: Age (derived from BirthDate and current date)

🌀
Complex Attribute

Composite and multi-valued attributes nested arbitrarily.

Example: {PreviousDegrees(College, Year, Degree, Field)}

Design Note:

Complex attributes are very rare and not a good design technique. They should be avoided in most cases.

Keys in ER Diagrams

Key Attribute An attribute of an entity type for which each entity must have a unique value.
Key Attribute Examples:
  • EMPLOYEE: SSN (Social Security Number)
  • STUDENT: Student ID
  • CAR: VehicleIdentificationNumber (VIN)

Key Characteristics

🔗 Composite Keys

A key attribute may be composite

VehicleTagNumber(Number, State) - license plate

🎯 Multiple Keys

An entity type may have more than one key

CAR: VIN or VehicleTagNumber

📝 Notation

Each key is underlined in ER diagrams

All candidate keys are underlined

ER Diagram Notation

In ER diagrams, we use specific symbols to represent different components:

ENTITY
Entity

Represented by a rectangle. Contains the entity type name.

WEAK
WEAK
Weak Entity

Represented by a double rectangle. Does not have a key attribute.

RELATES
Relationship

Represented by a diamond. Contains the relationship name.

Attr
Attribute

Represented by an oval connected to its entity or relationship.

Key
Key Attribute

Represented by an oval with the attribute name underlined.

Multi
Multi
Multi-valued Attribute

Represented by a double oval.

Derived
Derived Attribute

Represented by a dashed oval.

Sub1
Main
Composite Attribute

Shown as an oval connected to other ovals representing component attributes.

Relationships in ER Diagrams

Relationship A relationship relates two or more distinct entities with a specific meaning.
Relationship Examples:
  • EMPLOYEE John Smith works on the ProductX PROJECT
  • EMPLOYEE Franklin Wong manages the Research DEPARTMENT
Relationship Type Relationships of the same type are grouped into a relationship type. All relationships in a type have the same participating entity types.

Relationship Degree (Arity)

Unary (1)
Recursive Relationship

A relationship type between the same entity type in distinct roles.

Example: EMPLOYEE supervises EMPLOYEE

Binary (2)
Most Common

A relationship between two different entity types.

Examples: EMPLOYEE works for DEPARTMENT, STUDENT takes COURSE

Ternary (3)
Three-way Relationship

A relationship among three entity types.

Example: SUPPLIER supplies PART to PROJECT

N-ary (N)
General Case

A relationship among N entity types.

N > 3 relationships are rare in practice

Relationship Cardinality Constraints

Cardinality Ratio Specifies the number of relationship instances that an entity can participate in.

Types of Cardinality

1:1
One-to-One

An entity in A is associated with at most one entity in B, and vice versa.

Example: An employee manages at most one department. A department is managed by at most one employee.

1:N
One-to-Many

An entity in A is associated with any number of entities in B. An entity in B is associated with at most one entity in A.

Example: A department has many employees. An employee works for one department.

N:1
Many-to-One

An entity in A is associated with at most one entity in B. An entity in B can be associated with any number of entities in A.

Example: Many employees work for one department.

M:N
Many-to-Many

An entity in A is associated with any number of entities in B, and vice versa.

Example: An employee works on many projects. A project has many employees working on it.

Participation Constraints

Participation Constraint Specifies whether the existence of an entity depends on its being related to another entity via the relationship type.
Partial Participation

Not all entities in the entity set participate in the relationship.

Represented by a single line

Example: Not all employees manage departments

Total Participation

Every entity in the entity set must participate in at least one relationship instance.

Represented by a double line

Example: Every employee must work for a department

(Min, Max) Notation

(min, max) Constraint Specifies that each entity participates in at least min and at most max relationship instances.

📏 Default Values

Default: min=0, max=n (no limit)

Must have min ≤ max, min ≥ 0, max ≥ 1

🎯 Reading (min, max)

Read the numbers next to an entity type while looking AWAY from that entity

EMPLOYEE (1,1) - WORKS_FOR - (1,N) DEPARTMENT

Each employee works for exactly one department; each department has 1 to N employees

📊 Common Patterns

  • (0,1): Optional, at most one
  • (1,1): Mandatory, exactly one
  • (0,N): Optional, any number
  • (1,N): Mandatory, at least one

Weak Entity Types

Weak Entity Type An entity that does not have a key attribute and is identification-dependent on another entity type (the owner or identifying entity type).

🔗 Identifying Relationship

A weak entity must participate in an identifying relationship type with an owner entity

🔑 Partial Key

Weak entities have a partial key that is unique only within entities related to the same owner

👥 Identification

Weak entities are identified by combining their partial key with the key of the owner entity

Weak Entity Example:

Scenario: DEPENDENT entities and EMPLOYEE entities

  • Owner Entity: EMPLOYEE (has SSN as key)
  • Weak Entity: DEPENDENT (no key attribute)
  • Partial Key: Dependent_name
  • Identifying Relationship: DEPENDENT_OF
  • Identification: A DEPENDENT is identified by (Employee_SSN, Dependent_name)

Example: "John Smith's dependent Mary" is different from "Alice Brown's dependent Mary"

Weak Entity Notation

A weak entity is shown with a double rectangle

The identifying relationship is shown with a double diamond

The partial key is underlined with a dashed line

Attributes of Relationships

Relationship Attributes A relationship type can have attributes that describe properties of the relationship instance.
Relationship Attribute Example:

Relationship: WORKS_ON (between EMPLOYEE and PROJECT)

Attribute: HoursPerWeek

The value of HoursPerWeek depends on a particular (employee, project) combination. It describes how many hours per week that specific employee works on that specific project.

Design Note:

Most relationship attributes are used with M:N relationships. In 1:N relationships, they can often be transferred to the entity type on the N-side of the relationship.

Example: COMPANY Database

Let's design a database schema for a company based on the following requirements:

Requirements:
  • The company is organized into DEPARTMENTs. Each department has a name, number, and an employee who manages it. We keep track of the start date of the department manager. A department may have several locations.
  • Each department controls a number of PROJECTs. Each project has a unique name, unique number, and is located at a single location.
  • The database stores each EMPLOYEE's social security number, address, salary, sex, and birthdate.
  • Each employee works for one department but may work on several projects. The database keeps track of the number of hours per week that an employee works on each project.
  • We keep track of the direct supervisor of each employee.
  • Each employee may have a number of DEPENDENTs. For each dependent, we keep a record of name, sex, birthdate, and relationship to the employee.

Identified Entity Types

🏢 DEPARTMENT

Attributes: Name, Number, Locations, Manager, Manager_start_date

📋 PROJECT

Attributes: Name, Number, Location, Controlling_department

👤 EMPLOYEE

Attributes: SSN, Name(Fname, Minit, Lname), Address, Salary, Sex, Birth_date

👨‍👩‍👧 DEPENDENT

Attributes: Name, Sex, Birth_date, Relationship, Employee

Identified Relationships

WORKS_FOR

Between EMPLOYEE and DEPARTMENT (N:1)

MANAGES

Between EMPLOYEE and DEPARTMENT (1:1)

Attribute: Start_date

CONTROLS

Between DEPARTMENT and PROJECT (1:N)

WORKS_ON

Between EMPLOYEE and PROJECT (M:N)

Attribute: Hours

SUPERVISION

Between EMPLOYEE and EMPLOYEE (1:N)

Recursive relationship

DEPENDENTS_OF

Between EMPLOYEE and DEPENDENT (1:N)

Identifying relationship

Practice Activity: SBS Bank

🎯 Design Challenge

Scenario: We have a bank called SBS with the following business requirements:

  • The bank has multiple branches. Each branch is located in a specific city and has its own unique name. Every year the branch is assigned a yearly budget and a rating (out of 10).
  • A customer is identified by their name and phone number. The bank also keeps track of their address.
  • The bank offers two services: accounts and loans to customers. Each account and loan has a unique number and is created and maintained by a single branch.
  • Each account is assigned to one or more customers and the balance cannot be negative.
  • A loan is always assigned to a single customer, has an interest rate, and a balance that cannot be negative.

Sample Solution

Identified Entities and Relationships:

🏦 BRANCH

Attributes: Name (key), City, Budget, Rating

👥 CUSTOMER

Attributes: Name, Phone (key), Address

💰 ACCOUNT

Attributes: ID (key), Balance

💵 LOAN

Attributes: ID (key), Balance, Interest

Opened_by

BRANCH (1) - ACCOUNT (N)

Given_by

BRANCH (1) - LOAN (N)

Owns

CUSTOMER (M) - ACCOUNT (N)

Takes

CUSTOMER (1) - LOAN (N)

ER Diagram Design Conventions

📝 Naming

Choose names that convey meanings: nouns for entities and attributes, verbs for relationships

📖 Readability

Design diagrams to be readable from left to right and top to bottom

🔤 Capitalization

Entity and relationship type names in UPPERCASE, attributes Capitalized, role names in lowercase

🎨 Consistency

Use consistent notation and styling throughout the diagram

Chapter Summary

In this chapter, we explored the Entity-Relationship (ER) model for conceptual database design:

Key Takeaway:

The ER model provides a high-level conceptual view of the database structure. It is the first step in database design, helping us understand and model the real-world domain before implementing it in a specific DBMS.

Next Steps:

In the following chapters, we will learn how to convert ER diagrams into relational schemas, understand normalization, and write SQL queries to interact with our designed databases.