CS 221: Database Systems › Lesson 1 of 10

Relational Model & Keys

Lesson 1 · OKSTEM College · AS Computer Science

The Relational Model

A relational database organizes data into tables (called relations). Each table has named columns (attributes) and rows (tuples). The relational model was proposed by Edgar Codd in 1970 and remains the foundation of most production databases.

TermMath termMeaning
TableRelationA set of rows sharing the same structure
ColumnAttributeA named field with a data type and domain
RowTupleA single record in the table
SchemaRelation schemaThe table's name + list of attributes

Keys

Keys uniquely identify rows and establish relationships between tables.

Primary Key (PK)

A column (or minimal set of columns) whose values are unique and never NULL. Every table should have exactly one primary key.

Candidate Key

Any minimal set of attributes that could serve as a primary key. The chosen one is the PK; the rest are alternate keys.

Foreign Key (FK)

A column whose values match the primary key of another table, creating a referential integrity constraint.

Worked Example — Student / Enrollment Schema

Students table: student_id (PK), name, email.
student_id is unique and non-null → qualifies as PK.
Courses table: course_id (PK), title, credits.
Enrollments table: student_id (FK → Students), course_id (FK → Courses), grade.
The combination (student_id, course_id) is the composite PK — a student can't enroll in the same course twice.
Foreign key constraint: every value in Enrollments.student_id must exist in Students.student_id → prevents orphan records.

Superkey vs. Candidate Key

A superkey is any set of attributes that uniquely identifies rows — including redundant ones. A candidate key is a minimal superkey (no attribute can be removed and still guarantee uniqueness).

Example: In Students, both {student_id} and {student_id, name} are superkeys. Only {student_id} is a candidate key — adding name is redundant.

Relational Integrity Constraints

Practice Problems

Problem 1 — Identify Keys

Table Orders: (order_id, customer_id, product_id, order_date, quantity). Assume order_id is globally unique and each row is one line item. What is a reasonable PK? What foreign keys exist?

PK: order_id (unique, non-null). FKs: customer_id → Customers.customer_id; product_id → Products.product_id. These enforce that you can't place an order for a non-existent customer or product.

Problem 2 — Superkey vs. Candidate Key

In a Flights table, attributes include: flight_number, departure_date, origin, destination. The combination (flight_number, departure_date) is unique. Is {flight_number, departure_date, origin} a candidate key?

No — it is a superkey but not a candidate key. You can remove origin and still have uniqueness: {flight_number, departure_date} already uniquely identifies each row. A candidate key must be minimal.

🗄️ Schema Diagram Builder

Knowledge Check

Which property must a primary key always satisfy?

A primary key can NEVER be NULL — this is the entity integrity rule.
Correct — a PK uniquely identifies every row and cannot be NULL.
PKs can be strings, UUIDs, or any type — not limited to integers.
Composite PKs (multiple columns together) are valid and common.
📖 Quick Recap

Entity integrity rule: PK columns must be NOT NULL. Even for 'new' rows, the PK must have a value before insertion is committed.

📖 Quick Recap

Auto-incrementing integers are common PKs, but UUIDs, natural keys (email), and compound keys are also used.

📖 Quick Recap

Example: Enrollments(student_id, course_id) uses a composite PK where neither column alone is unique.

A foreign key ensures:

Uniqueness is a PK/UNIQUE constraint, not what FKs enforce.
Correct — an FK value must match a PK in the parent table.
FKs allow NULL by default (meaning 'no relationship'); it's up to the schema designer.
Sorting is done by ORDER BY in queries, not by FKs.
📖 Quick Recap

Foreign keys enforce referential integrity: the value must exist in the referenced table (or be NULL). Uniqueness is a separate constraint.

📖 Quick Recap

A NULL FK means 'this record has no parent' — often valid. The constraint only fires for non-NULL values.

📖 Quick Recap

FKs enforce referential integrity only. Sorting is a query-time operation.

A candidate key differs from a superkey in that a candidate key is:

Candidate keys are minimal, not larger.
Correct — candidate keys are minimal superkeys.
Composite candidate keys exist, e.g., (flight_number, departure_date).
Candidate keys are uniqueness constraints; foreign keys are referential constraints.
📖 Quick Recap

Adding attributes creates a superkey. Removing attributes until uniqueness would be lost gives a candidate key.

📖 Quick Recap

A candidate key can span multiple columns as long as the combination is minimal and unique.

📖 Quick Recap

A candidate key uniquely identifies rows in its own table. A foreign key references a key in another table.

In the Enrollments(student_id, course_id, grade) table, what is the most appropriate primary key?

Grade is not unique — multiple students can get the same grade.
One student enrolls in many courses — student_id alone would repeat.
Correct — together they uniquely identify each enrollment.
While valid technically, the natural composite key (student_id, course_id) also enforces the business rule that a student can't duplicate-enroll.
📖 Quick Recap

A PK must be unique per row. Many students share grades. A functional identifier like (student_id, course_id) is needed.

📖 Quick Recap

student_id is not unique in this table (a student enrolls in multiple courses). The composite (student_id, course_id) is unique.

📖 Quick Recap

An auto-id would work but wouldn't prevent duplicate enrollments by itself. The composite PK doubles as a uniqueness constraint on the business rule.

Which constraint is violated if you insert a row with a foreign key value that doesn't exist in the parent table?

Domain constraint is about valid values within a column's type/check, not cross-table references.
Entity integrity is about PKs being non-null — not about FK references.
Correct — the FK value must reference an existing PK in the parent table.
A null constraint prevents NULLs; this violation involves a non-null value that doesn't match any parent PK.
📖 Quick Recap

Domain constraint: e.g., grade must be A–F. Referential integrity is about FK values matching parent PKs.

📖 Quick Recap

Entity integrity: PK ≠ NULL. Referential integrity: FK must match an existing PK (or be NULL).

📖 Quick Recap

If the FK value is non-null but doesn't exist in the parent table, referential integrity is violated — not a null constraint.

Next →