A relational database organizes data into tables (called relations). Each table has named columns (attributes) and rows (tuples). The relational model was proposed by Edgar Codd in 1970 and remains the foundation of most production databases.
| Term | Math term | Meaning |
|---|---|---|
| Table | Relation | A set of rows sharing the same structure |
| Column | Attribute | A named field with a data type and domain |
| Row | Tuple | A single record in the table |
| Schema | Relation schema | The table's name + list of attributes |
Keys uniquely identify rows and establish relationships between tables.
A column (or minimal set of columns) whose values are unique and never NULL. Every table should have exactly one primary key.
Any minimal set of attributes that could serve as a primary key. The chosen one is the PK; the rest are alternate keys.
A column whose values match the primary key of another table, creating a referential integrity constraint.
student_id (PK), name, email. course_id (PK), title, credits.student_id (FK → Students), course_id (FK → Courses), grade. A superkey is any set of attributes that uniquely identifies rows — including redundant ones. A candidate key is a minimal superkey (no attribute can be removed and still guarantee uniqueness).
Example: In Students, both {student_id} and {student_id, name} are superkeys. Only {student_id} is a candidate key — adding name is redundant.
Table Orders: (order_id, customer_id, product_id, order_date, quantity). Assume order_id is globally unique and each row is one line item. What is a reasonable PK? What foreign keys exist?
In a Flights table, attributes include: flight_number, departure_date, origin, destination. The combination (flight_number, departure_date) is unique. Is {flight_number, departure_date, origin} a candidate key?
Which property must a primary key always satisfy?
Entity integrity rule: PK columns must be NOT NULL. Even for 'new' rows, the PK must have a value before insertion is committed.
Auto-incrementing integers are common PKs, but UUIDs, natural keys (email), and compound keys are also used.
Example: Enrollments(student_id, course_id) uses a composite PK where neither column alone is unique.
A foreign key ensures:
Foreign keys enforce referential integrity: the value must exist in the referenced table (or be NULL). Uniqueness is a separate constraint.
A NULL FK means 'this record has no parent' — often valid. The constraint only fires for non-NULL values.
FKs enforce referential integrity only. Sorting is a query-time operation.
A candidate key differs from a superkey in that a candidate key is:
Adding attributes creates a superkey. Removing attributes until uniqueness would be lost gives a candidate key.
A candidate key can span multiple columns as long as the combination is minimal and unique.
A candidate key uniquely identifies rows in its own table. A foreign key references a key in another table.
In the Enrollments(student_id, course_id, grade) table, what is the most appropriate primary key?
A PK must be unique per row. Many students share grades. A functional identifier like (student_id, course_id) is needed.
student_id is not unique in this table (a student enrolls in multiple courses). The composite (student_id, course_id) is unique.
An auto-id would work but wouldn't prevent duplicate enrollments by itself. The composite PK doubles as a uniqueness constraint on the business rule.
Which constraint is violated if you insert a row with a foreign key value that doesn't exist in the parent table?
Domain constraint: e.g., grade must be A–F. Referential integrity is about FK values matching parent PKs.
Entity integrity: PK ≠ NULL. Referential integrity: FK must match an existing PK (or be NULL).
If the FK value is non-null but doesn't exist in the parent table, referential integrity is violated — not a null constraint.