Database Design & Normalization

Design clean, efficient schemas by eliminating redundancy through normalization — the foundation of every real-world database.

Why Schema Design Matters

What Is Normalization?

Normalization is the process of organizing a database to reduce data redundancy and improve data integrity. A well-normalized database stores each fact exactly once, making updates, inserts, and deletes safe and consistent.

Edgar F. Codd introduced normalization theory in 1970. The process moves a table through a series of normal forms, each eliminating a specific type of anomaly.

The Three Anomalies
Without normalization you risk: Update anomalies (changing one fact requires updating many rows), Insert anomalies (you can't add data without unrelated data), and Delete anomalies (deleting a row removes unrelated facts).

First Normal Form (1NF)

A table is in 1NF when every column holds atomic (indivisible) values — no comma-separated lists, no repeating groups. Each row must be uniquely identifiable by a primary key.

Example violation: storing courses = "Math, Science, English" in one column. Fix: create a separate row for each course enrollment.

Second Normal Form (2NF)

A table is in 2NF when it is in 1NF and every non-key column is fully functionally dependent on the entire primary key — not just part of it. Partial dependencies only appear in tables with composite primary keys.

Example: in an enrollment(student_id, course_id, student_name, course_name) table, student_name depends only on student_id, not the full composite key. Fix: split into students and courses tables.

Third Normal Form (3NF)

A table is in 3NF when it is in 2NF and no non-key column is transitively dependent on the primary key through another non-key column.

Example: orders(order_id, customer_id, customer_city, customer_zip) — customer_city depends on customer_zip, not directly on order_id. Fix: move city/zip to a customers table.

Practical Rule: 3NF is the standard target for most production databases. Going beyond (BCNF, 4NF) is sometimes done for analytical systems, but 3NF provides the right balance of integrity and query performance for OLTP workloads.

Entity-Relationship (ER) Diagrams

ER diagrams are visual blueprints of a database schema. Key components:

Entities — rectangles representing tables (e.g., Student, Course)
Attributes — ovals or columns inside entities (e.g., student_id, name)
Relationships — diamonds or lines showing how entities relate
Cardinality — 1:1, 1:N, or M:N notation on relationship lines

Primary & Foreign Keys in Schema Design

Every table should have a primary key — a column (or composite) that uniquely identifies each row. Surrogate keys (id INT AUTO_INCREMENT) are often preferred over natural keys because they never change.

Foreign keys enforce referential integrity: a course_enrollments.student_id column must reference a valid row in students.id. The RDBMS rejects invalid inserts and can cascade updates/deletes.

Check Your Understanding

1. A table has the column phone_numbers = "555-1234, 555-5678". Which normal form does this violate?

1NF — values are not atomic

2NF — partial dependency exists

3NF — transitive dependency exists

It does not violate any normal form

2. In a table with composite key (order_id, product_id), the column product_name depends only on product_id. Which normal form is violated?

1NF

2NF — partial functional dependency

3NF — transitive dependency

No violation

3. A table has employee_id → department_id → department_budget. The column department_budget depends on department_id, not directly on employee_id. Which normal form is violated?

1NF

2NF

3NF — transitive dependency

No violation; this is acceptable design

4. Which statement about foreign keys is TRUE?

A foreign key must always be unique in the referencing table

Foreign keys are optional and have no effect on data integrity

A foreign key references a primary key in another table, enforcing referential integrity

Foreign keys automatically create indexes on the referenced column

Database Design & Normalization

Normalization Explorer

What Is Normalization?

First Normal Form (1NF)

Second Normal Form (2NF)

Third Normal Form (3NF)

Entity-Relationship (ER) Diagrams

Primary & Foreign Keys in Schema Design

Check Your Understanding