Previously, we introduced the concept of normalization vs. denormalization explaining how database designers utilize each approach. Now, let’s dive deeper into normal forms — the progressive stages of normalization that ensure data is structured logically, consistently, and without redundancy.
We’ll walk through 1NF, 2NF, 3NF, and BCNF (sometimes called 3.5NF) building upon the previous example from U.S. National Parks data.
First Normal Form (1NF)
Rule:
- Each row is unique
- Each column holds atomic (indivisible) values
- No repeating groups or arrays
Bad Example (Not in 1NF):
- The StateName column contains multiple values in a single cell.
- Columns are not atomic — violates 1NF rule.
- Makes querying and filtering by state difficult and error-prone.
| ParkID | ParkName | StateName |
|---|---|---|
| 1 | Yellowstone | Wyoming, Montana, Idaho |
| 2 | Great Smoky Mountains | Tennessee, North Carolina |
Corrected (1NF):
| ParkID | ParkName | StateName |
|---|---|---|
| 1 | Yellowstone | Wyoming |
| 1 | Yellowstone | Montana |
| 1 | Yellowstone | Idaho |
| 2 | Great Smoky Mountains | Tennessee |
| 2 | Great Smoky Mountains | North Carolina |
Now, each row has atomic values and no repeating groups.
Second Normal Form (2NF)
Rule:
- Meets all requirements of 1NF
- Every non-key column depends on the whole primary key, not just part of it
Bad Example (Not in 2NF):
Suppose we use (ParkID, StateName) as a composite primary key:
- ParkName depends only on
ParkID, not the full composite key(ParkID, StateName). - Region depends only on
StateName, not the full key. - Partial dependencies violate 2NF and can cause redundant or inconsistent data.
| ParkID | ParkName | StateName | Region |
|---|---|---|---|
| 1 | Yellowstone | Wyoming | Mountain |
| 1 | Yellowstone | Montana | Mountain |
| 1 | Yellowstone | Idaho | Mountain |
| 2 | Great Smoky Mountains | Tennessee | Southeast |
| 2 | Great Smoky Mountains | North Carolina | Southeast |
Corrected (2NF):
Parks Table
| ParkID | ParkName |
|---|---|
| 1 | Yellowstone |
States Table
| StateID | StateName | Region |
|---|---|---|
| WY | Wyoming | Mountain |
| MT | Montana | Mountain |
ParkStates Table
| ParkID | StateID |
|---|---|
| 1 | Wyoming |
| 1 | Montana |
Now, every non-key column depends on the entire key.
Third Normal Form (3NF)
Rule:
- Meets all requirements of 2NF
- No transitive dependencies (a non-key column depending on another non-key column)
Bad Example (Not in 3NF):
- Region depends on State, not directly on
ParkID. - Creates transitive dependency, which can cause update anomalies.
- Violates 3NF rules — changing a region requires updating multiple rows.
| ParkID | ParkName | State | Region |
|---|---|---|---|
| 1 | Yellowstone | Wyoming | Mountain |
Here, Region depends on State, not directly on ParkID.
Corrected (3NF):
Parks Table
| ParkID | ParkName | StateID |
|---|---|---|
| 1 | Yellowstone | 1 |
States Table
| StateID | StateName | Region |
|---|---|---|
| 1 | Wyoming | Mountain |
This removes the transitive dependency — Region is tied to StateID, not ParkID.
Boyce-Codd Normal Form (BCNF, 3.5NF)
Rule:
- A stronger version of 3NF
- For every dependency
X → Y,Xmust be a candidate key
Problem Example (Not in BCNF):
Suppose we track park rangers and assign them to regions:
Region → LeadRangerexists, butRegionis not a candidate key.- Multiple rows are repeated for the same region, creating redundancy.
- Changing a lead ranger’s name requires updating multiple rows — risk of inconsistency.
| RangerID | RangerName | Region | LeadRanger |
|---|---|---|---|
| 1 | Alice | Mountain | Bob |
| 2 | Charlie | Mountain | Bob |
Here, Region → LeadRanger, but Region is not a candidate key (multiple rangers belong to the same region). This creates anomalies — if Bob’s details change, they must be updated in multiple rows.
Corrected (BCNF):
We separate rangers from regional leadership assignments, and use IDs instead of names:
Rangers Table
| RangerID | RangerName | Region |
|---|---|---|
| 1 | Alice | Mountain |
| 2 | Bob | Mountain |
| 3 | Charlie | Mountain |
Regions Table
| Region | LeadRangerID |
|---|---|
| Mountain | 2 |
Now:
- Ranger details are stored in one place.
Region → LeadRangerIDis valid, andLeadRangerIDpoints to a candidate key (RangerID).- No anomalies remain.
Key Takeaway
- 1NF: Eliminate repeating groups and ensure atomic values.
- 2NF: Eliminate partial dependencies on part of a composite key.
- 3NF: Eliminate transitive dependencies.
- BCNF: Eliminate all anomalies by ensuring every determinant is a candidate key.
Most real-world databases aim for 3NF, but certain edge cases benefit from BCNF for maximum data integrity.

