Normal Forms: 1NF, 2NF, 3NF, and BCNF

Previously, we introduced the concept of normalization vs. denormalization explaining how database designers utilize each approach. Now, let’s dive deeper into normal forms — the progressive stages of normalization that ensure data is structured logically, consistently, and without redundancy.

We’ll walk through 1NF, 2NF, 3NF, and BCNF (sometimes called 3.5NF) building upon the previous example from U.S. National Parks data.

First Normal Form (1NF)

Rule:

  • Each row is unique
  • Each column holds atomic (indivisible) values
  • No repeating groups or arrays

Bad Example (Not in 1NF):

  • The StateName column contains multiple values in a single cell.
  • Columns are not atomic — violates 1NF rule.
  • Makes querying and filtering by state difficult and error-prone.
ParkIDParkNameStateName
1YellowstoneWyoming, Montana, Idaho
2Great Smoky MountainsTennessee, North Carolina

Corrected (1NF):

ParkIDParkNameStateName
1YellowstoneWyoming
1YellowstoneMontana
1YellowstoneIdaho
2Great Smoky MountainsTennessee
2Great Smoky MountainsNorth Carolina

Now, each row has atomic values and no repeating groups.

Second Normal Form (2NF)

Rule:

  • Meets all requirements of 1NF
  • Every non-key column depends on the whole primary key, not just part of it

Bad Example (Not in 2NF):
Suppose we use (ParkID, StateName) as a composite primary key:

  • ParkName depends only on ParkID, not the full composite key (ParkID, StateName).
  • Region depends only on StateName, not the full key.
  • Partial dependencies violate 2NF and can cause redundant or inconsistent data.
ParkIDParkNameStateNameRegion
1YellowstoneWyomingMountain
1YellowstoneMontanaMountain
1YellowstoneIdahoMountain
2Great Smoky MountainsTennesseeSoutheast
2Great Smoky MountainsNorth CarolinaSoutheast

Corrected (2NF):

Parks Table

ParkIDParkName
1Yellowstone

States Table

StateIDStateNameRegion
WYWyomingMountain
MTMontanaMountain

ParkStates Table

ParkIDStateID
1Wyoming
1Montana

Now, every non-key column depends on the entire key.

Third Normal Form (3NF)

Rule:

  • Meets all requirements of 2NF
  • No transitive dependencies (a non-key column depending on another non-key column)

Bad Example (Not in 3NF):

  • Region depends on State, not directly on ParkID.
  • Creates transitive dependency, which can cause update anomalies.
  • Violates 3NF rules — changing a region requires updating multiple rows.
ParkIDParkNameStateRegion
1YellowstoneWyomingMountain

Here, Region depends on State, not directly on ParkID.

Corrected (3NF):

Parks Table

ParkIDParkNameStateID
1Yellowstone1

States Table

StateIDStateNameRegion
1WyomingMountain

This removes the transitive dependency — Region is tied to StateID, not ParkID.

Boyce-Codd Normal Form (BCNF, 3.5NF)

Rule:

  • A stronger version of 3NF
  • For every dependency X → Y, X must be a candidate key

Problem Example (Not in BCNF):
Suppose we track park rangers and assign them to regions:

  • Region → LeadRanger exists, but Region is not a candidate key.
  • Multiple rows are repeated for the same region, creating redundancy.
  • Changing a lead ranger’s name requires updating multiple rows — risk of inconsistency.
RangerIDRangerNameRegionLeadRanger
1AliceMountainBob
2CharlieMountainBob

Here, Region → LeadRanger, but Region is not a candidate key (multiple rangers belong to the same region). This creates anomalies — if Bob’s details change, they must be updated in multiple rows.

Corrected (BCNF):

We separate rangers from regional leadership assignments, and use IDs instead of names:

Rangers Table

RangerIDRangerNameRegion
1AliceMountain
2BobMountain
3CharlieMountain

Regions Table

RegionLeadRangerID
Mountain2

Now:

  • Ranger details are stored in one place.
  • Region → LeadRangerID is valid, and LeadRangerID points to a candidate key (RangerID).
  • No anomalies remain.

Key Takeaway

  • 1NF: Eliminate repeating groups and ensure atomic values.
  • 2NF: Eliminate partial dependencies on part of a composite key.
  • 3NF: Eliminate transitive dependencies.
  • BCNF: Eliminate all anomalies by ensuring every determinant is a candidate key.

Most real-world databases aim for 3NF, but certain edge cases benefit from BCNF for maximum data integrity.

Scroll to Top