Learn by reading through in order

Table Design and Normalization (3-Tier Schema)

Use orders_flat with item1/item2 repeating columns to feel the update, insert, and delete anomalies firsthand, then split the table step by step through 1NF / 2NF / 3NF into the customer / order_record / order_line 3-tier schema.

The dataset for this article — orders_flat (an unnormalized example)

Normalization (a design technique that splits tables to reduce duplicated and inconsistent data) is the idea of breaking up a single giant table that holds everything into multiple tables that are easier to keep consistent.

In this article you'll see the problems that occur in an unnormalized table with repeating columns, then build up an understanding of 1NF / 2NF / 3NF and the 3-tier schema through hands-on design exercises.

The dataset is orders_flat (2 rows), which crams order data into a single table.

It's a classic unnormalized table where customer name, phone number, product 1, and product 2 are all lined up in one row — and you'll normalize it into the three tables customer / order_record / order_line.

Before jumping into the exercises, take a look at the column definitions and sample data of the orders_flat table.

① Run PRAGMA table_info(orders_flat); to check the column names and types. Notice how the product columns appear as repeating columns like item1 / item2.

② Run SELECT * FROM orders_flat; to preview all rows. Also observe how rows with only one product end up with an empty string or NULL in item2.

SQL Editor

Run a query to see results

Three anomalies in an unnormalized table

A design like orders_flat that crams everything into a single table easily produces contradictions when you add, change, or delete data.

The classic examples are the three: update anomaly / insert anomaly / delete anomaly.

An update anomaly happens when the same fact is duplicated across many rows — for example, if you want to change a phone number, you have to update every row without missing any.

An insert anomaly is when you can't register a customer who hasn't placed an order yet, because there's no order row to attach the customer information to.

A delete anomaly is when deleting a single order also wipes out information you wanted to keep, like the customer's phone number.

These happen because the design doesn't hold "one fact in one place".

Three anomalies in an unnormalized table
orders_flat(customer + order + product in one)Update anomalyInsert anomalyDelete anomalyPhone number duplicated across rowsMissing one update causes mismatchCan't register a customerwith no ordersDeleting an order alsowipes the customer info
Cramming customers, orders, and products into one table causes contradictions or information loss on update, insert, and delete. That's the motivation for splitting tables via normalization.
-- Delete anomaly example: deleting order 2 also wipes Bob's contact info
-- (observe the behavior on a throwaway table flat_demo so the real orders_flat is untouched)
DROP TABLE IF EXISTS flat_demo;
CREATE TABLE flat_demo(
  order_id INTEGER, customer TEXT, customer_tel TEXT, product TEXT
);
INSERT INTO flat_demo VALUES
  (1,'Alice','03-1111','Pen'),
  (2,'Bob','03-2222','Clip');

-- Bob has only one order. Deleting it also loses Bob's phone number 03-2222
DELETE FROM flat_demo WHERE order_id = 2;
SELECT * FROM flat_demo;
-- -> Bob's contact info is now gone from the database (delete anomaly)

Reproduce on a throwaway table the situation where "Alice's phone number has changed and you want to update it, but the same phone number is duplicated across multiple rows." This exercise stands on its own — you'll create a dedicated table flat_anomaly to keep it self-contained. (If you run it correctly, the explanation will appear.)

DROP TABLE IF EXISTS the flat_anomaly table, then CREATE it with 4 columns: order_id / customer / customer_tel / product.

② Insert 2 rows for Alice's orders (both with the same phone number 03-1111; products are Pen and Note).

③ Run an UPDATE that changes Alice's phone number to 03-9999 but only hits one row (e.g. with a condition like WHERE order_id = 1), then run SELECT * FROM flat_anomaly; and confirm that the two rows' phone numbers don't match.

SQL Editor

Run a query to see results

1NF through 3NF — split step by step

Tables are organized through normalization in stages.

1NF — get rid of repeating columns

Drop numbered repeating columns like item1 / item2.

Before — item1 / item2 repeating columns

order_idcustomertelitem1item1_qtyitem2item2_qty
1Alice03-1111Pen2Note1
2Bob03-2222Clip5(empty)NULL

After — one row per product, with the composite primary key `(order_id, product)`

order_id (PK)customertelproduct (PK)qty
1Alice03-1111Pen2
1Alice03-1111Note1
2Bob03-2222Clip5

2NF — move columns that depend on only part of the primary key into another table

The table from the previous step has the composite primary key (order_id, product). But look closely: customer and tel are determined by `order_id` alone (the same order_id always has the same customer).

Moving these columns that depend on only part of the primary key (partial dependency) into a separate table is 2NF. You split it into an order header order_header and order details order_line.

Before — customer and tel duplicated on every row

order_id (PK)customertelproduct (PK)qty
1Alice03-1111Pen2
1Alice03-1111Note1
2Bob03-2222Clip5

After — split into 2 tables

order_header (one row per order)

order_id (PK)customertel
1Alice03-1111
2Bob03-2222

order_line (composite primary key (order_id, product), where order_id is a foreign key into order_header)

order_id (PK,FK)product (PK)qty
1Pen2
1Note1
2Clip5

3NF — move columns determined by non-key columns into yet another table

Looking at the order_header from the previous step, tel is not determined by order_id but by `customer` (the same Alice always has 03-1111).

Eliminating dependencies that go through another non-key column instead of straight from the primary key (transitive dependency: order_id -> customer -> tel) is what 3NF is about.

You extract a customer master customer and leave order_record holding only the foreign key to the customer.

Before — tel depends on customer

order_id (PK)customertel
1Alice03-1111
2Bob03-2222

After — split into 3 tables

customer (customer master)

customer_id (PK)customer_namecustomer_tel
1Alice03-1111
2Bob03-2222

order_record (order header: references customer via foreign key)

order_id (PK)customer_id (FK)
11
22

order_line (order details: same as the previous step)

order_id (PK,FK)product (PK)qty
1Pen2
1Note1
2Clip5

Once split this far, each fact (a customer's phone number, the link between order and customer, the quantity of a product in an order) lives in exactly one place. That eliminates the update, insert, and delete anomalies.

Table structure before normalization vs. after 3NF
Before normalization1 tableAfter 3NF3 tablesorders_flatorder_id / customer / telitem1 / item1_qtyitem2 / item2_qtycustomercustomer_id (PK)customer_namecustomer_telorder_recordorder_id (PK)customer_id (FK)order_lineorder_id (FK), product (PK)qtyPK(order_id, product)splitFKFK
The orders_flat table crammed into one is broken up into three tables: customer master, order header, and order details. The original information can be reconstructed with a JOIN through the foreign keys.

The 3-tier schema and the final form — customer / order_record / order_line

The normalized schema can be organized as three tiers with different roles.

  • Master tier: entities that change infrequently and are referenced by others, like customer
  • Transaction tier: tables that record business events, like order_record
  • Detail tier: tables that hold the breakdown rows of one event, like order_line

By pointing the referencing side at the master with a foreign key, each fact (such as a customer's phone number) ends up in one place.

The final form is these three tables:

customer(customer_id primary key, customer_name, customer_tel), order_record(order_id primary key, customer_id -> customer), order_line(order_id -> order_record, product, qty, composite primary key (order_id, product)).

The phone number lives in just one row of customer, so an update is a single row; a customer with no orders can still be added to customer (insert anomaly resolved); and deleting an order leaves the customer in customer (delete anomaly resolved).

3-tier schema — master / transaction / detail
Master tiercustomer(customer_id PK, name, tel)Transaction tierorder_record(order_id PK, customer_id FK)Detail tierorder_line(order_id FK, product, qty, PK(order_id,product))order_record referencescustomerorder_line referencesorder_recordOne fact in one placefixes all 3 anomalies
Customers live in the master tier, orders in the transaction tier, and order details in the detail tier, linked together with foreign keys. Each fact ends up in one place and all three anomalies are resolved.
-- Benefit of normalization (read-only): a customer's phone number is in just one row of customer
-- In the orders_flat shape you'd have to fix every Alice order row = update anomaly
UPDATE customer SET customer_tel = '03-9999'
WHERE customer_name = 'Alice';

-- Fix one spot, and the new number flows into every order on JOIN
SELECT o.order_id, c.customer_name, c.customer_tel
FROM order_record o
JOIN customer c ON o.customer_id = c.customer_id
ORDER BY o.order_id;

Imagine the requirement: "Normalize orders_flat into three tables — customer, order, and order details." Design the final schema with foreign keys linking them, move the data over, and rebuild the original list with a JOIN.

① Turn on foreign key checks, then DROP TABLE IF EXISTS in the order order_line -> order_record -> customer (dropping the child first is safer).

CREATE the customer table (customer_id INTEGER primary key / customer_name TEXT NOT NULL / customer_tel TEXT).

CREATE the order_record table (order_id INTEGER primary key / customer_id INTEGER NOT NULL, foreign key into customer).

CREATE the order_line table (order_id INTEGER, foreign key into order_record / product TEXT NOT NULL / qty INTEGER NOT NULL / composite primary key (order_id, product)).

⑤ Insert Alice(1, 03-1111) and Bob(2, 03-2222) into customer; order 1 -> customer 1 and order 2 -> customer 2 into order_record; Pen(order 1, 2) / Note(order 1, 1) / Clip(order 2, 5) into order_line.

⑥ Join the 3 tables on order_id and customer_id and select order_id / customer_name / customer_tel / product / qty in ascending order of order_id and product to confirm you can reconstruct the original list.

SQL Editor

Run a query to see results

Using the customer / order_record / order_line tables from Exercise 2, confirm one by one that the update, insert, and delete anomalies that plagued the unnormalized table no longer happen. This is the last exercise of the article.

Update anomaly resolved: Change Alice's phone number to '03-9999'. Confirm it's a single-row UPDATE on the customer table.

Insert anomaly resolved: INSERT a new customer Charlie (customer_id 3 / customer_name 'Charlie' / customer_tel '03-3333') into customer. Confirm you can register a customer with no orders.

Delete anomaly resolved: Delete order 1. Delete from the child order_line first, then from order_record. Run SELECT * FROM customer ORDER BY customer_id; and confirm Alice's row is still in customer.

SQL Editor

Run a query to see results

Tips — Always put a primary key on every table

Whether or not you've normalized, design every table with a primary key (`PRIMARY KEY`). A primary key gives you three things:

- Unique row identification: no two rows can have the same key, so UPDATE / DELETE reliably targets exactly one row via WHERE pk = ...

- Referenceable from other tables: a foreign key needs a primary key (or UNIQUE) on the other side. It's the foundation for linking the tables you split via normalization

- Automatic indexing: primary key columns get a B-Tree index automatically, speeding up equality lookups and JOINs

If no single column is unique on its own, use a composite primary key like PRIMARY KEY (order_id, product). Skipping it with "I'll add it later" makes duplicate rows indistinguishable and gives up all three benefits above.

QUIZ

Knowledge Check

Answer each question one by one.

Q1A customer's phone number is duplicated across multiple order rows, so changing the phone number requires updating every row without missing any — otherwise the values disagree. What is this problem called?

Q2Which operation gets a table into 1NF?

Q3After normalization, orders_flat is split into the three tables customer / order_record / order_line. How do you get the original combined list back?