Learn by reading through in order

WHERE Deep Dive ③ — IS NULL and IN for NULL and Set Filtering

The third WHERE deep-dive article — filtering on NULL and on sets. Cover three-valued logic, IS NULL / IS NOT NULL, IN / NOT IN, and the NOT IN + NULL gotcha, all on a CSV-loaded customer dataset.

Data we'll use — the customer table (with NULLs)

The third WHERE deep-dive article covers NULL and set-based filtering. We'll work with a customer table (8 rows) loaded from CSV, where the age / email / country columns each have rows with NULL (unknown value). In real-world tables it's normal for non-required columns to mix in NULLs — "email not registered," "country left blank," and so on.

The five exercises in this article walk through, in order: the right way to fetch NULLs (IS NULL / IS NOT NULL), candidate-list matching (IN / NOT IN), and finally the easy-to-miss gotcha when using NOT IN against data that contains NULLs.

Before the exercises, confirm the customer table's column definitions and a sample of the data.

① Run PRAGMA table_info(customer); to check column names, types, and primary keys.

② Run SELECT * FROM customer LIMIT 5; to preview the first 5 rows. Notice the cells that are NULL.

SQL Editor

Run a query to see results

NULL and three-valued logic — true, false, unknown

In most programming languages a condition is either TRUE or FALSE — two choices. SQL adds one more state: NULL (unknown). This is called three-valued logic. NULL doesn't mean "the value is empty"; it means "the value is not known." Even NULL = NULL evaluates to NULL (unknown) — comparing two unknowns leaves the result unknown.

WHERE condition only keeps rows where the condition is TRUE. Both NULL and FALSE rows get excluded. Once that rule clicks, you'll see why WHERE email = NULL always comes back empty.

SQL three-valued logic
TRUEFALSENULL(unknown)
A condition's evaluation result is one of these three. WHERE keeps a row only when the result is TRUE — FALSE and NULL are both treated as "excluded."
Three-valued logic — TRUE / FALSE / NULL evaluation
ExpressionEvaluates toKept by WHERE?name = 'Alice'(when name is Alice)TRUE○ keptname = 'Bob'(when name is Alice)FALSE× excludedemail = NULL(when email is NULL)NULL× excluded(not TRUE)email IS NULL(when email is NULL)TRUE○ kept
WHERE keeps only rows where the condition is TRUE. Comparisons that involve NULL evaluate to NULL, and NULL is not TRUE, so the row is excluded.

Why `email = NULL` doesn't return anything

The comparison email = NULL always returns NULL (unknown) when either side is NULL. 'unknown' = NULL and NULL = NULL both evaluate to NULL. WHERE only keeps rows where the condition is TRUE, so NULL conditions get excluded, leaving 0 rows.

To fetch NULLs, you need the dedicated syntax column IS NULL / column IS NOT NULL. These don't do equality comparison — they directly check whether the value is NULL — and the result is always TRUE or FALSE.

"Zero customers haven't churned"?

When pulling "the count of customers who haven't churned (deleted_at IS NULL)," I once wrote SELECT COUNT(*) FROM user WHERE deleted_at = NULL; and the result came back as 0.

In reality there were thousands of active users, but deleted_at = NULL always evaluates to NULL (unknown) and gets excluded by WHERE, so no matter how many users existed the result was always 0. It's an easy mistake when you're starting out with SQL — make a habit of using `IS NULL` / `IS NOT NULL` for every NULL check.

IS NULL and IS NOT NULL — fetching unknown values

To fetch rows where a value is NULL (unknown), use `column IS NULL`. To fetch rows that are not NULL, use column IS NOT NULL. Just remember never use `= NULL` or `<> NULL` and you'll dodge the three-valued-logic trap.

As the code below shows, the syntax is just IS instead of = — but the result is completely different. = NULL returns nothing; IS NULL correctly returns the NULL rows.

How `= NULL` and `IS NULL` differ
WHERE email = NULLWHERE email IS NULLNULL = NULL→ NULLemail IS NULL→ TRUENULL is not TRUE,so always excludedTRUE rows are kept,so NULL rows come throughResult: 0 rowsResult: 4 rows(Bob / Dave / Frank / Henry)
= NULL always evaluates to NULL (unknown), so WHERE returns 0 rows. IS NULL directly checks for NULL and returns either TRUE or FALSE.
-- 1) NG: = NULL doesn't fetch anything (always NULL = excluded)
SELECT name, email FROM customer WHERE email = NULL;
-- Result: 0 rows

-- 2) OK: IS NULL fetches the NULL rows
SELECT name, email FROM customer WHERE email IS NULL;
-- Result: 4 rows (Bob / Dave / Frank / Henry)

-- 3) Fetch the non-NULL rows
SELECT name, email FROM customer WHERE email IS NOT NULL;
-- Result: 4 rows (Alice / Carol / Eve / Grace)

Imagine customer support wants "a list of customers without a registered email so we can send them a registration request." (Once you run it correctly, the explanation will appear.)

① From customer, pull the name and email columns.

② Filter to rows where `email` is NULL.

③ Confirm the result is 4 rows (Bob / Dave / Frank / Henry, all with NULL email).

SQL Editor

Run a query to see results

Imagine targeted-ad delivery wants "customers with both age and email registered," sorted by highest age first.

① From customer, pull the name, age, and email columns.

② Filter to rows where neither `age` nor `email` is NULL.

③ Sort by `age` descending.

④ Confirm the result is 3 rows (Eve 42 / Grace 35 / Alice 30).

SQL Editor

Run a query to see results

IN and NOT IN — matching against a candidate list

When you want to test whether a column's value is in a list of candidates — "country is Japan, US, or UK" — that's IN. Writing column IN (value1, value2, ...) returns rows where the column matches any of the candidates. It means the same as column = value1 OR column = value2 OR column = value3; people pick IN often just for readability.

Adding NOT IN returns rows that don't match any candidate.

How `IN` works — matching against a candidate list
country valueIN ('Japan', 'US')Kept by WHERE?JapanTRUE○ keptUSTRUE○ keptUKFALSE× excludedItalyFALSE× excludedNULLNULL(unknown)× excluded(NULL is not TRUE)
IN tests whether the value matches any of the candidates. Rows where the column itself is NULL evaluate to NULL under both IN and NOT IN, so they get excluded — for NULL-containing data, combine with IS NULL / IS NOT NULL.
-- 1) IN: matches any candidate
SELECT name, country FROM customer
WHERE country IN ('Japan', 'US');

-- Same meaning, written with OR
SELECT name, country FROM customer
WHERE country = 'Japan' OR country = 'US';

-- 2) NOT IN: matches none of the candidates
SELECT name, country FROM customer
WHERE country NOT IN ('Japan');

Imagine a need to "send an announcement email to customers in Japan or the US."

① From customer, pull the name and country columns.

② Filter to rows where `country` is `'Japan'` or `'US'` (use IN).

③ Confirm the result is 5 rows (Alice Japan / Bob US / Carol Japan / Frank Japan / Grace US).

SQL Editor

Run a query to see results

The `NOT IN` + NULL gotcha

NOT IN ('Japan') looks like it grabs "customers whose country is not Japan," but it also excludes rows where `country` is NULL. That's because country NOT IN ('Japan') internally evaluates as country <> 'Japan' (not equal), and NULL <> 'Japan' returns NULL (unknown). NULL isn't TRUE, so WHERE drops the row.

Taking it further, writing NOT IN ('Japan', NULL) — with a NULL inside the candidate list — causes every row to evaluate to NULL and disappear. When using NOT IN against a column that might contain NULL, always make NULL handling explicit (column NOT IN (...) AND column IS NOT NULL) or rewrite the query in terms of IN.

Imagine a need to "offer a premium perk to the 2 oldest non-Japan customers."

① From customer, pull the name, age, and country columns.

② Filter to rows where `country` is not `'Japan'` (NOT IN).

③ Sort by `age` descending, and take only the first 2 rows.

④ Confirm the result is 2 rows (Grace 35 US / Henry 29 Italy). The reason Eve (country NULL) is excluded will be explained in the result section.

SQL Editor

Run a query to see results

Building on the gotcha from Practice 4, imagine you want to grab both "customers confirmed to not live in Japan" and "customers whose country is unknown."

① From customer, pull the name, age, and country columns.

② Include rows where `country` is not `'Japan'` plus rows where `country` is NULL (joined with OR).

③ Sort by `age` descending.

④ Confirm the result is 5 rows (Eve 42 NULL / Grace 35 US / Henry 29 Italy / Dave 28 UK / Bob 25 US).

SQL Editor

Run a query to see results
QUIZ

Knowledge Check

Answer each question one by one.

Q1What does SELECT * FROM customer WHERE email = NULL; return?

Q2Which of the following correctly describes filtering with IN?

Q3Running WHERE country NOT IN ('Japan') on a customer table where the country column contains NULLs, how are the NULL-country rows handled?