Q1What does SELECT * FROM customer WHERE email = NULL; return?
WHERE Deep Dive ③ — IS NULL and IN for NULL and Set Filtering
The third WHERE deep-dive article — filtering on NULL and on sets. Cover three-valued logic, IS NULL / IS NOT NULL, IN / NOT IN, and the NOT IN + NULL gotcha, all on a CSV-loaded customer dataset.
Data we'll use — the customer table (with NULLs)
The third WHERE deep-dive article covers NULL and set-based filtering. We'll work with a customer table (8 rows) loaded from CSV, where the age / email / country columns each have rows with NULL (unknown value). In real-world tables it's normal for non-required columns to mix in NULLs — "email not registered," "country left blank," and so on.
The five exercises in this article walk through, in order: the right way to fetch NULLs (IS NULL / IS NOT NULL), candidate-list matching (IN / NOT IN), and finally the easy-to-miss gotcha when using NOT IN against data that contains NULLs.
NULL and three-valued logic — true, false, unknown
In most programming languages a condition is either TRUE or FALSE — two choices. SQL adds one more state: NULL (unknown). This is called three-valued logic. NULL doesn't mean "the value is empty"; it means "the value is not known." Even NULL = NULL evaluates to NULL (unknown) — comparing two unknowns leaves the result unknown.
WHERE condition only keeps rows where the condition is TRUE. Both NULL and FALSE rows get excluded. Once that rule clicks, you'll see why WHERE email = NULL always comes back empty.
Why `email = NULL` doesn't return anything
The comparison email = NULL always returns NULL (unknown) when either side is NULL. 'unknown' = NULL and NULL = NULL both evaluate to NULL. WHERE only keeps rows where the condition is TRUE, so NULL conditions get excluded, leaving 0 rows.
To fetch NULLs, you need the dedicated syntax column IS NULL / column IS NOT NULL. These don't do equality comparison — they directly check whether the value is NULL — and the result is always TRUE or FALSE.
"Zero customers haven't churned"?
When pulling "the count of customers who haven't churned (deleted_at IS NULL)," I once wrote SELECT COUNT(*) FROM user WHERE deleted_at = NULL; and the result came back as 0.
In reality there were thousands of active users, but deleted_at = NULL always evaluates to NULL (unknown) and gets excluded by WHERE, so no matter how many users existed the result was always 0. It's an easy mistake when you're starting out with SQL — make a habit of using `IS NULL` / `IS NOT NULL` for every NULL check.
IS NULL and IS NOT NULL — fetching unknown values
To fetch rows where a value is NULL (unknown), use `column IS NULL`. To fetch rows that are not NULL, use column IS NOT NULL. Just remember never use `= NULL` or `<> NULL` and you'll dodge the three-valued-logic trap.
As the code below shows, the syntax is just IS instead of = — but the result is completely different. = NULL returns nothing; IS NULL correctly returns the NULL rows.
-- 1) NG: = NULL doesn't fetch anything (always NULL = excluded)
SELECT name, email FROM customer WHERE email = NULL;
-- Result: 0 rows
-- 2) OK: IS NULL fetches the NULL rows
SELECT name, email FROM customer WHERE email IS NULL;
-- Result: 4 rows (Bob / Dave / Frank / Henry)
-- 3) Fetch the non-NULL rows
SELECT name, email FROM customer WHERE email IS NOT NULL;
-- Result: 4 rows (Alice / Carol / Eve / Grace)
IN and NOT IN — matching against a candidate list
When you want to test whether a column's value is in a list of candidates — "country is Japan, US, or UK" — that's IN. Writing column IN (value1, value2, ...) returns rows where the column matches any of the candidates. It means the same as column = value1 OR column = value2 OR column = value3; people pick IN often just for readability.
Adding NOT IN returns rows that don't match any candidate.
-- 1) IN: matches any candidate
SELECT name, country FROM customer
WHERE country IN ('Japan', 'US');
-- Same meaning, written with OR
SELECT name, country FROM customer
WHERE country = 'Japan' OR country = 'US';
-- 2) NOT IN: matches none of the candidates
SELECT name, country FROM customer
WHERE country NOT IN ('Japan');
The `NOT IN` + NULL gotcha
NOT IN ('Japan') looks like it grabs "customers whose country is not Japan," but it also excludes rows where `country` is NULL. That's because country NOT IN ('Japan') internally evaluates as country <> 'Japan' (not equal), and NULL <> 'Japan' returns NULL (unknown). NULL isn't TRUE, so WHERE drops the row.
Taking it further, writing NOT IN ('Japan', NULL) — with a NULL inside the candidate list — causes every row to evaluate to NULL and disappear. When using NOT IN against a column that might contain NULL, always make NULL handling explicit (column NOT IN (...) AND column IS NOT NULL) or rewrite the query in terms of IN.
Knowledge Check
Answer each question one by one.
Q2Which of the following correctly describes filtering with IN?
Q3Running WHERE country NOT IN ('Japan') on a customer table where the country column contains NULLs, how are the NULL-country rows handled?