Learn by reading through in order

EXISTS and Correlated Subqueries

Learn SQL EXISTS and correlated subqueries: EXISTS / NOT EXISTS, how to safely avoid the NOT IN + NULL pitfall with EXISTS, and when to pick EXISTS over IN — practiced on employee and sales data, all running live in your browser.

Data used in this article — employee / sales / department

EXISTS is a subquery that checks only whether at least one matching row exists, and you use it much like IN.

In this article you'll work through the basics of EXISTS / NOT EXISTS, how to safely avoid the NOT IN + NULL pitfall by writing it with EXISTS, and how to choose between EXISTS and IN based on readability.

The material is three employee-data tables.

Using employee (30 employees), sales (50 sales line items), and department (6 departments), you'll pull out "employees with / without sales" and "departments with no employees".

Before diving into the exercises, take a look at the column definitions and sample data of the three tables this article uses — employee / sales / department.

① Run PRAGMA table_info(...) to check the column definitions of the three tables.

② Run SELECT * FROM table_name LIMIT 5; to preview the first 5 rows of each table. The fact that employee.dept_id can be NULL also comes up in a later exercise.

SQL Editor

Run a query to see results

EXISTS / NOT EXISTS — just check whether a row exists

WHERE EXISTS (SELECT ... ) is true if the subquery returns even one row, and false if it returns no rows at all.

What the subquery selects doesn't affect the result, so by convention you write SELECT 1 (the value isn't used — you only care whether a row exists).

The basic form is a correlated subquery that references a column from the main query inside the EXISTS.

"Employees with sales" is WHERE EXISTS (SELECT 1 FROM sales sale WHERE sale.emp_id = emp.emp_id), and "employees with no sales" is the negation of that, NOT EXISTS.

EXISTS stops evaluating as soon as it finds one row, which makes it a good fit for existence checks.

-- EXISTS: employees with sales
SELECT emp.emp_id, emp.name
FROM employee emp
WHERE EXISTS (
  SELECT 1 FROM sales sale WHERE sale.emp_id = emp.emp_id
)
ORDER BY emp.emp_id;
How EXISTS / NOT EXISTS is evaluated
Employee from main querySales in subqueryVerdictBob(emp.emp_id=2)5 sales rowsfoundEXISTS true→ keptDave(emp.emp_id=4)0 sales rowsNOT EXISTS true→ kept
This is how the sample code above behaves. Every time the main query pulls out one employee, the subquery checks whether that employee has even one sales row (sale.emp_id = emp.emp_id). If there's at least one row, EXISTS is true; if there are no rows at all, NOT EXISTS is true.

Imagine the requirement "I want a list of employees who don't appear in the sales line items even once (employees who haven't booked any sales yet)." (Run it correctly and the explanation appears.)

① Give the employee table the alias e and pull out emp_id and name.

② Use NOT EXISTS to narrow to rows where that employee has no row at all in sales. Correlate the subquery with the main query in the form SELECT 1 FROM sales sale WHERE sale.emp_id = emp.emp_id.

③ Sort by emp_id in ascending order.

SQL Editor

Run a query to see results

The NOT IN + NULL pitfall — EXISTS is safe

If you write "negative filtering" with NOT IN, the result can come back as 0 rows when a NULL gets mixed into the subquery side.

x NOT IN (1, 2, NULL) is evaluated as "x is not 1, not 2, and not NULL," but x <> NULL never becomes true and is always NULL (unknown), so the whole NOT IN never becomes true.

The full reasoning is explained in Subqueries ① (the NOT IN + NULL pitfall).

Since employee.dept_id mixes in NULLs, writing "departments with no employees" as WHERE dept_id NOT IN (SELECT dept_id FROM employee) makes Legal (a department with no employees at all), which should be there, disappear and the result comes back as 0 rows.

NOT EXISTS only looks at whether a row exists and isn't affected by NULL, so in this situation NOT EXISTS returns the correct result.

The NOT IN × NULL pitfall and why NOT EXISTS is safe
NOT IN (weak to NULL)NOT EXISTS (safe)dept_id NOT IN(SELECT dept_id FROM employee)NOT EXISTS(SELECT 1 FROM employee emp WHERE emp.dept_id = d.dept_id)NULL mixed in0 rows(Legal disappears)Returns Legalcorrectly
When a NULL is mixed into the subquery, NOT IN never becomes true and the result comes back as 0 rows. NOT EXISTS only looks at whether a row exists, so it isn't affected by NULL.
-- Pitfall: a subquery containing NULL + NOT IN comes back as 0 rows
SELECT dept_id, dept_name
FROM department
WHERE dept_id NOT IN (SELECT dept_id FROM employee);
-- → 0 rows because employee.dept_id contains NULL

-- Safe: NOT EXISTS isn't affected by NULL
SELECT d.dept_id, d.dept_name
FROM department d
WHERE NOT EXISTS (
  SELECT 1 FROM employee emp WHERE emp.dept_id = d.dept_id
);

Imagine the requirement "I want to list departments with no employees assigned to them at all." Since employee.dept_id mixes in NULLs, write it safely with NOT EXISTS.

① Give the department table the alias d and pull out dept_id and dept_name.

② Use NOT EXISTS to narrow to rows where that department has no employee row at all in employee. Correlate the subquery in the form SELECT 1 FROM employee emp WHERE emp.dept_id = d.dept_id.

③ Sort by dept_id in ascending order.

SQL Editor

Run a query to see results

Check what happens when you write the previous exercise with NOT IN. This exercise is meant to let you experience the mistake firsthand, so run it as NOT IN and observe that the result comes back as 0 rows.

① Pull out dept_id and dept_name from the department table.

② Try to list "departments with no employees" using WHERE dept_id NOT IN (SELECT dept_id FROM employee).

③ Run it and confirm that the result is 0 rows (the Legal department doesn't come out). Think about why the result differs from the NOT EXISTS version in Practice 2.

SQL Editor

Run a query to see results

Choose EXISTS by readability

EXISTS only looks at whether a row exists, so it's safe even when NULL is present, but writing a correlated subquery can make it look longer than IN.

For a simple existence check there are situations where IN reads better, and in real work you choose based on context.

On the other hand, for a negation like NOT IN where NULL breaks the result, or when you want to reliably judge only whether a row exists, EXISTS / NOT EXISTS is the safe choice.

QUIZ

Knowledge Check

Answer each question one by one.

Q1Which is the correct description of WHERE EXISTS (SELECT 1 FROM sales sale WHERE sale.emp_id = emp.emp_id)?

Q2When a NULL can mix into the subquery side, what problem can occur if you write negative filtering with NOT IN?

Q3Which is an appropriate way to decide between EXISTS and IN?