Index Design — Composite Column Order and Conditions That Use the Index
Compare composite index column order, the leftmost prefix rule, the equality-then-range ordering, and how wrapping a column in a function falls back to SCAN, all read out of EXPLAIN QUERY PLAN.
The dataset for this article — perf_sales (50,000 rows)
From here on, the tuning articles use a table with many rows to observe how indexes actually behave.
This article covers the column order of a composite index (an index built across several columns in a specific order), the leftmost prefix rule, and how to write conditions that can use the index (also called SARGable conditions).
Before the practice, take a quick look at the column definitions and the size and a sample of the data for perf_sales. (Run the queries correctly and the explanation will be revealed.)
① Use PRAGMA table_info(perf_sales); to view the column names, types, and primary key.
② Use SELECT COUNT(*) FROM perf_sales; to confirm the row count.
③ Preview the first 5 rows with SELECT emp_id, region, product, amount, sale_date, status FROM perf_sales LIMIT 5; (since the table has 50,000 rows, never pull every row with SELECT * — use aggregation or LIMIT to peek).
SQL Editor
Run a query to see results
Schema
No tables
Composite index column order — the leftmost prefix rule
A composite index (such as CREATE INDEX ix ON table(col_A, col_B), listing several columns in a fixed order) stores its entries in exactly that order.
The index is sorted like a dictionary: by col_A ascending, and within the same col_A by col_B ascending.
Because of this, whether the index helps depends on whether your conditions cover the columns continuously from the leading one.
This is called the leftmost prefix rule (a composite index only helps conditions specified continuously from the leading column).
An index on (emp_id, sale_date) works for a condition on emp_id, or on both emp_id and sale_date together, but not for a condition on sale_date alone — the leading emp_id is missing, so the index isn't used.
Leftmost prefix rule — the (emp_id, sale_date) indexThe composite index (emp_id, sale_date) is sorted by emp_id first. Conditions that include the leading emp_id can use the index; skipping emp_id and filtering on sale_date alone leaves the leading column missing, so the index doesn't kick in.
-- Build a composite index on region and product, then compare-- the plans for a condition that includes the leading region-- with one that uses product aloneDROPINDEXIFEXISTS ix_demo;CREATEINDEXix_demoON perf_sales(region, product);-- Uses the leading column region -> ix_demo kicks inEXPLAIN QUERY PLANSELECTCOUNT(*) FROM perf_sales WHERE region ='East';-- Skips the leading region and filters by product only -> ix_demo doesn't kick inEXPLAIN QUERY PLANSELECTCOUNT(*) FROM perf_sales WHERE product ='P050';
Build a composite index on emp_id and sale_date over perf_sales and confirm the leftmost prefix rule by reading the execution plan. Recreate the index and show the plans inside a single console run so the steps are self-contained. (Run the queries correctly and the explanation will be revealed.)
① Start with DROP INDEX IF EXISTS so the index can be rebuilt even if a same-named one is left over.
② Create a composite index on perf_sales listing emp_id and sale_datein that order.
③ Use EXPLAIN QUERY PLAN to show the plan for an aggregate query filtered by emp_id alone.
④ Then show the plan for an aggregate query filtered by sale_date alone and compare how the plan changes from ③.
SQL Editor
Run a query to see results
Schema
No tables
List columns in equality-then-range order
There's a guideline for the order of columns in a composite index.
Put equality conditions (a = or IN that pins a column to one value) at the front, and range conditions (>, >=, <, BETWEEN, etc., that cover a span) at the back.
Pinning the leading column with equality keeps the matching entries in one contiguous block within the index, so you can carve that block down all at once.
A condition like WHERE emp_id = 7 AND sale_date >= '2024-01-01' is a natural fit for the (emp_id, sale_date) order.
How equality-first, range-second filtering worksThe (emp_id, sale_date) index is sorted by emp_id, and within the same emp_id by sale_date. (1) Pin the equality condition emp_id=7 to a contiguous block, then (2) narrow within that block using the range condition sale_date >= '2024-01-01'.Order the columns: equality, then rangePut the equality column emp_id first and the matches collapse into one contiguous block, which the range condition on sale_date can carve down further. Lead with the range column and the index opens up too wide, so the next column's filtering loses traction.
-- Index region (equality) and amount (range) together,-- with the equality column first and the range column secondDROPINDEXIFEXISTS ix_demo;CREATEINDEXix_demoON perf_sales(region, amount);EXPLAIN QUERY PLANSELECTCOUNT(*) FROM perf_salesWHERE region ='West'AND amount >=500000;
Picture the requirement: "We want to roll up sales for a specific rep from a given date onward." The conditions are equality on emp_id and a range on sale_date. Build a composite index over these two columns in equality-then-range order and confirm with the execution plan that it turns into an index seek.
① Run DROP INDEX IF EXISTS to drop any previous index.
② Create a composite index with the equality column first and the range column second.
③ Use EXPLAIN QUERY PLAN on an aggregate query that combines the equality on emp_id with the range on sale_date via AND, and confirm that it becomes an index seek.
SQL Editor
Run a query to see results
Schema
No tables
Conditions that can use the index — compare the column as-is, not wrapped
Even with an index in place, the way you write the condition decides whether it gets used.
A condition shaped so the index can narrow it down is called a condition that can use the index (in English this is also known as SARGable).
The basic rule: compare the indexed column as-is, without wrapping it in any computation.
Something like WHERE emp_id = 7, where the column is compared directly by equality or range, lets the index kick in.
On the other hand, WHERE emp_id + 0 = 7 (the column sits inside an expression) or WHERE substr(sale_date,1,4) = '2024' (the column is wrapped in a function) breaks the correspondence with the index's ordering and falls back to a full row scan.
Keep the column raw on the left side of the comparison, and if any adjustment is needed, apply it to the value on the right side.
Conditions that do and don't use the indexCompare the column as-is and the index can narrow it down. Wrap the column in an expression or function and it stops lining up with the index's ordering, falling back to a full row scan.
-- Same meaning, different shapes -> different plansDROPINDEXIFEXISTS ix_demo;CREATEINDEXix_demoON perf_sales(amount);-- Column compared as-is -> index kicks in, SEARCHEXPLAIN QUERY PLANSELECTCOUNT(*) FROM perf_sales WHERE amount >=800000;-- Column inside an expression -> index doesn't kick in, falls back to SCANEXPLAIN QUERY PLANSELECTCOUNT(*) FROM perf_sales WHERE amount + 0>=800000;
Observe how two conditions with the same meaning — one comparing the column as-is and one wrapping it in an expression — produce different execution plans. Build an index on emp_id and compare the plans for the two shapes.
① Run DROP INDEX IF EXISTS to drop the previous index.
② Create a single-column index on emp_id.
③ Show the plan for an aggregate query that compares emp_id as-is with equality.
④ Then show the plan for an aggregate query of the same meaning but with emp_id inside an expression, and compare how the plan changes from ③.
SQL Editor
Run a query to see results
Schema
No tables
QUIZ
Knowledge Check
Answer each question one by one.
Q1Which condition does the composite index (emp_id, sale_date) fail to help?
Q2When picking the column order for a composite index built for a query with both equality and range conditions, which ordering is appropriate?
Q3Which of the following describes a condition that can use the index?