Q1Two employees are tied at 5th place with the same total. What rank does the next person get under RANK() vs. DENSE_RANK()?
Window Functions ③ — Ranking, Analytic Functions, and Execution Order
Walk through how ROW_NUMBER / RANK / DENSE_RANK handle ties, how LAG / LEAD pull values from the previous / next row, how NTILE(4) splits rows into n groups, and why you have to use a subquery to filter on a rank — all with sales data.
Ranking — ROW_NUMBER / RANK / DENSE_RANK
The dedicated ranking functions are ROW_NUMBER() / RANK() / DENSE_RANK().
All three number rows in the order given by OVER (ORDER BY ...), but they handle ties (equal values) differently.
| Function | How ties (equal values) are handled |
|---|---|
ROW_NUMBER() | Forces consecutive numbering even on ties (always 1, 2, 3, ...) |
RANK() | Ties share the same rank, then the next rank skips by however many were tied (2 tied at 5th → next is 7) |
DENSE_RANK() | Ties share the same rank, but the next rank doesn't skip (the row after a tie at 5th is 6) |
The SQL below — and the diagram that visualizes its result — walks through a concrete case with tied totals row by row.
-- Compute each employee's total sales, then rank in descending order of total in 3 ways
SELECT emp_id,
SUM(amount) AS total,
ROW_NUMBER() OVER (ORDER BY SUM(amount) DESC) AS rn,
RANK() OVER (ORDER BY SUM(amount) DESC) AS rk,
DENSE_RANK() OVER (ORDER BY SUM(amount) DESC) AS dr
FROM sales
GROUP BY emp_id
ORDER BY total DESC;
-- You can first aggregate with GROUP BY and then rank those aggregates
LAG / LEAD — pull values from the previous / next row
LAG(column) pulls the value from the previous row onto the current row, and LEAD(column) pulls from the next row (lag = to fall behind, lead = to go ahead).
OVER (PARTITION BY ... ORDER BY ...) decides what "previous" and "next" mean by setting the order.
Use them for period-over-period diffs (vs. previous sale), vs.-last-month deltas, or comparisons against the next scheduled event.
LAG(amount) returns NULL on the first row of each partition (there's no previous row); similarly LEAD is NULL on the last row.
The optional second argument controls offset ("how many rows back") and the third sets a default value when the target row is missing — e.g., LAG(amount, 1, 0).
Subtract with amount - LAG(amount) OVER (...) to get the diff vs. the previous row directly.
-- Show previous sale (LAG) and the diff vs. the previous sale
SELECT emp_id, sale_date, amount,
LAG(amount) OVER (PARTITION BY emp_id ORDER BY sale_date) AS prev_amount,
amount - LAG(amount) OVER (PARTITION BY emp_id ORDER BY sale_date) AS diff_prev
FROM sales
ORDER BY emp_id, sale_date;
-- Use LEAD if you want to show the next sale alongside the current row
SELECT emp_id, sale_date, amount,
LEAD(amount) OVER (PARTITION BY emp_id ORDER BY sale_date) AS next_amount
FROM sales
ORDER BY emp_id, sale_date;
NTILE — split ordered rows into n evenly sized groups
NTILE(n) splits the rows ordered by OVER (ORDER BY ...) into n as-evenly-sized-as-possible groups, labeling each row with a group number from 1 to n.
Use it for quartiles, quintiles, "top 25%", and similar bucketing.
When the row count isn't divisible by n, the earlier groups get one extra row each (e.g., splitting 5 rows into 4 groups gives 2 / 1 / 1 / 1).
Combine with PARTITION BY to do things like "split each employee's sales into upper / lower buckets within that employee."
The diagram below shows what happens when emp_id=11's 5 sales are sorted by amount desc and split with NTILE(4).
-- Per employee, split that employee's sales into 4 groups by amount desc
SELECT emp_id, sale_date, amount,
NTILE(4) OVER (PARTITION BY emp_id ORDER BY amount DESC) AS quartile
FROM sales
ORDER BY emp_id, amount DESC;
-- Split all sales into 4 buckets by amount (top 25% has quartile = 1)
SELECT sale_id, emp_id, amount,
NTILE(4) OVER (ORDER BY amount DESC) AS quartile
FROM sales
ORDER BY amount DESC;
SQL execution order — why you can't put a window function in WHERE
You might wonder, "can't I just filter on the rank I computed, like WHERE rk = 1?"
You'd get an error. The reason is SQL's logical execution order.
The order you write the clauses (SELECT → FROM → WHERE ...) is not the order they're evaluated.
Evaluation roughly goes: FROM / JOIN (assemble tables) → WHERE (filter rows) → GROUP BY (group) → HAVING (filter groups) → SELECT (compute columns — this is where window functions run) → ORDER BY (sort) → LIMIT (truncate).
Since WHERE runs before SELECT, it can't see the rank that SELECT is about to compute.
To filter on a rank, push the query that computes the window function into a subquery (or CTE) and apply WHERE on the outside.
The inner SELECT finishes computing the window function, and the outer query can then treat that result as just another column.
-- This errors: WHERE runs before SELECT, so rk doesn't exist yet
-- SELECT emp_id, RANK() OVER (ORDER BY SUM(amount) DESC) AS rk
-- FROM sales GROUP BY emp_id WHERE rk <= 3;
-- Correct: compute the rank in an inner subquery, filter on the outside
SELECT * FROM (
SELECT emp_id,
SUM(amount) AS total,
RANK() OVER (ORDER BY SUM(amount) DESC) AS rk
FROM sales
GROUP BY emp_id
) AS ranked
WHERE rk <= 3
ORDER BY rk, emp_id;
Knowledge Check
Answer each question one by one.
Q2What does LAG(amount) OVER (PARTITION BY emp_id ORDER BY sale_date) return on the first sale row of each employee?
Q3Why does SELECT emp_id, RANK() OVER (ORDER BY SUM(amount) DESC) AS rk FROM sales GROUP BY emp_id WHERE rk <= 3; not work?