Q1When you add ORDER BY inside OVER, like SUM(amount) OVER (PARTITION BY emp_id ORDER BY sale_date), what value ends up on each row?
Window Functions ② — ORDER BY and Frames (ROWS / RANGE)
Visualize running totals with OVER (... ORDER BY ...), moving sums with ROWS BETWEEN 2 PRECEDING, and the ROWS vs. RANGE difference on tied rows — all explored with sales data and diagrams.
Windows with ORDER BY — producing a running total
Add ORDER BY inside OVER and the aggregation target shifts to "from the start of the window up to the current row" — the value grows row by row, producing a running total (the sum from the start up through the current row).
The shape is SUM(amount) OVER (PARTITION BY emp_id ORDER BY sale_date).
PARTITION BY emp_id splits the window per employee, ORDER BY sale_date arranges each employee's sales by date, and then the running total is computed per employee.
The diagram below traces how the running total grows for emp_id=29's 5 sales in date order.
-- Without ORDER BY: each row gets the employee's unchanging total
SELECT emp_id, sale_date, amount,
SUM(amount) OVER (PARTITION BY emp_id) AS emp_total
FROM sales
ORDER BY emp_id, sale_date;
-- With ORDER BY: per-employee running total from the start through the current row
SELECT emp_id, sale_date, amount,
SUM(amount) OVER (PARTITION BY emp_id ORDER BY sale_date) AS running_total
FROM sales
ORDER BY emp_id, sale_date;
The frame — which rows actually get aggregated inside the window
When you add ORDER BY to a window function, the range of rows that each row actually aggregates over is called the frame.
If you don't write the frame explicitly, the default is RANGE BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW (from the start of the window through the current row).
That's exactly why adding ORDER BY in the previous exercise gave you a running total — this default frame is doing the work.
-- A: frame omitted (default = RANGE UNBOUNDED PRECEDING ... CURRENT ROW)
SELECT emp_id, sale_date, amount,
SUM(amount) OVER (PARTITION BY emp_id ORDER BY sale_date) AS run_default
FROM sales WHERE emp_id = 14
ORDER BY sale_date;
-- B: the default frame written explicitly (same result as A)
SELECT emp_id, sale_date, amount,
SUM(amount) OVER (
PARTITION BY emp_id ORDER BY sale_date
ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW
) AS run_explicit
FROM sales WHERE emp_id = 14
ORDER BY sale_date;
Frames are built from these 5 keywords, used to say "from where" and "to where" relative to the current row.
| Keyword | Meaning |
|---|---|
UNBOUNDED PRECEDING | All the way back to the start of the window |
n PRECEDING | n rows before the current row |
CURRENT ROW | The current row itself |
n FOLLOWING | n rows after the current row |
UNBOUNDED FOLLOWING | All the way forward to the end of the window |
Default (with ORDER BY) | RANGE BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW |
ROWS BETWEEN n PRECEDING — moving sums and moving averages
Set the frame to ROWS BETWEEN n PRECEDING AND CURRENT ROW and the aggregation narrows to "the last n rows + the current row".
ROWS BETWEEN 2 PRECEDING AND CURRENT ROW covers the most recent 3 rows (including the current one), so SUM becomes a moving sum and AVG becomes a moving average.
This is the bread-and-butter shape for smoothing out a recent trend.
Near the start of the window the frame has fewer rows to draw from (row 1 has 1 row, row 2 has 2 rows, row 3 onward has 3 rows).
The diagram below walks through emp_id=14's 5 sales row by row, showing what ends up in the frame at each "current row".
-- Moving average of the most recent 3 sales (prev 2 rows + current)
SELECT emp_id, sale_date, amount,
AVG(amount) OVER (
PARTITION BY emp_id ORDER BY sale_date
ROWS BETWEEN 2 PRECEDING AND CURRENT ROW
) AS moving_avg3
FROM sales WHERE emp_id = 2
ORDER BY sale_date;
-- Moving sum of the most recent 2 sales (prev 1 row + current) uses the same shape
SELECT emp_id, sale_date, amount,
SUM(amount) OVER (
PARTITION BY emp_id ORDER BY sale_date
ROWS BETWEEN 1 PRECEDING AND CURRENT ROW
) AS moving_sum2
FROM sales WHERE emp_id = 2
ORDER BY sale_date;
ROWS vs. RANGE — how tied rows get handled
Frames come in two counting modes: ROWS and RANGE.
ROWS counts physical rows ("the previous 2 rows" means exactly 2 rows). Even with tied values, each row is counted separately, so the running total grows row by row.
RANGE groups rows that have the same ORDER BY value (ties) into a single bucket. Tied rows share a frame and end up with the same running total.
The two diagrams below show how ROWS and RANGE each handle the 3 rows where amount = 100000.
-- ORDER BY amount has 3 rows tied at 100000, so ROWS and RANGE diverge
SELECT sale_id, amount,
SUM(amount) OVER (ORDER BY amount ROWS UNBOUNDED PRECEDING) AS rows_run,
SUM(amount) OVER (ORDER BY amount RANGE UNBOUNDED PRECEDING) AS range_run
FROM sales
WHERE amount = 100000
ORDER BY sale_id;
-- ROWS counts one row at a time; RANGE bundles the 3 tied 100000 rows into one frame
Knowledge Check
Answer each question one by one.
Q2When a window function has ORDER BY and you don't write the frame explicitly, what's the default frame?
Q3What's the difference between ROWS and RANGE frames?