Learn by reading through in order

Window Functions ② — ORDER BY and Frames (ROWS / RANGE)

Visualize running totals with OVER (... ORDER BY ...), moving sums with ROWS BETWEEN 2 PRECEDING, and the ROWS vs. RANGE difference on tied rows — all explored with sales data and diagrams.

Windows with ORDER BY — producing a running total

Add ORDER BY inside OVER and the aggregation target shifts to "from the start of the window up to the current row" — the value grows row by row, producing a running total (the sum from the start up through the current row).

The shape is SUM(amount) OVER (PARTITION BY emp_id ORDER BY sale_date).

PARTITION BY emp_id splits the window per employee, ORDER BY sale_date arranges each employee's sales by date, and then the running total is computed per employee.

The diagram below traces how the running total grows for emp_id=29's 5 sales in date order.

Adding ORDER BY produces a running total
sale_dateamountrunning_total(SUM OVER ORDER BY date)01-07310000310000(= 310000)02-14295000605000(= 310 + 295)03-21280000885000(= 605 + 280)04-052650001150000(= 885 + 265)06-233250001475000(= 1150 + 325)final row = the employee's total sales+295000+280000+265000+325000
Ordering emp_id=29's 5 sales by sale_date, running_total is "previous running total + current amount". It grows 310000 → 605000 → 885000 → 1150000 → 1475000, with the final row matching the employee's total sales.
-- Without ORDER BY: each row gets the employee's unchanging total
SELECT emp_id, sale_date, amount,
  SUM(amount) OVER (PARTITION BY emp_id) AS emp_total
FROM sales
ORDER BY emp_id, sale_date;

-- With ORDER BY: per-employee running total from the start through the current row
SELECT emp_id, sale_date, amount,
  SUM(amount) OVER (PARTITION BY emp_id ORDER BY sale_date) AS running_total
FROM sales
ORDER BY emp_id, sale_date;

Imagine you want a running total of sales per employee, ordered by date. (If your query runs correctly, an explanation will appear.)

① Select emp_id, sale_date, and amount from sales.

② Compute SUM(amount) over a window partitioned per employee and ordered by date, and add it as a 4th column aliased running_total.

③ The grader checks row order, so end with ORDER BY emp_id, sale_date.

SQL Editor

Run a query to see results

The frame — which rows actually get aggregated inside the window

When you add ORDER BY to a window function, the range of rows that each row actually aggregates over is called the frame.

Frame — rows actually aggregated in the window
window (5 rows)frame (3 rows aggregated)window(PARTITION BY emp_id= one employee's rows)frame(rows for the current row's calc)row 1row 2row 3 ← current rowrow 4row 5row 1row 2row 3 (current)all rows for one employeerows aggregated to compute row 3(default: start through current row)
Within one window (say emp_id=14's 5 rows), the frame decides "from here to here, sum these rows" for the current row. Change the frame and the meaning of the calculation changes — running total, moving sum, average of neighbors, and so on.

If you don't write the frame explicitly, the default is RANGE BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW (from the start of the window through the current row).

That's exactly why adding ORDER BY in the previous exercise gave you a running total — this default frame is doing the work.

-- A: frame omitted (default = RANGE UNBOUNDED PRECEDING ... CURRENT ROW)
SELECT emp_id, sale_date, amount,
  SUM(amount) OVER (PARTITION BY emp_id ORDER BY sale_date) AS run_default
FROM sales WHERE emp_id = 14
ORDER BY sale_date;

-- B: the default frame written explicitly (same result as A)
SELECT emp_id, sale_date, amount,
  SUM(amount) OVER (
    PARTITION BY emp_id ORDER BY sale_date
    ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW
  ) AS run_explicit
FROM sales WHERE emp_id = 14
ORDER BY sale_date;

Frames are built from these 5 keywords, used to say "from where" and "to where" relative to the current row.

KeywordMeaning
UNBOUNDED PRECEDINGAll the way back to the start of the window
n PRECEDINGn rows before the current row
CURRENT ROWThe current row itself
n FOLLOWINGn rows after the current row
UNBOUNDED FOLLOWINGAll the way forward to the end of the window
Default (with ORDER BY)RANGE BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW

Let's confirm the running-total behavior by writing the frame explicitly.

① Select emp_id, sale_date, and amount from sales.

② Over a window partitioned per employee and ordered by sale_date, compute SUM(amount) with the explicit frame ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW, and add it as a 4th column aliased running_total.

③ The grader checks row order, so end with ORDER BY emp_id, sale_date.

SQL Editor

Run a query to see results

ROWS BETWEEN n PRECEDING — moving sums and moving averages

Set the frame to ROWS BETWEEN n PRECEDING AND CURRENT ROW and the aggregation narrows to "the last n rows + the current row".

ROWS BETWEEN 2 PRECEDING AND CURRENT ROW covers the most recent 3 rows (including the current one), so SUM becomes a moving sum and AVG becomes a moving average.

This is the bread-and-butter shape for smoothing out a recent trend.

Near the start of the window the frame has fewer rows to draw from (row 1 has 1 row, row 2 has 2 rows, row 3 onward has 3 rows).

The diagram below walks through emp_id=14's 5 sales row by row, showing what ends up in the frame at each "current row".

ROWS BETWEEN 2 PRECEDING — sliding 3-row frame
current row (amount)frame contents (recent 3 rows)moving sum (SUM)row 1490000[490000] just 1 row490000row 2100000[490000, 100000] 2 rows590000row 385000[490000, 100000, 85000] 3 rows675000row 470000[100000, 85000, 70000] 3 rows(490000 drops out)255000row 5505000[85000, 70000, 505000] 3 rows(100000 drops out)660000the "recent 3 rows" frame slides along
For emp_id=14's 5 sales (amounts 490000 / 100000 / 85000 / 70000 / 505000), the frame slides with the current row: "the last 2 rows + the current row". Near the start there aren't enough preceding rows, so the frame is 1 or 2 rows wide. Follow the arrows: current row → frame → moving sum.
-- Moving average of the most recent 3 sales (prev 2 rows + current)
SELECT emp_id, sale_date, amount,
  AVG(amount) OVER (
    PARTITION BY emp_id ORDER BY sale_date
    ROWS BETWEEN 2 PRECEDING AND CURRENT ROW
  ) AS moving_avg3
FROM sales WHERE emp_id = 2
ORDER BY sale_date;

-- Moving sum of the most recent 2 sales (prev 1 row + current) uses the same shape
SELECT emp_id, sale_date, amount,
  SUM(amount) OVER (
    PARTITION BY emp_id ORDER BY sale_date
    ROWS BETWEEN 1 PRECEDING AND CURRENT ROW
  ) AS moving_sum2
FROM sales WHERE emp_id = 2
ORDER BY sale_date;

Imagine you want a per-employee moving sum over the most recent 3 sales (the last 2 + the current one), ordered by date.

① Select emp_id, sale_date, and amount from sales.

② Over a window partitioned per employee and ordered by sale_date, compute SUM(amount) with the frame ROWS BETWEEN 2 PRECEDING AND CURRENT ROW, and add it as a 4th column aliased moving_sum3.

③ The grader checks row order, so end with ORDER BY emp_id, sale_date.

SQL Editor

Run a query to see results

ROWS vs. RANGE — how tied rows get handled

Frames come in two counting modes: ROWS and RANGE.

ROWS counts physical rows ("the previous 2 rows" means exactly 2 rows). Even with tied values, each row is counted separately, so the running total grows row by row.

RANGE groups rows that have the same ORDER BY value (ties) into a single bucket. Tied rows share a frame and end up with the same running total.

The two diagrams below show how ROWS and RANGE each handle the 3 rows where amount = 100000.

ROWS — ties counted one at a time, total grows
current rowROWS frame (start → current row)rows_runsale_id 21100000[21]1 row100000sale_id 24100000[21, 24]2 rows200000sale_id 37100000[21, 24, 37]3 rows300000
For the 3 rows with amount = 100000 (ordered by amount), the ROWS frame counts physical rows up through the current row. Row 1 has [21] (1 row), row 2 has [21,24] (2 rows), row 3 has [21,24,37] (3 rows) — the running total goes 100000 → 200000 → 300000.
RANGE — tied rows share one frame and one total
3 tied rows form one framecurrent rowRANGE frame (ties bundled)range_runsale_id 21100000[21, 24, 37]3 tied rows300000sale_id 24100000[21, 24, 37]3 tied rows300000sale_id 37100000[21, 24, 37]3 tied rows300000
RANGE bundles rows with the same ORDER BY value (ties) together. The 3 rows with amount = 100000 all land in the same frame [21, 24, 37], and the running total is 300000 on every row.
-- ORDER BY amount has 3 rows tied at 100000, so ROWS and RANGE diverge
SELECT sale_id, amount,
  SUM(amount) OVER (ORDER BY amount ROWS  UNBOUNDED PRECEDING) AS rows_run,
  SUM(amount) OVER (ORDER BY amount RANGE UNBOUNDED PRECEDING) AS range_run
FROM sales
WHERE amount = 100000
ORDER BY sale_id;
-- ROWS counts one row at a time; RANGE bundles the 3 tied 100000 rows into one frame

Imagine you want to see, in a single result, how a running total changes between ROWS and RANGE when several rows share the same value. There are 3 sales where amount = 100000.

① Select sale_id and amount from sales.

② Over a window ordered by amount, add SUM(amount) with frame ROWS UNBOUNDED PRECEDING aliased rows_run, and another SUM(amount) with frame RANGE UNBOUNDED PRECEDING aliased range_run.

③ Filter to just the rows where amount = 100000.

④ The grader checks row order, so end with ORDER BY sale_id.

SQL Editor

Run a query to see results
QUIZ

Knowledge Check

Answer each question one by one.

Q1When you add ORDER BY inside OVER, like SUM(amount) OVER (PARTITION BY emp_id ORDER BY sale_date), what value ends up on each row?

Q2When a window function has ORDER BY and you don't write the frame explicitly, what's the default frame?

Q3What's the difference between ROWS and RANGE frames?