Q1When you specify GROUP BY name, how many rows does the result have (score table / 10 distinct names)?
GROUP BY and HAVING — Grouped Aggregates
Learn SQL GROUP BY and HAVING: per-group aggregates, filtering groups with HAVING, and how it differs from WHERE — practiced on CSV score data, live in your browser.
The data used in this article — the score table
In the previous article you collapsed an entire table into a single value. This article's `GROUP BY` (grouping — collecting rows that share the same value into one group) runs aggregates per group and returns one aggregate row per group. On top of that, `HAVING` lets you filter the aggregate results, like "only groups whose average is 80 or higher."
The subject is the same score table as last time (30 rows = 10 people × 3 subjects). Grouping by name gives 10 groups, and grouping by subject gives 3 groups.
GROUP BY — aggregate per group
When you write GROUP BY column, rows that share the same value in that column are collected into one group, and aggregate functions are computed once per group. With SELECT name, AVG(score) FROM score GROUP BY name;, the 3 rows with the same name (one per subject) form one group, and you get one average row per name. The only "plain columns" you can put in SELECT are the columns named in `GROUP BY`; any other column has to be wrapped in an aggregate function (because its value isn't single within a group).
AVG(score) is computed per group. With 10 people, the result is 10 rows.-- Count / average / top score per subject (3 groups by subject)
SELECT
subject,
COUNT(*) AS row_count,
ROUND(AVG(score), 1) AS avg_score,
MAX(score) AS top_score
FROM score
GROUP BY subject;
HAVING — filter the aggregate results
After a group aggregate, you sometimes want to filter on the aggregate value, like "only show groups whose average is 80 or higher." You can't do this with WHERE. WHERE is a clause that judges individual rows before grouping, so a group aggregate like AVG(score) hasn't been computed yet. That's where `HAVING` comes in. HAVING is written after GROUP BY and judges its condition against the group aggregate results. Writing HAVING AVG(score) >= 80 keeps only the groups whose average is 80 or higher.
GROUP BY aggregates into 10 groups, then HAVING AVG(score) >= 80 judges the already-aggregated groups and keeps only those that satisfy the condition.-- Only names whose average is above 85 (HAVING judges the aggregate value)
SELECT
name,
ROUND(AVG(score), 2) AS avg_score
FROM score
GROUP BY name
HAVING AVG(score) > 85;
WHERE vs. HAVING — filter rows, or filter groups
WHERE and HAVING are both filters, but they work at different times. WHERE judges individual rows and drops them before grouping. HAVING judges per group and drops them after grouping and aggregating. The execution order is FROM → WHERE → GROUP BY → HAVING → SELECT → ORDER BY.
You can also use both at once. In that case the flow is "narrow the target rows with WHERE, then group, then filter the aggregate results with HAVING." For example, to get "people whose per-name average is 80 or higher, considering only Math and English," you narrow rows with WHERE subject IN ('Math','English') and narrow groups with HAVING AVG(score) >= 80.
WHERE judges rows on plain columns and drops them before grouping, while HAVING judges groups on aggregate values (AVG, etc.) and drops them after grouping. The key point is where things get dropped. Example: WHERE subject='Math' (filters rows) and HAVING AVG(score)>=80 (filters groups).-- Use WHERE and HAVING together
-- Exclude Science, keep only names whose average is 80 or higher
SELECT
name,
ROUND(AVG(score), 2) AS avg_two
FROM score
WHERE subject <> 'Science'
GROUP BY name
HAVING AVG(score) >= 80;
Knowledge Check
Answer each question one by one.
Q2When you want to extract "only people whose per-name average score is 80 or higher," which clause does the 80-or-higher condition go in?
Q3Which statement correctly describes the difference between WHERE and HAVING?