Q1When you write just one of the aggregate functions COUNT / SUM / AVG / MIN / MAX in a SELECT and run it, how many rows does the result have, basically?
Aggregate Functions — COUNT / SUM / AVG / MIN / MAX
Learn the SQL aggregate functions COUNT / SUM / AVG / MIN / MAX hands-on — from whole-table aggregates to combining them with WHERE — using CSV score data, all live in your browser.
The data we'll use — the score table
In this article we'll work with aggregate functions (functions that fold many rows into a single value). How many rows are there, what's the total, the average, the minimum, the maximum — these all compress many rows down to a single number. The five basics are COUNT / SUM / AVG / MIN / MAX, and almost every summary figure in a report can be built from just these.
The dataset is the score table (30 rows = 10 people × the 3 subjects Math / English / Science). Targeting the points in the score column, you'll work through whole-table aggregates first, then aggregates after narrowing the rows with WHERE. Per-group aggregates (like the average per subject) are covered in the next article, so here we focus on folding the whole table — or the whole range narrowed by WHERE — into a single value.
Aggregate functions: "many rows → one value"
Aggregate functions go in the column position of a SELECT.
The thing to watch out for here is not mixing a plain column and an aggregate function in the same `SELECT`. A plain column like name has a different value per row, whereas an aggregate function like COUNT(*) folds the whole table (or the whole set narrowed by `WHERE`) into a single value. If you write both, as in SELECT name, COUNT(*) FROM score;, the aggregate side is a single value for the whole set while name has no defined answer for "which row's value should I show?" In most databases this is an error.
score as input, COUNT returns the row count, SUM the total, AVG the average, and MIN / MAX the minimum and maximum — each as a single value.-- Get the count, distinct count, total, and average together
SELECT
COUNT(*) AS row_count,
COUNT(DISTINCT name) AS name_kinds,
SUM(score) AS total,
AVG(score) AS avg_raw
FROM score
WHERE subject = 'English';
The difference between COUNT(*) and COUNT(column)
COUNT(*) counts the row count itself. COUNT(column) counts only the rows where that column is not NULL. The score table has no NULLs, so both come out to the same 30, but for a column that contains NULLs you'll see a difference here (we'll confirm this later using a table with NULLs). On top of that, writing COUNT(DISTINCT column) counts the number of distinct kinds with duplicates removed (use it on subject and you get 3, the number of distinct subjects).
Tip — aggregate functions and NULL
SUM / AVG / MIN / MAX don't include NULL rows in the calculation (they ignore them). In particular, AVG is computed as the sum of the non-NULL values ÷ the count of non-NULL values. So for a column containing NULLs, SUM(column) / COUNT(*) (denominator = all rows) and AVG(column) (denominator = non-NULL count) come out different. When you want to average treating NULL as 0, spell it out explicitly with something like AVG(COALESCE(column, 0)).
Narrow the target with WHERE before aggregating
Adding WHERE makes only the rows matching the condition the input to the aggregate functions. The execution order is "FROM reads all rows → WHERE narrows the rows → the remaining rows are folded by the aggregate functions." In other words, the aggregate functions only see the rows that passed WHERE, and the sum or average is computed after the filtering. Add WHERE subject = 'Math' and only the 10 Math rows are the target, so the average becomes the average of Math alone.
FROM reads 30 rows, WHERE subject = 'Math' narrows to 10 rows, and AVG(score) folds just those 10 rows. The aggregation happens after the filtering.-- Count / average / top score for just the 10 Science rows
SELECT
COUNT(*) AS row_count,
ROUND(AVG(score), 1) AS avg_science,
MAX(score) AS top_science
FROM score
WHERE subject = 'Science';
COUNT(column) and NULL, and rounding AVG
The score table so far had no NULLs. Let's confirm how aggregation changes depending on whether NULLs are present, using the customer table that has gaps (8 people, with NULLs in age / email / country). COUNT(*) is the row count, so it stays 8, but COUNT(email) counts only the rows where email is filled in, so it's 4. Running COUNT(column) on a column containing NULLs comes out smaller than COUNT(*) like this.
Also, since AVG is division, it produces decimals like 31.5. In a report it's easier to handle if you round with ROUND(expression, digits), as in ROUND(AVG(age), 2).
customer (8 people) which has NULLs, COUNT(*) stays 8 while COUNT(email) counts only the 4 rows where email is filled in. For a column containing NULLs, COUNT(column) comes out smaller than COUNT(*).-- How COUNT and AVG change with NULLs present
SELECT
COUNT(*) AS rows_all,
COUNT(country) AS with_country,
ROUND(AVG(age), 2) AS avg_age
FROM customer;
Knowledge Check
Answer each question one by one.
Q2Which is a correct explanation of the difference between COUNT(*) and COUNT(column)?
Q3When you add WHERE subject = 'Math' and run AVG(score), over what range is the average computed?