Learn by reading through in order

Subqueries ② — Derived Tables, SELECT Subqueries, CREATE / INSERT SELECT

Q: Which of these is required when you write a subquery (derived table) in the FROM clause?

Giving the derived table an alias

Q: Which is a correct description of CREATE TABLE top_seller AS SELECT ...?

It creates a new table whose contents are the SELECT result

This article is part of the SQL Course, where you master practical SQL skills from scratch, from the fundamentals through to complex queries and SQL tuning.
Learn SQL derived tables, SELECT-clause subqueries, pairing with CASE, and CREATE / INSERT SELECT — hands-on with employee and sales data in your browser.

Data used in this article — employee / department / sales

Last time you used a subquery in WHERE.

In this article you'll work through four applications in order: putting a subquery in the FROM clause (a derived table), writing one in a SELECT column, combining it with CASE, and building a new table from a subquery's result.

The material is three employee-data tables.

Using employee (30 employees), department (6 departments), and sales (50 sales line items), you'll build a per-department aggregate table and line up each employee's total sales as a column.

Before starting the exercises, take a look at the column definitions and sample data of the three tables this article uses — employee / department / sales.

① Run PRAGMA table_info(...) to check the column definitions of all three tables.

② Run SELECT * FROM table_name LIMIT 5; to preview the first 5 rows of each table.

SQL Editor

Run a query to see results

Derived tables in FROM — treat a subquery's result as a single table

A derived table is a way of writing a subquery in the FROM clause so that its result is treated as a single virtual table.

You can build "the average salary per department" first as an aggregate table, then apply further JOINs or WHEREs to that table.

It's handy when you want to narrow down an aggregated result one more step.

The key rule for derived tables is that you must give it an alias.

Write AS alias like FROM (SELECT ...) AS ds (SQLite also lets you drop AS) — without it you get an error.

Through that alias you reference the derived table's columns, like ds.avg_salary.

Derived table in FROM — reuse an aggregate result as a table

The inner subquery builds a per-department aggregate table, and the outer query gives it an alias and JOINs it with department. With a derived table you can aggregate first, then JOIN and filter.

-- Derived table in FROM: aggregate headcount and avg salary per department, then join with department
SELECT d.dept_name, ds.headcount, ds.avg_salary
FROM (
  SELECT dept_id, COUNT(*) AS headcount, AVG(salary) AS avg_salary
  FROM employee
  WHERE dept_id IS NOT NULL
  GROUP BY dept_id
) AS ds
JOIN department d ON d.dept_id = ds.dept_id
ORDER BY ds.avg_salary DESC;

Picture this requirement: "aggregate the average salary per department, and list only the departments whose average is higher than 5,900,000, with their department names." (Run it correctly and the explanation appears.)

① In a subquery, aggregate employee by dept_id and build a derived table that has dept_id and the average salary (alias avg_salary). Exclude employees whose dept_id is NULL from the aggregation.

② Give the derived table an alias, join it with the department table on dept_id, and pull out the department name.

③ In the outer WHERE, narrow down to departments whose avg_salary is greater than 5900000, and order by avg_salary descending.

SQL Editor

Run a query to see results

SELECT-clause subqueries — embed a single value as a column

When you write a scalar subquery in the SELECT column list, you can add a column whose value is computed one per row.

Put (SELECT SUM(sale.amount) FROM sales sale WHERE sale.emp_id = emp.emp_id) in a column, and you can line up "that employee's total sales" as one column per employee.

This is a correlated subquery that references the outer emp.emp_id.

-- SELECT-clause subquery: add each employee's total sales as one column
SELECT emp.name,
  COALESCE(
    (SELECT SUM(sale.amount) FROM sales sale WHERE sale.emp_id = emp.emp_id),
    0
  ) AS total_amount
FROM employee emp
ORDER BY emp.emp_id
LIMIT 6;

An employee with no sales makes the subquery return NULL, so replacing it with 0 using COALESCE(..., 0) makes the table easier to read.

A SELECT-clause subquery also has to fit into one row and one column.

SELECT-clause subquery — compute one value per row

For each employee in the outer query, the inner subquery computes that employee's total sales, and the result lines up as a new column. Employees with zero sales are tidied up to 0 with COALESCE.

Picture this requirement: "alongside a list of all employees, show that employee's total sales as one extra column, displaying 0 for employees with no sales."

① Give the employee table the alias emp and pull out name.

② In the SELECT columns, add a subquery that computes that employee's sales total, aliased as total_amount. Correlate it with sale.emp_id = emp.emp_id, and replace the total with 0 for employees whose total comes out NULL.

③ Order by total_amount descending, breaking ties by emp_id ascending, and limit to the first 8 rows.

SQL Editor

Run a query to see results

Combining with CASE — attach a branching label based on a subquery's value

A SELECT-clause subquery works just as well as a condition in CASE.

You can build a rank column like "if total sales is 1,500,000 or more then High, if greater than 0 then Mid, otherwise None" by judging the subquery's value with CASE WHEN.

Note that you have to repeat the same subquery in each CASE WHEN (if readability comes first, there's also a way to write it just once with WITH (Common Table Expressions), which you'll learn in a later chapter).

In this article you'll cover the basic form of combining a subquery and CASE directly.

-- Classify employees by their number of sales
SELECT emp.name,
  (SELECT COUNT(*) FROM sales sale WHERE sale.emp_id = emp.emp_id) AS sale_count,
  CASE
    WHEN (SELECT COUNT(*) FROM sales sale WHERE sale.emp_id = emp.emp_id) >= 4 THEN 'Frequent'
    WHEN (SELECT COUNT(*) FROM sales sale WHERE sale.emp_id = emp.emp_id) >= 1 THEN 'Occasional'
    ELSE 'None'
  END AS activity
FROM employee emp
ORDER BY sale_count DESC, emp.emp_id
LIMIT 8;

Combining with CASE — branch on a subquery's value

The CASE WHEN clauses evaluate the single value the subquery computed from top to bottom, and the label of the first matching branch is attached. If nothing matches, you get the ELSE.

Picture this requirement: "for every employee, show the total sales and a rank (High / Mid / None) side by side."

① Give employee the alias emp and pull out name and that employee's total sales (NULL replaced with 0, alias total_amount).

② With CASE, add a rank column aliased grade where the total sales 1500000 or more is 'High', greater than 0 and less than 1500000 is 'Mid', and otherwise (0) is 'None'. Use the same total-sales subquery for the judgment too.

③ Order by total_amount descending, breaking ties by emp_id ascending, and limit to the first 8 rows.

SQL Editor

Run a query to see results

CREATE / INSERT SELECT — turn a subquery's result into a table

Writing CREATE TABLE new_table AS SELECT ... lets you save the SELECT result directly as a new table (CTAS: Create Table As Select).

It's often used when you want to keep an aggregated result as a snapshot.

When you want to add rows to a table that already exists, use INSERT INTO existing_table SELECT ... to bulk-add the SELECT result.

These are write operations that create a table or add rows.

In this article you'll use a disposable working table top_seller for the aggregation.

So that you get the same result no matter how many times you run it, you'll clean up the working table with DROP TABLE IF EXISTS before recreating it.

The flow of CREATE / INSERT SELECT

SELECT builds the aggregated result, and CREATE TABLE AS saves it into a new table. Later you can add rows for another condition with INSERT INTO ... SELECT.

Bulk INSERT is faster

INSERT INTO table SELECT ... inserts the target rows all at once in a single statement.

Compared with repeating INSERT INTO table VALUES (...) for each row, the SQL parsing, index updates, and transaction handling happen just once, so the more rows you insert, the faster it gets.

When copying or moving another table or an aggregated result, prefer the bulk INSERT ... SELECT over repeating INSERT one row at a time.

-- Save a per-department aggregate into a snapshot table
DROP TABLE IF EXISTS dept_summary;
CREATE TABLE dept_summary AS
  SELECT dept_id, COUNT(*) AS headcount, AVG(salary) AS avg_salary
  FROM employee
  WHERE dept_id IS NOT NULL
  GROUP BY dept_id;

SELECT * FROM dept_summary ORDER BY avg_salary DESC;

Picture this requirement: "collect employees whose total sales is 1,500,000 or more into an aggregate table with 3 columns — employee ID, name, and total sales." Since this is a write operation, structure it so the working table is recreated and won't break on re-run.

① First, clean up the working table if it exists with DROP TABLE IF EXISTS top_seller;.

② Join employee and sales, aggregate total sales per employee, write a SELECT that keeps only employees whose total is 1500000 or more, and from that result build the top_seller table with CREATE TABLE top_seller AS SELECT .... Make the columns the 3: emp_id, name, and total_amount (total sales).

③ Finally, check the contents with SELECT * FROM top_seller ORDER BY total_amount DESC;.

SQL Editor

Run a query to see results

Picture this requirement: "to the aggregate table from Practice 4, also add the mid-tier employees whose total sales is 900,000 or more but less than 1,500,000." Use INSERT INTO ... SELECT to bulk-add the rows. So it won't break on re-run, structure it to recreate the table before adding.

① Clean up the working table with DROP TABLE IF EXISTS top_seller;, then recreate it with the same CREATE TABLE top_seller AS SELECT ... (1.5M or more) as Practice 4.

② With INSERT INTO top_seller SELECT ..., add employees whose total sales is 900000 or more and less than 1500000 (3 columns: emp_id, name, total sales). Make the column order match top_seller.

③ Check all rows with SELECT * FROM top_seller ORDER BY total_amount DESC;.

SQL Editor

Run a query to see results

Answer each question one by one.

Q1Which of these is required when you write a subquery (derived table) in the FROM clause?

Q2In SELECT emp.name, (SELECT SUM(sale.amount) FROM sales sale WHERE sale.emp_id = emp.emp_id) AS total FROM employee emp, what happens to the total column for an employee with no sales at all?

Q3Which is a correct description of CREATE TABLE top_seller AS SELECT ...?

Back to SQL Intermediate

Subqueries ② — Derived Tables, SELECT Subqueries, CREATE / INSERT SELECT

Data used in this article — employee / department / sales

SQL Editor

Derived tables in FROM — treat a subquery's result as a single table

SQL Editor

SELECT-clause subqueries — embed a single value as a column

SQL Editor

Combining with CASE — attach a branching label based on a subquery's value

SQL Editor

CREATE / INSERT SELECT — turn a subquery's result into a table

SQL Editor

SQL Editor

Knowledge Check