Section 6: Aggregates & Grouping

Summarize Queries

This section shows how to aggregate rows into summary statistics and group data into meaningful buckets. You will learn the five core aggregate functions, how to use GROUP BY to create per-group summaries, and how HAVING filters those groups after they are formed.

§1 Aggregate Functions #

Aggregate functions take many rows as input and return a single computed value, letting you summarize large datasets with a handful of numbers. They are indispensable for reports, dashboards, and any task that needs totals, averages, or counts instead of raw rows.

Imagine an online shop: rather than listing every individual order, an aggregate can instantly tell you “we sold X items items for a total of $Y in September 2025.”

You can view the full table with SELECT * from sales;


Aggregate functions collapse many rows into one value. They ignore NULL unless stated otherwise.

FunctionPurposeExample
COUNT(*)# of input rowsCOUNT(*) AS orders
COUNT(column)# of non-NULL values in that columnCOUNT(email) AS with_email
COUNT(DISTINCT column)# of distinct non-NULL valuesCOUNT(DISTINCT dept)
SUM(column)Total of numeric columnSUM(amount) AS total_sales
AVG(column)Mean valueAVG(score) AS mean_score
MIN(column)Smallest valueMIN(birth_date) AS oldest
MAX(column)Largest valueMAX(temp) AS max_temp

All aggregates can be combined with DISTINCT inside the parentheses. e.g. SUM(DISTINCT price).

§1.1 COUNT #

COUNT returns how many rows match the query. COUNT(*) counts every row, while COUNT(column) counts only the non-NULL values in that column.

Count the number of customers. (update query to SELECT * FROM customers; to view full table)

§1.2 SUM #

SUM adds up every value in the column. SUM(column) ignores NULLs.

Total number of items sold. (update query to SELECT * FROM order_items; to view full table)

§1.3 AVG #

AVG computes the mean of all non-NULL values in the column.

Average sale price across all orders. (update query to SELECT * FROM order_items; to view full table)

§1.4 MIN / MAX #

MIN returns the smallest value; MAX returns the largest. Both ignore NULLs.

Get dates of first and last orders. (update query to SELECT * FROM orders; to view full table)

§2 Aggregates and NULL values #

Any aggregate function that receives 0 rows/values will always return NULL

Example where an aggregate itself returns NULL:

§3 GROUP BY Clause #

The GROUP BY clause collapses identical values in the listed columns into groups, then runs aggregates per group.

The rules for the GROUP BY clause are:

Syntax:

SELECT column_name(s), aggregate_function(column_name)
FROM table_name
WHERE condition
GROUP BY column_name(s);

GROUP BY Example:

Number of orders per status:

Full table:


Another GROUP BY Example:

Number of orders per status:

Full table:

§4 HAVING Clause #

The HAVING clause keeps (or discards) entire groups based on an aggregate condition.

The rules for the HAVING clause are:

Syntax:

SELECT column_name(s), aggregate_function(column)
FROM table_name
WHERE condition (optional with HAVING)
GROUP BY column_name(s)
HAVING column_name(s);

HAVING Example:

only categories with an average price below $1.50:
Try removing the HAVING clause

Full table:

Another HAVING Example:

Categories whose total quantity sold is 5 or more::

Full table:

§5 Order of Execution #

The database engine performs clauses in this order:

  1. FROM → 2. WHERE → 3. GROUP BY → 4. HAVING → 5. SELECT → 6. ORDER BY

Keep this in mind when mixing WHERE and HAVING.

§6 More Resources #

§7 Questions #

Please number each response to directly correspond with the question. Provide clear, concise answers that fully address each question without unnecessary elaboration or essay-style writing. Provide only the SQL query unless otherwise specified.

  1. Which aggregate function counts every row, including those that contain NULL values?
  1. Will COUNT(column) and COUNT(*) always return the same number for any table? Explain why.
  1. What does this query calculate: COUNT(DISTINCT price)
  1. If you need the smallest non-NULL date value stored in column order_date, which aggregate should you use?
  1. Which clause filters individual rows before any grouping occurs?
  1. Which clause filters entire groups after the grouping and aggregation have been performed?
  1. In a query that contains both WHERE, GROUP BY, and HAVING, which of these three clauses is executed first by the SQL engine?
  1. A valid SELECT list in a query with GROUP BY may contain (pick one):
    a) Any columns
    b) Only columns that appear in the GROUP BY list
    c) Columns in GROUP BY plus aggregates
    Explain why.
  1. Explain in one sentence what the return value of this query represents. (This is not asking what the query terms do):
SELECT   product
FROM     sales
GROUP BY product
HAVING   SUM(qty) >= 100;
  1. What will AVG(score) return if every value in the score column is NULL?
  1. True or False: MAX and MIN can be used on date and string columns as well as numeric ones.
  1. Which aggregate function would you use to find how many different categories exist in the sales table?
  1. Explain in one sentence why HAVING cannot replace WHERE in a query that has no GROUP BY.

The remaining questions will use the database below, use the text box to test queries for the questions.
For these questions, I do not want the SQL outputs, only the SQL query.

Use the (view schema) link to view the list of tables and their structure.

  1. Write a query that finds how many products have a unit_price greater than $15.
  1. Write a query that Lists the number of orders per customer (display customer_id and order_count).
  1. Write a query that shows statuses that appear on more than 1 order. Show status and count.
  1. Write a query that finds the minimum and maximum unit price among all products.
  1. Provide a query that returns how many unique product names exist.
  1. Provide a query that produces a list of customers who have placed at least two orders. Show customer_id and order count.
  1. Provide a query that calculates the total revenue (items sold * price) from the order_items table. Name the result total_revenue.
  1. Provide a query that finds the most common product price. Return the price and the number of items with that price.
  1. Write a query that finds the three cheapest products in the Electronics category. Show all information for each product.
  1. Write a query that returns the average unit price of products in the Home category?