Summarizing Data Using GROUP BY and Aggregate Functions

So far, we’ve learned how to read data with SELECT, filter rows with WHERE, and sort results with ORDER BY. But what if you don’t just want to see every row — what if you want to summarize data? That’s where GROUP BY and aggregate functions come in.

Basic Aggregate Function Syntax

SQL
SELECT column1,
	   AGGREGATE_FUNCTION(column2)
FROM schema_name.table_name
GROUP BY column1;
  • GROUP BY takes rows that share the same value in one or more columns and groups them together.
  • Aggregate functions like COUNT, SUM, or AVG are then applied to each group.

Universal Aggregate Functions

The functions listed below are supported by all major relational databases. Some platforms also provide additional aggregate functions (like STRING_AGG in PostgreSQL or GROUP_CONCAT in MySQL), but the core aggregation functions shown here will work everywhere.

  • COUNT() → number of rows
    • COUNT(*) or COUNT(1) counts all rows
    • COUNT(column_name) counts only non-NULL values
  • SUM() → total of numeric values
    • Must be applied to a numeric column type
    • Ignores NULL values
  • AVG() → average of numeric values
    • Must be applied to a numeric column type
    • Ignores NULL values, which may skew results if not handled properly
  • MIN() → smallest value
    • Can be applied to numeric, text, or date columns
  • MAX() → largest value
    • Can be applied to numeric, text, or date columns

It doesn’t make much sense to group on unique columns, that will return a single record for each row. For these examples we’ve expanded the paintings table used in previous posts to properly demonstrate aggregate functions.

Table: Gallery.Paintings

PaintingIDTitleArtistYearCreatedCountryOfOrigin
1The Great Wave of KanagawaKatsushika Hokusai1831Japan
2The ScreamEdvard Munch1893Norway
3The Persistence of MemorySalvador Dalí1931Spain
4Mona LisaLeonardo da Vinci1503Italy
5Girl with a Pearl EarringJohannes Vermeer1665Netherlands
6NighthawksEdward Hopper1942United States
7The Starry NightVincent van Gogh1889Netherlands
8Fine Wind, Clear MorningKatsushika Hokusai1830Japan
9Thunderstorm Beneath the SummitKatsushika Hokusai1830Japan
10The Sick ChildEdvard Munch1885Norway
11The Dance of LifeEdvard Munch1899Norway
12Swans Reflecting ElephantsSalvador Dalí1937Spain
13Metamorphosis of NarcissusSalvador Dalí1937Spain
14The Last SupperLeonardo da Vinci1498Italy
15The MilkmaidJohannes Vermeer1658Netherlands
16View of DelftJohannes Vermeer1661Netherlands
17SunflowersVincent van Gogh1888Netherlands
18Café Terrace at NightVincent van Gogh1888Netherlands
19AutomatEdward Hopper1927United States
20Early Sunday MorningEdward Hopper1930United States

Example: Count paintings per country

SQL
SELECT CountryOfOrigin,
	   COUNT(1) AS PaintingCount
FROM Gallery.Paintings
GROUP BY CountryOfOrigin
ORDER BY PaintingCount DESC,
		 CountryOfOrigin ASC;

Result:

CountryOfOriginPaintingCount
Netherlands6
Japan3
Norway3
Spain3
United States3
Italy2

Making Results Easier to Read with Aliases

Notice the AS PaintingCount in the example above? The AS keyword is used to define a column alias.

An alias is a temporary name you give to a column or calculation in your query results. It doesn’t change the underlying table—it just assigns the column a clear, descriptive identifier, making your output easier to read.

Without an alias, the column name would appear as a generic name determined by the database platform such as (No column name), which isn’t very user-friendly.

💡Tip: In most databases, AS is optional. You could simply write COUNT(*) OrderCount and get the same result. However, using AS is considered best practice because it makes your queries easier for someone else to follow.

Aliases with Spaces

Best practice is to keep aliases simple, without spaces. However, sometimes reporting requirements ask for more “friendly” names, like Painting Count instead of PaintingCount.

You can do this by enclosing the alias in either double quotes (" ") or square brackets ([ ]), depending on your database platform.

  • SQL Server → square brackets [ ] or double quotes " "
SQL
SELECT CountryOfOrigin,
	   COUNT(1) AS [Painting Count]
FROM Gallery.Paintings
GROUP BY CountryOfOrigin;
  • Oracle / PostgreSQL / MySQL / SQLite → double quotes " "
SQL
SELECT CountryOfOrigin,
	   COUNT(1) AS "Painting Count"
FROM Gallery.Paintings
GROUP BY CountryOfOrigin;

Result:

CountryOfOriginPainting Count
Italy2
Japan3
Netherlands6
Norway3
Spain3
United States3

📌 Important Note: Using spaces can make queries harder to maintain, because you’ll need to quote the alias again if you reference it in an outer query. Best practice is to stick with simple names unless there’s a strong reporting need.

What About Single Quotes?

Single quotes (' ') are not valid for identifiers like aliases. In SQL, single quotes always represent string literals, not names.

For example, if you write:

SQL
SELECT CountryOfOrigin,
	   COUNT(1) AS 'Painting Count'
FROM Gallery.Paintings
GROUP BY CountryOfOrigin;

Some tools will still display the output column heading as Painting Count, but what’s really happening is the database is treating 'Painting Count' as a literal string. It looks like it worked, but technically you just selected a constant string and renamed the count column with no real alias.

This can cause errors in more complex queries or when porting code between platforms, and should be avoided.

Using Multiple Aggregations

You can use more than one aggregation at a time, and aliases make the output much easier to understand.

Example: Works per artist with earliest and latest year

SQL
SELECT Artist,
	   COUNT(1) AS PaintingCount,
	   MIN(YearCreated) AS EarliestWork,
	   MAX(YearCreated) AS LatestWork
FROM Gallery.Paintings
GROUP BY Artist
ORDER BY PaintingCount DESC,
		 Artist ASC;

Result:

ArtistPaintingCountEarliestWorkLatestWork
Edward Hopper319271942
Edvard Munch318851899
Johannes Vermeer316581665
Katsushika Hokusai318301831
Salvador Dalí319311937
Vincent van Gogh318881889
Leonardo da Vinci214981503

Adding More Columns

Sometimes you’ll want to group multiple columns to provide more context in results.

Keep in mind, if a column appears in the SELECT list and isn’t aggregated, it must be listed in GROUP BY. This ensures that the database knows exactly how to group rows before calculating the summary values.

Example: Paintings by country and artist

SQL
SELECT CountryOfOrigin,
	   Artist,
	   COUNT(1) AS PaintingCount
FROM Gallery.Paintings
GROUP BY CountryOfOrigin,
		 Artist
ORDER BY CountryOfOrigin ASC,
		 PaintingCount DESC,
		 Artist ASC;

Result:

CountryOfOriginArtistPaintingCount
ItalyLeonardo da Vinci2
JapanKatsushika Hokusai3
NetherlandsJohannes Vermeer3
NetherlandsVincent van Gogh3
NorwayEdvard Munch3
SpainSalvador Dalí3
United StatesEdward Hopper3

Filtering Groups with HAVING

You already know WHERE filters rows before grouping. But what if you want to filter after the grouping? That’s where HAVING comes in.

  • WHERE filters rows before grouping.
  • HAVING filters groups after aggregation.

Example: Only countries with more than 3 paintings

SQL
SELECT CountryOfOrigin,
	   COUNT(1) AS PaintingCount
FROM Gallery.Paintings
GROUP BY CountryOfOrigin
HAVING COUNT(1) > 3
ORDER BY PaintingCount DESC,
		 CountryOfOrigin;

Result:

CountryOfOriginPaintingCount
Netherlands6

Key Takeaways

  • Use aggregate functions (COUNT, SUM, AVG, MIN, MAX) to summarize data.
  • GROUP BY forms groups; aggregates compute summaries per group.
  • Every selected column must be aggregated or listed in GROUP BY.
  • Aliases (AS) make results readable and maintainable.
  • WHERE filters rows before grouping; HAVING filters groups after aggregation.

Once you’re comfortable grouping rows by columns, you’ll be ready to explore more powerful options. In the next post, we’ll look at more ways to group data by building multiple levels of summaries in a single query.

Scroll to Top