SQL GROUP BY and HAVING help in filtering the datasets. Today, analysts use these widely for various types of analysis . However, they can often be confusing too. Beginners often gent confused and wonder which one to use in which scenario. Let’s understand this in detail.
As we know, SQL is a powerful language for communicating with databases and analyzing records. Aggregation is an important concept in SQL, which helps us in summarizing data. SQL GROUP BY and HAVING help us in this aspect.
For example , If you we want to find the total sales per store, or the average salary per department, or the number of students in each grade, then mastering SQL GROUP BY and HAVING can be extremely useful.
In this article, we will explain with examples what SQL GROUP BY and HAVING do, how they execute, and provide easy-to-understand real world scenarios so you can become the champion of these SQL features quickly.
๐ What is GROUP BY in SQL?
We use GROUP BY clause in SQL to combine rows into groups based on one or more columns. Generally, we use it with aggregate functions as below –
-
COUNT()
– For counting the number of rows in each group -
SUM()
– For getting the total sum of values in each group -
AVG()
– Used for finding the average value in each group -
MAX()
– To get the highest value in each group -
MIN()
–ย To get the lowest value in each group
We can use these functions with GROUP BY to summarize data for reporting and analysis. Therefore, analysts use these aggregate functions are frequently. Additionally, we will provide practical examples on these in the article.
GROUP BY Syntax –
Let’s look at below construct to understand the basic syntax of GROUP BY –
"SELECT column_name, AGGREGATE_FUNCTION(column_name)
FROM table_name
GROUP BY column_name;"
๐ GROUP BY Example 1: Counting Orders per Customer
Let’s understand the above syntax with the help of examples. Suppose we have a table named Orders. We have 3 customers – Alice, Bob and Charlie who placed one or more orders of different amounts.
OrderID | CustomerName | Amount |
---|---|---|
1 | Alice | 200 |
2 | Bob | 150 |
3 | Alice | 300 |
4 | Charlie | 100 |
5 | Bob | 250 |
Now, we want to calculate how many orders each customer has placed. Like, Alice and Bob placed 2 orders and Charlie placed 1.
โ SQL Query Using GROUP BY
We can write following query using the syntax provided above –
"SELECT CustomerName, COUNT(OrderID) AS TotalOrders
FROM Orders
GROUP BY CustomerName;"
๐ฏ Output:
CustomerName | TotalOrders |
---|---|
Alice | 2 |
Bob | 2 |
Charlie | 1 |
Explanation:
-
The COUNT(OrderID) function counted the number of orders per customer.
-
The GROUP BY CustomerName ensured that we count orders separately for each customer.
๐ฐ GROUP BY Example 2: Total Sales Per Customer
Now, let’s suppose that we want to calculate the total amount spent by each customer.
โ SQL Query Using SUM()
Below query, in the same syntax is using SUM function to thereby calculate total amount spent by each customer –
SELECT CustomerName, SUM(Amount) AS TotalSpent
FROM Orders
GROUP BY CustomerName;"
๐ฏ Output:
CustomerName | TotalSpent |
---|---|
Alice | 500 |
Bob | 400 |
Charlie | 100 |
Explanation:
- Thus, all the values in the Amount column were added up for each customer by SUM(Amount).
๐จ What is the HAVING Clause in SQL?
In SQL GROUP BY and Having, we use the HAVING clause to filter the results of GROUP BY based on conditions we applied to aggregate functions.
Why do we need HAVING?
-
First because we can not use the
WHERE
clause with aggregate functions. -
HAVING
is used after GROUP BY is written to filter grouped results.
HAVING Syntaxย
Following is the basic syntax for HAVING cllause –
"SELECT column_name, AGGREGATE_FUNCTION(column_name)
FROM table_name
GROUP BY column_name
HAVING condition;"
๐ฏ HAVING Example: Customers Who Spent More Than โน300
Let’s use the previous example and retrieve only those customers who spent more than โน300.
โ SQL Query Using HAVINGย
Using the basic syntax from above, we can write the below query to perform this analysis –
"SELECT CustomerName, SUM(Amount) AS TotalSpent
FROM Orders
GROUP BY CustomerName
HAVING SUM(Amount) > 300;"
๐ฏ Output:
CustomerName | TotalSpent |
---|---|
Alice | 500 |
Bob | 400 |
Explanation:
-
The customers who spent โน300 or less got filtered out by HAVING SUM(Amount) > 300.
-
To be specific, Charlie got excluded since he spent only โน100.
๐ Difference Between WHERE and HAVING
Feature | WHERE | HAVING |
---|---|---|
Used for | Filtering individual rows | Filtering grouped data |
Works with | Regular columns | Aggregate functions |
Can be used before GROUP BY? | Yes | No |
Example:
In nutshell, WHERE filters data before grouping and Having filters data after grouping. Therefore, try to visualize this with below examples –
"SELECT CustomerName, Amount
FROM Orders
WHERE Amount > 200;"
"SELECT CustomerName, SUM(Amount) AS TotalSpent
FROM Orders
GROUP BY CustomerName
HAVING SUM(Amount) > 300;"
๐ Combining WHERE, GROUP BY, and HAVING
We can combine WHERE with SQL GROUP BY and HAVING in the same query!
๐ฏ Example:
Let’s find out customers who made at least two orders and spent more than โน300.
"SELECT CustomerName, COUNT(OrderID) AS TotalOrders, SUM(Amount) AS TotalSpent
FROM Orders
WHERE Amount > 100
GROUP BY CustomerName
HAVING COUNT(OrderID) >= 2 AND SUM(Amount) > 300;"
๐ฏ Output:
CustomerName | TotalOrders | TotalSpent |
---|---|---|
Alice | 2 | 500 |
Bob | 2 | 400 |
Explanation:
-
WHERE Amount > 100 โ Filtered out orders with
Amount โค 100
before grouping. -
GROUP BY CustomerName โ Grouped orders by customer.
-
HAVING COUNT(OrderID) >= 2 โ Only customers with 2+ orders are included with the help of HAVING here.
-
HAVING SUM(Amount) > 300 โ Only the customers who spent more than โน300 are included in this case.
๐ฅ Real-Life Use Cases of SQL GROUP BY and HAVING
- E-commerce: As E-commerce sites get lot of website traffic, they deploy data analysis teams to understand customer behavior. Hence, think of use cases like calculating total revenue per product category.
- Banking: Among many inactive and low balance accounts, a manager will straightaway want to find customers with high account balances.
- Healthcare: Counting the number of patients per doctor.
- Social Media: Find users with more than 1000 followers.
๐ฏ Summary
Feature | Purpose | Example |
---|---|---|
GROUP BY | Groups rows based on a column | Total sales per customer |
HAVING | Filters grouped data | Show only customers who spent โน300+ |
WHERE | Filters individual rows before grouping | Show only orders above โน100 |
Evidently, it is very important to champion SQL GROUP BY and HAVING to analyze big datasets effectively. Thus, practice 4-5 times using the queries and scenarios discussed above and see the results yourself.๐
Hope you found this guide helpful! Let me know if you need further clarification. Happy coding!
Also checkout these articles on SQL –