When working with more advanced SQL it can be unclear when it makes sense to use a WHERE versus a HAVING clause.
Though it appears that both clauses do the same thing, they do it in different ways. In fact, their functions complement each other.
- A WHERE clause is used is filter records from a result. The filter occurs before any groupings are made.
- A HAVING clause is used to filter values from a group.
Before we go any further let’s review the format of an SQL Statement. It is
SELECT FROM WHERE GROUP BY HAVING
To help keep things straight I like to think of the order of execution of SQL statements as from top to bottom. That means the WHERE clause is first applied to the result and then, the remaining rows summarized according to the GROUP BY.
WHERE clause
The WHERE clause is used to filer rows from a results. For instance
SELECT COUNT(SalesOrderID) FROM Sales.SalesOrderDetail
Returns 121,317 as the count, whereas, the query
SELECT COUNT(SalesOrderID) FROM Sales.SalesOrderDetail WHERE UnitPrice > 200
Returns 48,159 as the count. This is because the WHERE clause filters out the 73,158 SalesOrderDetails whose UnitPrice is less than or equal to 200 from the results.
HAVING Clause
The HAVING clause is used to filter values in a GROUP BY. You can use them to filter out groups such as
SELECT SalesOrderID, SUM(UnitPrice * OrderQty) AS TotalPrice FROM Sales.SalesOrderDetail GROUP BY SalesOrderID HAVING SalesOrderID > 50000
But their true power lies in their ability to compare and filter based on aggregate function results. For instance, you can select all orders totaling more than $10,000
SELECT SalesOrderID, SUM(UnitPrice * OrderQty) AS TotalPrice FROM Sales.SalesOrderDetail GROUP BY SalesOrderID HAVING SUM(UnitPrice * OrderQty) > 10000
Since the WHERE clause’s visibility is one row at a time, there isn’t a way for it to evaluate the SUM across all SalesOrderID’s. The HAVING clause is evaluated after the grouping is created.
Combing the two: WHERE and HAVING
When SQL statements have both a WHERE clause and HAVING clause, keep in mind the WHERE clause is applied first, then the results grouped, and finally the groups filtered according to the HAVING clause.
In many cases you can place the WHERE condition in the HAVING clause, such as
SELECT SalesOrderID, SUM(UnitPrice * OrderQty) AS TotalPrice FROM Sales.SalesOrderDetail GROUP BY SalesOrderID HAVING SUM(UnitPrice * OrderQty) > 10000 AND SalesOrderID > 50000
Versus
SELECT SalesOrderID, SUM(UnitPrice * OrderQty) AS TotalPrice FROM Sales.SalesOrderDetail WHERE SalesOrderID > 50000 GROUP BY SalesOrderID HAVING SUM(UnitPrice * OrderQty) > 10000
If you can put condition from the where clause in the having clause then why even worry about the WHERE? Can I just use this query?
SELECT SalesOrderID, SUM(UnitPrice * OrderQty) AS TotalPrice FROM Sales.SalesOrderDetail GROUP BY SalesOrderID HAVING SUM(UnitPrice * OrderQty) > 10000 AND LineTotal > 10
Actually that query generates an error. The column LineTotal is not part of the group by field list nor the result of an aggregate total.
To be valid the having clause can only compare results of aggregated functions or column part of the group by.
To be valid the query has to be rewritten as
SELECT SalesOrderID, SUM(UnitPrice * OrderQty) AS TotalPrice FROM Sales.SalesOrderDetail WHERE LineTotal > 100 GROUP BY SalesOrderID HAVING SUM(UnitPrice * OrderQty) > 10000
To summarize the difference between WHERE and HAVING:
- WHERE is used to filter records before any groupings take place.
- HAVING is used to filter values after they have been groups. Only columns or expression in the group can be included in the HAVING clause’s conditions..
No comments:
Post a Comment