Adsense Ad

Wednesday 12 April 2017

Difference B/w SQL HAVING and SQL WHERE


When working with more advanced SQL it can be unclear when it makes sense to use a WHERE versus a HAVING clause.
Though it appears that both clauses do the same thing, they do it in different ways.  In fact, their functions complement each other.
  • A WHERE clause is used is filter records from a result.  The filter occurs before any groupings are made.
  • A HAVING clause is used to filter values from a group.
Before we go any further let’s review the format of an SQL Statement.  It is
SELECT
FROM
WHERE
GROUP BY
HAVING
To help keep things straight I like to think of the order of execution of SQL statements as from top to bottom.  That means the WHERE clause is first applied to the result and then, the remaining rows summarized according to the GROUP BY.

WHERE clause

The WHERE clause is used to filer rows from a results.  For instance
SELECT   COUNT(SalesOrderID)
FROM     Sales.SalesOrderDetail
Returns 121,317 as the count, whereas, the query
SELECT   COUNT(SalesOrderID)
FROM     Sales.SalesOrderDetail
WHERE    UnitPrice > 200
Returns 48,159 as the count.  This is because the WHERE clause filters out the 73,158  SalesOrderDetails whose UnitPrice is less than or equal to 200 from the results.

HAVING Clause

The HAVING clause is used to filter values in a GROUP BY.  You can use them to filter out groups such as
SELECT   SalesOrderID,
         SUM(UnitPrice * OrderQty) AS TotalPrice
FROM     Sales.SalesOrderDetail
GROUP BY SalesOrderID
HAVING   SalesOrderID > 50000
But their true power lies in their ability to compare and filter based on aggregate function results.  For instance, you can select all orders totaling more than $10,000
SELECT   SalesOrderID,
         SUM(UnitPrice * OrderQty) AS TotalPrice
FROM     Sales.SalesOrderDetail
GROUP BY SalesOrderID
HAVING   SUM(UnitPrice * OrderQty) > 10000
Since the WHERE clause’s visibility is one row at a time, there isn’t a way for it to evaluate the SUM across all SalesOrderID’s. The HAVING clause is evaluated after the grouping is created.

Combing the two: WHERE and HAVING

When SQL statements have both a WHERE clause and HAVING clause, keep in mind the WHERE clause is applied first, then the results grouped, and finally the groups filtered according to the HAVING clause.
In many cases you can place the WHERE condition in the HAVING clause, such as
SELECT   SalesOrderID,
         SUM(UnitPrice * OrderQty) AS TotalPrice
FROM     Sales.SalesOrderDetail
GROUP BY SalesOrderID
HAVING   SUM(UnitPrice * OrderQty) > 10000 
         AND SalesOrderID > 50000
Versus
SELECT   SalesOrderID,
         SUM(UnitPrice * OrderQty) AS TotalPrice
FROM     Sales.SalesOrderDetail
WHERE    SalesOrderID > 50000
GROUP BY SalesOrderID
HAVING   SUM(UnitPrice * OrderQty) > 10000
If you can put condition from the where clause in the having clause then why even worry about the WHERE?  Can I just use this query?
SELECT   SalesOrderID,
         SUM(UnitPrice * OrderQty) AS TotalPrice
FROM     Sales.SalesOrderDetail
GROUP BY SalesOrderID
HAVING   SUM(UnitPrice * OrderQty) > 10000 AND LineTotal > 10
Actually that query generates an error.  The column LineTotal is not part of the group by field list nor the result of an aggregate total.
To be valid the having clause can only compare results of aggregated functions or column part of the group by.
To be valid the query has to be rewritten as
SELECT   SalesOrderID,
         SUM(UnitPrice * OrderQty) AS TotalPrice
FROM     Sales.SalesOrderDetail
WHERE    LineTotal > 100
GROUP BY SalesOrderID
HAVING   SUM(UnitPrice * OrderQty) > 10000
To summarize the difference between WHERE and HAVING:
  • WHERE is used to filter records before any groupings take place.
  • HAVING is used to filter values after they have  been groups.  Only columns or expression in the group can be included in the HAVING clause’s conditions..

No comments: