GROUP BY (Transact-SQL)
Groups a selected set of rows into a set of summary rows by the values of one or more columns or expressions. One row is returned for each group. Aggregate functions in the SELECT clause <select> list provide information about each group instead of individual rows.
The GROUP BY clause has an ISO-compliant syntax and a non-ISO-compliant syntax. Only one syntax style can be used in a single SELECT statement. Use the ISO compliant syntax for all new work. The non-ISO compliant syntax is provided for backward compatibility.
In this topic, a GROUP BY clause can be described as general or simple:
A general GROUP BY clause includes GROUPING SETS, CUBE, ROLLUP, WITH CUBE, or WITH ROLLUP.
A simple GROUP BY clause does not include GROUPING SETS, CUBE, ROLLUP, WITH CUBE, or WITH ROLLUP. GROUP BY (), grand total, is considered a simple GROUP BY.
ISO-Compliant Syntax
GROUP BY <group by spec>
<group by spec> ::=
<group by item> [ ,...n ]
<group by item> ::=
<simple group by item>
| <rollup spec>
| <cube spec>
| <grouping sets spec>
| <grand total>
<simple group by item> ::=
<column_expression>
<rollup spec> ::=
ROLLUP ( <composite element list> )<cube spec> ::=
CUBE ( <composite element list> )<composite element list> ::=
<composite element> [ ,...n ]
<composite element> ::=
<simple group by item>
| ( <simple group by item list> )<simple group by item list> ::=
<simple group by item> [ ,...n ]
<grouping sets spec> ::=
GROUPING SETS ( <grouping set list> )<grouping set list> ::=
<grouping set> [ ,...n ]
<grouping set> ::=
<grand total>
| <grouping set item>
| ( <grouping set item list> )<empty group> ::=()<grouping set item> ::=
<simple group by item>
| <rollup spec>
| <cube spec>
<grouping set item list> ::=
<grouping set item> [ ,...n ]
Non-ISO-Compliant Syntax
[ GROUP BY [ ALL ] group_by_expression [ ,...n ]
[ WITH { CUBE | ROLLUP } ]
]
Expressions in the GROUP BY clause can contain columns of the tables, derived tables or views in the FROM clause. The columns are not required to appear in the SELECT clause <select> list.
Each table or view column in any nonaggregate expression in the <select> list must be included in the GROUP BY list:
The following statements are allowed:
SELECT ColumnA, ColumnB FROM T GROUP BY ColumnA, ColumnB SELECT ColumnA + ColumnB FROM T GROUP BY ColumnA, ColumnB SELECT ColumnA + ColumnB FROM T GROUP BY ColumnA + ColumnB SELECT ColumnA + ColumnB + constant FROM T GROUP BY ColumnA, ColumnB
The following statements are not allowed:
SELECT ColumnA, ColumnB FROM T GROUP BY ColumnA + ColumnB SELECT ColumnA + constant + ColumnB FROM T GROUP BY ColumnA + ColumnB.
If aggregate functions are included in the SELECT clause <select list>, GROUP BY calculates a summary value for each group. These are known as vector aggregates.
Rows that do not meet the conditions in the WHERE clause are removed before any grouping operation is performed.
The HAVING clause is used with the GROUP BY clause to filter groups in the result set.
The GROUP BY clause does not order the result set. Use the ORDER BY clause to order the result set.
If a grouping column contains null values, all null values are considered equal, and they are put into a single group.
You cannot use GROUP BY with an alias to replace a column name in the AS clause unless the alias replaces a column name in a derived table in the FROM clause.
Duplicate grouping sets in a GROUPING SETS list are not eliminated. Duplicate grouping sets can be generated by specifying a column expression more than one time or by listing a column expression also generated by a CUBE or ROLLUP in the GROUPING SETS list.
Distinct aggregates, for example, AVG (DISTINCT column_name), COUNT (DISTINCT column_name), and SUM (DISTINCT column_name) are supported with ROLLUP, CUBE, and GROUPING SETS.
ROLLUP, CUBE, and GROUPING SETS cannot be specified in an indexed view.
GROUP BY or HAVING cannot be used directly on columns of ntext, text, or image. These columns can be used as arguments in functions that return a value of another data type, such as SUBSTRING() and CAST().
xml data type methods cannot be specified directly in a <column_expression>. Instead, refer to a user-defined function that uses xml data type methods inside it, or refer to a computed column that uses them.
GROUP BY Limitations for GROUPING SETS, ROLLUP, and CUBE
Syntax Limitations
GROUPING SETS are not allowed in the GROUP BY clause unless they are part of a GROUPING SETS list. For example, GROUP BY C1, (C2,..., Cn) is not allowed but GROUP BY GROUPING SETS (C1, (C2, ..., Cn)) is allowed.
GROUPING SETS are not allowed inside GROUPING SETS. For example, GROUP BY GROUPING SETS (C1, GROUPING SETS (C2, C3)) is not allowed.
The non-ISO ALL, WITH CUBE, and WITH ROLLUP keywords are not allowed in a GROUP BY clause with the ROLLUP, CUBE or GROUPING SETS keywords.
Size Limitations
For simple GROUP BY, there is no limit on the number of expressions.
For a GROUP BY clause that uses ROLLUP, CUBE, or GROUPING SETS, the maximum number of expressions is 32, and the maximum number of grouping sets that can be generated is 4096 (212). The following examples fail because the GROUP BY clause is too complex:
The following examples generate 8192 (213) grouping sets.
GROUP BY CUBE (a1, ..., a13) GROUP BY a1, ..., a13 WITH CUBE
The following example generates 4097 (212 + 1) grouping sets.
GROUP BY GROUPING SETS( CUBE(a1, ..., a12), b )
The following example also generates 4097 (212 + 1) grouping sets. Both CUBE () and the () grouping set produce a grand total row and duplicate grouping sets are not eliminated.
GROUP BY GROUPING SETS( CUBE(a1, ..., a12), ())
Support for ISO and ANSI SQL-2006 GROUP BY Features
SQL Server 2008 supports all GROUP BY features that are included in the SQL-2006 standard with the following syntax exceptions:
Grouping sets are not allowed in the GROUP BY clause unless they are part of an explicit GROUPING SETS list. For example, GROUP BY Column1, (Column2, ...ColumnN) is allowed in the standard but not in SQL Server. GROUP BY C1, GROUPING SETS ((Column2, ...ColumnN)) or GROUP BY Column1, Column2, ... ColumnN is allowed. These are semantically equivalent to the previous GROUP BY example. This is to avoid the possibility that GROUP BY Column1, (Column2, ...ColumnN) might be misinterpreted as GROUP BY C1, GROUPING SETS ((Column2, ...ColumnN)). This is not semantically equivalent.
Grouping sets are not allowed inside grouping sets. For example, GROUP BY GROUPING SETS (A1, A2,…An, GROUPING SETS (C1, C2, ...Cn)) is allowed in the SQL-2006 standard but not in SQL Server. SQL Server 2008 allows GROUP BY GROUPING SETS( A1, A2,...An, C1, C2, ...Cn ) or GROUP BY GROUPING SETS( (A1), (A2), ... (An), (C1), (C2), ... (Cn) ). These examples are semantically equivalent to the first GROUP BY example and have a clearer syntax.
GROUP BY [ALL/DISTINCT] is not allowed in a general GROUP BY clause or with the GROUPING SETS, ROLLUP, CUBE, WITH CUBE or WITH ROLLUP constructs. ALL is the default and is implicit.
Comparison of Supported GROUP BY Features
The following table describes the GROUP BY features that are supported based upon the version of SQL Server and the database compatibility level.
Feature | SQL Server 2005 Integration Services | SQL Server 2008 compatibility level 100 | SQL Server 2008 compatibility level 90 or earlier |
|---|---|---|---|
DISTINCT aggregates | Not supported for WITH CUBE or WITH ROLLUP. | Supported for WITH CUBE, WITH ROLLUP, GROUPING SETS, CUBE, or ROLLUP. | Same as SQL Server 2008 compatibility level 100. |
User-defined function with CUBE or ROLLUP name in the GROUP BY clause | User-defined function dbo.cube(arg1,...argN) or dbo.rollup(arg1,...argN) in the GROUP BY clause is allowed. For example: SELECT SUM (x) FROM T GROUP BY dbo.cube(y) | User-defined function dbo.cube (arg1,...argN) or dbo.rollup(arg1,...argN) in the GROUP BY clause is not allowed. For example: SELECT SUM (x) FROM T GROUP BY dbo.cube(y) The following error message is returned: "Incorrect syntax near the keyword 'cube'|'rollup'." To avoid this problem, replace dbo.cube with [dbo].[cube] or dbo.rollup with [dbo].[rollup]. The following example is allowed: SELECT SUM (x) FROM T GROUP BY [dbo].[cube](y) | User-defined function dbo.cube (arg1,...argN) or dbo.rollup(arg1,...argN) in the GROUP BY clause is allowed For example: SELECT SUM (x) FROM T GROUP BY dbo.cube(y) |
GROUPING SETS | Not supported | Supported | Supported |
CUBE | Not supported | Supported | Not supported |
ROLLUP | Not supported | Supported | Not supported |
Grand total, such as GROUP BY () | Not supported | Supported | Supported |
GROUPING_ID function | Not supported | Supported | Supported |
GROUPING function | Supported | Supported | Supported |
WITH CUBE | Supported | Supported | Supported |
WITH ROLLUP | Supported | Supported | Supported |
WITH CUBE or WITH ROLLUP "duplicate" grouping removal | Supported | Supported | Supported |
For examples that use GROUPING SETS, ROLLUP, and CUBE, see Using GROUP BY with ROLLUP, CUBE, and GROUPING SETS.
A. Using a simple GROUP BY clause
The following example retrieves the total for each SalesOrderID from the SalesOrderDetail table.
USE AdventureWorks; GO SELECT SalesOrderID, SUM(LineTotal) AS SubTotal FROM Sales.SalesOrderDetail sod GROUP BY SalesOrderID ORDER BY SalesOrderID;
B. Using a GROUP BY clause with multiple tables
The following example retrieves the number of employees for each City from the Address table joined to the EmployeeAddress table.
USE AdventureWorks;
GO
SELECT a.City, COUNT(ea.AddressID) EmployeeCount
FROM HumanResources.EmployeeAddress ea
INNER JOIN Person.Address a
ON ea.AddressID = a.AddressID
GROUP BY a.City
ORDER BY a.City;
C. Using a GROUP BY clause with an expression
The following example retrieves the total sales for each year by using the DATEPART function. The same expression must be present in both the SELECT list and GROUP BY clause.
USE AdventureWorks;
GO
SELECT DATEPART(yyyy,OrderDate) AS N'Year'
,SUM(TotalDue) AS N'Total Order Amount'
FROM Sales.SalesOrderHeader
GROUP BY DATEPART(yyyy,OrderDate)
ORDER BY DATEPART(yyyy,OrderDate);
D. Using a GROUP BY clause with a HAVING clause
The following example uses the HAVING clause to specify which of the groups generated in the GROUP BY clause should be included in the result set.
USE AdventureWorks;
GO
SELECT DATEPART(yyyy,OrderDate) AS N'Year'
,SUM(TotalDue) AS N'Total Order Amount'
FROM Sales.SalesOrderHeader
GROUP BY DATEPART(yyyy,OrderDate)
HAVING DATEPART(yyyy,OrderDate) >= N'2003'
ORDER BY DATEPART(yyyy,OrderDate);
