Parts of a SELECT Statement

The full syntax of the SELECT statement is complex, but the main clauses can be summarized as:

SELECT select_list

[ INTO new_table_name ]

FROM table_list

[ WHERE search_conditions ]

[ GROUP BY group_by_list ]

[ HAVING search_conditions ]

[ ORDER BY order_list [ ASC | DESC ] ]

  • select_list
    Describes the columns of the result set. It is a comma-separated list of expressions. Each expression defines both the format (data type and size) and the source of the data for the result set column. Each select list expression is typically a reference to a column in the source table or view the data is coming from, but can be any other expression, such as a constant or a Transact-SQL function. Using the * expression in a select list specifies that all columns in the source table are returned.

  • INTO new_table_name
    Specifies that the result set is used to create a new table. new_table_name specifies the name of the new table.

  • FROM table_list
    Contains a list of the tables from which the result set data is retrieved. These sources can be:

    • Base tables in the local server running SQL Server.

    • Views in the local instance of SQL Server. SQL Server internally resolves a view reference to references against the base tables that make up the view.

    • Linked tables. These are tables in OLE DB data sources that can be accessed by SQL Server. This is referred to as a distributed query. OLE DB data sources can be accessed from SQL Server by linking them as a linked server, or by referencing the data source in an OPENROWSET or OPENQUERY function.

    The FROM clause can also contain join specifications. These define the specific path SQL Server is to use in navigating from one table to another.

    The FROM clause is also used on the DELETE and UPDATE statements to define the tables that are modified.

  • WHERE search_conditions
    The WHERE clause is a filter that defines the conditions each row in the source tables must meet to qualify for the SELECT. Only rows that meet the conditions contribute data to the result set. Data from rows that do not meet the conditions is not used.

    The WHERE clause is also used on the DELETE and UPDATE statements to define the rows in the target tables that are modified.

  • GROUP BY group_by_list
    The GROUP BY clause partitions the result set into groups based on the values in the columns of the group_by_list. For example, the AdventureWorks Sales.SalesOrderHeader table has ten values in TerritoryID. A GROUP BY TerritoryID clause partitions the result set into 10 groups, one for each value of TerritoryID.

  • HAVING search_conditions
    The HAVING clause is an additional filter that is applied to the result set. Logically, the HAVING clause filters rows from the intermediate result set built from applying any FROM, WHERE, or GROUP BY clauses in the SELECT statement. HAVING clauses are typically used with a GROUP BY clause, although a GROUP BY clause is not required before a HAVING clause.

  • ORDER BY order_list[ ASC | DESC ]
    The ORDER BY clause defines the order in which the rows in the result set are sorted. order_list specifies the result set columns that make up the sort list. The ASC and DESC keywords are used to specify if the rows are sorted in an ascending or descending sequence.

    ORDER BY is important because relational theory specifies that the rows in a result set cannot be assumed to have any sequence unless ORDER BY is specified. ORDER BY must be used in any SELECT statement for which the order of the result set rows is important.

The clauses in a SELECT statement must be specified in the appropriate order.

Each reference to a database object must be unambiguous. Ambiguity can come from the following sources:

  • There may be multiple objects that have the same name in a system. For example, both Schema1 and Schema2 may have a table named TableX. To resolve the ambiguity and specify the TableX owned by Schema1, qualify the table name with at least the schema name:

    SELECT *
    FROM Schema1.TableX
    
  • The database in which the object resides may not always be the current database when the SELECT statement is executed. To make sure that the correct object is always used, regardless of the current database setting, qualify the object name with the database and schema:

    SELECT *
    FROM AdventureWorks.Purchasing.ShipMethod
    
  • The tables and views specified in the FROM clause may have duplicate column names. Frequently, foreign keys will have the same column name as their related primary key. To resolve the ambiguity between duplicate names, the column name must be qualified with the table or view name:

    SELECT DISTINCT Sales.Customer.CustomerID, Sales.Store.Name
    FROM Sales.Customer JOIN Sales.Store ON
           ( Sales.Customer.CustomerID = Sales.Store.CustomerID)
    WHERE Sales.Customer.TerritoryID = 1
    

    This syntax becomes cumbersome when the table and view names must both be fully qualified. You can resolve this problem by assigning a correlation name, also known as a range variable or alias, to the table, by using the AS keyword in the FROM clause. The fully qualified table or view name has to be specified only in the FROM clause. All other table or view references can then use the correlation name. Applying correlation names and fully qualifying the tables in the previous sample creates this SELECT statement:

    SELECT DISTINCT c.CustomerID, s.Name
    FROM Sales.Customer AS c
       JOIN
         Sales.Store AS s
       ON ( c.CustomerID = s.CustomerID)
    WHERE c.TerritoryID = 1
    

For more information about object qualification, see Identifiers.

In SQL Server Books Online, many Transact-SQL examples are simplified by not using qualified names. Although these elements are left out of the examples to help readability, we recommend that you use qualified names in Transact-SQL statements in production systems.