Access Subquery Techniques
This article may contain URLs that were valid when originally published, but now link to sites or pages that no longer exist. To maintain the flow of the article, we've left these URLs in the text, but disabled the links.
Mike Gunderloy
Even some experienced Access developers shy away from writing SQL directly.
That's a shame, because, unless you're willing to write SQL, you can't use
subqueries, which are a powerful tool for solving some especially thorny data
retrieval problems. In this article, Mike Gunderloy introduces subqueries and
shows how you can use them in Access.
So what's a subquery? That's easy: A subquery is an SQL SELECT statement
that's nested inside of another SQL statement. You can use subqueries as part of
a SELECT, SELECT INTO, INSERT INTO, DELETE, or UPDATE statement; in this
article, I'll only cover subqueries in SELECT statements. But the big question
is, "Why would you want to use a subquery?" I'll start out by looking
at a practical example.
Using a subquery to calculate intervals
Suppose you'd like to know how frequently your customers place orders
(this example, like all of the others in this article, will use the Access 2002
version of the Northwind sample database for its data). You can create a query
to provide this information by following these steps:
- Create a new query based on the Orders table.
- Choose to show the CustomerID and OrderDate columns.
- In the Field row of the third column of the query, type this expression:
PreviousOrderDate: (SELECT MAX(OrderDate)
FROM Orders AS Orders1
WHERE Orders1.OrderDate < Orders.OrderDate
AND Orders1.CustomerID = Orders.CustomerID)
- In the Field row of the fourth column of the query, type this
expression:
OrderInterval: [OrderDate]-[PreviousOrderDate]
- Set the CustomerID field to sort ascending, and the OrderDate field to
sort descending. Now run the query.
Figure 1
shows the results. For each order you can see the date of the previous order
as well as the number of days between the two orders. If you look at this
query in SQL view, here's what you'll find:
SELECT Orders.CustomerID, Orders.OrderDate,
(SELECT MAX(OrderDate)
FROM Orders AS Orders1
WHERE Orders1.OrderDate < Orders.OrderDate
AND Orders1.CustomerID = Orders.CustomerID)
AS PreviousOrderDate,
[OrderDate]-[PreviousOrderDate] AS OrderInterval
FROM Orders
ORDER BY Orders.CustomerID, Orders.OrderDate DESC;
The embedded SELECT statement in parentheses is a subquery. To be more
precise, this particular example is a correlated subquery: one that uses a field
from the main table as a part of the WHERE clause in the subquery.
Subquery syntax
A subquery can appear in the field list (as in the preceding example)
or in a WHERE or HAVING clause, where it provides a set of one or more values to
evaluate. In a field list, the subquery must return a single value, which is
normally assured by using an aggregation such as MAX or SUM. In the WHERE or
HAVING clause there are three basic forms for a subquery:
comparison [ANY|ALL|SOME] (subquery)
expression [NOT] IN (subquery)
[NOT] EXISTS (subquery)
The first form, using the ANY, SOME, or ALL keywords, allows you to filter a
query based on the results of another query. For example, consider this query:
SELECT TOP 5 UnitPrice
FROM Products
ORDER BY UnitPrice DESC
That gives a result set with the five highest prices in the Products table:
Unit Price
$263.50
$123.79
$97.00
$81.00
$62.50
Now, I'll use that as a subquery. First, here it is with the ANY keyword:
SELECT UnitPrice FROM PRODUCTS
WHERE UnitPrice < ANY
(SELECT TOP 5 UnitPrice
FROM Products
ORDER BY UnitPrice DESC)
ORDER BY UnitPrice DESC
The result set for that query starts off:
Unit Price
$123.79
$97.00
$81.00
$62.50
$55.00
$53.00 (more rows omitted)
Note that the $123.79 unit price is included, because it's less than any one
of the rows returned by the subquery. The result might be more clear if you
think of this using the equivalent SOME keyword (this returns exactly the same
results as the previous example):
SELECT UnitPrice FROM PRODUCTS
WHERE UnitPrice < SOME
(SELECT TOP 5 UnitPrice
FROM Products
ORDER BY UnitPrice DESC)
ORDER BY UnitPrice DESC
The alternative is to use ALL for the subquery comparison, which returns
different results. The query is:
SELECT UnitPrice FROM PRODUCTS
WHERE UnitPrice < ALL
(SELECT TOP 5 UnitPrice
FROM Products
ORDER BY UnitPrice DESC)
ORDER BY UnitPrice DESC
And the results start off:
Unit Price
$55.00
$53.00
$49.30
$46.00
$45.60
$43.90 (more rows omitted)
Now, the $123.79 unit price is removed from the results (along with several
others); the query returns only rows in the main table that are less than all of
the rows returned by the subquery.
The second form of subquery syntax, using In or Not In, allows you to use a
set of values (rather than a single value) in a WHERE clause. Suppose, for
example, that you want to see the names of all employees who've sold anything to
a particular customer. You can accomplish that with this subquery in the WHERE
clause:
SELECT FirstName, LastName
FROM Employees
WHERE EmployeeID IN
(SELECT EmployeeID
FROM Orders
WHERE CustomerID = 'ALFKI')
To see the employees who haven't sold anything to this customer, just replace
In with Not In. In case you didn't know, In and Not In work perfectly well
without subqueries as well. For example, this is a valid query:
SELECT FirstName, LastName
FROM Employees
WHERE EmployeeID IN (3,5,6)
Finally, the EXISTS and NOT EXISTS keywords let you make decisions based on
whether there are any records at all in a subquery. For example, to find all
products that have been ordered by a customer, you could use this query:
SELECT ProductName
FROM Products
WHERE EXISTS
(SELECT Orders.OrderID
FROM Orders INNER JOIN [Order Details]
ON Orders.OrderID = [Order Details].OrderID
WHERE Products.ProductID =
[Order Details].ProductID
AND CustomerID = 'ALFKI')
More subquery solutions
Let's look at some other querying problems that are easily solved
with the use of subqueries. One of these is the problem of getting a "top
per group" result set. For example, suppose you'd like to see the most
recent three order dates for each customer in the database. You can accomplish
this task with this query:
SELECT CompanyName, OrderDate
FROM Customers INNER JOIN Orders
ON Customers.CustomerID = Orders.CustomerID
WHERE OrderDate IN
(SELECT TOP 3 OrderDate
FROM Orders
WHERE Orders.CustomerID = Customers.CustomerID
ORDER BY OrderDate DESC)
ORDER BY CompanyName ASC, OrderDate DESC
Note that the WHERE clause in the subquery joins a field from the subquery
with a field from the main query. This is what makes the subquery return
different results for each customer in the main query.
Another use for subqueries is to add rankings to a totals query. For example,
you might want to know the sales rank of each product. This is most easily done
with two queries. The first is a totals query that collects the total sales for
each product:
SELECT ProductName,
SUM(Quantity) AS TotalSales
FROM [Order Details] INNER JOIN Products
ON [Order Details].ProductID =
Products.ProductID
GROUP BY ProductName
ORDER BY SUM(Quantity) DESC
After that query has been saved as qryProductTotals, you can use this query
with a subquery to generate the sales rankings:
SELECT ProductName, TotalSales,
(SELECT COUNT(*)
FROM qryProductTotals AS QPT
WHERE qryProductTotals.TotalSales
<= QPT.TotalSales) AS Rank
FROM qryProductTotals
ORDER BY TotalSales DESC
Figure 2 shows the results of running the second
query. Note how duplicate sales figures are handled by assigning a tie rank. The
query works by looking at a second copy of the source query and counting the
number of rows that have a total equal to or greater than that of the current
row. If you run this query, you'll discover that it's extremely slow, because it
needs to run the nested totals query once for every row in the result set.
Finally, subqueries are very useful for answering "above average"
questions. For example, which products cost more than the average product?
Here's a query with a subquery that gives the answer:
SELECT ProductName, UnitPrice
FROM Products
WHERE UnitPrice >
(SELECT AVG(UnitPrice)
FROM Products)
ORDER BY UnitPrice DESC
The path to SQL enlightenment
Subqueries are perhaps the simplest queries in Access that absolutely
require you to write some SQL. Even if you use the QBE grid to construct your
overall query (as I did for the first example in this article), you can't avoid
writing an SQL statement for the subquery, either in a field definition or in a
WHERE or HAVING clause. The requirement to write SQL makes many beginning Access
developers shy away from using subqueries. That's a pity, because some problems
(for example, the ranking query or the top per group query) are most easily
solved by subqueries.
Rather than avoiding subqueries, I urge you to embrace them. Learning enough
SQL to write subqueries will help you gain confidence in writing SQL statements,
and ultimately you'll find that you can use this knowledge to write other types
of queries directly in SQL. Access is practically unique as a product in letting
you switch easily from a graphical view of a query to an SQL view. By learning
how to make this switch on your own, you'll develop SQL skills that will serve
you well in other less flexible products, as well as in writing VBA code that
uses SQL statements.
To find out more about Smart Access and Pinnacle Publishing, visit their website at
http://www.pinpub.com/html/main.isx?sub=57
Note: This is not a Microsoft Corporation website. Microsoft is not responsible for its content.
This article is reproduced from the October 2001 issue of Smart Access. Copyright 2001, by Pinnacle Publishing, Inc., unless otherwise noted. All rights are reserved. Smart Access is an independently produced publication of Pinnacle Publishing, Inc. No part of this article may be used or reproduced in any fashion (except in brief quotations used in critical articles and reviews) without prior consent of Pinnacle Publishing, Inc. To contact Pinnacle Publishing, Inc., please call 1-800-493-4867 x4209.