Creating useful indexes is one of the most important ways to achieve better query performance. Useful indexes help you find data with fewer disk I/O operations and less system resource usage.
To create useful indexes, you much understand how the data is used, the types of queries and the frequencies they are run, and how the query processor can use indexes to find your data quickly.
When you choose what indexes to create, examine your critical queries, the performance of which will affect user experience most. Create indexes to specifically aid these queries. After adding an index, rerun the query to see if performance is improved. If it is not, remove the index.
As with most performance optimization techniques, there are tradeoffs. For example, with more indexes, SELECT queries will potentially run faster. However, DML (INSERT, UPDATE, and DELETE) operations will slow down significantly because more indexes must be maintained with each operation. Therefore, if your queries are mostly SELECT statements, more indexes can be helpful. If your application performs many DML operations, you should be conservative with the number of indexes you create.
SQL Server Compact 3.5 includes support for showplans, which help assess and optimize queries. SQL Server Compact 3.5 uses the same showplan schema as SQL Server 2008 except SQL Server Compact 3.5 uses a subset of the operators. For more information, see the Microsoft Showplan Schema at http://schemas.microsoft.com/sqlserver/2004/07/showplan/.
The next few sections provide additional information about creating useful indexes.
Create Highly-Selective Indexes
Indexing on columns used in the WHERE clause of your critical queries frequently improves performance. However, this depends on how selective the index is. Selectivity is the ratio of qualifying rows to total rows. If the ratio is low, the index is highly selective. It can get rid of most of the rows and greatly reduce the size of the result set. It is therefore a useful index to create. By contrast, an index that is not selective is not as useful.
A unique index has the greatest selectivity. Only one row can match, which makes it most helpful for queries that intend to return exactly one row. For example, an index on a unique ID column will help you find a particular row quickly.
You can evaluate the selectivity of an index by running the sp_show_statistics stored procedures on SQL Server Compact 3.5 tables. For example, if you are evaluating the selectivity of two columns, "Customer ID" and "Ship Via", you can run the following stored procedures:
sp_show_statistics_steps 'orders', 'customer id';
RANGE_HI_KEY RANGE_ROWS EQ_ROWS DISTINCT_RANGE_ROWS
------------------------------------------------------------
ALFKI 0 7 0
ANATR 0 4 0
ANTON 0 13 0
AROUT 0 14 0
BERGS 0 23 0
BLAUS 0 8 0
BLONP 0 14 0
BOLID 0 7 0
BONAP 0 19 0
BOTTM 0 20 0
BSBEV 0 12 0
CACTU 0 6 0
CENTC 0 3 0
CHOPS 0 12 0
COMMI 0 5 0
CONSH 0 4 0
DRACD 0 9 0
DUMON 0 8 0
EASTC 0 13 0
ERNSH 0 33 0
(90 rows affected)
And
sp_show_statistics_steps 'orders', 'reference3';
RANGE_HI_KEY RANGE_ROWS EQ_ROWS DISTINCT_RANGE_ROWS
------------------------------------------------------------
1 0 320 0
2 0 425 0
3 0 333 0
(3 rows affected)
The results show that the "Customer ID" column has a much lower degree of duplication. This means an index on it will be more selective than an index on the "Ship Via" column.
For more information about using these stored procedures, see sp_show_statistics (SQL Server Compact 3.5), sp_show_statistics_steps (SQL Server Compact 3.5), and sp_show_statistics_columns (SQL Server Compact).
Create Multiple-Column Indexes
Multiple-column indexes are natural extensions of single-column indexes. Multiple-column indexes are useful for evaluating filter expressions that match a prefix set of key columns. For example, the composite index CREATE INDEX Idx_Emp_Name ON Employees ("Last Name" ASC, "First Name" ASC) helps evaluate the following queries:
-
... WHERE "Last Name" = 'Doe'
-
... WHERE "Last Name" = 'Doe' AND "First Name" = 'John'
-
... WHERE "First Name" = 'John' AND "Last Name" = 'Doe'
However, it is not useful for this query:
-
... WHERE "First Name" = 'John'
When you create a multiple-column index, you should put the most selective columns leftmost in the key. This makes the index more selective when matching several expressions.
Avoid Indexing Small Tables
A small table is one whose contents fit in one or just a few data pages. Avoid indexing very small tables because it is typically more efficient to do a table scan. This saves the cost of loading and processing index pages. By not creating an index on very small tables, you remove the chance of the optimizer selecting one.
SQL Server Compact 3.5 stores data in 4 Kb pages. The page count can be approximated by using the following formula, although the actual count might be slightly larger because of the storage engine overhead.
<sum of sizes of columns in bytes> * <# of rows>
<# of pages> = -----------------------------------------------------------------
4096
For example, suppose a table has the following schema:
|
Column Name
|
Type (size)
|
|---|
|
Order ID
|
INTEGER (4 bytes)
|
|
Product ID
|
INTEGER (4 bytes)
|
|
Unit Price
|
MONEY (8 bytes)
|
|
Quantity
|
SMALLINT (2 bytes)
|
|
Discount
|
REAL (4 bytes)
|
The table has 2820 rows. According to the formula, it takes about 16 pages to store its data:
<# of pages> = ((4 + 4 + 8 + 2 + 4) * 2820) / 4096 = 15.15 pages