A normalized database prevents functional dependencies in the data so that updating the database is easy and efficient. However, querying the database might require many joins of tables to combine information. As the number of join tables increases, the query running time increases significantly. Therefore, a normalized database might not always be the best choice. A database with the appropriate amount of denormalization reduces the number of tables that have to join together, without adding too much complication to the updating process. This is frequently a good compromise.
Note: |
|---|
|
In general, if a significant number of your queries require joins of more than five or six tables, you should consider denormalization.
|
There are other kinds of database denormalization, also. For example, suppose you have two tables in a database: Orders and Order Details. The Orders table contains information about a customer's order. The individual products in each order are contained in the Order Details table. Suppose you want to query the total dollar amount for each order. First you have to determine the dollar amount for each product (units * unit price – applicable discount). Then you have to group the amounts by order. Here is what the query looks like:
SELECT "Order ID", SUM("Unit Price" * Quantity * (1.0 - Discount))
AS Total FROM "Order Details"
GROUP BY "Order ID"
Order ID Total
----------------------------------------
10000 108
10001 1363.15000915527
10002 731.800003051758
10003 498.180023193359
10004 3194.19999694824
10005 173.400009155273
10006 87.2000007629395
10007 1405
10008 1171
10009 1530
10010 470
... ...
(1078 rows affected)
The calculation for this query is not trivial. For a large set of orders, the query can take a long time to run. The alternative is to calculate the dollar amount of the order at the time it is placed, and then store that amount in a column within the Orders table. With this approach, you have to query only the pre-computed column to return the information that you need:
SELECT "Order ID", "Order Total" AS Total FROM Orders
By creating a pre-computed column, you can save lots of query time, but this approach requires that you maintain an additional column in the table.