The main cause for reduced full-text indexing performance is hardware-resource limits:
-
If CPU usage by the MSFTESQL service and SQL Server are close to 100 percent, the CPU is the bottleneck.
-
If the average disk-waiting queue length is more than two times the number of disk heads, there is a bottleneck on the disk. The primary workaround is to create full-text catalogs that are separate from the SQL Server database files and logs. Put the logs, database files, and full-text catalogs on separate disks. Buying faster disks and using RAID can also help improve indexing performance.
-
If there is a shortage of physical memory (3-GB limit) and the service-pause message appeared in the crawl log, memory is the bottleneck. For information about how to resolve this problem see Paused Microsoft Full-Text Engine for SQL Server (MSFTESQL) Service. The MSFTESQL service does not support AWE. Adding more than 4 GB of memory only helps SQL Server.
If hardware bottlenecks are not observed on the system, you could fine-tune the system to maximize hardware capacity. When no hardware bottlenecks are present, the indexing performance of Full-Text Search in Microsoft SQL Server 2005 mostly depends on the following:
-
How long it takes SQL Server to create full-text batches.
-
How quickly the MSFTESQL service can consume those batches.
The interaction between SQL Server and the MSFTESQL service must be tuned to reach optimal performance. If SQL Server produces more batches than the MSFTESQL service can consume, the service pauses and generates a crawl-log message to indicate this paused state. For information about how to resolve this problem, see Paused Microsoft Full-Text Engine for SQL Server (MSFTESQL) Service.
If, on the other hand, SQL Server does not produce enough full-text batches to keep the MSFTESQL service busy, the service is idle. This state is not optimal. In fact, this is the most common reason for indexing that runs slowly. To make sure that MSFTESQL service is optimally occupied, you must track and tune the following counters:
-
Batches in progress counter: Microsoft Full-Text Engine Filter Daemon (MSFTELFD)
This counter should be either equal to or double the number of CPUs in the system. Values of 0, 1 or 2 with low CPU usage indicate that SQL Server is not performing well. For example, if you have a four-way computer, this number should be between 4 and 8.
-
Batches in ready queue: MSFTESQL service
The value should be close to 10 times the crawl range count. To determine how many ranges are used to index the table, run a query from sys.dm_fts_population_ranges.
If the counter is low, you can improve it in the following ways:
-
Make sure that the table has a multicrawl range. To verify, query sys.dm_fts_population_ranges. Ideally, the crawl-range count should be two times the number of CPUs. The crawl range is limited by the number of rows in the table, the number of CPUs, and the max full-text crawl range configuration option. You must restart the crawl operation for the option to take effect.
Note: |
|---|
|
This only applies to a full crawl.
|
-
Make sure that the base table has a clustered index. Use an integer data type for the first column of the clustered index. Avoid using GUIDs in the first column of the clustered index. A multirange crawl on a clustered index can produce the highest crawl speed.
-
Update the statistics of the base table by using the UPDATE STATISTICS statement. More important, update the statistics on the clustered index or the full-text key for a full crawl. This helps a multirange crawl to generate good partitions on the table.
-
Build a secondary index on a timestamp column if you want to improve the performance of incremental population.
Note: |
|---|
|
Unlike full crawl, incremental, manual, and auto change tracking population are not designed to maximize hardware resources to achieve faster speed. Therefore, these tuning suggestions may not enhance performance for full-text indexing.
|