Using CHKSGFILES in a Multithreaded Application
The CHKSGFILES API in Microsoft Exchange Server 2010 enables you to check database pages by using multithreaded applications. Note, however, that using CHKSGFILES in a multithreaded application requires your application to use a specific order of execution. Some parts of the API cannot be run simultaneously, whereas other parts can be. This topic describes how to create multithreaded applications that use CHKSGFILES.
CHKSGFILES is used by applications that back up Exchange databases, to ensure that the databases to be stored are fundamentally sound and not corrupted. With Exchange 2010, it’s much less likely that there will be a corruption problem, but it's safer for your backup and restore application to verify the data before making the backup.
To use CHKSGFILES in a multithreaded application, you must adhere to the following rules, which are described in more detail later in this topic.
In the CHKSGFILES API, the New, ErrInit, ErrCheckDbHeaders, ErrTerm, and Delete functions can only be called from a single thread, must be called in the proper sequence, and can only be called once each.
The ErrCheckDbPages and ErrCheckLogs functions can be called in a multithreaded manner, but only after ErrCheckDbHeaders is called and before ErrTerm is called.
The ErrCheckLogs function must only be called once for all of the databases specified in the ErrInit function. However, ErrCheckLogs can be called concurrently with calls to ErrCheckDbPages.
If you’re backing up multiple Exchange 2010 databases, you must obtain a separate CHKSGFILES object for each database, and the sequencing and concurrency rules still apply for each object.
Using a combination of single-threading and multithreading, and obeying the rules described in this topic and the rest of the SDK, your backup application can more quickly check Exchange databases than in a purely single-threaded manner. Remember, however, that the application described in this topic is only an example that is intended to illustrate which parts of the CHKSGFILES API must be handled in what way.
This example application entails two main processes and a set of worker threads. The first process (orange) handles the overall backup job, whereas the second process (blue) handles queue requests by creating worker threads to verify the database pages and log files. Central to this system is a request and completion-status queue.
The example backup job engine process block (orange) in the diagram indicates the parts of the database consistency check that must be run in a single-threaded manner.
Your application must never allow more than one backup job engine process to operate at the same time. The CHKSGFILES APIs shown in that part of the diagram should never be called concurrently. Your application should only call CHKSGFILES consistency checks in sequence. The CHKSGFILES DLL does not support out-of-sequence or simultaneous calls to the New, ErrInit, ErrCheckDbHeaders, ErrTerm, and Delete functions. Your application should call those APIs only once for each consistency check. Only after the sequence shown in the backup job engine part of the diagram has been completed can that sequence then be restarted.
When the backup job engine section starts, it should initialize whatever request queuing mechanism is being used. This example uses a queue that manages the requests, the return status of those requests, and a separate process that scans for entries in the request queue. Because the purpose of running a database consistency checks is to find errors, the queue needs to return success/failure information to the main part of the program.
After the queue is initialized, the backup job can use Windows VSS to take the backup, or “snapshot”. After the snapshot is successfully obtained and made available, the backup application can call the CHKSGFILES New function to obtain an instance of the API. The backup job must then call ErrInit, specifying which databases are to be checked, the log-file path, and other parameters.
Then the backup job process calls the ErrCheckDbHeaders function, to verify that all of the databases have the proper header information. It’s very important that ErrCheckDbHeaders be called only once to check all of the databases that were specified in ErrInit. For Exchange 2007, this will likely be all of the databases in a specified storage group. For Exchange 2010, this will probably be only a single database, because ErrInit accepts only a single log file path.
In this example, the single-threaded backup job engine then adds requests to the queue for the database pages in each database and adds a single request to check the log files.
The backup job engine then waits until the request queue contains completion status for all of the requests. At that point, the backup job engine calls the ErrTerm function. As with ErrCheckDbHeaders, your application must call ErrTerm only once. It’s up to the application to ensure that it has tried to check all of the database pages and log files. Don’t try to use ErrTerm to keep track of the progress: if you call ErrTerm and not all of the pages and logs have been checked, it will return an error and invalidate all checks that have been done on those databases. In this example, the backup job engine process uses the queue entries to keep track of the database pages that have already been checked.
Finally, the backup job engine can call the Delete function to dispose of the CHKSGFILES API instance. Then, based on the results, the backup application can copy the snapshot contents to the backup media and continue processing the backup job.
When CHKSGFILES was introduced in Microsoft Exchange Server 2007, the Exchange storage architecture included storage groups, which are collections of databases that you can manage as units. With Exchange 2007 servers, your application passes a list of the databases in a storage group to the ErrInit function. Because Exchange 2010 does not have storage groups, each database and log file set is kept separate. So, for checking Exchange 2010 databases, your application will very likely pass a single-element array with one database and log file path to the CHKSGFILES ErrInit function. Remember, ErrInit requires that the databases be specified in an array, even when there is only one database to be checked.
For an individual database, you can run the page checks concurrently to speed up processing. But keep in mind that to check multiple Exchange 2010 databases, your application must call the New and ErrInit functions separately for each database, and it must handle the different instances of the CHKSGFILES API separately. Just as with a single set of databases sent to ErrInit, you must follow the single-threading and sequencing rules for the API for each instance of the CHKSGFILES API.
The majority of functions in the CHKSGFILES API must be run in a single thread, and there can be no out-of-sequence calls made. As shown in the example application in this topic, the parts of the CHKSGFILES functions that can be run multithreaded are handled in the Queue Servicing process (blue), and the Database Page and log file worker threads (green).
The CHKSGFILES API supports checking the log files by using the ErrCheckLogs function, in the same way that this API uses the ErrCheckDbPages function to check the database pages. However, your application must call ErrCheckLogs only once for each set of databases that were passed to ErrInit. Calling ErrCheckLogs more than once causes the API to return an error, and the entire consistency check will be considered to have failed, even if there are no actual database or log file errors.
When the queue servicing process starts, it begins checking for new entries in the request queue. When it sees a new request, it starts a new worker thread to service that request.
When it starts, the worker thread (green) should obtain the request information from the queue (or directly from the request queue servicing process) and then perform the check. In this example application, it is up to the worker thread to process the request appropriately: database page requests use the ErrCheckDbPages function, whereas log file requests use the ErrCheckLogs function. If the backup job engine process is running properly, there should never be more than one request for log file checks for each set of databases passed to the ErrInit function. When the check has completed, the worker thread should update the request status information in the queue, and then the thread should exit.
When all of the dispatched threads have exited, the queue processing service can signal the backup job engine process by way of the queue. Alternatively, the backup job engine process can detect when there are no more unprocessed requests in the queue.