Performing a Parametric Sweep using the Job Submission Sample

SharePoint 2007

The Excel Services and Excel 2007 Windows Compute Cluster Server (CCS) Job Submission Sample Developer Guide shows you how to build an integrated solution using Windows CCS 2003 and Microsoft Office SharePoint Server 2007 to improve the performance and reliability of Microsoft Office Excel 2007 workbooks.

This topic provides an introduction to the parametric sweep scenario and an architectural overview of the Excel Services and Excel 2007 Windows CSS 2003 Job Submission sample.


This developer guide assumes that you are familiar with Excel Services and Windows CCS 2003. For more information about both, see Additional Resources.

Through a sample, we demonstrate how to perform a parametric sweep. Using an Excel 2007 workbook, you can create a model and use a set of parameters to iterate through it to create an aggregate of results by "sweeping" through the list of parameters, such as is performed in a Monte Carlo simulation. A parametric sweep is often performed on a single workstation, which can be a long-running and compute-intensive operation that results in calculations that take minutes, hours, or even days to complete. By moving computation to a server farm of 64-bit servers running Office SharePoint Server 2007 and Excel Services within a compute cluster, you can perform calculations in parallel. This can dramatically reduce the time to complete the calculation.

We also demonstrate how to create a custom Excel task pane add-in, and create a Web form that an end user can use to access the compute cluster server via a SharePoint document library custom shortcut menu. The task pane—the Excel Services and Excel 2007 Windows Compute Cluster Server Job Submission task pane, or "Job Submission task pane"—is a custom interface hosted within Excel 2007 that provides end users with the ability to perform immediate or scheduled Excel calculations to occur across the compute cluster. The Web form is a standard ASP.NET 2.0 application that uses the Microsoft Compute Cluster Pack (CCP) API to provide functionality similar to the task pane from a Web client.

The Excel Services and Excel 2007 Windows CCS Job Submission sample allows rapid calculation of workbooks with hundreds of inputs, and thousands of parameters per input. Using the compute cluster, you can significantly reduce the time to return results that can take hours to calculate on a desktop computer.

Architecture Overview

Figure 1 shows the architecture of the Excel Services and Excel 2007 Windows CCS Job Submission sample.

Figure 1. Excel Services and Excel 2007 Windows Submission sample architecture
CCS Job Submission sample architecture

To understand the scenario and architecture, we explore how the solution works, starting with the Excel Services and Excel 2007 CCS Job Submission task pane.

The solution contains three distinct layers and two job types, which are used for executing calls to these layers. The two job types are scheduled and immediate. The client layer provides the user with an interface to access the compute cluster and to schedule jobs or run them immediately through either the Excel 2007 task pane or the custom Web form. The task pane enables the user to prepare the input information for performing a parametric sweep, whereas the Web form is simply used for scheduling an existing set of parameters.

The head node performs both the execution of immediate jobs and the centralized job scheduling for all time-based jobs. For jobs that should be executed immediately, the head node receives job requests from the Excel task pane add-in, or from the Web form, and schedules a cluster job and manages job success or failure. The head node performs the same for a scheduled execution; it executes a Windows–based task that the Web form scheduled on the head node.

Each compute node runs Windows Compute Cluster Edition, Office SharePoint Server 2007 with Excel Services enabled, and the ExecutionEngine program. The ExecutionEngine program is a custom executable file that resides on each of the compute nodes. This program performs the workbook calculation by calling the Excel Services API programmatically. The compute nodes receive jobs from the head node, perform calculations based on a subset of the overall parameter list, and then store the results within a share located on the head node for further processing.

How the Excel Services and Excel 2007 Submission Sample Works

A user can submit a workbook to a compute cluster through the Job Submission task pane with a few clicks of the mouse. Behind the scenes, however, a complex process takes the workbook from the client computer to the final result, as shown in Figure 2.

Figure 2. Job submission process
Job submission process

Following is an ordered summary of the required steps to perform a parametric sweep using an Excel workbook across a compute cluster. For additional information and descriptions, see Job Submission Task Pane Add-in. The Excel 2007 workbook is stored in a SharePoint document library, which is configured as a trusted file location. This allows the workbook to be accessible to all SharePoint servers running as compute nodes.

Stage 1a

When a user clicks the Submit Job Now button on the Job Submission task pane, a custom XML file, Workbook Job XML, is created. This file contains all input parameters to use during the parametric sweep. The task pane is used to create jobs that are scheduled for immediate execution on the compute cluster. As each calculation is completed, the compute node results are returned to the client computer running Excel 2007.

Stage 1b

Jobs can also be scheduled through a custom Web form hosted on Office SharePoint Server, which is accessed through the workbook's shortcut (context) menu. Scheduled jobs are set to calculate on the compute cluster at a fixed or on a recurring basis. The Workbook Job XML file is stored on a share, typically on the head node named SOURCESHARE.

Stage 2a

We determine how the work is divided among the compute nodes by using a divisor. The divisor can be fixed and set by the user submitting the job, or set to be distributed automatically and evenly across all available compute nodes.

Stage 2b

The Workbook Job XML file is used in combination with the divisor to divide the work into individual job tasks. After division, an individual Workbook Task XML file is created containing only those inputs to calculate for an individual job task. The XML file is stored on the HEADNODESHARE share located on the cluster's head node.

Stage 3a

A CCS Job XML file is created that describes the job and tasks within it that the cluster performs. The CCS Job XML file is stored on the HEADNODESHARE share for access from the compute cluster nodes during calculation. For the immediate jobs, the CCS Job XML file creates a cluster job that is executed immediately by the job scheduler running on the head node. Results from each of the compute nodes are collected programmatically by using the available API events; collection is performed within the Excel task pane. Therefore, no additional CCS result collection task is necessary in the CCS Job XML file. Scheduled jobs contain an additional work task that collects the calculation results performed by the preceding tasks. This additional CCS task is the last one to execute within the CCS Job XML file.

Stage 4

Execution of the calculation tasks occurs across all available compute nodes. The ExecutionEngine program opens the Workbook Job XML file stored on the head node. Using ExecutionEngine, the compute node opens the workbook from the SharePoint document library by calling the Excel Web Services API. The program provides each individual input in an input row of data to the Web service, performs the Excel Services calculation, and then collects and stores the calculated results in the Workbook Job XML file located on the head node share.

Stage 5

For immediate jobs, the results are collected by using the Excel Calculation Services on Compute Cluster Server API and placed at the results location within the workbook as calculations are completed.

Stage 6

For scheduled jobs, the additional result collection CCS Task is executed, and results are collected and placed within the template workbook residing in the SharePoint document library.

At the end of a job, the end user has access to an Excel 2007 workbook that contains the completed parametric sweep results. By performing these computations as a set of tasks running in parallel across a set of servers running Office SharePoint Server in a compute cluster, results can be returned to the end user more quickly. This relieves the desktop computer of long-running and compute-intensive calculations, expedites the timeliness of results, and provides fault tolerance and scalability.

See Also

Community Additions