Profiling HPC Applications
Maxim Goldin, Senior Developer, Visual Studio Diagnostics, Microsoft Corporation
This white paper introduces profiling of High Performance Computing (HPC) applications and explains how to configure this profiling mode to analyze the results. The paper does not discuss HPC technology, terminology and APIs or specific HPC applications, as it assumes sufficient familiarity of the reader with those topics.
A new feature of Visual Studio 2010 Profiling Tools is the ability to collect performance information from High Performance Computing (HPC) servers and applications. The Microsoft HPC Pack can be used to schedule jobs on servers running Windows HPC Server. Several machines are connected into a logical cluster and can be used to simultaneously execute a distributed application. Optimizing these applications to efficiently utilize the cluster is a crucial part of the development lifecycle.
HPC profiling automatically deploys the target application onto the cluster nodes and runs the application in this distributed environment. The application runs on several nodes, but it is being profiled on only one of them. When the application has terminated, the profiling data file is copied from the profiled machine to the development machine and is opened in Visual Studio 2010 for analysis.
A prerequisite for profiling cluster nodes is that Microsoft .NET Framework 4, available on the Microsoft Download Center, and the Visual Studio 2010 Performance Tools are installed on the target cluster nodes. For more information, see How to: Install the Stand-Alone Profiler.
HPC Profiling currently works only for the Sampling profiling mode, and only a single compute node can be profiled at any given time.
HPC Performance Wizard
To configure HPC profiling, open your HPC project and then select "Launch HPC Performance Wizard" from the Analyze menu of Visual Studio 2010, or click the "Launch HPC Performance Wizard" button in the Performance Explorer. This option is available only if the HPC Pack is installed on your development machine.
Click next on the following screen:
Select one of the available projects that are open in Visual Studio 2010 to profile or a select "An executable (.EXE) file to select a stand-alone application, and then click Next:
The next wizard screen asks you to set the Remote working directory and Deployment location.
The Deployment location represents the file share where all necessary files are copied prior to execution. This includes the executable to be profiled, any runtime dependencies, and all additional data files included in the deployment. This directory must be accessible by all nodes of the cluster, because the files can either be copied locally to the nodes or executed from the share.
Setting the Deployment location is optional. For remote execution, the default value of the Deployment location is set to \\$(HeadNode)\CcpSpoolDir\$(UserName)\$(ProjectName).
The Remote working directory specifies a local directory on the node machine where the files in the Deployment location are copied. If the value of Remote working directory is explicitly set, the deployment phase of the profiling session copies files from the deployment location to the working directory of profiling node, and the profiling node executes the application in this local directory. Otherwise the working directory is considered to be the same as Deployment location, and no additional copy happens.
The next page of the wizard provides additional configuration options. First, a head node must be selected. You can either type the node name here, or select it from a drop-down list (which is populated with names of available HPC clusters head nodes). The "Number of processes" is the number of processes that will be started by the job scheduler for the target application. The job scheduler will then determine how the processes will be assigned to resources on the cluster, thus the number of processes is ultimately limited by the cluster resources.
Additional profiling options on the page allow choosing between profiling by rank and profiling on a specific node. Click Next to step to the final page of the wizard, which will notify you that you provided all necessary information. Click Finish on the last page to start profiling.
After you click Finish, a Performance Session is created in the Visual Studio Performance Explorer, and profiling begins.
HPC Advanced Properties of Performance Session
The Performance Wizard allows you to specify the required values of your session configuration. More advanced options of the session can be configured through the session properties pages. To set advanced profiling options, clear the "Launch profiling after the wizard finishes" check box.
After a performance session is created, you can specify advanced configuration options on the performance session property pages. Right-click the session name in Performance Explorer and then click Properties.
The Property Pages displays two HPC related tabs. On the "HPC Launch Properties" tab you can specify options that are provided through Performance Wizard.
On the "HPC Advanced Properties" page, you can specify options that provide more detailed control over the performance session and the target application execution.
On this tab, you can modify the following values:
Profiling starts when you click the Launch button in Performance Explorer. The steps that are executed by the profiler are slightly different than profiling scenarios that do not target HPC applications. Because the application is executed and profiled on a different machine, it is necessary to deploy the application, profile it remotely and copy the results of the profiling run back to the local machine. The primary purpose of Visual Studio 2010 running on the development machine is to serve as a controller over the profiling process.
Depending on the configuration of performance session, the complete HPC profiling scenario consists of the following steps.
Execute the Pre-Profile script on the development machine where Visual Studio installed (optional).
Validate the profiling configuration (for example, at least one compute node of the cluster must be available in order to successfully run and profile an HPC application).
Deploy all specified files from the local machine to the Deployment share.
Copy all files from the Deployment share to the working directory (optional). If required, files are copied into the specified working directory on the profiling node.
Run the application. Multiple distributed processes are created and only one of them runs under the profiler. To achieve this, MPIExec.exe is used to launch the application processes on all nodes.
Clean up the working directory (optional) – if required, the working directory of profiling node is cleaned up.
Copy results from execution directory (either deployment share or working directory) to the local development machine.
Clean up the Deployment share.
Execute the Post-Profile script on the development machine where Visual Studio installed (optional).
After the profiling run, the profiler data file is added to the Reports list of the Performance Session and the file is opened by Visual Studio 2010 for analysis. The file is opened from the local staging directory, where it was copied in step 7 above. The file contains sampling profiling data from one process running on one of the cluster nodes.
The Output window of Visual Studio 2010 reports the progress of the scenario.
After you start profiling, you see the following messages in the Output window:
Profiling started. Starting Pre-script command C:\projects\ComputePI\PreProfilingScript.cmd Starting deployment to \\<computername>\CcpSpoolDir\ <username>\ComputePI Starting remote run "\\<computername>\CcpSpoolDir\<username>\ComputePI\deploy_to_workdir.bat" Starting remote run mpiexec.exe -n 1 "\\<computername>\CcpSpoolDir\<username>\ComputePI\HpcProfile.bat"
You might get a request for your credentials in order to validate your access to the cluster.
After your access is granted (Tip: Select "Remember my password" to avoid being prompted again when you work with the same cluster), the profiling process continues. When the target application terminates, Visual Studio 2010 is notified and it collects the profiling results.
Starting remote run "\\<computername>\CcpSpoolDir\<username>\ComputePI\delete_from_workdir.bat" Starting copying *.* files from C:\projects\ComputePI\obj\Debug to C:\Users\<username>\AppData\Local\Temp\HpcProfiling Starting moving file \\<computername>\CcpSpoolDir\<username>\ComputePI\ComputePI100311.vsp to C:\Users\<username>\AppData\Local\Temp\HpcProfiling\ComputePI100311.vsp Cleaning deployment at \\<computername>\CcpSpoolDir\<username>\ComputePI Starting Post-script command C:\projects\ComputePI\PostProfilingScript.cmd Profiling finished.
You now have the basic skills needed to profile HPC servers and applications using the Visual Studio 2010 Profiling Tools.