Improving performance in AdventureWorks Shopper (Windows Store business apps using C#, XAML, and Prism)
Users of Windows Store apps expect their apps to remain responsive and feel natural when they use them. In this article we discuss general performance best practices for the AdventureWorks Shopper reference implementation.
After you download the code, see Getting started with AdventureWorks Shopper for instructions on how to compile and run the reference implementation, as well as understand the Microsoft Visual Studio solution structure.
- The differences between performance and perceived performance.
- Guidelines that help to create a well-performing, responsive app.
- Recommended strategies for profiling an app.
- Windows Runtime for Windows 8
- Extensible Application Markup Language (XAML)
Users have a number of expectations for apps. They want immediate responses to touch, clicks, and key presses. They expect animations to be smooth. They expect that they'll never have to wait for the app to catch up with them. Performance problems show up in various ways. They can reduce battery life, cause panning and scrolling to lag behind the user's finger, or make the app appear unresponsive for a period of time. The following list summarizes the decisions to make when planning a well-performing, responsive app:
- Should I optimize actual app performance or perceived app performance?
- What performance tools should I use to discover performance-related issues?
- When should I take performance measurements?
- What devices should I take performance measurements on?
- Do I need to completely understand the platform to determine where to improve app performance?
Optimizing performance is more than just implementing efficient algorithms. Another way to think about performance is to consider the user's perception of app performance. The user's app experience can be separated into three categories—perception, tolerance, and responsiveness.
- Perception. User perception of performance can be defined as how favorably they recall the time it took to perform their tasks within the app. This perception doesn't always match reality. Perceived performance can be improved by reducing the amount of time between activities that the user needs to perform to accomplish a task, and by allowing computationally intensive operations to execute without blocking the user from performing other activities.
- Tolerance. A user's tolerance for delay depends on how long the user expects an operation to take. For example, a user might find sending data to a web service intolerable if the app becomes unresponsive during this process, even for a few seconds. You can increase a user's tolerance for delay by identifying tasks in your app that require substantial processing time and limiting or eliminating user uncertainty during those tasks by providing a visual indication of progress. And you can use async APIs to avoid making the app appear frozen.
- Responsiveness. Responsiveness of an app is relative to the activity being performed. To measure and rate the performance of an activity, you must have a time interval to compare it against. We used the guideline that if an activity takes longer than 500 milliseconds, the app might need to provide feedback to the user in the form of a visual indication of progress.
Therefore, both actual app performance and perceived app performance should be optimized in order to deliver a well-performing, responsive app.
One technique for determining where code optimizations have the greatest effect in reducing performance problems is to perform app profiling. The profiling tools for Windows Store apps enable you to measure, evaluate, and find performance-related issues in your code. The profiler collects timing information for apps by using a sampling method that collects CPU call stack information at regular intervals. Profiling reports display information about the performance of your app and help you navigate through the execution paths of your code and the execution cost of your functions so that you can find the best opportunities for optimization. For more info see How to profile Visual C++, Visual C#, and Visual Basic code in Windows Store apps on a local machine. To learn how to analyze the data returned from the profiler see Analyzing performance data for Visual C++, Visual C#, and Visual Basic code in Windows Store apps. In addition to using profiling tools to measure app performance, we also used PerfView and Windows Performance Analyzer (WPA). PerfView is a performance analysis tool that helps isolate CPU and memory-related performance issues. WPA is a set of performance monitoring tools used to produce performance profiles of apps. We used both of these tools for a general diagnosis of the app’s performance. For more info about PerfView see PerfView Tutorial. For more info about WPA see Windows Performance Analyzer.
Measuring your app's performance during the early stages of development can add enormous value to your project. We recommend that you measure performance as soon as you have code that performs meaningful work. Early measurements give you a good idea of where the high costs in your app are, and can inform design decisions. It can be very costly to change design decisions later on in the project. Measuring performance late in the product cycle can result in last minute changes and poor performance. For more info see General best practices for performance.
At a minimum, take performance measurements on hardware that has the lowest anticipated specifications. Windows 8 runs on a wide variety of devices, and taking performance measurements on one type of device won't always show the performance characteristics of other form factors.
You don't need to completely understand the platform to determine where you might need to improve performance. By knowing what parts of your code execute most frequently, you can determine the best places to optimize your app.
A well-performing app responds to user actions quickly, and with no noticeable delay. We spent much time learning what works and what doesn't work when creating a responsive Windows Store app. Here are some things to remember:
- Limit the startup time.
- Emphasize responsiveness.
- Trim resource dictionaries
- Optimize the element count.
- Reuse identical brushes.
- Use independent animations.
- Minimize the communication between the app and the web service.
- Limit the amount of data downloaded from the web service.
- Use UI virtualization.
- Avoid unnecessary termination.
- Keep your app's memory usage low when it's suspended.
- Reduce battery consumption.
- Minimize the amount of resources that your app uses.
- Limit the time spent in transition between managed and native code.
- Reduce garbage collection time.
It's important to limit how much time the user spends waiting while your app starts. There are a number of techniques you can use to do this:
- You can dramatically improve the loading time of an app by packing its contents locally, including XAML, images, and any other important resources. If an app needs a particular file at initialization, you can reduce the overall startup time by loading it from disk instead of retrieving it remotely.
- You should only reference assemblies that are necessary to the launch of your app in startup code so that the common language runtime (CLR) doesn't load unnecessary modules.
- Defer loading large in-memory objects while the app is activating. If you have large tasks to complete, provide a custom splash screen so that your app can accomplish these tasks in the background.
In addition, apps have different startup performance at first install and at steady state. When your app is first installed on a user's machine, it is executed using the CLR's just-in-time (JIT) compiler. This means that the first time a method is executed it has to wait to be compiled. Later, a pre-compilation service pre-compiles all of the modules that have been loaded on a user's machine, typically within 24 hours. After this service has run most methods no longer need to be JIT compiled, and your app benefits from an improved startup performance. For more info see Minimize startup time.
Don't block your app with synchronous APIs, because if you do the app can't respond to new events while the API is executing. Instead, use asynchronous APIs that execute in the background and inform the app when they've completed by raising an event. For more info see Keep the UI thread responsive.
App-wide resources should be stored in the Application object to avoid duplication, but if you use a resource in a single page that is not the initial page, put the resource in the resource dictionary of that page. This reduces the amount of XAML the framework parses when the app starts. For more info see Optimize loading XAML.
The XAML framework is designed to display thousands of objects, but reducing the number of elements on a page will make your app render faster. You can reduce a page’s element count by avoiding unnecessary elements, and collapsing elements that aren't visible. For more info see Optimize loading XAML.
Create commonly used brushes as root elements in a resource dictionary, and then refer to those objects in templates as needed. XAML will be able to use the same objects across the different templates and memory consumption will be less than if the brushes were duplicated in templates. For more info see Optimize loading XAML.
An independent animation runs independently from the UI thread. Many of the animation types used in XAML are composed by a composition engine that runs on a separate thread, with the engine’s work being offloaded from the CPU to the graphics processing unit (GPU). Moving animation composition to a non-UI thread means that the animation won’t jitter or be blocked by the app working on the UI thread. Composing the animation on the GPU greatly improves performance, allowing animations to run at a smooth and consistent frame rate.
You don’t need additional markup to make your animations independent. The system determines when it's possible to compose the animation independently, but there are some limitations for independent animations. For more info see Make animations smooth.
In order to reduce the interaction between the AdventureWorks Shopper reference implementation and its web service as much data as possible is retrieved in a single call. For example, instead of retrieving product categories in one web service call, and then retrieving products for a category in a second web service call, AdventureWorks Shopper retrieves a category and its products in a single web service call.
In addition, the AdventureWorks Shopper reference implementation uses the TemporaryFolderCacheService class to cache data from the web service to the temporary app data store. This helps to minimize the communication between the app and the web service, provided that the cached data isn't stale. For more info see Caching data.
The GetRootCategoriesAsync method in ProductCatalotRepository class retrieves data for display on the HubPage, as shown in the following code example.
The call to the GetRootCategoriesAsync method specifies the maximum amount of products to be returned by each category. This parameter can be used to limit the amount of data downloaded from the web service, by avoiding returning an indeterminate number of products for each category. For more info see Consuming the data.
UI virtualization enables controls that derive from ItemsControl (that is, controls that can be used to present a collection of items) to only load into memory those UI elements that are near the viewport, or visible region of the control. As the user pans through the collection, elements that were previously near the viewport are unloaded from memory and new elements are loaded.
Controls that derive from ItemsControl, such as ListView and GridView, perform UI virtualization by default. XAML generates the UI for the item and holds it in memory when the item is close to being visible on screen. When the item is no longer being displayed, the control reuses that memory for another item that is close to being displayed.
If you restyle an ItemsControl to use a panel other than its default panel, the control continues to support UI virtualization as long as it uses a virtualizing panel. Standard virtualizing panels include WrapGrid and VirtualizingStackPanel. Using standard non-virtualizing panels, which include VariableSizedWrapGrid and StackPanel, disables UI virtualization for that control.
UI virtualization is not supported for grouped data. If performance is an issue, limit the size of your groups or if you have lots of items in a group, use another display strategy for group detail views like SemanticZoom.
In addition, make sure that the UI objects that are created are not overly complex. As items come into view, the framework must update the elements in cached item templates with the data of the items coming onto the screen. Reducing the complexity of those XAML trees can pay off both in the amount of memory needed to store the elements and the time it takes to data bind and propagate the individual properties within the template. This reduces the amount of work that the UI thread must perform, which helps to ensure that items appear immediately in a collection that a user pans through. For more info see Load, store, and display large sets of data efficiently.
An app can be suspended when the user moves it to the background or when the system enters a low power state. When the app is being suspended, it raises the Suspending event and has up to 5 seconds to save its data. If the app's Suspending event handler doesn't complete within 5 seconds, the system assumes that the app has stopped responding and terminates it. A terminated app has to go through the startup process again instead of being immediately loaded into memory when a user switches to it.
The AdventureWorks Shopper reference implementation saves page state while navigating away from a page, rather than saving all page state on suspension. This reduces the amount of time that it takes to suspend the app, and hence reduces the chance of the system terminating the app during suspension. In addition, AdventureWorks Shopper does not use page caching. This prevents views that are not currently active from consuming memory, which would increase the chance of termination when suspended. For more info see Minimize suspend/resume time and Handling suspend, resume and activation.
When your app resumes from suspension, it reappears nearly instantly. But when your app restarts after being closed, it might take longer to appear. So preventing your app from being closed when it's suspended can help to manage the user's perception and tolerance of app responsiveness.
When your app begins the suspension process, it should free any large objects that can be easily rebuilt when it resumes. Doing so helps to keep your app's memory footprint low, and reduces the likelihood that Windows will terminate your app after suspension. For more info see Minimize suspend/resume time and Handling suspend, resume and activation.
The CPU is a major consumer of battery power on devices, even at low utilization. Windows 8 tries to keep the CPU in a low power state when it is idle, but activates it as required. While most of the performance tuning that you undertake will naturally reduce the amount of power that your app consumes, you can further reduce your app's consumption of battery power by ensuring that it doesn't unnecessarily poll for data from web services and sensors. For more info see General best practices for performance.
Windows has to accommodate the resource needs of all Windows Store apps by using the Process Lifetime Management (PLM) system to determine which apps to close in order to allow other apps to run. A side effect of this is that if your app requests a large amount of memory, other apps might be closed, even if your app then frees that memory soon after requesting it. Minimize the amount of resources that your app uses so that the user doesn't begin to attribute any perceived slowness in the system to your app. For more info see Improve garbage collection performance and Garbage Collection and Performance.
Most of the Windows Runtime APIs are implemented in native code. This has an implication for Windows Store apps written in managed code, because any Windows Runtime invocation requires that the CLR transitions from a managed stack frame to a native stack frame and marshals function parameters to representations accessible by native code. While this overhead is negligible for most apps, if you make many calls to Windows Runtime APIs in the critical path of an app, this cost can become noticeable. Therefore, you should try to ensure that the time spent in transition between languages is small relative to the execution of the rest of your code.
The .NET for Windows Store apps types don't incur this interop cost. You can assume that types in namespace which begin with "Windows." are part of the Windows Runtime, and types in namespace which begin with "System." are .NET types.
If your app is slow because of interop overheard, you can improve its performance by reducing calls to Windows Runtime APIs on critical code paths. For example, if a collection is frequently accessed, then it is more efficient to use a collection from the System.Collections namespace, rather than a collection from the Windows.Foundation.Collections namespace. For more info see Keep your app fast when you use interop.
Windows Store apps written in managed code get automatic memory management from the .NET garbage collector. The garbage collector determines when to run by balancing the memory consumption of the managed heap with the amount of work a garbage collection needs to do. Frequent garbage collections can contribute to increased CPU consumption, and therefore increased power consumption, longer loading times, and decreased frame rates in your app.
If you have an app with a managed heap size that's substantially larger than 100MB, you should attempt to reduce the amount of memory you allocate directly in order to reduce the frequency of garbage collections. For more info see Improve garbage collection performance.
When profiling your app, follow these guidelines to ensure that reliable and repeatable performance measurements are taken:
- Make sure that you profile the app on the device that's capturing performance measurements when it is plugged in, and when it is running on a battery. Many systems conserve power when running on a battery, and so operate differently.
- Make sure that the total memory use on the system is less than 50 percent. If it's higher, close apps until you reach 50 percent to make sure that you're measuring the impact of your app, rather than that of other processes.
- When you remotely profile an app, we recommend that you interact with the app directly on the remote device. Although you can interact with an app via Remote Desktop Connection, doing so can significantly alter the performance of the app and the performance data that you collect. For more info, see How to profile Visual C++, Visual C#, and Visual Basic code in Windows Store apps on a remote device.
- To collect the most accurate performance results, profile a release build of your app. See How to: Set Debug and Release Configurations.
- Avoid profiling your app in the simulator because the simulator can distort the performance of your app.
Build date: 10/12/2013