How to: Use combinable to Improve Performance

This example shows how to use the Concurrency::combinable class to compute the sum of the numbers in a std::array object that are prime. The combinable class improves performance by eliminating shared state.

Example

The following example uses the std::accumulate function to compute the sum of the elements in an array that are prime. In this example, a is an array object and the is_prime function determines whether its input value is prime.

prime_sum = accumulate(a.begin(), a.end(), 0, [&](int acc, int i) {
   return acc + (is_prime(i) ? i : 0);
});

The following example shows a naïve way to parallelize the previous example. This example uses the Concurrency::parallel_for_each algorithm to process the array in parallel and a Concurrency::critical_section object to synchronize access to the prime_sum variable. This example does not scale because each thread must wait for the shared resource to become available.

critical_section cs;
prime_sum = 0;
parallel_for_each(a.begin(), a.end(), [&](int i) {
   cs.lock();
   prime_sum += (is_prime(i) ? i : 0);
   cs.unlock();
});

The following example uses a combinable object to improve the performance of the previous example. This example eliminates the need for synchronization objects; it scales because the combinable object enables each thread to perform its task independently.

A combinable object is typically used in two steps. First, produce a series of fine-grained computations by performing work in parallel. Next, combine (or reduce) the computations into a final result. This example uses the Concurrency::combinable::local method to obtain a reference to the local sum. It then uses the Concurrency::combinable::combine method and a std::plus object to combine the local computations into the final result.

combinable<int> sum;
parallel_for_each(a.begin(), a.end(), [&](int i) {
   sum.local() += (is_prime(i) ? i : 0);
});
prime_sum = sum.combine(plus<int>());

The following complete example computes the sum of prime numbers both serially and in parallel. The example prints to the console the time that is required to perform both computations.

// parallel-sum-of-primes.cpp
// compile with: /EHsc
#include <windows.h>
#include <ppl.h>
#include <array>
#include <numeric>
#include <iostream>

using namespace Concurrency;
using namespace std;

// Calls the provided work function and returns the number of milliseconds 
// that it takes to call that function.
template <class Function>
__int64 time_call(Function&& f)
{
   __int64 begin = GetTickCount();
   f();
   return GetTickCount() - begin;
}

// Determines whether the input value is prime.
bool is_prime(int n)
{
   if (n < 2)
      return false;
   for (int i = 2; i < n; ++i)
   {
      if ((n % i) == 0)
         return false;
   }
   return true;
}

int wmain()
{   
   // Create an array object that contains 200000 integers.
   array<int, 200000> a;

   // Initialize the array such that a[i] == i.
   int n = 0;
   generate(a.begin(), a.end(), [&] {
      return n++;
   });

   int prime_sum;
   __int64 elapsed;

   // Compute the sum of the numbers in the array that are prime.
   elapsed = time_call([&] {
      prime_sum = accumulate(a.begin(), a.end(), 0, [&](int acc, int i) {
         return acc + (is_prime(i) ? i : 0);
      });
   });   
   wcout << prime_sum << endl;   
   wcout << L"serial time: " << elapsed << L" ms" << endl << endl;

   // Now perform the same task in parallel.
   elapsed = time_call([&] {
      combinable<int> sum;
      parallel_for_each(a.begin(), a.end(), [&](int i) {
         sum.local() += (is_prime(i) ? i : 0);
      });
      prime_sum = sum.combine(plus<int>());
   });
   wcout << prime_sum << endl;
   wcout << L"parallel time: " << elapsed << L" ms" << endl << endl;
}

The following sample output is for a computer that has four processors.

1709600813
serial time: 6178 ms

1709600813
parallel time: 1638 ms

Compiling the Code

To compile the code, copy it and then paste it in a Visual Studio project, or paste it in a file that is named parallel-sum-of-primes.cpp and then run the following command in a Visual Studio Command Prompt window.

cl.exe /EHsc parallel-sum-of-primes.cpp

See Also

Reference

combinable Class

critical_section Class

Concepts

Parallel Containers and Objects