Share via


How to: Write a parallel_for_each Loop

This example shows how to use the concurrency::parallel_for_each algorithm to compute the count of prime numbers in a std::array object in parallel.

Example

The following example computes the count of prime numbers in an array two times. The example first uses the std::for_each algorithm to compute the count serially. The example then uses the parallel_for_each algorithm to perform the same task in parallel. The example also prints to the console the time that is required to perform both computations.

// parallel-count-primes.cpp 
// compile with: /EHsc
#include <windows.h>
#include <ppl.h>
#include <iostream>
#include <algorithm>
#include <array>

using namespace concurrency;
using namespace std;

// Calls the provided work function and returns the number of milliseconds  
// that it takes to call that function. 
template <class Function>
__int64 time_call(Function&& f)
{
   __int64 begin = GetTickCount();
   f();
   return GetTickCount() - begin;
}

// Determines whether the input value is prime. 
bool is_prime(int n)
{
   if (n < 2)
      return false;
   for (int i = 2; i < n; ++i)
   {
      if ((n % i) == 0)
         return false;
   }
   return true;
}

int wmain()
{
   // Create an array object that contains 200000 integers. 
   array<int, 200000> a;

   // Initialize the array such that a[i] == i. 
   int n = 0;
   generate(begin(a), end(a), [&] {
      return n++;
   });

   LONG prime_count;
   __int64 elapsed;

   // Use the for_each algorithm to count the number of prime numbers 
   // in the array serially.
   prime_count = 0L;
   elapsed = time_call([&] {
      for_each (begin(a), end(a), [&](int n ) { 
         if (is_prime(n))
            ++prime_count;
      });
   });
   wcout << L"serial version: " << endl
         << L"found " << prime_count << L" prime numbers" << endl
         << L"took " << elapsed << L" ms" << endl << endl;

   // Use the parallel_for_each algorithm to count the number of prime numbers 
   // in the array in parallel.
   prime_count = 0L;
   elapsed = time_call([&] {
      parallel_for_each (begin(a), end(a), [&](int n ) { 
         if (is_prime(n))
            InterlockedIncrement(&prime_count);
      });
   });
   wcout << L"parallel version: " << endl
         << L"found " << prime_count << L" prime numbers" << endl
         << L"took " << elapsed << L" ms" << endl << endl;
}

The following sample output is for a computer that has four processors.

serial version:
found 17984 prime numbers
took 6115 ms

parallel version:
found 17984 prime numbers
took 1653 ms

Compiling the Code

To compile the code, copy it and then paste it in a Visual Studio project, or paste it in a file that is named parallel-count-primes.cpp and then run the following command in a Visual Studio Command Prompt window.

cl.exe /EHsc parallel-count-primes.cpp

Robust Programming

The lambda expression that the example passes to the parallel_for_each algorithm uses the InterlockedIncrement function to enable parallel iterations of the loop to increment the counter simultaneously. If you use functions such as InterlockedIncrement to synchronize access to shared resources, you can present performance bottlenecks in your code. You can use a lock-free synchronization mechanism, for example, the concurrency::combinable class, to eliminate simultaneous access to shared resources. For an example that uses the combinable class in this manner, see How to: Use combinable to Improve Performance.

See Also

Reference

parallel_for_each Function

Concepts

Parallel Algorithms