How to: Write a parallel_for_each Loop

This example shows how to use the concurrency::parallel_for_each algorithm to compute the count of prime numbers in a std::array object in parallel.

Example

The following example computes the count of prime numbers in an array two times. The example first uses the std::for_each algorithm to compute the count serially. The example then uses the parallel_for_each algorithm to perform the same task in parallel. The example also prints to the console the time that is required to perform both computations.

// parallel-count-primes.cpp
// compile with: /EHsc
#include <windows.h>
#include <ppl.h>
#include <iostream>
#include <algorithm>
#include <array>

using namespace concurrency;
using namespace std;

// Returns the number of milliseconds that it takes to call the passed in function.
template <class Function>
__int64 time_call(Function&& f)
{
    __int64 begin = GetTickCount();
    f();
    return GetTickCount() - begin;
}

// Determines whether the input is a prime.
bool is_prime(int n)
{
    if (n < 2)
    {
        return false;
    }

    for (int i = 2; i < int(std::sqrt(n)) + 1; ++i)
    {
        if (n % i == 0)
        {
            return false;
        }
    }
    return true;
}

int wmain()
{
    // Create an array object that contains 200000 integers.
    array<int, 200000> a;

    // Initialize the array such that a[i] == i.
    int n = 0;
    generate(begin(a), end(a), [&]
        {
            return n++;
        });

    // Use the for_each algorithm to count, serially, the number
    // of prime numbers in the array.
    LONG prime_count = 0L;
    __int64 elapsed = time_call([&]
        {
            for_each(begin(a), end(a), [&](int n)
            {
                if (is_prime(n))
                {
                    ++prime_count;
                }
            });
        });
    
    wcout << L"serial version: " << endl
        << L"found " << prime_count << L" prime numbers" << endl
        << L"took " << elapsed << L" ms" << endl << endl;

    // Use the parallel_for_each algorithm to count, in parallel, the number
    // of prime numbers in the array.
    prime_count = 0L;
    elapsed = time_call([&]
        {
            parallel_for_each(begin(a), end(a), [&](int n)
                {
                    if (is_prime(n))
                    {
                        InterlockedIncrement(&prime_count);
                    }
                });
        });

    wcout << L"parallel version: " << endl
        << L"found " << prime_count << L" prime numbers" << endl
        << L"took " << elapsed << L" ms" << endl << endl;
}

The following sample output is for a computer that has four cores.

serial version:
found 17984 prime numbers
took 125 ms

parallel version:
found 17984 prime numbers
took 63 ms

Compiling the Code

To compile the code, copy it and then paste it in a Visual Studio project, or paste it in a file that is named parallel-count-primes.cpp and then run the following command in a Visual Studio Command Prompt window.

cl.exe /EHsc parallel-count-primes.cpp

Robust Programming

The lambda expression that the example passes to the parallel_for_each algorithm uses the InterlockedIncrement function to enable parallel iterations of the loop to increment the counter simultaneously. If you use functions such as InterlockedIncrement to synchronize access to shared resources, you can present performance bottlenecks in your code. You can use a lock-free synchronization mechanism, for example, the concurrency::combinable class, to eliminate simultaneous access to shared resources. For an example that uses the combinable class in this manner, see How to: Use combinable to improve performance.

See also

Parallel algorithms
parallel_for_each Function