方法: 並列呼び出しを使用して並列操作を実行する

[アーティクル]
08/09/2011

この例では、Concurrency::parallel_invoke アルゴリズムを使用して、共有データソースに対して複数の操作を実行するプログラムのパフォーマンスを向上させる方法について説明します。操作によってソースが変更されるわけではないため、簡単に並列実行できます。

使用例

次のコード例について考えます。このコードは、MyDataType 型の変数を作成し、関数を呼び出してその変数を初期化し、そのデータに対して時間のかかる複数の操作を実行します。

MyDataType data;
initialize_data(data);

lengthy_operation1(data);
lengthy_operation2(data);
lengthy_operation3(data);

lengthy_operation1、lengthy_operation2、および lengthy_operation3 の各関数が MyDataType 変数を変更しない場合は、追加の修正を行わなくても、これらの関数を並列に実行できます。

次の例では、並列で実行するように前の例を修正します。 parallel_invoke アルゴリズムは、各タスクを並列に実行し、すべてのタスクが終了した後に制御を戻します。

MyDataType data;
initialize_data(data);

Concurrency::parallel_invoke(
   [&data] { lengthy_operation1(data); },
   [&data] { lengthy_operation2(data); },
   [&data] { lengthy_operation3(data); }
);

次の例では、gutenberg.org から Homer の The Iliad をダウンロードし、そのファイルに対して複数の操作を実行します。この例では、これらの操作を逐次的に実行してから、同じ操作を並列に実行します。

// parallel-word-mining.cpp
// compile with: /EHsc /MD /DUNICODE /D_AFXDLL
#define _WIN32_WINNT 0x0501
#include <afxinet.h>
#include <ppl.h>
#include <string>
#include <iostream>
#include <vector>
#include <map>
#include <algorithm>

using namespace Concurrency;
using namespace std;

// Calls the provided work function and returns the number of milliseconds 
// that it takes to call that function.
template <class Function>
__int64 time_call(Function&& f)
{
   __int64 begin = GetTickCount();
   f();
   return GetTickCount() - begin;
}

// Downloads the file at the given URL.
CString get_http_file(CInternetSession& session, const CString& url);

// Adds each word in the provided string to the provided vector of strings.
void make_word_list(const wstring& text, vector<wstring>& words);

// Finds the most common words whose length are greater than or equal to the 
// provided minimum. 
vector<pair<wstring, size_t>> find_common_words(const vector<wstring>& words, 
   size_t min_length, size_t count);

// Finds the longest sequence of words that have the same first letter.
vector<wstring> find_longest_sequence(const vector<wstring>& words);

// Finds all pairs of palindromes that appear in the provided collection
// of words.
vector<pair<wstring, wstring>> find_palindromes(const vector<wstring>& words,
   size_t min_length);

int wmain()
{  
   // Manages the network connection.
   CInternetSession session(L"Microsoft Internet Browser");

   // Download 'The Iliad' from gutenberg.org.
   wcout << L"Downloading 'The Iliad'..." << endl;
   wstring file = get_http_file(session, L"http://www.gutenberg.org/files/6130/6130-0.txt");
   wcout << endl;

   // Convert the text to a list of individual words.
   vector<wstring> words;
   make_word_list(file, words);

   // Compare the time that it takes to perform several operations on the data
   // serially and in parallel.
   __int64 elapsed;

   vector<pair<wstring, size_t>> common_words;
   vector<wstring> longest_sequence;   
   vector<pair<wstring, wstring>> palindromes;

   wcout << L"Running serial version...";
   elapsed = time_call([&] {
      common_words = find_common_words(words, 5, 9);
      longest_sequence = find_longest_sequence(words);
      palindromes = find_palindromes(words, 5);
   });
   wcout << L" took " << elapsed << L" ms." << endl;

   wcout << L"Running parallel version...";
   elapsed = time_call([&] {
      parallel_invoke(         
         [&] { common_words = find_common_words(words, 5, 9); },
         [&] { longest_sequence = find_longest_sequence(words); },
         [&] { palindromes = find_palindromes(words, 5); }
      );
   });
   wcout << L" took " << elapsed << L" ms." << endl;
   wcout << endl;

   // Print results.

   wcout << L"The most common words that have five or more letters are:" 
         << endl;
   for_each(common_words.begin(), common_words.end(), 
      [](const pair<wstring, size_t>& p) {
         wcout << L"   " << p.first << L" (" << p.second << L")" << endl; 
      });

   wcout << L"The longest sequence of words that have the same first letter is:" 
         << endl << L"   ";
   for_each(longest_sequence.begin(), longest_sequence.end(), 
      [](const wstring& s) {
         wcout << s << L' '; 
      });
   wcout << endl;

   wcout << L"The following palindromes appear in the text:" << endl;
   for_each(palindromes.begin(), palindromes.end(), 
      [](const pair<wstring, wstring>& p) {
         wcout << L"   "  << p.first << L" " << p.second << endl;
      });
}

// Downloads the file at the given URL.
CString get_http_file(CInternetSession& session, const CString& url)
{
   CString result;

   // Reads data from an HTTP server.
   CHttpFile* http_file = NULL;

   try
   {
      // Open URL.
      http_file = reinterpret_cast<CHttpFile*>(session.OpenURL(url, 1));

      // Read the file.
      if(http_file != NULL)
      {           
         UINT bytes_read;
         do
         {
            char buffer[10000];
            bytes_read = http_file->Read(buffer, sizeof(buffer));
            result += buffer;
         }
         while (bytes_read > 0);
      }
    }
   catch (CInternetException)
   {
      // TODO: Handle exception
   }

   // Clean up and return.
   delete http_file;

   return result;
}

// Adds each word in the provided string to the provided vector of strings.
void make_word_list(const wstring& text, vector<wstring>& words)
{
   // Add continuous sequences of alphanumeric characters to the 
   // string vector. 
   wstring current_word;
   for_each(text.begin(), text.end(), [&](wchar_t ch) {
      if (!iswalnum(ch))
      {
         if (current_word.length() > 0)
         {
            words.push_back(current_word);
            current_word.clear();
         }
      }
      else
      {
         current_word += ch;
      }
   });
}

// Finds the most common words whose length are greater than or equal to the 
// provided minimum. 
vector<pair<wstring, size_t>> find_common_words(const vector<wstring>& words, 
   size_t min_length, size_t count)
{
   typedef pair<wstring, size_t> pair;

   // Counds the occurences of each word.
   map<wstring, size_t> counts;

   for_each(words.begin(), words.end(), [&](const wstring& word) {
      // Increment the count of words that are at least the minimum length.
      if (word.length() >= min_length)
      {
         auto find = counts.find(word);
         if (find != counts.end())
            find->second++;
         else
            counts.insert(make_pair(word, 1));
      }
   });

   // Copy the contents of the map to a vector and sort the vector by
   // the number of occurences of each word.
   vector<pair> wordvector;
   copy(counts.begin(), counts.end(), back_inserter(wordvector));

   sort(wordvector.begin(), wordvector.end(), [](const pair& x, const pair& y) {
      return x.second > y.second;
   });

   size_t size = min(wordvector.size(), count);
   wordvector.erase(wordvector.begin() + size, wordvector.end());

   return wordvector;
}

// Finds the longest sequence of words that have the same first letter.
vector<wstring> find_longest_sequence(const vector<wstring>& words)
{
   // The current sequence of words that have the same first letter.
   vector<wstring> candidate_list;
   // The longest sequence of words that have the same first letter.
   vector<wstring> longest_run;

   for_each(words.begin(), words.end(), [&](const wstring& word) {
      // Initialize the candidate list if it is empty.
      if (candidate_list.size() == 0)
      {
         candidate_list.push_back(word);
      }
      // Add the word to the candidate sequence if the first letter
      // of the word is the same as each word in the sequence.
      else if (word[0] == candidate_list[0][0])
      {
         candidate_list.push_back(word);
      }
      // The initial letter has changed; reset the candidate list.
      else 
      {
         // Update the longest sequence if needed.
         if (candidate_list.size() > longest_run.size())
            longest_run = candidate_list;

         candidate_list.clear();
         candidate_list.push_back(word);         
      }
   });

   return longest_run;
}

// Finds all pairs of palindromes that appear in the provided collection
// of words.
vector<pair<wstring, wstring>> find_palindromes(const vector<wstring>& words, 
   size_t min_length)
{
   typedef pair<wstring, wstring> pair;
   vector<pair> result;

   // Copy the words to a new vector object and sort that vector.
   vector<wstring> wordvector;
   copy(words.begin(), words.end(), back_inserter(wordvector));
   sort(wordvector.begin(), wordvector.end());

   // Add each word in the original collection to the result whose palindrome 
   // also exists in the collection. 
   for_each(words.begin(), words.end(), [&](const wstring& word) {
      if (word.length() >= min_length)
      {
         wstring rev = word;
         reverse(rev.begin(), rev.end());

         if (rev != word && binary_search(wordvector.begin(), wordvector.end(), rev))
         {
            auto candidate1 = make_pair(word, rev);
            auto candidate2 = make_pair(rev, word);
            if (find(result.begin(), result.end(), candidate1) == result.end() &&
                find(result.begin(), result.end(), candidate2) == result.end())
               result.push_back(candidate1);
         }
      }
   });

   return result;
}

この例では、次のサンプル出力が生成されます。

Downloading 'The Iliad'...

Running serial version... took 953 ms.
Running parallel version... took 656 ms.

The most common words that have five or more letters are:
   their (953)
   shall (444)
   which (431)
   great (398)
   Hector (349)
   Achilles (309)
   through (301)
   these (268)
   chief (259)
The longest sequence of words that have the same first letter is:
   through the tempest to the tented
The following palindromes appear in the text:
   spots stops
   speed deeps
   keels sleek

この例では、parallel_invoke アルゴリズムを使用して、同じデータソースに対して作用する複数の関数を呼び出します。 parallel_invoke アルゴリズムを使用すると、同じデータソースに対して作用する関数だけでなく、任意の関数セットを並列に呼び出すことができます。

parallel_invoke アルゴリズムでは各処理関数を並列で呼び出すため、そのパフォーマンスは (ランタイムが各関数を別々のプロセッサで処理する場合に) 最も時間がかかる関数によって制限されます。この例で、使用できるプロセッサの数よりも多くのタスクを並列で実行すると、各プロセッサで複数のタスクを実行できます。その場合のパフォーマンスは、最も時間のかかるタスクのグループによって制限されます。

この例では 3 つのタスクを並列に実行するので、4 つ以上のプロセッサを搭載したコンピューターではパフォーマンスの向上を期待できません。パフォーマンスを向上させるには、最も時間のかかるタスクを小さなタスクに分割し、それらのタスクを並列に実行します。

取り消し処理のサポートが不要であれば、Concurrency::task_group クラスと Concurrency::structured_task_group クラスの代わりに、parallel_invoke アルゴリズムを使用できます。 parallel_invoke アルゴリズムを使用した場合とタスクグループを使用した場合の違いを示す例については、「方法: 並列呼び出しを使用して並列並べ替えルーチンを記述する」を参照してください。

コードのコンパイル

このコードをコンパイルするには、コードをコピーし、Visual Studio プロジェクトに貼り付けるか、parallel-word-mining.cpp という名前のファイルに貼り付け、Visual Studio のコマンドプロンプトウィンドウで次のコマンドを実行します。

cl.exe /EHsc /MD /DUNICODE /D_AFXDLL parallel-word-mining.cpp

参照

parallel_invoke 関数

概念

並列アルゴリズム

方法: 並列呼び出しを使用して並列操作を実行する

使用例

コードのコンパイル

参照

参照

概念

その他のリソース