방법: parallel_invoke를 사용하여 병렬 작업 실행

아티클
08/09/2011

이 예제에서는 Concurrency::parallel_invoke 알고리즘을 사용하여 공유 데이터 소스에 대해 여러 작업을 수행하는 프로그램의 성능을 향상시키는 방법을 보여 줍니다. 어느 작업도 소스를 수정하지는 않으므로 간단한 방법으로 작업을 병렬로 실행할 수 있습니다.

예제

MyDataType 형식의 변수를 만들고 함수를 호출하여 해당 변수를 초기화한 다음 해당 데이터에 대해 여러 개의 긴 작업을 수행하는 다음 코드 예제를 살펴봅니다.

MyDataType data;
initialize_data(data);

lengthy_operation1(data);
lengthy_operation2(data);
lengthy_operation3(data);

lengthy_operation1, lengthy_operation2 및 lengthy_operation3 함수가 MyDataType 변수를 수정하지 않는 경우 이러한 함수를 추가로 수정하지 않고 병렬로 실행할 수 있습니다.

다음 예제에서는 이전 예제를 병렬로 실행하도록 수정합니다. parallel_invoke 알고리즘은 각 작업을 병렬로 실행하고 모든 작업이 끝난 후 반환됩니다.

MyDataType data;
initialize_data(data);

Concurrency::parallel_invoke(
   [&data] { lengthy_operation1(data); },
   [&data] { lengthy_operation2(data); },
   [&data] { lengthy_operation3(data); }
);

다음 예제에서는 gutenberg.org의 Homer가 만든 The Iliad를 다운로드하고 해당 파일에 대해 여러 작업을 수행합니다. 이 예제에서는 먼저 이러한 작업을 연속으로 수행한 다음 동일한 작업을 병렬로 수행합니다.

// parallel-word-mining.cpp
// compile with: /EHsc /MD /DUNICODE /D_AFXDLL
#define _WIN32_WINNT 0x0501
#include <afxinet.h>
#include <ppl.h>
#include <string>
#include <iostream>
#include <vector>
#include <map>
#include <algorithm>

using namespace Concurrency;
using namespace std;

// Calls the provided work function and returns the number of milliseconds 
// that it takes to call that function.
template <class Function>
__int64 time_call(Function&& f)
{
   __int64 begin = GetTickCount();
   f();
   return GetTickCount() - begin;
}

// Downloads the file at the given URL.
CString get_http_file(CInternetSession& session, const CString& url);

// Adds each word in the provided string to the provided vector of strings.
void make_word_list(const wstring& text, vector<wstring>& words);

// Finds the most common words whose length are greater than or equal to the 
// provided minimum. 
vector<pair<wstring, size_t>> find_common_words(const vector<wstring>& words, 
   size_t min_length, size_t count);

// Finds the longest sequence of words that have the same first letter.
vector<wstring> find_longest_sequence(const vector<wstring>& words);

// Finds all pairs of palindromes that appear in the provided collection
// of words.
vector<pair<wstring, wstring>> find_palindromes(const vector<wstring>& words,
   size_t min_length);

int wmain()
{  
   // Manages the network connection.
   CInternetSession session(L"Microsoft Internet Browser");

   // Download 'The Iliad' from gutenberg.org.
   wcout << L"Downloading 'The Iliad'..." << endl;
   wstring file = get_http_file(session, L"http://www.gutenberg.org/files/6130/6130-0.txt");
   wcout << endl;

   // Convert the text to a list of individual words.
   vector<wstring> words;
   make_word_list(file, words);

   // Compare the time that it takes to perform several operations on the data
   // serially and in parallel.
   __int64 elapsed;

   vector<pair<wstring, size_t>> common_words;
   vector<wstring> longest_sequence;   
   vector<pair<wstring, wstring>> palindromes;

   wcout << L"Running serial version...";
   elapsed = time_call([&] {
      common_words = find_common_words(words, 5, 9);
      longest_sequence = find_longest_sequence(words);
      palindromes = find_palindromes(words, 5);
   });
   wcout << L" took " << elapsed << L" ms." << endl;

   wcout << L"Running parallel version...";
   elapsed = time_call([&] {
      parallel_invoke(         
         [&] { common_words = find_common_words(words, 5, 9); },
         [&] { longest_sequence = find_longest_sequence(words); },
         [&] { palindromes = find_palindromes(words, 5); }
      );
   });
   wcout << L" took " << elapsed << L" ms." << endl;
   wcout << endl;

   // Print results.

   wcout << L"The most common words that have five or more letters are:" 
         << endl;
   for_each(common_words.begin(), common_words.end(), 
      [](const pair<wstring, size_t>& p) {
         wcout << L"   " << p.first << L" (" << p.second << L")" << endl; 
      });

   wcout << L"The longest sequence of words that have the same first letter is:" 
         << endl << L"   ";
   for_each(longest_sequence.begin(), longest_sequence.end(), 
      [](const wstring& s) {
         wcout << s << L' '; 
      });
   wcout << endl;

   wcout << L"The following palindromes appear in the text:" << endl;
   for_each(palindromes.begin(), palindromes.end(), 
      [](const pair<wstring, wstring>& p) {
         wcout << L"   "  << p.first << L" " << p.second << endl;
      });
}

// Downloads the file at the given URL.
CString get_http_file(CInternetSession& session, const CString& url)
{
   CString result;

   // Reads data from an HTTP server.
   CHttpFile* http_file = NULL;

   try
   {
      // Open URL.
      http_file = reinterpret_cast<CHttpFile*>(session.OpenURL(url, 1));

      // Read the file.
      if(http_file != NULL)
      {           
         UINT bytes_read;
         do
         {
            char buffer[10000];
            bytes_read = http_file->Read(buffer, sizeof(buffer));
            result += buffer;
         }
         while (bytes_read > 0);
      }
    }
   catch (CInternetException)
   {
      // TODO: Handle exception
   }

   // Clean up and return.
   delete http_file;

   return result;
}

// Adds each word in the provided string to the provided vector of strings.
void make_word_list(const wstring& text, vector<wstring>& words)
{
   // Add continuous sequences of alphanumeric characters to the 
   // string vector. 
   wstring current_word;
   for_each(text.begin(), text.end(), [&](wchar_t ch) {
      if (!iswalnum(ch))
      {
         if (current_word.length() > 0)
         {
            words.push_back(current_word);
            current_word.clear();
         }
      }
      else
      {
         current_word += ch;
      }
   });
}

// Finds the most common words whose length are greater than or equal to the 
// provided minimum. 
vector<pair<wstring, size_t>> find_common_words(const vector<wstring>& words, 
   size_t min_length, size_t count)
{
   typedef pair<wstring, size_t> pair;

   // Counds the occurences of each word.
   map<wstring, size_t> counts;

   for_each(words.begin(), words.end(), [&](const wstring& word) {
      // Increment the count of words that are at least the minimum length.
      if (word.length() >= min_length)
      {
         auto find = counts.find(word);
         if (find != counts.end())
            find->second++;
         else
            counts.insert(make_pair(word, 1));
      }
   });

   // Copy the contents of the map to a vector and sort the vector by
   // the number of occurences of each word.
   vector<pair> wordvector;
   copy(counts.begin(), counts.end(), back_inserter(wordvector));

   sort(wordvector.begin(), wordvector.end(), [](const pair& x, const pair& y) {
      return x.second > y.second;
   });

   size_t size = min(wordvector.size(), count);
   wordvector.erase(wordvector.begin() + size, wordvector.end());

   return wordvector;
}

// Finds the longest sequence of words that have the same first letter.
vector<wstring> find_longest_sequence(const vector<wstring>& words)
{
   // The current sequence of words that have the same first letter.
   vector<wstring> candidate_list;
   // The longest sequence of words that have the same first letter.
   vector<wstring> longest_run;

   for_each(words.begin(), words.end(), [&](const wstring& word) {
      // Initialize the candidate list if it is empty.
      if (candidate_list.size() == 0)
      {
         candidate_list.push_back(word);
      }
      // Add the word to the candidate sequence if the first letter
      // of the word is the same as each word in the sequence.
      else if (word[0] == candidate_list[0][0])
      {
         candidate_list.push_back(word);
      }
      // The initial letter has changed; reset the candidate list.
      else 
      {
         // Update the longest sequence if needed.
         if (candidate_list.size() > longest_run.size())
            longest_run = candidate_list;

         candidate_list.clear();
         candidate_list.push_back(word);         
      }
   });

   return longest_run;
}

// Finds all pairs of palindromes that appear in the provided collection
// of words.
vector<pair<wstring, wstring>> find_palindromes(const vector<wstring>& words, 
   size_t min_length)
{
   typedef pair<wstring, wstring> pair;
   vector<pair> result;

   // Copy the words to a new vector object and sort that vector.
   vector<wstring> wordvector;
   copy(words.begin(), words.end(), back_inserter(wordvector));
   sort(wordvector.begin(), wordvector.end());

   // Add each word in the original collection to the result whose palindrome 
   // also exists in the collection. 
   for_each(words.begin(), words.end(), [&](const wstring& word) {
      if (word.length() >= min_length)
      {
         wstring rev = word;
         reverse(rev.begin(), rev.end());

         if (rev != word && binary_search(wordvector.begin(), wordvector.end(), rev))
         {
            auto candidate1 = make_pair(word, rev);
            auto candidate2 = make_pair(rev, word);
            if (find(result.begin(), result.end(), candidate1) == result.end() &&
                find(result.begin(), result.end(), candidate2) == result.end())
               result.push_back(candidate1);
         }
      }
   });

   return result;
}

이 예제를 실행하면 다음과 같은 샘플 결과가 출력됩니다.

Downloading 'The Iliad'...

Running serial version... took 953 ms.
Running parallel version... took 656 ms.

The most common words that have five or more letters are:
   their (953)
   shall (444)
   which (431)
   great (398)
   Hector (349)
   Achilles (309)
   through (301)
   these (268)
   chief (259)
The longest sequence of words that have the same first letter is:
   through the tempest to the tented
The following palindromes appear in the text:
   spots stops
   speed deeps
   keels sleek

이 예제에서는 parallel_invoke 알고리즘을 사용하여 동일한 데이터 소스에 대해 동작하는 여러 함수를 호출합니다. parallel_invoke 알고리즘을 사용하여 동일한 데이터에 대해 동작하는 함수뿐 아니라 임의의 함수 집합을 병렬로 호출할 수 있습니다.

parallel_invoke 알고리즘은 각 작업 함수를 병렬로 호출하므로 완료하는 데 가장 오랜 시간이 소요되는 함수에 의해 성능이 정해집니다. 즉, 런타임에서는 각 함수를 별도의 프로세서에서 처리합니다. 이 예제에서 사용 가능한 프로세서 수보다 많은 작업을 병렬로 수행하는 경우 각 프로세서에서 여러 작업을 실행할 수 있습니다. 이 경우에는 완료하는 데 가장 오랜 시간이 소요되는 작업 그룹에 의해 성능이 정해집니다.

이 예제에서는 세 개의 작업을 병렬로 수행하기 때문에 프로세서가 세 개를 초과하는 컴퓨터에서는 성능이 향상되지 않습니다. 성능을 향상시키기 위해서는 최장 실행 작업을 작은 작업으로 분할한 다음 해당 작업을 병렬로 실행하면 됩니다.

취소에 대한 지원이 필요하지 않은 경우에는 Concurrency::task_group 및 Concurrency::structured_task_group 클래스 대신 parallel_invoke 알고리즘을 사용할 수 있습니다. parallel_invoke 알고리즘과 작업 그룹의 사용법을 비교하는 예제를 보려면 방법: parallel_invoke를 사용하여 병렬 정렬 루틴 작성을 참조하십시오.

코드 컴파일

코드를 컴파일하려면 해당 코드를 복사하여 Visual Studio 프로젝트 또는 parallel-word-mining.cpp 파일에 붙여 넣고 Visual Studio 명령 프롬프트 창에서 다음 명령을 실행합니다.

cl.exe /EHsc /MD /DUNICODE /D_AFXDLL parallel-word-mining.cpp

참고 항목

참조

parallel_invoke 함수

개념

병렬 알고리즘

방법: parallel_invoke를 사용하여 병렬 작업 실행

예제

코드 컴파일

참고 항목

참조

개념

추가 리소스