Extract Key Phrases from Text

 

Updated: February 23, 2017

Extracts key phrases from given text

Category: Text Analytics

Use the Extract Key Phrases from Text module to pre-process a text column and extract meaningful phrases. A phrase can be either a single meaningful word, or a compound. The extracted phrase or phrases are potentially meaningful in the context of the sentence for various reasons:

  • The phrase captures the topic of the sentence
  • The phrase contains a combination of modifier and noun that indicates sentiment

For example, assume the sentence analyzed is: "It was a wonderful hotel to stay at, with unique decor and friendly staff."

The Extract Key Phrases from Text module might return these key phrases:

  • wonderful hotel
  • friendly staff
  • unique decor

To extract key phrases, you must connect a dataset that has a column of text.

  1. Add the Extract Key Phrases from Text module to your experiment, and connect a dataset that has at least one full-text column.

  2. Use the Column Selector to select the column from which to extract key phrases.

  3. For Language, select a language to use when analyzing phrases. If you specify the language, only phrases in the target language will be output.

  4. If the text column contains phrases in multiple languages, choose the option, Language identified in columns. A new column selector is displayed that lets you select a column in your data set that contains a language identifier. The language identifier can either be the language name or the Iso6391 culture identifier. For example, either "English" or "en" are acceptable.

    System_CAPS_ICON_tip.jpg Tip

    Before running Extract Key Phrases from Text, use the Detect Languages module to identify the language in each row and generate the identifier for you.

The following example demonstrates how to use this module to extract key phrases and then build a word cloud from them: Extract Key Phrases and Show Word Cloud

Please see the Cortana Intelligence Gallery for more examples of text processing using Azure Machine Learning.

This module currently supports the following languages:

  • Dutch
  • English
  • French
  • German
  • Italian
  • Spanish

Support for additional languages will be added in future.

NameTypeDescription
DatasetData TableThe table containing the text to be processed.
NameTypeRangeOptionalDefaultDescription
Culture-language columnColumnSelectionlanguage:Column contains languageName or one-based index of the column containing the culture-language information
Text columnColumnSelectionRequiredName or one-based index of the text column.
LanguageT_LanguageEnglish, Spanish, French, Dutch, German, Italian, Column contains languageRequiredEnglishSelect the language of the text to be processed.
NameTypeDescription
Results datasetData TableThe extracted key phrases
ExceptionDescription
Error 0003Exception occurs if one or more of inputs are null or empty.
Error 0010Exception occurs if input datasets have column names that should match but do not.
Error 0016Exception occurs if input datasets passed to the module should have compatible column types but do not.
Error 0008Exception occurs if parameter is not in range.

Text Analytics
A-Z Module List

Show: