Extract Key Phrases from Text
Updated: February 23, 2017
Extracts key phrases from given text
Category: Text Analytics
Use the Extract Key Phrases from Text module to pre-process a text column and extract meaningful phrases. A phrase can be either a single meaningful word, or a compound. The extracted phrase or phrases are potentially meaningful in the context of the sentence for various reasons:
- The phrase captures the topic of the sentence
- The phrase contains a combination of modifier and noun that indicates sentiment
For example, assume the sentence analyzed is: "It was a wonderful hotel to stay at, with unique decor and friendly staff."
The Extract Key Phrases from Text module might return these key phrases:
- wonderful hotel
- friendly staff
- unique decor
To extract key phrases, you must connect a dataset that has a column of text.
Add the Extract Key Phrases from Text module to your experiment, and connect a dataset that has at least one full-text column.
Use the Column Selector to select the column from which to extract key phrases.
For Language, select a language to use when analyzing phrases. If you specify the language, only phrases in the target language will be output.
If the text column contains phrases in multiple languages, choose the option, Language identified in columns. A new column selector is displayed that lets you select a column in your data set that contains a language identifier. The language identifier can either be the language name or the Iso6391 culture identifier. For example, either "English" or "en" are acceptable.
Tip Before running Extract Key Phrases from Text, use the Detect Languages module to identify the language in each row and generate the identifier for you.
The following example demonstrates how to use this module to extract key phrases and then build a word cloud from them: Extract Key Phrases and Show Word Cloud
Please see the Cortana Intelligence Gallery for more examples of text processing using Azure Machine Learning.
This module currently supports the following languages:
- Dutch
- English
- French
- German
- Italian
- Spanish
Support for additional languages will be added in future.
| Name | Type | Description |
|---|---|---|
| Dataset | Data Table | The table containing the text to be processed. |
| Name | Type | Range | Optional | Default | Description |
|---|---|---|---|---|---|
| Culture-language column | ColumnSelection | language:Column contains language | Name or one-based index of the column containing the culture-language information | ||
| Text column | ColumnSelection | Required | Name or one-based index of the text column. | ||
| Language | T_Language | English, Spanish, French, Dutch, German, Italian, Column contains language | Required | English | Select the language of the text to be processed. |
| Name | Type | Description |
|---|---|---|
| Results dataset | Data Table | The extracted key phrases |
| Exception | Description |
|---|---|
| Error 0003 | Exception occurs if one or more of inputs are null or empty. |
| Error 0010 | Exception occurs if input datasets have column names that should match but do not. |
| Error 0016 | Exception occurs if input datasets passed to the module should have compatible column types but do not. |
| Error 0008 | Exception occurs if parameter is not in range. |