Text Analytics
Updated: August 10, 2016
Azure Machine Learning provides specialized tools for helping you work with both structured and unstructured text:
Extensive options for preprocessing text
Detect the language of input text
Create features from text using customizable n-gram dictionaries
Feature hashing, to efficiently analyze text without preprocessing or advanced linguistic analysis
Vowpal Wabbit, for very fast machine learning on text , including feature hashing, topic modeling (LDA), classification, and more
Named entity recognition, to extract the names of people, places, and organizations from unstructured text
For examples of text analytics using Azure Machine Learning, see these sample experiments in the Model Gallery:
The News Categorization sample uses feature hashing to classify articles into a predefined list of categories.
The Find similar companies sample uses the text of Wikipedia articles to categorize companies.
In the five-part Text Classification sample, text from Twitter messages is used to perform sentiment analysis.
The Text Analytics category includes the following modules:
Module | Description |
|---|---|
(New) | Detects the language of each line in the input file |
Extract N-Gram Features from Text (New) | Creates N-Gram dictionary features and does feature selection on them |
Converts text data to integer-encoded features using the Vowpal Wabbit library | |
(New) | Performs topic modeling using the Vowpal Wabbit library for LDA |
Recognizes named entities in a text column | |
(New) | Performs cleaning operations on text |
Scores input from Azure using version 7-4 of the Vowpal Wabbit machine learning system | |
Scores input from Azure using version 7-10 of the Vowpal Wabbit machine learning system | |
Scores input from Azure using version 8 of the Vowpal Wabbit machine learning system | |
Trains a model using version 7-4 of the Vowpal Wabbit machine learning system | |
Trains a model using version 7-10 of the Vowpal Wabbit machine learning system | |
Trains a model using version 8 of the Vowpal Wabbit machine learning system |