Text Analytics

 

Updated: August 10, 2016

Azure Machine Learning provides specialized tools for helping you work with both structured and unstructured text:

  • Extensive options for preprocessing text

  • Detect the language of input text

  • Create features from text using customizable n-gram dictionaries

  • Feature hashing, to efficiently analyze text without preprocessing or advanced linguistic analysis

  • Vowpal Wabbit, for very fast machine learning on text , including feature hashing, topic modeling (LDA), classification, and more

  • Named entity recognition, to extract the names of people, places, and organizations from unstructured text

For examples of text analytics using Azure Machine Learning, see these sample experiments in the Model Gallery:

  • The News Categorization sample uses feature hashing to classify articles into a predefined list of categories.

  • The Find similar companies sample uses the text of Wikipedia articles to categorize companies.

  • In the five-part Text Classification sample, text from Twitter messages is used to perform sentiment analysis.

The Text Analytics category includes the following modules:

Module

Description

Detect Languages

(New)

Detects the language of each line in the input file

Extract N-Gram Features from Text

(New)

Creates N-Gram dictionary features and does feature selection on them

Feature Hashing

Converts text data to integer-encoded features using the Vowpal Wabbit library

Latent Dirichlet Allocation

(New)

Performs topic modeling using the Vowpal Wabbit library for LDA

Named Entity Recognition

Recognizes named entities in a text column

Preprocess Text

(New)

Performs cleaning operations on text

Score Vowpal Wabbit Version 7-4 Model

Scores input from Azure using version 7-4 of the Vowpal Wabbit machine learning system

Score Vowpal Wabbit Version 7-10 Model

Scores input from Azure using version 7-10 of the Vowpal Wabbit machine learning system

Score Vowpal Wabbit Version 8 Model

Scores input from Azure using version 8 of the Vowpal Wabbit machine learning system

Train Vowpal Wabbit Version 7-4 Model

Trains a model using version 7-4 of the Vowpal Wabbit machine learning system

Train Vowpal Wabbit Version 7-10 Model

Trains a model using version 7-10 of the Vowpal Wabbit machine learning system

Train Vowpal Wabbit Version 8 Model

Trains a model using version 8 of the Vowpal Wabbit machine learning system

Show: