Score Vowpal Wabbit Version 8 Model

 

Updated: March 28, 2016

Scores data using the Vowpal Wabbit machine learning system from the command line interface

Category: Text Analytics

You can use the Score Vowpal Wabbit Version 8 Model module to generate scores for a set of input data, using an existing trained Vowpal Wabbit model.

System_CAPS_noteNote

This module provides the latest version of the Vowpal Wabbit framework, version 8. Use this module to score data using a trained model saved in the 8 format.

If you have existing models created using an earlier version, use these modules: Train Vowpal Wabbit Version 7-4 Model, Score Vowpal Wabbit Version 7-4 Model.

  1. Add the Score Vowpal Wabbit Version 8 Model module to your experiment.

  2. Add a trained Vowpal Wabbit model and connect it to the left-hand input port. You can use a trained model created in the same experiment, or locate a saved model in the Trained Models group of Studio’s left navigation pane. However, the model must be available in Azure Machine Learning Studio; you cannot directly load a model from Azure storage.

    System_CAPS_noteNote

    Only Vowpal Wabbit 8 models are supported; you cannot connect saved models that were trained by using other algorithms, and you cannot use models that were trained using earlier versions.

  3. In the VW arguments text box, type a set of valid command-line arguments to the Vowpal Wabbit executable.

    For information about which Vowpal Wabbit arguments are supported and unsupported in Azure Machine Learning, see the Technical Notes section.

  4. Click Specify data type, and select one of the supported data types from the list.

    Scoring requires a single column of VW-compatible data.

    If you have an existing file that was created in the SVMLight or VW formats, you can load it into the Azure ML workspace as a new dataset in one of these formats: Generic CSV without header, TSV without header.

    The VW option requires that a label be present, but it is not used in scoring except for comparison.

  5. Add a Import Data module and connect it to the right-hand input port of Score Vowpal Wabbit Version 8. Configure the Import Data to access the input data.

    The input data for scoring must have been prepared ahead of time in one of the supported formats and stored in Azure blob storage.

  6. Select the option, Include an extra column containing labels, if you want to output labels together with the scores.

    Typically, when handling text data, Vowpal Wabbit does not require labels, and will return only the scores for each row of data.

  7. Select the option, Include an extra column containing raw scores, if you want to output raw scores together with the results.

    This option is new for Vowpal Wabbit Version 8.

  8. Select the option, Use cached results, if you want to re-use results from a previous run, assuming the following conditions are met:

    • A valid cache exists from a previous run.

    • The input data and parameters settings of the module have not changed since the previous run.

    Otherwise, this module will be executed each time the experiment runs.

  9. Run the experiment, and right-click the output of the Score Vowpal Wabbit Version 7-4 Model module to visualize the results.

    The output indicates a prediction score normalized from 0 to 1.

The following video provides a walkthrough of the training and scoring process for Vowpal Wabbit:

https://azure.microsoft.com/en-us/documentation/videos/text-analytics-and-vowpal-wabbit-in-azure-ml-studio/

Vowpal Wabbit has many command-line options for choosing and tuning algorithms. A full discussion of these options is not possible here; we recommend that you view the Vowpal Wabbit wiki page.

Note that some options are not supported in Azure Machine Learning Studio.

Not supported

  • The input/output options specified in https://github.com/JohnLangford/vowpal_wabbit/wiki/Command-line-arguments

    These properties are already configured automatically by the module.

  • Additionally, any option that generates multiple outputs or takes multiple inputs is disallowed. These include --cbt, --lda, and --wap.

  • Only supervised learning algorithms are supported. This disallows these options: –active, --rank, --search etc.

Supported

All arguments other than those described above are allowed.

Name

Type

Description

Trained model

ILearner interface

Trained learner

Dataset

Data Table

Dataset to be scored

Name

Range

Type

Default

Description

Specify data type

VW

SVMLight

DataType

VW

Indicate whether the file type is SVMLight or Vowpal Wabbit

VW arguments

any

String

none

Type Vowpal Wabbit arguments. Do not include -i or -p, or -t

Include an extra column containing labels

True/False

Boolean

false

Specify whether the zipped file should include labels with the predictions

Include an extra column containing raw scores

True/False

Boolean

false

Specify whether the result should include an additional columns containing the raw scores (corresponding to --raw_predictions)

Name

Type

Description

Results dataset

Data Table

Dataset with the prediction results

Exception

Description

Error 0001

Exception occurs if one or more specified columns of data set couldn't be found.

Error 0003

Exception occurs if one or more of inputs are null or empty.

Error 0004

Exception occurs if parameter is less than or equal to specific value.

Error 0017

Exception occurs if one or more specified columns have type unsupported by current module.

Show: