Two-Class Bayes Point Machine

 

Updated: March 10, 2016

Creates a Bayes point machine binary classification model

You can use the Two-Class Bayes Point Machine module to create an untrained binary classification model.

After you have configured the model, you must train the model on a labeled dataset using the Train Model module.

The Bayes Point Machine is a Bayesian approach to linear classification. It efficiently approximates the theoretically optimal Bayesian average of linear classifiers (in terms of generalization performance) by choosing one "average" classifier, the Bayes Point. Because the Bayes Point Machine is a Bayesian classification model, it is not prone to overfitting to the training data.

For more information, read the original research paper: Bayes Point Machines.

This implementation improves on the original algorithm in several ways:

These improvements make the Bayes Point Machine classification model more robust and easier-to-use, and you can bypass the time-consuming step of parameter tuning.

For more information, see Chris Bishop's post on the Microsoft Machine Learning blog: Embracing Uncertainty - Probabilistic Inference,

  1. Add the Two-Class Bayes Point Machine module to the experiment.

  2. For Number of training iterations, type a number to specify how often the message-passing algorithm iterates over the training data. The higher the number of training iterations, the more accurate the predictions; however, training will be slower.

    For most datasets, the default setting of 30 training iterations is sufficient for the algorithm to make accurate predictions. Sometimes accurate predictions can be made by using fewer iterations. For datasets with highly correlated features, you might benefit from more training iterations. Typically, the number of iterations should be set to a value in the range 5 – 100.

  3. Select the option, Include bias, if you want a constant feature or bias to be added to each instance in training and prediction.

    Including a bias is necessary when the data does not already contain a constant feature.

  4. Select the option, Allow unknown values, to create a group for unknown values.

    If you deselect it, the model can accept only the values that are contained in the training data. In the former case, the model might be less precise for known values, but it can provide better predictions for new (unknown) values.

  5. Connect a dataset and choose a single label column. Connect an instance of the Train Model module.

  6. Run the experiment.

  7. When the model is trained, right-click the output of the Train Model module and select Visualize to see a summary of the model's parameters, together with the feature weights learned from training.

  8. You can pass the trained model to the Score Model module to make predictions. Alternatively, the untrained model can be passed to Cross-Validate Model for cross-validation against a labeled data set.

To see how the Two-Class Bayes Point Machine is used in machine learning, see these sample experiments in the Model Gallery:

Name

Range

Type

Default

Description

Number of training iterations

>=1

Integer

30

Specify the number of iterations to use when training

Include bias

Any

Boolean

True

Indicate whether a constant feature or bias should be added to each instance

Allow unknown values in categorical features

Any

Boolean

True

If True, creates an additional level for each categorical column. Any levels in the test dataset that are not available in the training dataset are mapped to this additional level.

Name

Type

Description

Untrained model

ILearner interface

An untrained binary classification model

Show: