Two-Class Bayes Point Machine
Updated: March 10, 2016
Creates a Bayes point machine binary classification model
You can use the Two-Class Bayes Point Machine module to create an untrained binary classification model.
After you have configured the model, you must train the model on a labeled dataset using the Train Model module.
The Bayes Point Machine is a Bayesian approach to linear classification. It efficiently approximates the theoretically optimal Bayesian average of linear classifiers (in terms of generalization performance) by choosing one "average" classifier, the Bayes Point. Because the Bayes Point Machine is a Bayesian classification model, it is not prone to overfitting to the training data.
For more information, read the original research paper: Bayes Point Machines.
This implementation improves on the original algorithm in several ways:
It uses the expectation propagation message-passing algorithm. For more information, see A family of algorithms for approximate Bayesian inference.
It does not require parameter sweeping.
It does not require data to be normalized.
These improvements make the Bayes Point Machine classification model more robust and easier-to-use, and you can bypass the time-consuming step of parameter tuning.
For more information, see Chris Bishop's post on the Microsoft Machine Learning blog: Embracing Uncertainty - Probabilistic Inference,
Add the Two-Class Bayes Point Machine module to the experiment.
For Number of training iterations, type a number to specify how often the message-passing algorithm iterates over the training data. The higher the number of training iterations, the more accurate the predictions; however, training will be slower.
For most datasets, the default setting of 30 training iterations is sufficient for the algorithm to make accurate predictions. Sometimes accurate predictions can be made by using fewer iterations. For datasets with highly correlated features, you might benefit from more training iterations. Typically, the number of iterations should be set to a value in the range 5 – 100.
Select the option, Include bias, if you want a constant feature or bias to be added to each instance in training and prediction.
Including a bias is necessary when the data does not already contain a constant feature.
Select the option, Allow unknown values, to create a group for unknown values.
If you deselect it, the model can accept only the values that are contained in the training data. In the former case, the model might be less precise for known values, but it can provide better predictions for new (unknown) values.
Connect a dataset and choose a single label column. Connect an instance of the Train Model module.
Run the experiment.
When the model is trained, right-click the output of the Train Model module and select Visualize to see a summary of the model's parameters, together with the feature weights learned from training.
You can pass the trained model to the Score Model module to make predictions. Alternatively, the untrained model can be passed to Cross-Validate Model for cross-validation against a labeled data set.
To see how the Two-Class Bayes Point Machine is used in machine learning, see these sample experiments in the Model Gallery:
The Compare Binary Classifiers sample demonstrates the use of various two-class classifiers.
Name | Range | Type | Default | Description |
|---|---|---|---|---|
Number of training iterations | >=1 | Integer | 30 | Specify the number of iterations to use when training |
Include bias | Any | Boolean | True | Indicate whether a constant feature or bias should be added to each instance |
Allow unknown values in categorical features | Any | Boolean | True | If True, creates an additional level for each categorical column. Any levels in the test dataset that are not available in the training dataset are mapped to this additional level. |
Name | Type | Description |
|---|---|---|
Untrained model | An untrained binary classification model |