Train Model
Updated: August 22, 2017
Trains a classification or regression model in a supervised manner
Category: Machine Learning / Train
This article describes how to use the Train Model module in Azure Machine Learning to train a classification or regression model.
Training a classification or regression model is a kind of supervised machine learning. That means you must provide a dataset that contains historical data from which to learn patterns. The data should contain both the outcome you are trying to predict, and related factors (variables). The machine learning model uses the data to extract statistical patterns and build a model.
Additionally, you must also connect an already configured model, such as a regression algorithm, decision tree model, or other machine learning module. For more information, see classification or regression.
After the model has been trained by processing all the data, the output is a trained model that you can evaluate, or use to create predictions.
New to machine learning? This tutorial walks you through the process of getting data, configuring an algorithm, training and then using a model: Create your first machine learning experiment |
Related Tasks
Create a custom model using R, or import a trained model from another R package and then do scoring or testing using R script.
Write your own Python script to train a model. Azure Machine Learning supports Anaconda 2 with Python 2.7, or Anaconda 4 with Python 2.7 or 3.5.
Depending on the type of model you are creating, you might need to use one of these specialized training modules:
Add the Train Model module to the experiment. You can find this module in Azure Machine Learning Studio under the Machine Learning category. Expand Train, and then drag the Train Model module into your experiment.
On the left input, attach one of the classification or regression models provided in Azure Machine Learning Studio.
Tip You can also train a custom model created by using Create R Model.
Attach a training dataset to the right-hand input of Train Model.
For Label column, you must identify a single column that contains outcomes the model can use for training. Click Launch column selector, and choose the column in the dataset that contains the values you want to predict.
For classification problems, the label column must contain categorical values or discrete values. Some examples might be a yes/no rating, a disease classification code or name, or an income group. If you pick a noncategorical column, the module will return an error during training.
For regression problems, the label column must contain a numeric data that represents the response variable. Some examples might be a credit risk score, the projected time to failure for a hard drive, or the forecasted number of calls to a call center on a given day or time. If you do not choose a numeric column, you might get an error.
If you do not specify which label column to use, Azure Machine Learning will try to infer which is the appropriate label column, by using the metadata of the dataset. If it picks the wrong column, launch the column selector and choose a new column.
Tip If you have trouble using the Column Selector, see the article Select Columns in Dataset for tips. It describes some common scenarios and tips for using the WITH RULES and BY NAME options.
Run the experiment. If you have a lot of data, this can take a while.
Results
When the model is trained, right-click the output and select Visualize to view the model parameters and feature weights.
You can also save the trained model to use in other experiments, or connect it to the Score Model module to predict values for new cases.
For examples of how the Train Model module is used in machine learning experiments, see these experiments in the Model Gallery:
The Retail Forecasting sample demonstrates how to build, train, and compare multiple models.
The Flight Delay Prediction sample demonstrates how to train multiple related classification models.
| Name | Type | Description |
|---|---|---|
| Untrained model | ILearner interface | Untrained learner |
| Dataset | Data Table | Training data |
| Name | Range | Type | Default | Description |
|---|---|---|---|---|
| Label column | any | ColumnSelection | Select the column that contains the label or outcome column |
| Name | Type | Description |
|---|---|---|
| Trained model | ILearner interface | Trained learner |
For a list of all module errors, see Module Error Codes.
| Exception | Description |
|---|---|
| Error 0032 | Exception occurs if argument is not a number. |
| Error 0033 | Exception occurs if argument is Infinity. |
| Error 0083 | Exception occurs if dataset used for training cannot be used for concrete type of learner. |
| Error 0035 | Exception occurs if no features were provided for a given user or item. |
| Error 0003 | Exception occurs if one or more of inputs are null or empty. |
| Error 0020 | Exception occurs if number of columns in some of the datasets passed to the module is too small. |
| Error 0021 | Exception occurs if number of rows in some of the datasets passed to the module is too small. |
| Error 0013 | Exception occurs if passed to module learner has invalid type. |