Tune Model Hyperparameters

 

Updated: July 21, 2017

Performs a parameter sweep on a model to determine the optimum parameter settings

Category: Machine Learning / Train

You can use the Tune Model Hyperparameters module to build and test models using different combinations of settings, in order to determine the optimum hyperparameters for the given prediction task and data.

You can think of the model's parameters as the actual values applied in a specific decision tree, regression mode, or other model generated by an algorithm. The model's hyperparameters are the settings and values you use when configuring and testing the model. The purpose of this module is help you find the best combination: a process called tuning, or a parameter sweep.

The module support two methods for finding the optimum settings for a model:

  • Integrated train and tune: You configure the set of parameters to experiment with, and then let the module iterate over multiple combinations, measuring accuracy until it finds a "best" model. With most learner modules, you can choose which parameters should be changed during the training process, and which should remain fixed.

    You can configure the tuning process to exhaustively test all combinations, or to establish a grid of parameter combinations and test some random set of these.

  • Cross validation with tuning: With this option, you divide your data into some number of folds and then build and test models on each fold. This method provides the best accuracy and can help find problems with the dataset; however, it takes longer to train.

Both methods generate a trained model that you can save for re-use.

It's impossible to know the best parameters for a given machine learning task without considerable experimentation. The tuning process can include multiple steps:

  • Perform feature selection to determine the columns or variables that have the highest information value. For more information, see Feature Selection.

  • When you've removed unnecessary columns, use the Tune Model Hyperparameters module to train a model while automatically finding the best parameters. This process of finding the ideal combination of settings is sometimes called a parameter sweep.

  • Cross-validation is another important part of testing your models. You can combine cross-validation with a parameter sweep as described in this topic.

  • If you are building a clustering model, use Sweep Clustering to train a model and automatically determine the optimum number of clusters and other parameters.

How to train a model with a parameter sweep

This section describes how to perform a parameter sweep using the Tune Model Hyperparameters module.

  1. Add the Tune Model Hyperparameters module to your experiment.

  2. Connect an untrained learner to the leftmost input.

  3. Set the Create trainer mode option to Parameter Range and use the Range Builder to specify a range of values to use in the parameter sweep.

    Almost all the classification and regression modules support an integrated parameter sweep. For those learners that do not support configuring a parameter range, the range of available parameter values will be tested.

    System_CAPS_ICON_tip.jpg Tip

    You can fix one or more parameters at a certain value manually and then sweep over the remaining parameters. This might save some time.

  4. Add the dataset you want to use for training and connect it to the middle input of Tune Model Hyperparameters.

    Optionally, if you have a tagged dataset, you can connect it to the rightmost input port (Optional validation dataset) to use in measuring accuracy.

  5. In the Properties pane of Tune Model Hyperparameters, choose a value for Parameter sweeping mode. This option controls how the parameters are selected.

    • Entire grid: When you select this option, the module loops over a grid predefined by the system to try different combinations and identify the best learner.

      This option is useful for cases where you don't know what the best parameter settings might be and want to try many parameters.

      There are two types of grid sweeps: one that trains a model over all possible combination of values, and one that builds the matrix of all possible values, and then randomly samples from the matrix. Research has shown that the random grid sweep is both more efficient computationally, and yields the same results.

    • Random sweep: When you select this option, the module will randomly select parameter values over a system-defined range. You must specify the maximum number of runs that you want the module to execute.

      This option is useful for cases where you want to increase model performance using the metrics of your choice but still conserve computing resources.

  6. For Label column, launch the column selector to choose a single label column.

  7. Choose a single metric to use when ranking the models.

    When you run a parameter sweep, all applicable metrics for the model type are calculated and are returned in the Sweep results report. Separate metrics are used for regression and classification models.

    However, the metric you choose determines how the models are ranked. Only the top model, as ranked by the chosen metric, is output as a trained model to use for scoring.

  8. For Random seed, type a number to use when initializing the parameter sweep. If you are training a model that supports an integrated parameter sweep, you can also set a range of seed values to use and iterate over the random seeds as well. This is optional, but can be useful for avoiding bias introduced by seed selection.

  9. Run the experiment.

Results

  • To view a set of accuracy metrics for the best model, right-click the module, select Sweep results, and then select Visualize.

    All accuracy metrics applicable to the model type are output, but the metric that you selected for ranking determines which model is considered "best". Metrics are generated only for the top-ranked model.

  • To view the settings derived for the "best" model, right-click the module, select Trained best model, and then click Visualize. The report includes parameter settings and feature weights for the input columns.

  • To use the model for scoring in other experiments, without having to repeat the tuning process, right-click the model output and select Save as Trained Model.

How to perform cross-validation with a parameter sweep

This section describes how to combine a parameter sweep with cross-validation and a custom number of folds. This process takes longer, but you get the maximum amount of information about your dataset and the possible models.

  1. Add the Partition and Sample module, and connect your dataset.

  2. Choose the Assign to Folds option and specify some number of folds to divide the data into. If you don't specify a number, 10 folds will be used during cross validation, and rows will be selected randomly without replacement. l

    If you want to balance the sampling on some column, set Stratified split to TRUE, and then select the strata column. For example, if you have an imbalanced dataset, you might want to stratify on the label column.

  3. Add the Tune Model Hyperparameters module to your experiment.

  4. Connect one of the machine learning modules in this category to the left-hand input of Tune Model Hyperparameters.

  5. In the Properties pane for the learner, set the Create trainer mode option to Parameter Range and use the Range Builder to specify a range of values to use in the parameter sweep.

    Almost all learners in Azure Machine Learning support cross-validation with an integrated parameter sweep, which lets you choose the parameters to experiment with.

    System_CAPS_ICON_tip.jpg Tip

    You don’t need to specify a range for all values. You can fix one or more parameters at a certain value manually and then sweep over the remaining parameters. This might save some computation time.

    If the learner doesn't support setting a range of values, you can still use it in cross-validation, and some range of allowed values will be selected. Learners that don't support specifying a range are listed in the Technical Notes section.

  6. Connect the output of Partition and Sample to the labeled Training dataset input of Tune Model Hyperparameters.

  7. You don’t need to connect a validation dataset to the rightmost input of Tune Model Hyperparameters – for cross-validation you just need a training dataset.

  8. In the Properties pane of Tune Model Hyperparameters, indicate whether you want to perform a random sweep or a grid sweep. You can also limit the number of iterations. See the Options section for details.

  9. Choose a single label column, and a metric to use in ranking the model.

  10. For Random seed, type a number to use when initializing the parameter sweep.

    If you are training a model that supports an integrated parameter sweep, you can also set a range of seed values to use and iterate over the random seeds as well. This is optional, but can be useful for avoiding bias introduced by seed selection.

  11. Add the Cross-Validate Model module. Connect the output of Partition and Sample to the Dataset input, and connect the output of Tune Model Hyperparameters to the Untrained model input.

  12. Select the class column, and set a random seed value if desired.

  13. Run the experiment.

Results

  • To view the evaluation results, right-click the module, select Evaluation results by fold, and then select Visualize.

    The accuracy metrics are calculated from the cross-validation pass, and may vary slightly depending on how many folds you selected.

  • To see how the dataset was divided, and how the "best" model would score each row in the dataset, right-click the module, select Scored results, and then select Visualize.

    If you save this dataset for later re-use, the fold assignments are preserved.For example:

    Fold assignmentsClassAge(1st feature column)
    2035
    1117
    3062
  • To get the parameter settings for the "best" model, right-click Tune Model Hyperparameters

How the parameter sweep options work

This section describes some of the more important options, and how they interact.

Parameter sweeping mode

When you set up a parameter sweep, you define the scope of your search, to use either a finite number of parameters selected randomly, or an exhaustive search over a parameter space you define.

  • The Random sweep option trains a model using a set number of iterations. You specify a range of values to iterate over, and the module uses a randomly chosen subset of those values. Values are chosen with replacement, meaning that numbers previously chosen at random are not removed from the pool of available numbers. Thus, the chance of any value being selected remains the same across all passes.

  • The Grid sweep option creates a matrix that includes every combination of the parameters in the value range you specify, and then trains multiple models using these parameters.

    However, you can use either the entire grid, or a random selection from the grid. When you use the Entire grid option, each and every combination is tested. This option can be considered the most thorough, but requires the most time. However, recent research has shown that random sweeps can perform better than grid sweeps.

    If you select the Random grid option, the matrix of all combinations is calculated and values are sampled from the matrix, over the number of iterations you specified.

Controlling the length and complexity of training

You can limit the number of iterations used to test a model, or limit the parameter space, or both.

  • Maximum number of runs on random sweep

    If you use one of the random sweep options, you can specify how many times the model should be trained on a random combination of parameter values.

  • Maximum number of runs on random grid

    This option also controls the number of iterations over a random sampling of parameter values, but the values are not generated randomly from the specified range; instead, a matrix is created of of all possible combinations of parameter values and a random sampling is taken over the matrix. This is more efficient and less prone to regional oversampling or undersampling.

Choosing an evaluation metric

A uniform set of metrics is used for all classification models and another set of metrics for all regression models. A report containing the accuracy for each model is presented at the end so that you can review all od he metric results.

However, you must choose a single metric to use in ranking the models that are generated during the tuning process. You might find that the best metric varies, depending on your business problem, and the cost of false positives and false negatives.

For more information, see How to evaluate model performance in Azure Machine Learning

Metric for measuring performance for classification

  • Accuracy The proportion of true results to total cases.

  • Precision The proportion of true results to positive results.

  • Recall The fraction of all correct results over all results.

  • F-score A measure that balances precision and recall.

  • AUC A value that represents the area under the curve when false positives are plotted on the x-axis and true positives are plotted on the y-axis.

  • Average Log Loss The difference between two probability distributions: the true one, and the one in the model.

  • Train Log Loss The improvement provided by the model over a random prediction.

Metric for measuring performance for regression

  • Mean absolute error Averages all the error in the model, where error means the distance of the predicted value from the true value. Often abbreviated as MAE.

  • Root of mean squared error Measures the average of the squares of the errors, and then takes the root of that value. Often abbreviated as RMSE

  • Relative absolute error Represents the error as a percentage of the true value.

  • Relative squared error Normalizes the total squared error it by dividing by the total squared error of the predicted values.

  • Coefficient of determination A single number that indicates how well data fits a model. A value of 1 means that the model exactly matches the data; a value of 0 means that the data is random or otherwise cannot be fit to the model. Often referred to as r2, R2, or r-squared.

See these experiments in the Cortana Intelligence Gallery:

System_CAPS_ICON_tip.jpg Tip

Need more help with accuracy metrics?

These blogs provide a good description of how to interpret performance metrics when assessing a model's fit:

The following learners do not support setting a range of values to use in a parameter sweep:

  • Two-Class Bayes Point Machine
  • Bayesian Linear Regression
NameTypeDescription
Untrained modelILearner interfaceUntrained model for parameter sweep
Training datasetData TableInput dataset for training
Validation datasetData TableInput dataset for validation (for Train/Test validation mode). This input is optional.
NameRangeTypeDefaultDescription
Specify parameter sweeping modeListSweep MethodsRandom sweepSweep entire grid on parameter space, or sweep with using a limited number of sample runs
Maximum number of runs on random sweep[1;10000]Integer5Execute maximum number of runs using random sweep
Random seedanyInteger0Provide a value to seed the random number generator
Label columnanyColumnSelectionLabel column
Metric for measuring performance for classificationListBinary Classification Metric TypeAccuracySelect the metric used for evaluating classification models
Metric for measuring performance for regressionListRegressionMetric TypeMean absolute errorSelect the metric used for evaluating regression models
NameTypeDescription
Sweep resultsData TableResults metric for parameter sweep runs
Trained best modelILearner interfaceModel with best performance on the training dataset

A-Z Module List
Train
Cross-Validate Model

Show: