Data Transformation - Scale and Reduce

 

Updated: October 6, 2017

This article describes the modules in Azure Machine Learning Studio that are provided to help you work with numerical data. For machine learning, common data tasks include clipping, binning, and normalizing numerical values. Other modules support dimensionality reduction.

Tasks such as normalizing, binning, or redistributing numerical variables are an important part of data preparation. The modules in this group support the following data preparation tasks:

  • Grouping data into bins of varying sizes or distributions

  • Removing outliers or changing their values

  • Normalizing a set of numeric values into a specific range

  • Creating a compact set of feature columns from a high-dimension dataset

Related Tasks

In addition to these modules, you might find the following related tools useful for transforming numeric data:

This category includes the following modules:

ModuleDescription
Clip ValuesDetects outliers and clips or replaces their values
Group Data into BinsPuts numerical data into bins
Normalize DataRescales numeric data to constrain dataset values to a standard range
Principal Component AnalysisComputes a set of features with reduced dimensionality for more efficient learning

Module Categories and Descriptions
Manipulation
Sample and Split
Filter
Learning with Counts
Feature Selection
A-Z Module List

Show: