Data Transformation / Scale and Reduce

 

Updated: April 11, 2016

The modules in this group help you clip, bin, and normalize numerical values, as well as reduce the number of columns in the dataset.

Normalizing, binning, or redistributing numerical variables is an important part of data preparation for many machine learning task. The modules in this group help you perform these critical data preparation tasks:

  • Grouping data into bins of varying sizes or distributions

  • Removing outliers or changing their values

  • Normalizing a set of numeric values into a specific range

  • Creating a compact set of feature columns from a high-dimension dataset

Related Tasks

In addition to these modules, you might find the following related tools useful for transforming numeric data:

The category Data Transformation / Scale and Reduce includes these modules:

Module

Description

Clip Values

Detects outliers and clips or replaces their values

Group Data into Bins

Puts numerical data into bins

Normalize Data

Rescales numeric data to constrain dataset values to a standard range

Principal Component Analysis

Computes a set of features with reduced dimensionality for more efficient learning

Show: