Quantize (deprecated)

 

Updated: July 2, 2015

Puts numerical data into bins

You can use the Quantize module to bin numbers.

System_CAPS_warningWarning

This module is provided for backward compatibility with experiments created using the pre-release version of Azure Machine Learning, and will soon be deprecated. We recommend that you modify your experiments to use Quantize Data instead.

This module is useful for binning and flattening the distribution of continuous data

During binning, each input element is mapped to a bin by comparing its value against positions of bin edges. For example, if value is 1.5 and bin edges are 1,2,3, the element would be mapped to bin number 2. Value 0.5 would be mapped to bin number 1 (the underflow bin), and value 3.5 would be mapped to bin number 4 (the overflow bin).

The help for this module was based on an early release version and will not be updated because this module is deprecated. For up-to-date information about how quantization works, see the help topic for Quantize Data.

  • The option Tag columns as categorical can be used to control whether the quantized columns become categorical variables.

  • To apply different quantization rules to different columns, chain together multiple instances of the Quantize module, and in each instance select a subset of columns to quantize.

  • The same binning rule is applied to all columns specified in list of columns to quantize. The output mode specifies how the quantized values are returned. Choices include appending input table, overwriting columns in input table, and returning result columns only.

  • Input columns must be numeric, and for quantile binning there must be a sufficient range of data points to determine the quantiles. Otherwise an error or NaN result may occur.

The bin indices are 1-based. This is the natural convention for quantiles (1st quantile, 2nd quantile, and so on). The only exception is the case when the column to bin is sparse.

All NaNs and missing values are propagated from input to output column. The only exception is the case when the module returns quantile indexes. In this case all NaNs are promoted to missing values.

If the column to bin (quantize) is sparse, then the bin index offset (quantile offset) is used when resulting column is populated. The offset is chosen so that sparse 0 always goes to the bin with index 0 (quantile with value 0). As a result, sparse zeros are propagated from input to output column. Notice that processing of dense column always produces a result with a minimum bin index equal to 1 (in other words, the minimum quantile value equals the minimum value in the column). Processing of a sparse column produces a result with variable values for the minimum bin index (minimum quantile value).

Name

Type

Description

Dataset

Data Table

Dataset to be analyzed

Name

Range

Type

Default

Description

Binning mode

any

QuantizationMode

Quantiles

Choose a binning method

Columns to bin

any

ColumnSelection

NumericAll

Choose columns for quantization

Output mode

any

OutputTo

Indicate how quantized columns should be output

Tag columns as categorical

any

Boolean

true

Indicate whether output columns should be tagged as categorical

Name

Type

Description

Quantized dataset

Data Table

Dataset with quantized columns

Binning function

Function

Function that applies quantization to the dataset

Exception

Description

Error 0003

Exception occurs if one or more of inputs are null or empty.

Error 0004

Exception occurs if parameter is less than or equal to specific value.

Error 0011

Exception occurs if passed column set argument does not apply to any of dataset columns.

Error 0021

Exception occurs if number of rows in some of the datasets passed to the module is too small.

Error 0024

Exception occurs if dataset does not contain a label column.

Error 0020

Exception occurs if number of columns in some of the datasets passed to the module is too small.

Error 0038

Exception occurs if number of elements expected should be an exact value, but is not.

Error 0005

Exception occurs if parameter is less than a specific value.

Error 0002

Exception occurs if one or more parameters could not be parsed or converted from specified type into required by target method type.

Error 0019

Exception occurs if column is expected to contain sorted values, but it does not.

Error 0039

Exception occurs if operation has failed.

Show: