Merge Count Transform

 

Updated: April 12, 2016

Creates a set of features based on a counts table

You can use the Merge Count Transform module to combine two sets of count-based features. By merging two sets of related counts and features, you can potentially improve the coverage and distribution of the features.

In general, learning from counts is particularly useful in large data sets with high-cardinality features. Therefore, the ability to combine multiple datasets into count-based feature sets without having to reprocess the datasets makes it easier to gather statistics on very large datasets and apply them to new datasets. For example, count tables can be used to collect information over terabytes of data. You can re-use those statistics to improve the accuracy of predictive models on small data sets.

To merge two sets of count-based features, the features must have been created using tables that have the same schema: that is, both sets must use the same columns, with the same names and data types.

  1. To use Merge Count Transform, you must have created at least one count-based transformation, and made it available in your workspace. For saved count-based transformations, look in the Transforms group. For transformations in th current experiment, see the outputs of the following modules:

  2. Add the Merge Count Transform module to the experiment, and connect a transformation to each input.

    System_CAPS_tipTip

    The second transformation is an optional input – you can connect the same transformation twice, or connect nothing on the second input port.

  3. If you do not want the second dataset to be weighted equally with the first, specify a value for Decay factor. The value that you type indicates how the set of features from the second transformation should be weighted.

    For example, the default value of 1 weights both sets of features equally. A value of .5 means that the features in the second set would have half the weight of those in the first set.

  4. Optionally, add an instance of the Apply Transformation module, and apply the transformation to a dataset.

You can see examples of how this module is used by exploring these sample experiments in the Model Gallery:

Name

Type

Description

Previous counting transform

ITransform interface

The counting transform to edit.

New counting transform

ITransform interface

The counting transform to add.

  

Name

ToHide

Type

Range

Optional

Description

Default

Decay factor

dampingFactor

Float

Required

1.0f

The decay factor to be multiplied to the counting transform at the right input port.

Name

Type

Description

Merged counting transform

ITransform interface

The merged transform.

Exception

Description

Error 0003

Exception occurs if one or more of inputs are null or empty.

Error 0086

Exception occurs when a counting transform is invalid.

Show: