Add Columns

 

Updated: July 6, 2016

Adds a set of columns from one dataset to another

You can use the Add Columns module to concatenate two datasets. You combine all columns from the two datasets that you specify as inputs to create a single dataset. If you need to concatenate more than two datasets, use several instances of Add Columns.

System_CAPS_tipTip

When combining two datasets that contain a different number of rows, use the Join Data module, which supports outer joins on a common key column.

  1. Add the Add Columns module to your experiment.

  2. Connect the two datasets that you want to concatenate. If you want to combine more than two datasets, you can chain together several combinations of Add Columns.

    • It is possible to combine two columns that have a different number of rows. In that case, the output dataset contains nulls for every row missing from the smaller source column.

    • You cannot choose individual columns to add. All the columns from each dataset are concatenated when you use Add Columns. Therefore, if you want to add only a subset of the columns, use Select Columns in Dataset to create a dataset with the columns you want.

  3. Run the experiment.

    You can right-click the output of Add Columns and select View Results to see the first rows of the new dataset, or you can select Save as Dataset to save and name the concatenated dataset.

    • The number of columns in the new dataset equals the sum of the columns of both input datasets.

    • If there are two columns with the same name in the input datasets, a numeric suffix is added to the name of the column from the dataset used in the right input column. For example, if there are two instances of a column named TargetOutcome, the right column would be renamed TargetOutcome (1).

For examples of how Add Columns is used in an experiment, see these sample experiments in the Model Gallery:

Name

Type

Description

Left dataset

Data Table

Left dataset

Right dataset

Data Table

Right dataset

Name

Type

Description

Combined dataset

Data Table

Combined dataset

For a list of all exceptions, see Machine Learning Module Error Codes.

Exception

Description

Error 0003

An exception occurs if one or more input datasets is null or empty.

Error 0017

An exception occurs if one or more specified columns has a type that is unsupported by the current module.

Show: