Data Transformation
Updated: September 22, 2015
The modules in this group support a wide range of operations necessary to prepare data for use in machine learning algorithms.
Filter: Digital signal filters can be applied to numeric data to support machine learning tasks such as image recognition, voice recognition, and waveform analysis.
Learning with Counts: Count-based featurization modules help you develop compact features for use in machine learning.
Manipulate: Tools to support data preparation for machine learning, including tasks such as merging datasets, cleaning missing values, grouping and summarizing data, changing column names and data types, and indicating which column is a label or feature.
Sample and Split: Divide your data intro training and test sets, split datasets by percentage or by a filter condition, or perform sampling.
Scale and Reduce: Prepare numerical data for analysis by applying normalization or by scaling. Bin data into groups, remove or replace outliers. perform principal component analysis (PCA).
Click the links in the table to see a complete list of the Data Transformation modules in each category:
Category |
|---|