Statistical Functions
Updated: September 21, 2017
This article describes a group of modules in Azure Machine Learning that support mathematical and statistical operations critical for machine learning. Modules in the Statistical Functions group perform tasks such as these, when added to a machine learning experiment:
Perform ad hoc computations on column values, including rounding. Compute means, logarithms, and other statistics commonly used in machine learning.
Calculate correlations among columns.
Generate probability scores for column values.
Calculate z scores for multiple test types and sample sizes.
Test the values in a column against a variety of statistical distributions.
Generate statistical reports that summarize a set of columns or a complete dataset.
For example, the Summarize Data module generates a report for an entire dataset that includes standard statistical measures such as mean and standard deviation.
Using the Compute Elementary Statistics module, you can generate additional descriptive statistics, such as sample skewness or interquartile distance.
The Statistical Functions category includes the following modules:
| Module | Description |
|---|---|
| Apply Math Operation | Applies a mathematical operation to column values |
| Compute Elementary Statistics | Calculates specified summary statistics for selected dataset columns |
| Compute Linear Correlation | Calculates the linear correlation between column values in a dataset |
| Evaluate Probability Function | Fits a specified probability distribution function to a dataset |
| Replace Discrete Values | Replaces discrete values from one column with numeric values based on another column |
| Summarize Data | Generates a basic descriptive statistics report for the columns in a dataset |
| Test Hypothesis Using t-Test | Compares means from two datasets using a t-test |
Scale and Reduce
Feature Selection
Learning with Counts
Module Categories and Descriptions
A-Z Module List