Statistical Functions

 

Updated: September 21, 2017

This article describes a group of modules in Azure Machine Learning that support mathematical and statistical operations critical for machine learning. Modules in the Statistical Functions group perform tasks such as these, when added to a machine learning experiment:

  • Perform ad hoc computations on column values, including rounding. Compute means, logarithms, and other statistics commonly used in machine learning.

  • Calculate correlations among columns.

  • Generate probability scores for column values.

  • Calculate z scores for multiple test types and sample sizes.

  • Test the values in a column against a variety of statistical distributions.

  • Generate statistical reports that summarize a set of columns or a complete dataset.

    For example, the Summarize Data module generates a report for an entire dataset that includes standard statistical measures such as mean and standard deviation.

    Using the Compute Elementary Statistics module, you can generate additional descriptive statistics, such as sample skewness or interquartile distance.

The Statistical Functions category includes the following modules:

ModuleDescription
Apply Math OperationApplies a mathematical operation to column values
Compute Elementary StatisticsCalculates specified summary statistics for selected dataset columns
Compute Linear CorrelationCalculates the linear correlation between column values in a dataset
Evaluate Probability FunctionFits a specified probability distribution function to a dataset
Replace Discrete ValuesReplaces discrete values from one column with numeric values based on another column
Summarize DataGenerates a basic descriptive statistics report for the columns in a dataset
Test Hypothesis Using t-TestCompares means from two datasets using a t-test

Scale and Reduce
Feature Selection
Learning with Counts
Module Categories and Descriptions
A-Z Module List

Show: