SystemGetCrossValidationResults (Analysis Services - Data Mining)
Applies To: SQL Server 2016
Partitions the mining structure into the specified number of cross-sections, trains a model for each partition, and then returns accuracy metrics for each partition.
SystemGetCrossValidationResults( <mining structure> [, <mining model list>] ,<fold count> ,<max cases> ,<target attribute> [,<target state>] [,<target threshold>] [,<test list>])
Name of a mining structure in the current database.
mining model list
Comma-separated list of mining models to validate.
If a model name contains any characters that are not valid in the name of an identifier, the name must be enclosed in brackets.
If a list of mining models is not specified, cross-validation is performed against all models that are associated with the specified structure and that contain a predictable attribute.
Integer that specifies the number of partitions into which to separate the data set. The minimum value is 2. The maximum number of folds is maximum integer or the number of cases, whichever is lower.
Each partition will contain roughly this number of cases: max cases/fold count.
There is no default value.
Integer that specifies the maximum number of cases that can be tested across all folds.
A value of 0 indicates that all the cases in the data source will be used.
If you specify a value that is greater than the actual number of cases in the data set, all cases in the data source will be used.
There is no default value.
String that contains the name of the predictable attribute. A predictable attribute can be a column, nested table column, or nested table key column of a mining model.
Formula that specifies the value to predict. If a target value is specified, metrics are collected for the specified value only.
If a value is not specified or is null, the metrics are computed for the most probable state for each prediction.
The default is null.
An error is raised during validation if the specified value is not valid for the specified attribute, or if the formula is not the correct type for the specified attribute.
Double greater than 0 and less than 1. Indicates the minimum probability score that must be obtained for the prediction of the specified target state to be counted as correct.
A prediction that has a probability less than or equal to this value is considered incorrect.
If no value is specified or is null, the most probable state is used, regardless of its probability score.
The default is null.
A string that specifies testing options.
Note This parameter is reserved for future use.
The rowset that is returned contains scores for each partition in each model.
The following table describes the columns in the rowset.
|ModelName||The name of the model that was tested.|
|AttributeName||The name of the predictable column.|
|AttributeState||A specified target value in the predictable column. If this value is null, the most probable prediction was used.|
If this column contains a value, the accuracy of the model is assessed against this value only.
|PartitionIndex||An 1-based index that identifies to which partition the results apply.|
|PartitionSize||An integer that indicates how many cases were included in each partition.|
|Test||Category of the test that was performed. For a description of the categories and the tests that are included in each category, see Measures in the Cross-Validation Report.|
|Measure||The name of the measure returned by the test. Measures for each model depend on the type of the predictable value. For a definition of each measure, see Cross-Validation (Analysis Services - Data Mining).|
For a list of measures returned for each predictable type, see Measures in the Cross-Validation Report.
|Value||The value of the specified test measure.|
To return accuracy metrics for the complete data set, use SystemGetAccuracyResults (Analysis Services - Data Mining).
If the mining model has already been partitioned into folds, you can bypass processing and return only the results of cross-validation by using SystemGetAccuracyResults (Analysis Services - Data Mining).
The following example demonstrates how to partition a mining structure for cross-validation into two folds, and then test two mining models that are associated with the mining structure,
[v Target Mail].
Line three of the code lists the mining models that you want to test. If you do not specify the list, all non-clustering models associated with the structure are used. Line four of the code specifies the number of partitions. Because no value is specified for max cases, all cases in the mining structure are used and distributed evenly across the partitions.
Line five specifies the predictable attribute, Bike Buyer, and line six specifies the value to predict, 1 (meaning "yes, will buy").
The NULL value in line seven indicates that there is no minimum probability bar that must be met. Therefore, the first prediction that has a non-zero probability will be used in assessing accuracy.
CALL SystemGetCrossValidationResults( [v Target Mail], [Target Mail DT], [Target Mail NB], 2, 'Bike Buyer', 1, NULL )
|Target Mail DT||Bike Buyer||1||1||500||Classification||True Positive||144|
|Target Mail DT||Bike Buyer||1||1||500||Classification||False Positive||105|
|Target Mail DT||Bike Buyer||1||1||500||Classification||True Negative||186|
|Target Mail DT||Bike Buyer||1||1||500||Classification||False Negative||65|
|Target Mail DT||Bike Buyer||1||1||500||Likelihood||Log Score||-0.619042807138345|
|Target Mail DT||Bike Buyer||1||1||500||Likelihood||Lift||0.0740963734002671|
|Target Mail DT||Bike Buyer||1||1||500||Likelihood||Root Mean Square Error||0.346946279977653|
|Target Mail DT||Bike Buyer||1||2||500||Classification||True Positive||162|
|Target Mail DT||Bike Buyer||1||2||500||Classification||False Positive||86|
|Target Mail DT||Bike Buyer||1||2||500||Classification||True Negative||165|
|Target Mail DT||Bike Buyer||1||2||500||Classification||False Negative||87|
|Target Mail DT||Bike Buyer||1||2||500||Likelihood||Log Score||-0.654117781086519|
|Target Mail DT||Bike Buyer||1||2||500||Likelihood||Lift||0.038997399132084|
|Target Mail DT||Bike Buyer||1||2||500||Likelihood||Root Mean Square Error||0.342721344892651|
Cross-validation is available only in SQL Server Enterprise beginning with SQL Server 2008.
SystemGetAccuracyResults (Analysis Services - Data Mining)
SystemGetClusterCrossValidationResults (Analysis Services - Data Mining)
SystemGetClusterAccuracyResults (Analysis Services - Data Mining)