Updated:
30 April 2009
Used only with clustering models. This function returns the likelihood that an input case will fit in the existing model.
PredictCaseLikelihood([NORMALIZED|NONNORMALIZED])
- NORMALIZED
-
Return value contains the probability of the case within the model divided by the probability of the case without the model.
- NONNORMALIZED
-
Return value contains the raw probability of the case, which is the product of the probabilities of the case attributes.
Models that are built by using the Microsoft Clustering and Microsoft Sequence Clustering algorithms.
Double-precision floating point number between 0 and 1. A number closer to 1 indicates that the case has a higher probability of occurring in this model. A number closer to 0 indicates that the case is less likely to occur in this model.
By default, the result of the PredictCaseLikelihood function is normalized. Normalized values are typically more useful as the number of attributes in a case increase and the differences between the raw probabilities of any two cases become much smaller.
The following equation is used to calculate the normalized values, given x and y:
-
x = likelihood of the case based on the clustering model
-
y = Marginal case likelihood, calculated as the log likelihood of the case based on counting the training cases
-
Z = Exp( log(x) – Log(Y))
Normalized = (z/ (1+z))
The following example returns the likelihood that the specified case will occur within the clustering model that was created in the Basic Data Mining Tutorial.
SELECT
PredictCaseLikelihood() AS Default_Likelihood,
PredictCaseLikelihood(NORMALIZED) AS Normalized_Likelihood,
PredictCaseLikelihood(NONNORMALIZED) AS Raw_Likelihood,
FROM
[TM Clustering]
NATURAL PREDICTION JOIN
(SELECT 28 AS [Age],
'2-5 Miles' AS [Commute Distance],
'Graduate Degree' AS [Education],
0 AS [Number Cars Owned],
0 AS [Number Children At Home]) AS t
Expected results:
|
Default_Likelihood
|
Normalized_Likelihood
|
Raw_Likelihood
|
|---|
|
6.30672792729321E-08
|
6.30672792729321E-08
|
9.5824454056846E-48
|
The difference between these results demonstrates the effect of normalization. Change History
|
Updated content
|
|---|
|
Fixed sample to accurately show differences between raw normalized and nonnormalized (raw) probabilities.
|
Reference
Data Mining Extensions (DMX) Function Reference
Functions (DMX)
Mapping Functions to Query Types (DMX)
Other Resources
Data Mining Algorithms (Analysis Services - Data Mining)
Help and Information
Getting SQL Server 2008 Assistance