Mining Model Content for Logistic Regression Models (Analysis Services  Data Mining)
Applies To: SQL Server 2016
This topic describes mining model content that is specific to models that use the Microsoft Logistic Regression algorithm. For an explanation of how to interpret statistics and structure shared by all model types, and general definitions of terms related to mining model content, see Mining Model Content (Analysis Services  Data Mining).
A logistic regression model is created by using the Microsoft Neural Network algorithm with parameters that constrain the model to eliminate the hidden node. Therefore, the overall structure of a logistic regression model is almost identical to that of a neural network: each model has a single parent node that represents the model and its metadata, and a special marginal statistics node (NODE_TYPE = 24) that provides descriptive statistics about the inputs used in the model.
Additionally, the model contains a subnetwork (NODE_TYPE = 17) for each predictable attribute. Just like in a neural network model, each subnetwork always contains two branches: one for the input layer, and another branch that contains the hidden layer (NODE_TYPE = 19) and the output layer (NODE_TYPE = 20) for the network. The same subnetwork may be used for multiple attributes if they are specified as predictonly. Predictable attributes that are also inputs may not appear in the same subnetwork.
However, in a logistic regression model, the node that represents the hidden layer is empty, and has no children. Therefore the model contains nodes that represent individual outputs (NODE_TYPE = 23) and individual inputs (NODE_TYPE = 21) but no individual hidden nodes.
By default, a logistic regression model is displayed in the Microsoft Neural Network Viewer. With this custom viewer, you can filter on input attributes and their values, and graphically see how they affect the outputs. The tooltips in the viewer show you the probability and lift associated with each pair of inputs and output values. For more information, see Browse a Model Using the Microsoft Neural Network Viewer.
To explore the structure of the inputs and subnetworks, and to see detailed statistics, you can use the Microsoft Generic Content Tree viewer. You can click on any node to expand it and see the child nodes, or view the weights and other statistics contained in the node.
This section provides detail and examples only for those columns in the mining model content that have particular relevance for logistic regression. The model content is almost identical to that of a neural network model, but descriptions that apply to neural network models may be repeated in this table for convenience.
For information about generalpurpose columns in the schema rowset, such as MODEL_CATALOG and MODEL_NAME, that are not described here, or for explanations of mining model terminology, see Mining Model Content (Analysis Services  Data Mining).
MODEL_CATALOG
Name of the database where the model is stored.
MODEL_NAME
Name of the model.
ATTRIBUTE_NAME
The names of the attribute that corresponds to this node.
Node  Content 

Model root  Blank 
Marginal statistics  Blank 
Input layer  Blank 
Input node  Input attribute name 
Hidden layer  Blank 
Output layer  Blank 
Output node  Output attribute name 
NODE_NAME
The name of the node. Currently, this column contains the same value as NODE_UNIQUE_NAME, though this may change in future releases.
NODE_UNIQUE_NAME
The unique name of the node.
For more information about how the names and IDs provide structural information about the model, see the section, Using Node Names and IDs.
NODE_TYPE
A logistic regression model outputs the following node types:
Node Type ID  Description 

1  Model. 
17  Organizer node for the subnetwork. 
18  Organizer node for the input layer. 
19  Organizer node for the hidden layer. The hidden layer is empty. 
20  Organizer node for the output layer. 
21  Input attribute node. 
23  Output attribute node. 
24  Marginal statistics node. 
NODE_CAPTION
A label or a caption associated with the node. In logistic regression models, always blank.
CHILDREN_CARDINALITY
An estimate of the number of children that the node has.
Node  Content 

Model root  Indicates the count of child nodes, which includes at least 1 network, 1 required marginal node, and 1 required input layer. For example, if the value is 5, there are 3 subnetworks. 
Marginal statistics  Always 0. 
Input layer  Indicates the number of input attributevalues pairs that were used by the model. 
Input node  Always 0. 
Hidden layer  In a logistic regression model, always 0. 
Output layer  Indicates the number of output values. 
Output node  Always 0. 
PARENT_UNIQUE_NAME
The unique name of the node's parent. NULL is returned for any nodes at the root level.
For more information about how the names and IDs provide structural information about the model, see the section, Using Node Names and IDs.
NODE_DESCRIPTION
A userfriendly description of the node.
Node  Content 

Model root  Blank 
Marginal statistics  Blank 
Input layer  Blank 
Input node  Input attribute name 
Hidden layer  Blank 
Output layer  Blank 
Output node  If the output attribute is continuous, contains the name of the output attribute. If the output attribute is discrete or discretized, contains the name of the attribute and the value. 
NODE_RULE
An XML description of the rule that is embedded in the node.
Node  Content 

Model root  Blank 
Marginal statistics  Blank 
Input layer  Blank 
Input node  An XML fragment containing the same information as the NODE_DESCRIPTION column. 
Hidden layer  Blank 
Output layer  Blank 
Output node  An XML fragment containing the same information as the NODE_DESCRIPTION column. 
MARGINAL_RULE
For logistic regression models, always blank.
NODE_PROBABILITY
The probability associated with this node. For logistic regression models, always 0.
MARGINAL_PROBABILITY
The probability of reaching the node from the parent node. For logistic regression models, always 0.
NODE_DISTRIBUTION
A nested table that contains statistical information for the node. For detailed information about the contents of this table for each node type, see the section, Understanding the NODE_DISTRIBUTION Table, in Mining Model Content for Neural Network Models (Analysis Services  Data Mining).
NODE_SUPPORT
For logistic regression models, always 0.



To get information about the support in the training cases for specific values, see the marginal statistics node.
MSOLAP_MODEL_COLUMN
Node  Content 

Model root  Blank 
Marginal statistics  Blank 
Input layer  Blank 
Input node  Input attribute name. 
Hidden layer  Blank 
Output layer  Blank 
Output node  Input attribute name. 
MSOLAP_NODE_SCORE
In logistic regression models, always 0.
MSOLAP_NODE_SHORT_CAPTION
In logistic regression models, always blank.
The naming of the nodes in a logistic regression model provides additional information about the relationships between nodes in the model. The following table shows the conventions for the IDs that are assigned to nodes in each layer.
Node Type  Convention for node ID 

Model root (1)  00000000000000000. 
Marginal statistics node (24)  10000000000000000 
Input layer (18)  30000000000000000 
Input node (21)  Starts at 60000000000000000 
Subnetwork (17)  20000000000000000 
Hidden layer (19)  40000000000000000 
Output layer (20)  50000000000000000 
Output node (23)  Starts at 80000000000000000 
You can use these IDs to determine how output attributes are related to specific input layer attributes, by viewing the NODE_DISTRIBUTION table of the output node. Each row in that table contains an ID that points back to a specific input attribute node. The NODE_DISTRIBUTION table also contains the coefficient for that inputoutput pair.
Microsoft Logistic Regression Algorithm
Mining Model Content for Neural Network Models (Analysis Services  Data Mining)
Logistic Regression Model Query Examples
Microsoft Logistic Regression Algorithm Technical Reference