Building a New Analysis Model

You can build multiple analysis models for each site. The model build process is resource intensive, so it is recommended that you build an analysis model only when the impact on your computer will be minimal.

Before you build an analysis model, you set up model configuration using the Predictor Model Configuration Setup Wizard. After you have set up model configuration, you can build the analysis model using the Predictor resource in the Commerce Server Manager and then use Model Builder DTS task to schedule automatic rebuilding of the analysis model. For information about the Model Builder DTS task, see the Running the Model Builder DTS Task.

Ee810491.important(en-US,CS.20).gifImportant

  • If you are using an analysis model for real-time online predictions, it is highly recommended that you copy your analysis model on to the Web server. This way you can secure your Data Warehouse behind a firewall and still make use of the Prediction model. For more information, see Deploying Predictor and Securing a Predictor Deployment.

The time required to build an analysis model depends on the size of the input data (total number of cases), the sample size, the number of properties in the data, and the hardware and topology of your servers.

Ee810491.important(en-US,CS.20).gifImportant

  • You can build only one analysis model at a time.

To build a new analysis model

  1. Expand Commerce Server Manager, expand Global Resources, expand Predictor on <server name>, expand Predictor Service, and then click Model Configurations.
  2. In the details screen, right-click the model configuration you want to use to build an analysis model, and then click Build.
  3. In the Model Build Properties dialog box, do the following:
    Use this To do this
    Name Type a name for the analysis model.
    Model type Select from the drop-down list either prediction or segment for this model.
  4. Click Next.
  5. In the second screen of the Model Build Properties dialog box, do the following:
    Use this To do this
    Sample size Type the number of cases that are used to build the analysis model. You can change the default value if necessary.

    It is recommended that you use no fewer than 10,000 cases.

    For example, if you have a large table with 200,000 cases, you may want to specify that the Predictor resource use only 20,000 cases to build the analysis model.

    The default is -1 for less than 20,000 cases, and 20,000 for more than 20,000 cases.

    Measured accuracy sample fraction Type the fraction of the sample data you want to use to automatically score the accuracy of the model, as a number between 0.0 and 1.0.

    For example, if you type 0.0 as the value of the Measured accuracy sample fraction option, the model will not be scored.

    If you type 0.4, 40 percent of the sample data will be used to score the model.

    (The remaining 60 percent will be used to build the model.)

    Measured accuracy maximum predictions Type the maximum number of recommendations to be presented on your site (used to compute the Recommendation Score).

    The default is 10 properties.

    Input property fraction Type the fraction of properties to be used as input to the predictions as a number between 0.0 and 1.0.

    For example, specifying an input property fraction of 0.05 selects the most significant 5 percent of input properties.

    The default value is 1.0, which includes all properties as inputs for prediction.

    Values less than 1.0 are recommended if the number of properties is very large, such as product recommendations for a catalog with over 1,000 products.

    Output property fraction Type the fraction of properties to be predicted as a number between 0.0 and 0.1.

    For example, specifying an output property fraction of 0.05 results in decision trees being built for the most significant 5 percent of properties.

    (If the output properties are products, this will return trees for the 5 percent most popular products.)

    The default value is 1.0, which produces trees for all properties.

    Values less than 1.0 are recommended if the number of properties is very large, such as product recommendations for a catalog with over 1,000 products.

    Number of Segments Type the maximum number of segments in which to partition the users. This value provides a rough estimate for the algorithm, which may find fewer significant segments than this value. This value is only available if you are building a Segment model.
    Buffer size Type the size of the buffer that will be used to read cases during segmentation.

    The default is 1 megabyte.

    The buffer size can affect the build time and quality of the model.

    For example, if the model contains many properties, you should set a large buffer size.

    The system resources of the computer running the Predictor resource determine buffer size limitations.

    This value is only available if you are building a Segment model.

  6. Click Finish.

The status of the build process appears in the details screen.

During the build process, the status is Building. When the build process is finished, the status is Idle. If the build process is unsuccessful, a message that describes the problem appears in the details screen and is written to the Commerce Server 2002 Application Log. You can use Event Viewer to view the Application Log.

See Also

Business Analysis Objects

Deploying Predictor

Securing a Predictor Deployment

Viewing Analysis Model Configuration Tables

Best Practices for the Predictor Resource

Saving and Importing Models

Copyright © 2005 Microsoft Corporation.
All rights reserved.