Data Mining Client for Excel (SQL Server Data Mining Add-ins)
SQL Server 2014
The Data Mining Client for Excel is a set of tools that let you perform common data mining tasks, from data cleansing to model building and prediction queries. You can use data in Excel tables or ranges, or access external data sources.
Load your data into Excel, cleanse the data, check for outliers, and create statistical summaries. You can also perform different kinds of sampling, profile the data, and test models using external data. The Data Mining Client is the easiest way to prepare data for analysis without complex scripts or ETL processes.
These tools provide wizard interfaces to well-known, empirically tested data mining algorithms, including clustering (K-means and EM), association analysis, time series analysis, and decision trees. Advanced modeling options for each wizard let you choose different algorithms, such as the Naïve Bayes or neural networks, and customize behavior such as the cluster seed or initial sampling size.
All data mining algorithms are hosted in an instance of Analysis Services, giving you more power to build complex models.
The Data Mining Client provides industry-standard tools for testing models, including lift charts and cross-validation. The wizards provided make it easy to test the validity of the data set and its accuracy. The query wizard builds queries to use the models for prediction and scoring.
The Data Mining Client for Excel maintains an active connection to the server, so you can save your data mining model to the server, to use in further testing, or to deploy to a production server for greater scalability.
The Data Preparation group contains the following wizards that help you review and clean data in preparation for data mining tasks. Most wizards also let you separate data into training and testing sets.
Creating the right kind of data for training and testing your models is an important part of data mining, but one that can be tedious without the right tools. The Sample Data wizard makes it easy to divide the data used for a model into two groups, one for building the model and one for testing it. You can use random sampling or oversampling.
The Remove Outliers wizard gives you several tools to identify and appropriately handle outliers. It shows you the distribution of values and the relationship of the outliers to other data, and lets you decided whether to remove or change outliers.
The Relabel wizard helps you create new labels for data to make it easier to understand the results of analysis. For example, you can rename a range of data with a more descriptive name, or you can choose a representative value from the list.
The options on the Data Modeling section of the toolbar let you derive patterns from data; group rows of data based on attributes, or explore associations. The wizards in this tool ribbon are based on the powerful data mining algorithms available in Analysis Services. Unlike the similar tools in the Table Analysis Tools for Excel, these wizards let you customize the behavior of the algorithm and use a variety of data sources.
The Classify wizard helps you build a classification model based on existing data in an Excel table, an Excel range, or an external data source. A classification model extracts patterns in your data that indicate similarities and helps you make predictions based on groupings of values. For example, a classification model might be used to predict risk based on income or spending patterns.
The Classify wizard supports use of these Microsoft data mining algorithms: Decision Trees algorithm, Logistic Regression, Naïve Bayes, neural Networks.
The Estimate wizard helps you create an estimation model. An estimation model extracts patterns from data and uses the patterns to predict a numeric outcome, such as a currency, sales amount, date, or time.
The Estimate wizard uses these Microsoft data mining algorithms: Decision Trees, Linear Regression, Logistic Regression, and Neural Networks.
The Associate wizard helps you create a data mining model using the Microsoft Association Rules algorithm, which detects frequently co-occurring items or events. Such association models are particularly useful for making recommendations.
The Associate wizard uses the Microsoft Association Rules algorithm.
The Forecast wizard helps you predict values in a time series. Typically data that you use in a forecasting contains some kind of time series, either a date stamp or some sequence ID, and you use it to derive patterns for use in predicting future values.
The Forecast wizard uses the Microsoft Time Series algorithm.
Models you create are automatically opened for browsing. However, you can also browse models on the server and generate new visualizations. Use the Visio shapes to export model diagrams to a customizable canvas.
Click Trace to view an ongoing capture of the interaction between the Excel client and the server. All activity is stored as DMX or XMLA statements, so that you can troubleshoot your data mining session or save the information for later use.
To use Excel as a client for data mining, you must establish a connection to an instance of Analysis Services. The connection provides you with access to the Analysis Services engine. If you have permissions, the connection also lets you store any patterns you have discovered, and modify existing data mining objects.
The Connections toolbar provides wizards for managing connections to an instance of Analysis Services. You must define a connection to an instance of Analysis Services to use the data mining tools and algorithms. You can create the connection when you install the add-in, or you can add a connection later.
Click the Getting Started button to start a configuration wizard that walks you through the process of creating a connection to an instance of Analysis Services, and obtaining the permissions needed to do data mining.
The Help dropdown menu provides links to online help, Web sites, and a configuration wizard to help you complete setup and start data mining.
The Help page also links to resources on line, including the Help for the add-in, and additional videos, demos, and samples.