Introducing SQL Server R Services
SQL Server R Services provides a platform for developing intelligent applications that uncover new insights. You can use the rich and powerful R language and the many open source packages to create models and generate predictions using your SQL Server data.
SQL Server R Services is an installation of R that runs alongside SQL Server (operating as a SQL Server service) and communicates securely with SQL Server. During the installation process, Microsoft R Open and ScaleR libraries are installed onto SQL Server so that you can integrate your server data wth SQL Server and Microsoft’s BI tools. Because SQL Server R Services integrates the R language with SQL Server, you can keep analytics close to the data and eliminate the costs and security risks associated with data movement.
SQL Server R Services combines R with a comprehensive set of SQL Server tools and technologies that offer superior performance, security, reliability and manageability. You can deploy R solutions using convenient, familiar tools, and your production applications can call the R runtime and retrieve predictions and visuals using Transact-SQL. With Enterprise Edition, you also get the ScaleR libraries to overcome R’s inherent performance and scale limitations.
The RevoScaleR and MicrosoftML packages are included when you install SQL Server R Services on an instance of SQL Server 2016. RevoScaleR and MicrosoftML are also included in all installations of Microsoft R Server.
How to Work with R in SQL Server
Getting and Moving SQL Server Data
To work with SQL Server data in R, create a SQL Server data source in your R code using the
RxSqlServerData function, which defines the connection string and the data you want to use.
The data source is saved as a variable that points to the data. The data is not actually loaded or moved until you use a function that uses the data, such as
rxDataStep, which can work with multiple data sources, or
The ability to save the data source object in a variable makes it very easy to re-use connection information with different functions, create multiple variables for different tables in a single source, and so forth. Most of the RevoScaleR functions, also referred to as rx functions, support moving or analyzing data in chunks.
Additionally, with SQL Server data you can stream data, sending a predefined number of rows in each batch, or, if the function supports it, process the data in parallel. In contrast, when you read data from a database using RODBC, all the data is read into memory, making it difficult to work with large datasets.
Using SQL Server as a Compute Context
A defining feature of the RevoScaleR package is that most of its functions support execution of R code in different compute contexts. To run your R code on the SQL Server computer:
- Define a compute context that points to the database server.
- Change the active compute context to use the SQL Server computer.
- To switch back to local execution, reset the compute context to the default.
If you run R code within T-SQL code, the server is always used as the compute context. In this scenario, your code will call the R libraries installed on the SQL Server instance, and use secure connections to get data from the database. You can also save models in a database table, load models from a table to use for scoring, and save your results ot a databse table, all without leaving the context of SQL Server.
For more information, see SQL Server 2016 Books Online:
Using Development Tools
You can develop your R solutions in your preferred R IDE. We recommend that you use a standalone R IDE when testing your code and exploring your data, rather than trying to write R code inside of T-SQL. For R Services in particular, R Tools for Visual Studio is a good choice, because it has strong support for both R code and integration with SQL Server, but you can use any IDE that supports database connectivity.
To run R code within the context of SQL Server, you have two options:
- Run code from your laptop or any remote workstation, but set the compute context to SQL Server.
- Copy R code into the @script argument of a special SQL stored procedure, sp_execute_external_script. You can provide input data as a parameter of the stored procedure, call the stored procedure from any application that supports SQL calls, and easily export results to any application that can consume SQL data.
Deploying, Managing, and Optimizing Solutions
One of the primary goals of providing R Services in SQL Server is to make it easier to deploy R code in production. Typically you'll deploy your R code to production by wrapping it in a stored procedure. Stored procedures in SQL Server support parameterization, making it easy to pass in a SQL query as input, and save the results to the database.
Because the syntax for calling stored procedures is supported by many applications, you do not need to write any extra code to call your R code from an external application -- just pass in the data and handle the results that are returned.
You can also generate visualizations and archive them locally, export them to other applications such as Reporting Services or Power BI, or send them back to your local workstation for review.
Finally, because R Services is integrated with SQL Server, you can use database server tools for monitoring code execution and managing and balancing resources.
For more information, see SQL Server 2016 Books Online:
- Operationalizing Your R Code
- Resource Governance for R Services
- SQL Server R Services Performance Tuning
Learn more about SQL Server R Services here:
- In the SQL Server Books Online site, which also includes a tutorial for SQL Server R Services.
- Data Science End-to-End Walkthrough: Load, explore, and analyze the New York City taxi dataset. Build models and deploy them to SQL Server for production.
- Using the RevoScaleR Packages: Deep dive on the analytical functions provided by the ScaleR package. Demonstrates how to create and use compute contexts, how to move data between local and server compute contexts using XDF files, and creates a simulation using R code that runs in SQL Server.
- Data Science Scenarios and Solution Templates: Includes all the R and T-SQL code you need for fraud detection, churn analysis, predictive maintenance, and demand forecasting.
- In-Database Advanced Analytics for SQL Developers (Tutorial)
Learn more about the RevoScaleR package and its function here.