sp_execute_external_script (Transact-SQL)

 

Updated: June 24, 2016

THIS TOPIC APPLIES TO: yesSQL Server (starting with 2016)noAzure SQL DatabasenoAzure SQL Data Warehouse noParallel Data Warehouse

Executes the script provided as argument at an external location. The script must be written in a supported and registered language. The query tree is controlled by SQL Server and users cannot perform arbitrary operations on the query. To execute sp_execute_external_script you must first enable external scripts by using the sp_configure 'external scripts enabled', 1; statement.

Topic link icon Transact-SQL Syntax Conventions

sp_execute_external_script   
    @language = N'language' ,   
    @script = N'script',  
  
    @input_data_1 = ] 'input_data_1'   
    [ , @input_data_1_name = ] N'input_data_1_name' ]   
    [ , @output_data_1_name = 'output_data_1_name' ]  
    [ , @parallel = 0 | 1 ]  
    [ , @params = ] N'@parameter_name data_type [ OUT | OUTPUT ] [ ,...n ]'  
    [ , @parameter1 = ] 'value1' [ OUT | OUTPUT ] [ ,...n ]  
    [ WITH <execute_option> ]  
[;]  
  
<execute_option>::=  
{  
      { RESULT SETS UNDEFINED }   
    | { RESULT SETS NONE }   
    | { RESULT SETS ( <result_sets_definition> ) }  
}  
  
<result_sets_definition> ::=   
{  
    (  
         { column_name   
           data_type   
         [ COLLATE collation_name ]   
         [ NULL | NOT NULL ] }  
         [,...n ]  
    )  
    | AS OBJECT   
        [ db_name . [ schema_name ] . | schema_name . ]   
        {table_name | view_name | table_valued_function_name }  
    | AS TYPE [ schema_name.]table_type_name  
}  

[ @language =
] 'language'
Can be any supported language that has been registered by using xxx. language is sysname.

Valid values are 'R'.

[ @script = ]
'script'
External language script specified as a literal or variable input. script is nvarchar(max).

[ @input_data_1_name = ] 'input_data_1_name'
Specifies the name 1 query. The data type of the variable in the external script depends on the language. In case of R, this variable will be a data frame. input_data_1_name is sysname.

Default value is "InputDataSet".

[ @input_data_1 = ] 'input_data_1'
Specifies the input data for the external script in the form of a Transact-SQL query. input_data_1 is nvarchar(max).

[ @output_data_1_name = ] 'output_data_1_name'
Specifies the name of the variable in the external script that will contain the data to be returned to SQL Server upon completion of the stored procedure call. The data type of the variable in the external script depends on the language. In case of R, this variable will be a data frame. output_data_1_name is sysname.

Default value is "OutputDataSet".

[ @parallel = ] 0 | 1
Enable parallel execution of R scripts by setting the @parallel parameter to 1. The default for this parameter is 0 (no parallelism).

For R scripts that do not use RevoScaleR functions, using @parallel parameter can be beneficial for processing large datasets, if the script can be trivially parallelized. For example, when the R predict function with a model to generate new predictions, set @parallel = 1 as a hint to the query engine. If the query can be parallelized, rows will be distributed according to the MAXDOP setting.

If @parallel = 1 and the output is being streamed directly to the client machine, then the WITH RESULTS SETS clause is required and output schema must be specified.

For R scripts that use RevoScaleR functions, parallel processing is handled automatically and you should not specify @parallel = 1 to the sp_execute_external_script call.

[ @params = ] N'@parameter_name data_type [ OUT | OUTPUT ] [ ,...n ]'
A list of input parameter declarations that will be used in the external script.

[ @parameter1 = ] 'value1' [ OUT | OUTPUT ] [ ,...n ]
A list of values for the input parameters used by the external script.

Use sp_execute_external_script to execute scripts written in a supported language such as R. In SQL Server, R Services (In-database) is comprised of a server component installed with SQL Server, and a set of workstation tools and connectivity libraries that connect the data scientist to the high-performance environment of SQL Server. Install R Services (In-Database) during SQL Server setup to enable the execution of R scripts. For more information, see Set up SQL Server R Services (In-Database).

You can control the resources used by an external script by configuring an external resource pool. For more information, see CREATE EXTERNAL RESOURCE POOL (Transact-SQL). Information about the workload can be obtained from the resource governor catalog views, DMV's, and counters. For more information, see Resource Governor Catalog Views (Transact-SQL), Resource Governor Related Dynamic Management Views (Transact-SQL), and SQL Server, External Scripts Object.

Monitor script execution using sys.dm_external_script_requests and sys.dm_external_script_execution_stats.

In addition to returning a result set, you can return scalar values from R script to SQL Server using OUTPUT parameters. The following example shows the use of OUTPUT parameter:

DECLARE @model varbinary(max);  
EXEC sp_execute_external_script  
        @language = N'R'  
        , @script = N'  
    # build classification model to predict tipped or not  
    logitObj <- glm(tipped ~ passenger_count + trip_distance + trip_time_in_secs + direct_distance, data = featureDataSource, family = binomial(link=logit));  
  
    # First, serialize a model and put it into a database table  
    modelbin <- serialize(logitObj, NULL);  
    '  
        , @input_data_1 = N'  
SELECT tipped, passenger_count, trip_time_in_secs, trip_distance, d.direct_distance  
  FROM dbo.nyctaxi_sample TABLESAMPLE (1 PERCENT) REPEATABLE (98074)  
  CROSS APPLY [CalculateDistance](pickup_latitude, pickup_longitude,  dropoff_latitude, dropoff_longitude) as d  
'  
        , @input_data_1_name = N'featureDataSource'  
        , @params = N'@modelbin varbinary(max) OUTPUT'  
        , @modelbin = @model OUTPUT;  

Streaming execution of R script is supported by specifying @r_rowsPerRead int parameter in @params collection. Streaming allows R scripts to work with data that doesn’t fit in memory. For example, if there are billion rows to score using predict function the new @r_rowsPerRead parameter can be used to split the execution into one stream at a time. Because this parameter controls the number of rows that are sent to the R processes, it can also be used to throttle the number of rows being read at one time. This might be useful to mitigate server performance issues if, for example, a large model is being trained. Note that this parameter can only be used in cases where the output of the R script doesn’t depend on reading or looking at the entire set of rows.

Both the @r_rowsPerRead parameter for streaming and the @parallel argument should be considered hints. For the hint to be applied, it must be possible to generate a SQL query plan that includes parallel processing. If this is not possible, parallel processing cannot be enabled.

System_CAPS_ICON_note.jpg Note


Streaming and parallel processing are supported only in Enterprise Edition. You can include the parameters in your queries in Standard Edition without raising an error, but the parameters will have no effect and R scripts will run in a single process.

Data types: The following data types are not supported when used in the input query or parameters of sp_execute_external_script procedure, and will return an unsupported type error.

As a workaround, CAST the column or value to a supported type in Transact-SQL and send it to R.

  • cursor

  • timestamp

  • datetime2, datetimeoffset, time

  • sql_variant

  • text, image

  • xml

  • hierarchyid, geometry, geography

  • CLR user-defined types

WITH RESULTS SETS clause is mandatory if you are returning a result set from R . The specified column data types need to match the types supported in R (bit, int, float, datetime, varchar)

datetime values in the input will be converted to NA on the R side for values that do not fit the permissible range of values in R. this is required because SQL Server permits a larger range of values than is supported in the R language.

Float values (for example, +Inf, -Inf, NaN) are not supported in SQL Server even though both languages use IEEE 754. Current behavior just sends the values to SQL Server directly and as a result sqlclient in Management Studio throws error. We should convert these values to NULL.

When using the WITH RESULTS SET clause, an error will be raised in the following conditions:

  • The number of columns doesn’t match the number of columns in the R data frame.

  • Any Transact-SQL data type that cannot be mapped to an R data type will transfer as NULL. Any R result set that cannot be mapped to a Transact-SQL data type will transfer as NULL.

Requires EXECUTE ANY EXTERNAL SCRIPT database permission.

This section contains examples of how this stored procedure can be used to execute R scripts using Transact-SQL.

A. Return a data set from R to SQL Server

The following example create a stored procedure that uses sp_execute_external_script to return an iris dataset from R to SQL Server.

DROP PROC IF EXISTS get_iris_dataset;  
go  
CREATE PROC get_iris_dataset  
AS  
BEGIN  
 EXEC   sp_execute_external_script  
       @language = N'R'  
     , @script = N'iris_data <- iris;'  
     , @input_data_1 = N''  
     , @output_data_1_name = N'iris_data'  
     WITH RESULT SETS (("Sepal.Length" float not null,   
           "Sepal.Width" float not null,  
        "Petal.Length" float not null,   
        "Petal.Width" float not null, "Species" varchar(100)));  
END;  
go  

B. Generate a model based on data from SQL Server

The following example create a stored procedure that uses sp_execute_external_script to generate an iris model and return the model to SQL Server.

System_CAPS_ICON_note.jpg Note


This example requires installing the e1071 package. For more information, see Install Additional R Packages on SQL Server.

DROP PROC IF EXISTS generate_iris_model;  
go  
  
CREATE PROC generate_iris_model  
AS  
BEGIN  
 EXEC sp_execute_external_script  
      @language = N'R'  
     , @script = N'  
          library(e1071);  
          irismodel <-naiveBayes(iris_data[,1:4], iris_data[,5]);  
          trained_model <- data.frame(payload = as.raw(serialize(irismodel, connection=NULL)));  
'  
     , @input_data_1 = N'select "Sepal.Length", "Sepal.Width", "Petal.Length", "Petal.Width", "Species" from iris_data'  
     , @input_data_1_name = N'iris_data'  
     , @output_data_1_name = N'trained_model'  
    WITH RESULT SETS ((model varbinary(max)));  
END;  
go  

System Stored Procedures (Transact-SQL)
sp_prepare (Transact SQL)
sp_configure (Transact-SQL)
external scripts enabled Server Configuration Option
SERVERPROPERTY (Transact-SQL)
SQL Server R Services
Known Issues for SQL Server R Services
CREATE EXTERNAL RESOURCE POOL (Transact-SQL)
sys.resource_governor_external_resource_pools (Transact-SQL)
sys.dm_resource_governor_external_resource_pool_affinity (Transact-SQL)
SQL Server, External Scripts Object
sys.dm_external_script_requests
sys.dm_external_script_execution_stats

Show: