Table of contents
TOC
Collapse the table of content
Expand the table of content

CREATE FUNCTION (U-SQL): Table-valued Function

Michael Rys|Last Updated: 5/15/2018
|
4 Contributors

The CREATE FUNCTION statement allows to create a table-valued function (often referred to as TVF).

Syntax
Create_TV_Function_Statement :=                                                                          
    'CREATE' 'FUNCTION' ['IF' 'NOT' 'EXISTS'] IdentifierTVF_Signature  
    ['AS'] 'BEGIN'  
    TVF_Statement_List   
    'END'.

Semantics of Syntax Elements

This statement creates the function with the specified identifier and function signature based on the provided statement list.

  • Identifier
    Specifies the name of the table-valued function. If the Identifier is a three-part identifier, the function will be created in the specified database and schema. If it is a two-part identifier, then the function will be created in the specified schema of the current database context. If the identifier is a simple identifier, then the function will be created in the current database and schema context.

    If an object of the given name already exists in the specified database and schema context or the user has no permissions to create a function, an error is raised.

  • IF NOT EXISTS
    If the optional IF NOT EXISTS is specified, then the statement creates the function if it does not already exist, or succeeds without changes if the function already exists and the user has permission to at least enumerate all existing functions.

  • TVF_Signature
    The function signature provides the arguments and their types and optional default values and the return type of the function:

    Syntax
    TVF_Signature :=                                                                                    
        '(' [Parameter_List] ')' TVF_Returns.
    Parameter_List := Parameter {',' Parameter}.
    • Parameter
      A parameter defines the name of the parameter in form of a variable local to the function body. Its type is either a built-in U-SQL type, which optionally can be initialized with a default value, or a named or anonymous table type:

      Syntax
      Parameter :=                                                                                   
          User_Variable ( Type_Name [Default_Parameter_Value]
                        | Table_Type
                        | Anonymous_Table_Type).
      Default_Parameter_Value := '=' function_argument_expression.
      Table_Type := Identifier.
      Anonymous_Table_Type := 'TABLE' '(' Column_Definition_List ')'.
    • TVF_Returns
      The return of a TVF is specified by either a single return rowset or a list of return rowsets. Each returned rowset is specified by a rowset variable name and its return type that is either specified as a reference to a registered U-SQL table type or an anonymous table type.

      Syntax
      TVF_Returns :=
          'RETURNS' (Return_Rowset | '(' Return_Rowset_List ')').
      Return_Rowset := Rowset_Variable Return_Table_Type.
      Return_Table_Type := Anonymous_Table_Type | Table_Type.
      Return_Rowset_List := Return_Rowset {',' Return_Rowset}.
    Tip

    In a future refresh, table-valued functions will disallow a result variable from having the same name as any of the function's parameters.

  • TVF_Statement_List
    The resulting values of a TVF are being calculated inside the table-valued function’s body. The function body consists of a sequence of U-SQL query statements. The last statement for a given return rowset variable will be returned as the resulting rowset. If the statement’s inferred type is not the same as the specified return type, a compilation error will be reported.

    Syntax
    TVF_Statement_List :=                                                                               
        { [TVF_Statement] ';' }.
    TVF_Statement := TVF_Statement_Body [Optimizer_Hint_Clause].
    TVF_Statement_Body := Use_Statement | Declare_Variable_Statement | Import_Package_Statement | Reference_Assembly_Statement | Deploy_Resource_Statement | Query_Statement.

    Please follow the links for more on the general nature of the statements. Note that in order to avoid side-effects that cannot inlined into a query expression, you cannot call INSERT or OUTPUT statements or call procedures that may have side-effects.

    Setting the context with a USE statement, declaring variables or referencing assembly statements inside a function body will only affect the static context of the table-valued function’s body and will not be visible in the calling context or the static context of the definition of an object called within (e.g., another TVF).

    The function body’s own static context is not affected by the calling environment’s static context. E.g., a USE DATABASE statement in the script that is calling the function is not affecting the function’s default static database context and variables defined outside a function body will not be visible.

    Assemblies referenced in the function body will however be visible in the calling environment’s dynamic context and will be visible in any of the called contexts and the function’s own dynamic context. In addition, the function will inherit the loaded assemblies from the calling environment.

    Deploying a resource inside a function will become visible to the whole script at runtime, similar to referencing an assembly and any resources deployed by the calling environment will be visible in the function’s dynamic context.

    The compiler does not allow more than 50 nested function calls to prevent stack overflow. Please make sure that functions are called with less nesting. See Recursive TVF example, below.

Important

In the Public Preview, only a limited grammar check will be performed during creation of a table-valued function. Instead the grammar will be checked when the TVF gets compiled for usage. This behavior is subject to change in future releases.

Examples

Basic Syntax - tvf_SearchLog

CREATE DATABASE IF NOT EXISTS TestReferenceDB; 
USE DATABASE TestReferenceDB; 

DROP FUNCTION IF EXISTS tvf_SearchLog;
CREATE FUNCTION tvf_SearchLog()
RETURNS @searchlog TABLE(UserId int, Start DateTime, Region string, Query string, Duration int?, Urls string, ClickedUrls string)
AS
BEGIN
    @searchlog =
        EXTRACT UserId int,
                Start DateTime,
                Region string,
                Query string,
                Duration int?,
                Urls string,
                ClickedUrls string
        FROM "/Samples/Data/SearchLog.tsv"
        USING Extractors.Tsv();
RETURN;
END;

TVF with a parameter
The function declaration defines the function with one argument that allows to parameterize the contained query statements. The parameter @Region is of type string and initialized with the default en-gb. The function declaration also declares the resulting table type which will be checked against the resulting query expression’s type. The query body is an extraction statement followed by a selection that applies the filter.

DROP FUNCTION IF EXISTS TestReferenceDB.dbo.SearchLogPerRegion;
CREATE FUNCTION TestReferenceDB.dbo.SearchLogPerRegion(@Region string = "en-gb")
RETURNS @res TABLE(UserId int, Start DateTime, Region string, Query string, Duration int, Urls string, ClickedUrls string)
AS
BEGIN
    @sl =
        EXTRACT UserId int,
                Start DateTime,
                Region string,
                Query string,
                Duration int,
                Urls string,
                ClickedUrls string
        FROM "/Samples/Data/SearchLog.tsv"
        USING Extractors.Tsv();

    @res =
        SELECT *
        FROM @sl
        WHERE Region == @Region;  
END;   

TVF that returns multiple-rowsets
Similar to above function except this TVF will 1) return a rowset that contains all values for the defined region, and then 2) return a second rowset that contains value for the remaining regions.

DROP FUNCTION IF EXISTS TestReferenceDB.dbo.SearchLogPerRegionWithRemainder;
CREATE FUNCTION TestReferenceDB.dbo.SearchLogPerRegionWithRemainder(@Region string = "en-gb")
RETURNS (@definedRegion, @remainderRegions)

/* alternative header
RETURNS (
@definedRegion TABLE (UserId int, Start DateTime, Region string, Query string, Duration int, Urls string, ClickedUrls string),
@remainderRegions TABLE (UserId int, Start DateTime, Region string, Query string, Duration int, Urls string, ClickedUrls string)
)
*/

AS
BEGIN
    @searchlog  =
        EXTRACT UserId int,
                Start DateTime,
                Region string,
                Query string,
                Duration int,
                Urls string,
                ClickedUrls string
        FROM "/Samples/Data/SearchLog.tsv"
        USING Extractors.Tsv();

    @definedRegion =
        SELECT *
        FROM @searchlog 
        WHERE Region == @Region;  

    @remainderRegions =
        SELECT *
        FROM @searchlog 
        WHERE Region != @Region;  
END;  

Recursive TVF
The TVF below calls itself and increments a value by one. The recursive calls continue based on the value passed to the parameter @MaxIterations. A value greater than 50 to @MaxIterations will cause the function to error. Consider the options to avoid such an error. Here, you could add code to the call to prevent a value greater than 50 to be passed to @MaxIterations. The function definition could also be modified to alter the behavior when @CurrentIteration exceeds 50 and/or when @MaxIterations is passed a value greater than 50.

USE TestReferenceDB;

// Function definition
USE TestReferenceDB;

DROP FUNCTION IF EXISTS dbo.SimpleRecursion;
CREATE FUNCTION dbo.SimpleRecursion(
        @baseValues TABLE(id1 int, id2 int), 
        @CurrentIteration int = 0, @MaxIterations int = 5)
RETURNS @result TABLE(id1 int, id2 int)
AS
BEGIN
    IF (@CurrentIteration >= @MaxIterations) THEN
        @result = SELECT * FROM @baseValues;
    ELSE
        @baseValues = SELECT * FROM @baseValues UNION SELECT @CurrentIteration + 1 AS id1, @MaxIterations AS id2 FROM @baseValues;
        @result = dbo.SimpleRecursion(@baseValues, @CurrentIteration + 1, @MaxIterations);   
    END; 
END; 

// Function call
@baseValues =
    SELECT * FROM (
        VALUES
            (0, 0) 
            ) AS T(id1, id2);

OUTPUT dbo.SimpleRecursion(@baseValues, 0, 5)
TO "/Output/ReferenceGuide/DDL/Functions/callSimpleRecursion.txt"
USING Outputters.Tsv();  

Additional Examples

See Also

© 2018 Microsoft