Table of contents
TOC
Collapse the table of content
Expand the table of content

CROSS JOIN (U-SQL)

Michael Rys|Last Updated: 5/15/2018
|
4 Contributors

A cross join returns the Cartesian product of rows from the rowsets in the join. In other words, it will combine each row from the first rowset with each row from the second rowset.

Note that this is potentially an expensive and dangerous operation since it can lead to a large data explosion. It is best used in scenarios where a normal join cannot be used and very selective predicates are being used in the WHERE clause to limit the number of produced rows.

Examples

  • The examples can be executed in Visual Studio with the Azure Data Lake Tools plug-in.
  • The scripts can be executed locally. An Azure subscription and Azure Data Lake Analytics account is not needed when executed locally.

Basic Example
Using the input rowsets

EmpNameDepID
Rafferty31
Jones33
Heisenberg33
Robinson34
Smith34
Williamsnull
DeptIDDepName
31Sales
33Engineering
34Clerical
35Marketing

the following cross join

@employees = SELECT *  
               FROM (VALUES   
                      ("Rafferty", (int?) 31)  
                    , ("Jones", (int?) 33)  
                    , ("Heisenberg", (int?) 33)  
                    , ("Robinson", (int?) 34)  
                    , ("Smith", (int?) 34)  
                    , ("Williams", (int?) null)) AS E(EmpName, DepID);  

@departments = SELECT *  
                FROM (VALUES  
                       ((int) 31, "Sales")  
                     , ((int) 33, "Engineering")  
                     , ((int) 34, "Clerical")  
                     , ((int) 35, "Marketing")) AS D(DepID, DepName);  

@rs_cross =   
    SELECT e.EmpName, d.DepName  
    FROM @employees AS e   
         CROSS JOIN @departments AS d  
    WHERE d.DepName == "Engineering";  

OUTPUT @rs_cross   
TO "/Output/ReferenceGuide/Joins/CrossJoins/ExampleA.csv"
USING Outputters.Csv();  

produces this rowset

EmpNameDepName
RaffertyEngineering
JonesEngineering
HeisenbergEngineering
RobinsonEngineering
SmithEngineering
WilliamsEngineering

Note that without the filter on the DepName, the resulting rowset would have produced 6x4=24 rows.

Additional Examples

@right = 
    SELECT * FROM 
        ( VALUES
        (0, 99, "Ford Motor Company", "Ford"),
        (100, 199, "Ford Motor Company", "Lincoln"),
        (200, 299, "Ford Motor Company", "Motorcraft"),
        (300, 399, "General Motors Company", "Buick"),
        (400, 499, "General Motors Company", "Chevrolet"),
        (500, 599, "General Motors Company", "Cadillac"),
        (600, 699, "General Motors Company", "GMC"),
        (700, 799, "Fiat Chrysler", "Chrysler"),
        (800, 899, "Fiat Chrysler", "Dodge"),
        (900, 999, "Fiat Chrysler", "Jeep"),
        (1000, 1099, "Fiat Chrysler", "Ram"),
        (1100, 1199, "Fiat Chrysler", "Mopar"),
        (1200, 1299, "Fiat Chrysler", "SRT")
        ) AS T(carIDstart, carIDend, Automaker, Division);

@left = 
    SELECT * FROM 
        ( VALUES
        (3, "Mustang", "1964"),
        (7, "Fiesta", "1976"),
        (133, "Navigator", "1998"),
        (160, "Continental", "2017"),
        (639, "Canyon", "2004"),
        (801, "Challenger", "2008"),
        (802, "Charger", "2006")
        ) AS T(carID, currentModel, introYear);

@result = 
    SELECT  l.currentModel,
            r.Division,
            l.introYear
    FROM    @left AS l
    CROSS JOIN    @right AS r
    WHERE   l.carID BETWEEN r.carIDstart AND r.carIDend;

OUTPUT @result
TO "/Output/ReferenceGuide/Joins/CrossJoins/ExampleB.csv"
USING Outputters.Csv(outputHeader: true);

See Also

© 2018 Microsoft