Extractors.Text()

 

Updated: March 28, 2017

The Text() extractor supports a variety of text file formats that all follow a row/column format. It provides a set of delimiters to identify the row and column boundaries and several other parameters to parse the text file and produces a rowset based on the EXTRACT expression’s schema.

See Extractor Parameters (U-SQL) for supported parameters and their defaults values.

Examples


@searchlog =
     EXTRACT UserId          int,
             Start           DateTime,
             Region          string,
             Query           string,
             Duration        int?,
             Urls            string,
             ClickedUrls     string
     FROM "/Samples/Data/SearchLog.tsv"
     USING Extractors.Text(delimiter: '\t', skipFirstNRows: 1);

OUTPUT @searchlog 
TO "/Output/ReferenceGuide/BuiltIn/UDOs/extractorText_SearchLog.csv" 
USING Outputters.Csv();

@Drivers =
    EXTRACT driver_id   int,
        name            string,
        street          string,
        city            string,
        region          string,
        zipcode         string,
        country         string,
        phone_numbers   string // Map
 FROM "/Samples/Data/AmbulanceData/Drivers.txt"
 USING Extractors.Text(delimiter: '\t', encoding:Encoding.Unicode);

OUTPUT @Drivers 
TO "/Output/ReferenceGuide/BuiltIn/UDOs/extractorText_Drivers.csv" 
USING Outputters.Csv();

// You need to quote ASCII with [] to make sure it is not read as a reserved U-SQL keyword
@Trips =
    EXTRACT date    DateTime,
        driver_id   int,
        vehicle_id  int,
        trips       string // Array
     FROM "/Samples/Data/AmbulanceData/DriverShiftTrips.csv"
     USING Extractors.Text(encoding: Encoding.[ASCII]);

OUTPUT @Trips 
TO "/Output/ReferenceGuide/BuiltIn/UDOs/extractorText_DriverShiftTrips.csv" 
USING Outputters.Csv();

See Also

Community Additions

ADD
Show: