Extractors.Text()
Updated: March 28, 2017
The Text() extractor supports a variety of text file formats that all follow a row/column format. It provides a set of delimiters to identify the row and column boundaries and several other parameters to parse the text file and produces a rowset based on the EXTRACT expression’s schema.
See Extractor Parameters (U-SQL) for supported parameters and their defaults values.
Examples
- The examples can be executed in Visual Studio with the Azure Data Lake Tools plug-in.
- The examples below use the sample data provided with your Data Lake Analytics account. See Prepare source data for additional information.
@searchlog =
EXTRACT UserId int,
Start DateTime,
Region string,
Query string,
Duration int?,
Urls string,
ClickedUrls string
FROM "/Samples/Data/SearchLog.tsv"
USING Extractors.Text(delimiter: '\t', skipFirstNRows: 1);
OUTPUT @searchlog
TO "/Output/ReferenceGuide/BuiltIn/UDOs/extractorText_SearchLog.csv"
USING Outputters.Csv();
@Drivers =
EXTRACT driver_id int,
name string,
street string,
city string,
region string,
zipcode string,
country string,
phone_numbers string // Map
FROM "/Samples/Data/AmbulanceData/Drivers.txt"
USING Extractors.Text(delimiter: '\t', encoding:Encoding.Unicode);
OUTPUT @Drivers
TO "/Output/ReferenceGuide/BuiltIn/UDOs/extractorText_Drivers.csv"
USING Outputters.Csv();
// You need to quote ASCII with [] to make sure it is not read as a reserved U-SQL keyword
@Trips =
EXTRACT date DateTime,
driver_id int,
vehicle_id int,
trips string // Array
FROM "/Samples/Data/AmbulanceData/DriverShiftTrips.csv"
USING Extractors.Text(encoding: Encoding.[ASCII]);
OUTPUT @Trips
TO "/Output/ReferenceGuide/BuiltIn/UDOs/extractorText_DriverShiftTrips.csv"
USING Outputters.Csv();
See Also
Community Additions
ADD
Show: