I have a data file and a corresponding schema file stored in separate locations. I would like to load the data using the schema in the schema-file. I tried using
<
It's possible to load data with schema file.
When you store your data with the '-schema'
flag, in the output path, there is .pig-schema
file that hold json with the schema.
You can use it when loading data
B = LOAD '<>' USING PigStorage(',','-schema');
You can see the schema by running
describe A;
Check this good post for more details.
This feature is available beginning with Pig 0.10.
The AS clause is for specifying the schema directly not the path to the schema file.
A = LOAD '<file path>' USING PigStorage('\u0001') as 'type: long, id:chararray, nameformat:chararray';
Alternatively, a file named .pig_schema
containing the schema and located in your input directory could work as well. Never tried that though. It must be a JSON file with the following syntax:
{"fields":[
{"name":"type","type":55,"description":"Fu","schema":null},
{"name":"id","type":15,"description":"Bar","schema":null},
{"name":"nameFormat","type":55,"description":"Xu","schema":null},
] ,"version":0,"sortKeys":[],"sortKeyOrders":[]}
This file is also generated if you specify the -schema option when storing with PigStorage.