问题
Reorder column in a csv in apache nifi.
Input - I have multiple files which have same columns but are in different order.
Output - Scrape some columns and store in same order.
回答1:
In my case, because I'm sure those columns will be included in all CSV files, I just need to reorder them. So I use QueryRecord
to reorder my csv files.
For example, here're my csv files:
\\file1
name, age, location, gender
Jack, 40, TW, M
Lisa, 30, CA, F
\\file2
name, location, gender, age
Mary, JP, F, 25
Kate, DE, F, 23
I'd like to reorder columns to location,name,gender,age
, I set a new property in QueryRecord
named reorder_data
, with the value like:
SELECT location,name,gender,age FROM FLOWFILE
Then data in the flowfile will become:
\\file1 - reordered
location, name, gender, age
TW, Jack, M, 40
CA, Lisa, F, 30
\\file2 - reordered
location, name, gender, age
JP, Mary, F, 25
DE, Kate, F, 23
Thus, I can get reordered data output from QueryRecord
as well as original data, it's very convenient.
BTW, You can also use group variable or attribute to set column order for better maintenance:
//Group variable or attribute
column_order location,name,gender,age
//Property in QueryRecord
reorder_data SELECT ${column_order} FROM FLOWFILE
回答2:
You should be able to do this with ConvertRecord, you'd have the schema for the CSVRecordReader match the columns (in order) of the input, and the output schema for the CSVRecordSetWriter would have the schema with the selected columns in the desired output order. I haven't tried this but I believe that's how it works.
来源:https://stackoverflow.com/questions/58182118/how-to-reorder-csv-columns-in-apache-nifi