How to reorder CSV columns in Apache NiFi

限于喜欢 提交于 2019-12-11 17:23:48

问题


Reorder column in a csv in apache nifi.

Input - I have multiple files which have same columns but are in different order.

Output - Scrape some columns and store in same order.


回答1:


In my case, because I'm sure those columns will be included in all CSV files, I just need to reorder them. So I use QueryRecord to reorder my csv files.

For example, here're my csv files:

\\file1
name, age, location, gender
Jack, 40, TW, M
Lisa, 30, CA, F 

\\file2
name, location, gender, age
Mary, JP, F, 25
Kate, DE, F, 23

I'd like to reorder columns to location,name,gender,age, I set a new property in QueryRecord named reorder_data, with the value like:

SELECT location,name,gender,age FROM FLOWFILE

Then data in the flowfile will become:

\\file1 - reordered
location, name, gender, age
TW, Jack, M, 40
CA, Lisa, F, 30

\\file2 - reordered
location, name, gender, age
JP, Mary, F, 25
DE, Kate, F, 23

Thus, I can get reordered data output from QueryRecord as well as original data, it's very convenient.

BTW, You can also use group variable or attribute to set column order for better maintenance:

//Group variable or attribute
column_order   location,name,gender,age

//Property in QueryRecord
reorder_data   SELECT ${column_order} FROM FLOWFILE



回答2:


You should be able to do this with ConvertRecord, you'd have the schema for the CSVRecordReader match the columns (in order) of the input, and the output schema for the CSVRecordSetWriter would have the schema with the selected columns in the desired output order. I haven't tried this but I believe that's how it works.



来源:https://stackoverflow.com/questions/58182118/how-to-reorder-csv-columns-in-apache-nifi

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!