问题
A csv is brought into the NiFi Workflow using a GetFile Processor. I have a column consisting of a "id". Each id means a certain string. There are around 3 id's. For an example if my csv consists of
name,age,id
John,10,Y
Jake,55,N
Finn,23,C
I am aware that Y means York, N means Old and C means Cat. I want a new column with a header named "nick" and have the corresponding nick for each id.
name,age,id,nick
John,10,Y,York
Jake,55,N,Old
Finn,23,C,Cat
Finally I want a csv with the extra column and the appropriate data for each record. How is this possible Using Apache NiFi. Please advice me on the processors that must be used and the configurations that must be changed in order to accomplish this task.
回答1:
Flow:
- add a new nick column
- copy over the id to the nick column
- look at each line and match id with it's corresponding value
- set this value into current line in the nick column
You can achieve this using either ReplaceText or ReplaceTextWithMapping. I do it with ReplaceText:
UpdateRecord will parse the csv file, add the new column and copy the id value:
Create a CSVReader
and keep the default properties. Create a CSVRecordSetWriter
and set Schema access strategy to Schema Text
. Set Schema Text property to
{
"type":"record",
"name":"foobar",
"namespace":"my.example",
"fields":[
{
"name":"name",
"type":"string"
},
{
"name":"age",
"type":"int"
},
{
"name":"id",
"type":"string"
},
{
"name":"nick",
"type":"string"
}
]
}
Notice that it has the new column. Finally replace the original values with the mapping:
PS: I noticed you are new to SO, welcome! You have not accepted a single answer in any of your previous questions. Accept them, if they solve your problem, as it will help others to find solutions.
来源:https://stackoverflow.com/questions/58554652/apache-nifi-add-column-to-csv-using-mapped-values