Apache NiFi: Add column to csv using mapped values

问题

A csv is brought into the NiFi Workflow using a GetFile Processor. I have a column consisting of a "id". Each id means a certain string. There are around 3 id's. For an example if my csv consists of

name,age,id
John,10,Y
Jake,55,N
Finn,23,C

I am aware that Y means York, N means Old and C means Cat. I want a new column with a header named "nick" and have the corresponding nick for each id.

name,age,id,nick
John,10,Y,York
Jake,55,N,Old
Finn,23,C,Cat

Finally I want a csv with the extra column and the appropriate data for each record. How is this possible Using Apache NiFi. Please advice me on the processors that must be used and the configurations that must be changed in order to accomplish this task.

回答1:

Flow:

add a new nick column
copy over the id to the nick column
look at each line and match id with it's corresponding value
set this value into current line in the nick column

You can achieve this using either ReplaceText or ReplaceTextWithMapping. I do it with ReplaceText:

UpdateRecord will parse the csv file, add the new column and copy the id value:

Create a CSVReader and keep the default properties. Create a CSVRecordSetWriter and set Schema access strategy to Schema Text. Set Schema Text property to

{
   "type":"record",
   "name":"foobar",
   "namespace":"my.example",
   "fields":[
      {
         "name":"name",
         "type":"string"
      },
      {
         "name":"age",
         "type":"int"
      },
      {
         "name":"id",
         "type":"string"
      },
      {
         "name":"nick",
         "type":"string"
      }
   ]
}

Notice that it has the new column. Finally replace the original values with the mapping:

PS: I noticed you are new to SO, welcome! You have not accepted a single answer in any of your previous questions. Accept them, if they solve your problem, as it will help others to find solutions.

来源：https://stackoverflow.com/questions/58554652/apache-nifi-add-column-to-csv-using-mapped-values

标签

apache-nifi

data-processing