How to validate one csv data compare with another csv file using Pentaho?

六月ゝ 毕业季﹏ 提交于 2019-12-24 20:11:13

问题


I have two csv file . In one file i have 10 rows and in another list of data . What i want to do is , check the data of one filed of first csv and compare it with another csv file . So how can i achieve this ? Any help would be great .


回答1:


The step you are looking for is named the a Stream Lookup step.`

Read you CSV and the reference files, and drop the two flows in a Stream Lookup and set it up as follow: a) Lookup step = the step that reads the reference b) Keys / field = the name of field of the CSV that contains any field able to identify the row in the reference file. c) Keys / Lookup field = the name of the field in the reference file. d) Field to retrieve = the name of the field in the reference to return (may be the identifier or any other field you need) e) Field to retrieve / Type = Do not forget !

Like that, you will add a column from the reference file to the 10 rows of the CSV file. You may then filter out the rows which the Lookup did not found by testing if the value of the new column is not null.

As in the PDI all the above setup are guided with drop down lists, it should take you 2 minutes.



来源:https://stackoverflow.com/questions/50017225/how-to-validate-one-csv-data-compare-with-another-csv-file-using-pentaho

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!