how to pass values dynamically in Apache NiFi from executeSQL to SelectHiveQL

大城市里の小女人 提交于 2019-12-25 08:09:41

问题


I have two tables one in mysql test.employee and other in hive default.dept I want to pass empid of test.employee table as a parameter to query in hive table and store data into HDFS

ExecuteSQL -> select empid from test.employee (gives 10 records)

SelectHiveQL -> SELECT * FROM default.dept where empid = ${empid} (should retrieve 10 records)

image description here


回答1:


You could do the following:

  1. ExecuteSQL - to retrieve the employee records
  2. ConvertAvroToJson - for later processing of empid
  3. SplitJson - to split into one flow file per row
  4. EvaluateJsonPath - to get the value of empid into a flow file attribute
  5. ReplaceText - to set the content to the HiveQL statement (using expression language as you have done above)
  6. SelectHiveQL - to get the department records

Note that this executes a Hive SELECT for each of the empid values, so each execution of SelectHiveQL will produce a single record. I'm not sure (given HiveQL semantics for the IN clause, for example) how to get a single HiveQL statement since it's kind of a join between a "table of constants" and the Hive table, not to mention the NiFi processing is more complex as you won't want the SplitJson and would likely have to process all the records at once (with ExecuteScript, e.g.)




回答2:


ExecuteSQL ->SplitAvro ->ConvertAvroToJson -> EvaluateJsonPath -> ReplaceText ->SelectHiveQL -> PutHDFS

ExecuteSQL       ==> Query -> select empid from employees
EvaluateHsonPath ==> destination ->flowfile-attribute
                 ==> empid = $.empid
ReplaceText      ==> ReplacementValue  -> ${empid}
SelectHiveQL   -> select * from dept where empid = ${empid}


来源:https://stackoverflow.com/questions/41057483/how-to-pass-values-dynamically-in-apache-nifi-from-executesql-to-selecthiveql

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!