Hive execution hook

前端 未结 2 609
心在旅途
心在旅途 2021-01-01 09:45

I am in need to hook a custom execution hook in Apache Hive. Please let me know if somebody know how to do it.

The current environment I am using is given below:

相关标签:
2条回答
  • 2021-01-01 09:52

    a good start --> http://dharmeshkakadia.github.io/hive-hook/

    there are examples...

    note: hive cli from console show the messages if you execute from hue, add a logger and you can see the results in hiveserver2 log role.

    0 讨论(0)
  • 2021-01-01 10:04

    There are several types of hooks depending on at which stage you want to inject your custom code:

    • Driver run hooks (Pre/Post)
    • Semantic analyizer hooks (Pre/Post)
    • Execution hooks (Pre/Failure/Post)
    • Client statistics publisher

    If you run a script the processing flow looks like as follows:

    1. Driver.run() takes the command
    2. HiveDriverRunHook.preDriverRun()
      (HiveConf.ConfVars.HIVE_DRIVER_RUN_HOOKS)
    3. Driver.compile() starts processing the command: creates the abstract syntax tree
    4. AbstractSemanticAnalyzerHook.preAnalyze()
      (HiveConf.ConfVars.SEMANTIC_ANALYZER_HOOK)
    5. Semantic analysis
    6. AbstractSemanticAnalyzerHook.postAnalyze()
      (HiveConf.ConfVars.SEMANTIC_ANALYZER_HOOK)
    7. Create and validate the query plan (physical plan)
    8. Driver.execute() : ready to run the jobs
    9. ExecuteWithHookContext.run()
      (HiveConf.ConfVars.PREEXECHOOKS)
    10. ExecDriver.execute() runs all the jobs
    11. For each job at every HiveConf.ConfVars.HIVECOUNTERSPULLINTERVAL interval:
      ClientStatsPublisher.run() is called to publish statistics
      (HiveConf.ConfVars.CLIENTSTATSPUBLISHERS)
      If a task fails: ExecuteWithHookContext.run()
      (HiveConf.ConfVars.ONFAILUREHOOKS)
    12. Finish all the tasks
    13. ExecuteWithHookContext.run()
      (HiveConf.ConfVars.POSTEXECHOOKS)
    14. Before returning the result HiveDriverRunHook.postDriverRun()
      ( HiveConf.ConfVars.HIVE_DRIVER_RUN_HOOKS)
    15. Return the result.

    For each of the hooks I indicated the interfaces you have to implement. In the brackets there's the corresponding conf. prop. key you have to set in order to register the class at the beginning of the script. E.g: setting the PreExecution hook (9th stage of the workflow)

    HiveConf.ConfVars.PREEXECHOOKS -> hive.exec.pre.hooks :
    set hive.exec.pre.hooks=com.example.MyPreHook;
    

    Unfortunately these features aren't really documented, but you can always look into the Driver class to see the evaluation order of the hooks.

    Remark: I assumed here Hive 0.11.0, I don't think that the Cloudera distribution differs (too much)

    0 讨论(0)
提交回复
热议问题