How to get jobId that was submitted using Dataproc Workflow Template

爱⌒轻易说出口 提交于 2019-12-24 00:53:18

问题


I have submitted a Hive job using Dataproc Workflow Template with the help of Airflow operator (DataprocWorkflowTemplateInstantiateInlineOperator) written in Python. Once the job is submitted some name will be assigned as jobId (example: job0-abc2def65gh12).

Since I was not able to get jobId I tried to pass jobId as a parameter from REST API which isn't working.

Can I fetch jobId or, if it's not possible, can I pass jobId as a parameter?


回答1:


The JobId will be available as part of metadata field in Operation object that is returned from Instantiate operation. See this [1] article for how to work with metadata.

The Airflow operator only polls [2] on the Operation but does not return the final Operation object. You could try to add a return to execute.

Another option would to be to use dataproc rest API [3] after workflow finishes. Any labels assigned to the workflow itself will be propagated to clusters and jobs so you can do a list jobs call. For example the filter parameter could look like: filter = labels.my-label=12345

[1] https://cloud.google.com/dataproc/docs/concepts/workflows/debugging#using_workflowmetadata

[2] https://github.com/apache/airflow/blob/master/airflow/contrib/operators/dataproc_operator.py#L1376

[3] https://cloud.google.com/dataproc/docs/reference/rest/v1/projects.regions.jobs/list



来源:https://stackoverflow.com/questions/54550988/how-to-get-jobid-that-was-submitted-using-dataproc-workflow-template

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!