where are the individual dataproc spark logs?

时间秒杀一切 提交于 2019-12-24 05:04:59

问题


Where are the dataproc spark job logs located? I know there are logs from the driver under "Logging" section but what about the execution nodes? Also, where are the detailed steps that Spark is executing logged (I know I can see them in the Application Master)? I am attempting to debug a script that seems to hang and spark seems to freeze.


回答1:


The task logs are stored on each worker node under /tmp.

It is possible to collect them in one place via yarn log aggregation. Set these properties at cluster creation time (via --properties with yarn: prefix):

  • yarn.log-aggregation-enable=true
  • yarn.nodemanager.remote-app-log-dir=gs://${LOG_BUCKET}/logs
  • yarn.log-aggregation.retain-seconds=-1

Here's an article that discusses log management:

https://hortonworks.com/blog/simplifying-user-logs-management-and-access-in-yarn/



来源:https://stackoverflow.com/questions/47342132/where-are-the-individual-dataproc-spark-logs

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!