Monitoring the Memory Usage of Spark Jobs

点点圈 提交于 2020-01-03 08:26:30

问题


How can we get the overall memory used for a spark job. I am not able to get the exact parameter which we can refer to retrieve the same. Have referred to Spark UI but not sure of the field which we can refer. Also in Ganglia we have the following options: a)Memory Buffer b)Cache Memory c)Free Memory d)Shared Memory e)Free Swap Space

Not able to get any option related to Memory Used. Does anyone have some idea regarding this.


回答1:


If you persist your RDDs you can see how big they are in memory via the UI.

It's hard to get an idea of how much memory is being used for intermediate tasks (e.g. for shuffles). Basically Spark will use as much memory as it needs given what's available. This means that if your RDDs take up more than 50% of your available resources, your application might slow down because there are fewer resources available for execution.



来源:https://stackoverflow.com/questions/39615702/monitoring-the-memory-usage-of-spark-jobs

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!