Spark Monitoring with Ganglia

问题

I'm testing the framework Apache Spark. I need monitoring some aspects about my cluster like network and resources.

Ganglia looks like a good option for what I need. Then, I found out that Spark has support to Ganglia.

On the Spark monitoring webpage there is this information: "To install the GangliaSink you’ll need to perform a custom build of Spark."

I found in my Spark the directory: "/extras/spark-ganglia-lgpl". But I don't know how to install it.

How can I install the Ganglia to monitoring Spark cluster? How I do this custom build?

Thanks!

回答1:

Spark Ganglia support is one of Maven profiles of Spark project and it's "spark-ganglia-lgpl". In order to activate the profile, you put "-Pspark-ganglia-lgpl" option in mvn command when you build the project. For example, building Apache Hadoop 2.4.X with Ganglia is done by

mvn -Pspark-ganglia-lgpl -Pyarn -Phadoop-2.4 -Dhadoop.version=2.4.0 -DskipTests clean package

For building the Spark project, please refer to documentation of Building Spark with Maven

回答2:

So if your running the HDP stack, i would recommend updating to the latests version. It includes the spark job tracker as well as the spark client libraries to be deployed on machines. It also will now integrate with ambari metrics which is set to replace Ganglia and Nagios

来源：https://stackoverflow.com/questions/26166398/spark-monitoring-with-ganglia

标签

apache-spark

ganglia

易学教程内所有资源均来自网络或用户发布的内容，如有违反法律规定的内容欢迎反馈！
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!