Metrics System not recognizing Custom Source/Sink in application jar

一个人想着一个人 提交于 2019-12-11 05:29:48

问题


Followup from here.

I've added Custom Source and Sink in my application jar and found a way to get a static fixed metrics.properties on Stand-alone cluster nodes. When I want to launch my application, I give the static path - spark.metrics.conf="/fixed-path/to/metrics.properties". Despite my custom source/sink being in my code/fat-jar - I get ClassNotFoundException on CustomSink.

My fat-jar (with Custom Source/Sink code in it) is on hdfs with read access to all.

So here's what all I've already tried setting (since executors can't find Custom Source/Sink in my application fat-jar):

  1. spark.executor.extraClassPath = hdfs://path/to/fat-jar
  2. spark.executor.extraClassPath = fat-jar-name.jar
  3. spark.executor.extraClassPath = ./fat-jar-name.jar
  4. spark.executor.extraClassPath = ./
  5. spark.executor.extraClassPath = /dir/on/cluster/* (although * is not at file level, there are more directories - I have no way of knowing random application-id or driver-id to give absolute name before launching the app)

It seems like this is how executors are getting initialized for this case (please correct me if I am wrong) -

  1. Driver tells here's the jar location - hdfs://../fat-jar.jar and here are some properties like spark.executor.memory etc.
  2. N number of Executors spin up (depending on configuration) on cluster
  3. Start downloading hdfs://../fat-jar.jar but initialize metrics system in the mean time (? - not sure of this step)
  4. Metrics system looking for Custom Sink/Source files - since it's mentioned in metrics.properties - even before it's done downloading fat-jar (which actually has all those files) (this is my hypothesis)
  5. ClassNotFoundException - CustomSink not found!

Is my understanding correct? Moreover, is there anything else I can try? If anyone has experience with custom source/sinks, any help would be appreciated.


回答1:


I stumbled upon the same ClassNotFoundException when I needed to extend existing GraphiteSink class and here's how I was able to solve it.

First, I created a CustomGraphiteSink class in org.apache.spark.metrics.sink package:

package org.apache.spark.metrics.sink;
public class CustomGraphiteSink extends GraphiteSink {}

Then I specified the class in metrics.properties *.sink.graphite.class=org.apache.spark.metrics.sink.CustomGraphiteSink

And passed this file to spark-submit via: --conf spark.metrics.conf=metrics.properties




回答2:


In order to use custom source/sink, one has to distribute it using spark-submit --files and set it via spark.executor.extraClassPath



来源:https://stackoverflow.com/questions/39763364/metrics-system-not-recognizing-custom-source-sink-in-application-jar

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!