Spark Cluster + Spring XD

佐手、 提交于 2019-12-08 10:43:42

问题


I am trying to run a Spark processor on Spring XD for streaming operation.

The spark processor module on Spring XD works when spark is pointing to local. The processor fails to run when we point spark to spark standalone (running on the same machine) or yarn-client. Is it possible to run spark processor on spark standalone or yarn inside spring XD or is spark local the only option here ?

The processor module is defined as:

class WordCount extends Processor[String, (String, Int)] {

  def process(input: ReceiverInputDStream[String]): DStream[(String, Int)] = {
      val words = input.flatMap(_.split(" "))
      val pairs = words.map(word => (word, 1))
      val wordCounts = pairs.reduceByKey(_ + _)
      wordCounts
  }

  @SparkConfig
  def properties : Properties = {
    val props = new Properties()
    // Any specific Spark configuration properties would go here.
    // These properties always get the highest precedence
    //props.setProperty("spark.master", "spark://a.b.c.d:7077")
    **props.setProperty("spark.master", "spark://abcd.hadoop.ambari:7077**")
    props
  }

}

The processor works fine when the config is given as local. Is there something that i am missing in the declarations.

Thanks !

EDIT : ERROR LOG

//commands executed on xd-shell
===================================================================
spark/sbin/start-all.sh

module upload --file /opt/igc_services/SparkDev/XdWordCount/build/libs/spark-streaming-wordcount-scala-processor-0.1.0.jar  --name scala-word-count --type processor

stream create spark-streaming-word-count --definition "http | processor:scala-word-count | log" --deploy


// Error Log 
====================================================================
2015-09-16T14:28:48+0530 1.2.0.RELEASE INFO DeploymentsPathChildrenCache-0 container.DeploymentListener - Deploying module 'log' for stream 'spark-streaming-word-count'
2015-09-16T14:28:48+0530 1.2.0.RELEASE INFO DeploymentsPathChildrenCache-0 container.DeploymentListener - Deploying module [ModuleDescriptor@6dbc4f81 moduleName = 'log', moduleLabel = 'log', group = 'spark-streaming-word-count', sourceChannelName = [null], sinkChannelName = [null], index = 2, type = sink, parameters = map[[empty]], children = list[[empty]]]
2015-09-16T14:28:48+0530 1.2.0.RELEASE INFO DeploymentsPathChildrenCache-0 container.DeploymentListener - Path cache event: path=/deployments/modules/allocated/4ff3ba84-e6ca-47dd-894f-aa92bdbb3e06/spark-streaming-word-count.processor.processor.1, type=CHILD_ADDED
2015-09-16T14:28:48+0530 1.2.0.RELEASE INFO DeploymentsPathChildrenCache-0 container.DeploymentListener - Deploying module 'processor' for stream 'spark-streaming-word-count'
2015-09-16T14:28:48+0530 1.2.0.RELEASE INFO DeploymentsPathChildrenCache-0 container.DeploymentListener - Deploying module [ModuleDescriptor@5e16dafb moduleName = 'scala-word-count', moduleLabel = 'processor', group = 'spark-streaming-word-count', sourceChannelName = [null], sinkChannelName = [null], index = 1, type = processor, parameters = map[[empty]], children = list[[empty]]]
2015-09-16T14:28:49+0530 1.2.0.RELEASE WARN DeploymentsPathChildrenCache-0 util.NativeCodeLoader - Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
2015-09-16T14:28:49+0530 1.2.0.RELEASE WARN sparkDriver-akka.actor.default-dispatcher-3 remote.ReliableDeliverySupervisor - Association with remote system [akka.tcp://sparkMaster@abcd.hadoop.ambari:7077] has failed, address is now gated for [5000] ms. Reason is: [Disassociated].
2015-09-16T14:29:09+0530 1.2.0.RELEASE WARN sparkDriver-akka.actor.default-dispatcher-4 remote.ReliableDeliverySupervisor - Association with remote system [akka.tcp://sparkMaster@abcd.hadoop.ambari:7077] has failed, address is now gated for [5000] ms. Reason is: [Disassociated].
2015-09-16T14:29:18+0530 1.2.0.RELEASE INFO DeploymentsPathChildrenCache-0 container.DeploymentListener - Path cache event: path=/deployments/modules/allocated/8d07cdba-557e-458a-9225-b90e5a5778ce/spark-streaming-word-count.source.http.1, type=CHILD_ADDED
2015-09-16T14:29:18+0530 1.2.0.RELEASE INFO DeploymentsPathChildrenCache-0 container.DeploymentListener - Deploying module 'http' for stream 'spark-streaming-word-count'
2015-09-16T14:29:18+0530 1.2.0.RELEASE INFO DeploymentsPathChildrenCache-0 container.DeploymentListener - Deploying module [ModuleDescriptor@610e43b0 moduleName = 'http', moduleLabel = 'http', group = 'spark-streaming-word-count', sourceChannelName = [null], sinkChannelName = [null], index = 0, type = source, parameters = map[[empty]], children = list[[empty]]]
2015-09-16T14:29:19+0530 1.2.0.RELEASE INFO DeploymentSupervisor-0 zk.ZKStreamDeploymentHandler - Deployment status for stream 'spark-streaming-word-count': DeploymentStatus{state=failed,error(s)=Deployment of module 'ModuleDeploymentKey{stream='spark-streaming-word-count', type=processor, label='processor'}' to container '4ff3ba84-e6ca-47dd-894f-aa92bdbb3e06' timed out after 30000 ms}
2015-09-16T14:29:29+0530 1.2.0.RELEASE WARN sparkDriver-akka.actor.default-dispatcher-4 remote.ReliableDeliverySupervisor - Association with remote system [akka.tcp://sparkMaster@abcd.hadoop.ambari:7077] has failed, address is now gated for [5000] ms. Reason is: [Disassociated].
2015-09-16T14:29:49+0530 1.2.0.RELEASE ERROR sparkDriver-akka.actor.default-dispatcher-3 cluster.SparkDeploySchedulerBackend - Application has been killed. Reason: All masters are unresponsive! Giving up.
2015-09-16T14:29:49+0530 1.2.0.RELEASE WARN DeploymentsPathChildrenCache-0 cluster.SparkDeploySchedulerBackend - Application ID is not initialized yet.
2015-09-16T14:29:49+0530 1.2.0.RELEASE ERROR sparkDriver-akka.actor.default-dispatcher-3 scheduler.TaskSchedulerImpl - Exiting due to error from cluster scheduler: All masters are unresponsive! Giving up.
2015-09-16T14:29:50+0530 1.2.0.RELEASE INFO DeploymentSupervisor-0 zk.ContainerListener - Path cache event: path=/containers/4ff3ba84-e6ca-47dd-894f-aa92bdbb3e06, type=CHILD_REMOVED
2015-09-16T14:29:50+0530 1.2.0.RELEASE INFO DeploymentSupervisor-0 zk.ContainerListener - Container departed: Container{name='4ff3ba84-e6ca-47dd-894f-aa92bdbb3e06', attributes={groups=, host=abcd.hadoop.ambari, id=4ff3ba84-e6ca-47dd-894f-aa92bdbb3e06, managementPort=54998, ip=a.b.c.d, pid=4597}}

回答1:


The error looks like because of the version conflict. Make sure to use Spark 1.2.1 that XD supports out of the box.

If you have a specific version, you can still make it work by removing 1.2.1 versions of spark dependencies from XD_HOME/lib and replace them with the version of spark you use.



来源:https://stackoverflow.com/questions/32624037/spark-cluster-spring-xd

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!