Cannot connect from Spark Streaming to Kafka: org.apache.spark.SparkException: java.net.SocketTimeoutException

99封情书 提交于 2019-12-11 12:15:59

问题


I'm trying to read from a Kafka topic with Spark Streaming direct stream but I receive the following error:

INFO consumer.SimpleConsumer: Reconnect due to socket error: java.net.SocketTimeoutException
ERROR yarn.ApplicationMaster: User class threw exception: org.apache.spark.SparkException: java.net.SocketTimeoutException
java.net.SocketTimeoutException
org.apache.spark.SparkException: java.net.SocketTimeoutException
java.net.SocketTimeoutException
    at org.apache.spark.streaming.kafka.KafkaCluster$$anonfun$checkErrors$1.apply(KafkaCluster.scala:366)
    at org.apache.spark.streaming.kafka.KafkaCluster$$anonfun$checkErrors$1.apply(KafkaCluster.scala:366)
    at scala.util.Either.fold(Either.scala:97)
    at org.apache.spark.streaming.kafka.KafkaCluster$.checkErrors(KafkaCluster.scala:365)
    at org.apache.spark.streaming.kafka.KafkaUtils$.createDirectStream(KafkaUtils.scala:422)

I have Kafka 0.7.1 and Spark 1.5.2.

I'm using the following code:

  val ssc : StreamingContext = new StreamingContext(sparkContext, Seconds(60))   
  val topicsSet = Set("myTopic")
  val kafkaParams = Map[String, String]
          ("metadata.broker.list" -> "mybrokerhostname1:9092,mybrokerhostname2:9092")

  val stream = KafkaUtils.createDirectStream[Array[Byte], Array[Byte], DefaultDecoder, DefaultDecoder](ssc, kafkaParams, topicsSet)

I am sure that the topic already exists because other applications are correctly reading from it.


回答1:


Try not to use older version of kafka, in your case it is (0.7.1). If you have a strong reason to use 0.7.1, do let me know. Looking at your exception, it looks like the application is not able to connect to kafka brokers.

I have used this direct stream api to read from kafka 0.8.2. https://github.com/koeninger/kafka-exactly-once/blob/master/src/main/scala/example/TransactionalPerBatch.scala

Hope, this will solve your problem.

Thanks & Regards, Vikas Gite



来源:https://stackoverflow.com/questions/36788424/cannot-connect-from-spark-streaming-to-kafka-org-apache-spark-sparkexception-j

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!