spark-streaming-kafka

How to manually set group.id and commit kafka offsets in spark structured streaming?

♀尐吖头ヾ 提交于 2020-08-24 06:29:12
问题 I was going through the Spark structured streaming - Kafka integration guide here. It is told at this link that enable.auto.commit: Kafka source doesn’t commit any offset. So how do I manually commit offsets once my spark application has successfully processed each record? 回答1: Current Situation (Spark 2.4.5) This feature seems to be under discussion in the Spark community https://github.com/apache/spark/pull/24613. In that Pull Request you will also find a possible solution for this at https

does pyspark support spark-streaming-kafka-0-10 lib?

空扰寡人 提交于 2020-07-08 02:05:15
问题 my kafka cluster version is 0.10.0.0, and i want to use pyspark stream to read kafka data. but in Spark Streaming + Kafka Integration Guide, http://spark.apache.org/docs/latest/streaming-kafka-0-10-integration.html there is no python code example. so can pyspark use spark-streaming-kafka-0-10 to integrate kafka? Thank you in advance for your help ! 回答1: I also use spark streaming with Kafka 0.10.0 cluster. After adding following line to your code, you are good to go. spark.jars.packages org