I have just began with Spark Streaming and I am trying to build a sample application that counts words from a Kafka stream. Although it compiles with sbt package
Following build.sbt
worked for me. It requires you to also put the sbt-assembly
plugin in a file under the projects/
directory.
build.sbt
name := "NetworkStreaming" // https://github.com/sbt/sbt-assembly/blob/master/Migration.md#upgrading-with-bare-buildsbt
libraryDependencies ++= Seq(
"org.apache.spark" % "spark-streaming_2.10" % "1.4.1",
"org.apache.spark" % "spark-streaming-kafka_2.10" % "1.4.1", // kafka
"org.apache.hbase" % "hbase" % "0.92.1",
"org.apache.hadoop" % "hadoop-core" % "1.0.2",
"org.apache.spark" % "spark-mllib_2.10" % "1.3.0"
)
mergeStrategy in assembly := {
case m if m.toLowerCase.endsWith("manifest.mf") => MergeStrategy.discard
case m if m.toLowerCase.matches("meta-inf.*\\.sf$") => MergeStrategy.discard
case "log4j.properties" => MergeStrategy.discard
case m if m.toLowerCase.startsWith("meta-inf/services/") => MergeStrategy.filterDistinctLines
case "reference.conf" => MergeStrategy.concat
case _ => MergeStrategy.first
}
project/plugins.sbt
addSbtPlugin("com.eed3si9n" % "sbt-assembly" % "0.14.1")