I'm running a job on Spark Yarn and trying to emit messages to Influx DB but I'm crashing on an okio conflict:
22:17:54 ERROR ApplicationMaster - User class threw exception: java.lang.NoSuchMethodError: okio.BufferedSource.readUtf8LineStrict(J)Ljava/lang/String;
java.lang.NoSuchMethodError: okio.BufferedSource.readUtf8LineStrict(J)Ljava/lang/String;
at okhttp3.internal.http1.Http1Codec.readHeaderLine(Http1Codec.java:212)
at okhttp3.internal.http1.Http1Codec.readResponseHeaders(Http1Codec.java:189)
Here's my dependencies:
val cdhVersion = "cdh5.12.2"
val sparkVersion = "2.2.0.cloudera2"
val parquetVersion = s"1.5.0-$cdhVersion"
val hadoopVersion = s"2.6.0-$cdhVersion"
val awsVersion = "1.11.295"
val log4jVersion = "1.2.17"
val slf4jVersion = "1.7.5"
lazy val sparkDependencies = Seq(
"org.apache.spark" %% "spark-core" % sparkVersion,
"org.apache.spark" %% "spark-hive" % sparkVersion,
"org.apache.spark" %% "spark-sql" % sparkVersion,
"org.apache.spark" %% "spark-streaming" % sparkVersion,
"org.apache.hadoop" % "hadoop-common" % "2.2.0"
)
lazy val otherDependencies = Seq(
"org.apache.spark" %% "spark-streaming-kinesis-asl" % "2.2.0",
"org.clapper" %% "grizzled-slf4j" % "1.3.1",
"org.apache.logging.log4j" % "log4j-slf4j-impl" % "2.6.2" % "runtime",
"org.slf4j" % "slf4j-log4j12" % slf4jVersion,
"com.typesafe" % "config" % "1.3.1",
"org.rogach" %% "scallop" % "3.0.3",
"org.influxdb" % "influxdb-java" % "2.9"
)
libraryDependencies ++= sparkDependencies.map(_ % "provided" ) ++ otherDependencies
dependencyOverrides ++= Set("com.squareup.okio" % "okio" % "1.13.0")
Using the same jar I can run a succesful test to instantiate an InfluxDb instance in a non-spark job. But trying to do some from Spark throws the above error. Sounds like spark must have it's own version of OKIO that's causing this conflict at run when I use spark-submit. ... But it doesn't show that when I dump the dependency tree. Any advice on how I can bring my desired version of okio 1.13.0 to the spark cluster run path?
(as I'm typing I'm thinking to try shading which I will do now) Thanks
In my case "using Apache Spark 1.6.3 with Hadoop HDP distribution"
- I run
spark-shell
and see on web UI what jar are used - Search okhttp
jar tf /usr/hdp/current/spark-client/lib/spark-assembly-1.6.3.2.6.3.0-235-hadoop2.7.3.2.6.3.0-235.jar | grep okhttp
- Extract okhttp version
jar xf /usr/hdp/current/spark-client/lib/spark-assembly-1.6.3.2.6.3.0-235-hadoop2.7.3.2.6.3.0-235.jar META-INF/maven/com.squareup.okhttp/okhttp/pom.xml
=> version 2.4.0
No idea who is provided this version.
I had the same problem on spark 2.1.0.
Solution: I have downgraded the influxdb-java
dependency from version 2.11 (2.12 has empty child dependency and we have problems at fat jar assembling) to 2.1.
Influxdb-java 2.1 have a different API, but it works on spark-submit applications.
来源:https://stackoverflow.com/questions/49481868/spark-and-influx-okio-conflict