I've got several .proto files which rely on syntax = "proto3";
. I also have a Maven project that is used to build Hadoop/Spark jobs (Hadoop 2.7.1 and Spark 1.5.2). I'd like to generate data in Hadoop/Spark and then serialize it according to my proto3 files.
Using libprotoc 3.0.0, I generate Java sources which work fine within my Maven project as long as I have the following in my pom.xml:
com.google.protobuf protobuf-java 3.0.0-beta-1
Now, when I use my libprotoc-generated classes in a job that gets deployed to a cluster I get hit with:
java.lang.VerifyError : class blah overrides final method mergeUnknownFields.(Lcom/google/protobuf/UnknownFieldSet;)Lcom/google/protobuf/GeneratedMessage$Builder; at java.lang.ClassLoader.defineClass1(Native Method) at java.lang.ClassLoader.defineClass(ClassLoader.java:760)
ClassLoader failing seems reasonable given that Hadoop/Spark have a dependency on protobuf-java 2.5.0 which is incompatible with my 3.0.0-beta-1. I also noticed that protobufs (presumably versions
$ jar tf target/myjar-0.1-SNAPSHOT.jar | grep protobuf | grep '/$' org/apache/hadoop/ipc/protobuf/ org/jboss/netty/handler/codec/protobuf/ META-INF/maven/com.google.protobuf/ META-INF/maven/com.google.protobuf/protobuf-java/ org/apache/mesos/protobuf/ io/netty/handler/codec/protobuf/ com/google/protobuf/ google/protobuf/
Is there something I can do (Maven Shade?) to sort this out?
Similar issue here: Spark java.lang.VerifyError