问题
I am getting the following error in my spark application when it is trying to serialize a protobuf field which is a map of key String and value float. Kryo serialization is being used in the spark app.
Caused by: java.lang.NullPointerException
at com.google.protobuf.UnmodifiableLazyStringList.size(UnmodifiableLazyStringList.java:68)
at java.util.AbstractList.add(AbstractList.java:108)
at com.esotericsoftware.kryo.serializers.CollectionSerializer.read(CollectionSerializer.java:134)
at com.esotericsoftware.kryo.serializers.CollectionSerializer.read(CollectionSerializer.java:40)
at com.esotericsoftware.kryo.Kryo.readObject(Kryo.java:731)
at com.esotericsoftware.kryo.serializers.ObjectField.read(ObjectField.java:125)
... 71 more
Has anyone faced this issue before? Is there a way to resolve it?
回答1:
You have to register ProtobufSerializer with kryo to serialize protobufs.
StreamExecutionEnvironment.getExecutionEnvironment()
.registerTypeWithKryoSerializer(YourProtobufClass.class,
ProtobufSerializer.class);
add below dependency to access ProtobufSerializer class.
<dependency>
<groupId>de.javakaffee</groupId>
<artifactId>kryo-serializers</artifactId>
<version>0.45</version>
</dependency>
回答2:
When Kryo encounters an object of a class it doesn't recognize, it falls back to Java serialization.
But it's possible to set Kryo
to throw a exception instead of this:
final Kryo kryo = new Kryo();
kryo.setRegistrationRequired(true);
I've decided to keep the registration above because it helps avoiding slow serialization for some classes that could impact performacne negatively.
For tackling Protobuf generated classes serialization I used the following class:
package com.juarezr.serialization;
import com.esotericsoftware.kryo.Kryo;
import com.esotericsoftware.kryo.Serializer;
import com.esotericsoftware.kryo.io.Input;
import com.esotericsoftware.kryo.io.Output;
import com.google.protobuf.AbstractMessage;
import java.io.Serializable;
import java.lang.reflect.InvocationTargetException;
import java.lang.reflect.Method;
public class ProtobufSerializer<T extends AbstractMessage> extends Serializer<T> implements Serializable {
static final long serialVersionUID = 1667386898559074449L;
protected final Method parser;
public ProtobufSerializer(final Class<T> protoMessageClass) {
try {
this.parser = protoMessageClass.getDeclaredMethod("parseFrom", byte[].class);
this.parser.setAccessible(true);
} catch (SecurityException | NoSuchMethodException ex) {
throw new IllegalArgumentException(protoMessageClass.toString() + " doesn't have a protobuf parser", ex);
}
}
@Override
public void write(final Kryo kryo, final Output output, final T protobufMessage) {
if (protobufMessage == null) {
output.writeByte(Kryo.NULL);
output.flush();
return;
}
final byte[] bytes = protobufMessage.toByteArray();
output.writeInt(bytes.length + 1, true);
output.writeBytes(bytes);
output.flush();
}
@SuppressWarnings({"unchecked", "JavaReflectionInvocation"})
@Override
public T read(final Kryo kryo, final Input input, final Class<T> protoMessageClass) {
final int length = input.readInt(true);
if (length == Kryo.NULL) {
return null;
}
final Object bytesRead = input.readBytes(length - 1);
try {
final Object parsed = this.parser.invoke(protoMessageClass, bytesRead);
return (T) parsed;
} catch (IllegalAccessException | InvocationTargetException e) {
throw new RuntimeException("Unable to deserialize protobuf for class: " + protoMessageClass.getName(), e);
}
}
@Override
public boolean getAcceptsNull() {
return true;
}
@SuppressWarnings("unchecked")
public static <M extends AbstractMessage> void registerMessagesFrom(final M rootMessage, final Kryo kryo) {
final Class<M> messageClass = (Class<M>) rootMessage.getClass();
final ProtobufSerializer<M> serializer = new ProtobufSerializer<>(messageClass);
kryo.register(messageClass, serializer);
final Class<?>[] nestedClasses = messageClass.getDeclaredClasses();
for (final Class<?> innerClass : nestedClasses) {
if ((AbstractMessage.class).isAssignableFrom(innerClass)) {
final Class<M> typedClass = (Class<M>) innerClass;
final ProtobufSerializer<M> serializer2 = new ProtobufSerializer<>(typedClass);
kryo.register(typedClass, serializer2);
}
}
}
}
You can configure the serialization with something like:
// ...
final Kryo kryo = new Kryo();
kryo.setRegistrationRequired(true);
// Add a registration for each generated file and top level class ...
ProtobufSerializer.registerMessagesFrom(MyProtoEnclosingClass.MyProtoTopLevelClass.getDefaultInstance(), kryo);
// Add a registration for each other Java/Scala class you would need...
来源:https://stackoverflow.com/questions/53109011/nullpointerexception-in-protobuf-when-kryo-serialization-is-used-with-spark