How Kryo serializer allocates buffer in Spark

断了今生、忘了曾经 提交于 2019-12-04 01:50:37
vvladymyrov

In my case, the problem was using the wrong property name for the max buffer size.

Up to Spark version 1.3 the property name is spark.kryoserializer.buffer.max.mb - it has ".mb" in the end. But I used property name from Spark 1.4 docs - spark.kryoserializer.buffer.max .

As a result spark app was using the default value - 64mb. And it was not enough for the amount of data I was processing.

After I fixed the property name to spark.kryoserializer.buffer.max.mb my app worked fine.

lambzee

Solution is to setup spark.kryoserializer.buffer.max to 1g in spark-default.conf and restarting spark services

This at least worked for me.

Use conf.set('spark.kryoserializer.buffer.max.mb', 'val') to set kryoserializer buffer and keep in mind val should be less than 2048 otherwise you will get some error again indicating buffer should be less than 2048MB

Mayukh

I am using spark 1.5.2 and I had the same issue. Setting spark.kryoserializer.buffer.max.mb to 256 fixed it.

Now spark.kryoserializer.buffer.max.mb is deprecated

WARN spark.SparkConf: The configuration key 'spark.kryoserializer.buffer.max.mb' has been deprecated as of Spark 1.4 and and may be removed in the future. Please use the new key 'spark.kryoserializer.buffer.max' instead.

You should rather use:

import org.apache.spark.SparkConf
val conf = new SparkConf()
conf.set("spark.kryoserializer.buffer.max", "val")
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!