Spark java.lang.OutOfMemoryError : Java Heap space

这一生的挚爱 提交于 2019-12-03 23:06:13

Things I would try:

1) Removing spark.memory.offHeap.enabled=true and increasing driver memory to something like 90% of the available memory on the box. You probably are aware of this since you didn't set executor memory, but in local mode the driver and the executor all run in the same process which is controlled by driver-memory. I haven't tried it, but the offHeap feature sounds like it has limited value. Reference

2) An actual cluster instead of local mode. More nodes will obviously give you more RAM.

3a) If you want to stick with local mode, try using less cores. You can do this by specifying the number of cores to use in the master setting like --master local[4] instead of local[*] which uses all of them. Running with less threads simultaneously processing data will lead to less data in RAM at any given time.

3b) If you move to a cluster, you may also want to tweak the number of executors cores for the same reason as mentioned above. You can do this with the --executor-cores flag.

4) Try with more partitions. In your example code you repartitioned to 500 partitions, maybe try 1000, or 2000? More partitions means each partition is smaller and less memory pressure.

Usually, this error is thrown when there is insufficient space to allocate an object in the Java heap. In this case, The garbage collector cannot make space available to accommodate a new object, and the heap cannot be expanded further. Also, this error may be thrown when there is insufficient native memory to support the loading of a Java class. In a rare instance, a java.lang.OutOfMemoryError may be thrown when an excessive amount of time is being spent doing garbage collection and little memory is being freed.

How to fix error :

How to set Apache Spark Executor memory

Spark java.lang.OutOfMemoryError: Java heap space

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!