问题
I've been encountering various issues in some Spark LDA topic modeling (mainly disassociation errors at seemingly random intervals) I've been running, which I think mainly have to do with insufficient memory allocation on my executors. This would seem to be related to problematic automatic cluster configuration. My latest attempt uses n1-standard-8 machines (8 cores, 30GB RAM) for both the master and worker nodes (6 workers, so 48 total cores).
But when I look at /etc/spark/conf/spark-defaults.conf
I see this:
spark.master yarn-client
spark.eventLog.enabled true
spark.eventLog.dir hdfs://cluster-3-m/user/spark/eventlog
# Dynamic allocation on YARN
spark.dynamicAllocation.enabled true
spark.dynamicAllocation.minExecutors 1
spark.dynamicAllocation.initialExecutors 100000
spark.dynamicAllocation.maxExecutors 100000
spark.shuffle.service.enabled true
spark.scheduler.minRegisteredResourcesRatio 0.0
spark.yarn.historyServer.address cluster-3-m:18080
spark.history.fs.logDirectory hdfs://cluster-3-m/user/spark/eventlog
spark.executor.cores 4
spark.executor.memory 9310m
spark.yarn.executor.memoryOverhead 930
# Overkill
spark.yarn.am.memory 9310m
spark.yarn.am.memoryOverhead 930
spark.driver.memory 7556m
spark.driver.maxResultSize 3778m
spark.akka.frameSize 512
# Add ALPN for Bigtable
spark.driver.extraJavaOptions -Xbootclasspath/p:/usr/local/share/google/alpn/alpn-boot-8.1.3.v20150130.jar
spark.executor.extraJavaOptions -Xbootclasspath/p:/usr/local/share/google/alpn/alpn-boot-8.1.3.v20150130.jar
But these values don't make much sense. Why use only 4/8 executor cores? And only 9.3 / 30GB RAM? My impression was that all this config was supposed to be handled automatically, but even my attempts at manual tweaking aren't getting me anywhere.
For instance, I tried launching the shell with:
spark-shell --conf spark.executor.cores=8 --conf spark.executor.memory=24g
But then this failed with
java.lang.IllegalArgumentException: Required executor memory (24576+930 MB) is above the max threshold (22528 MB) of this cluster! Please increase the value of 'yarn.scheduler.maximum-allocation-mb'.
I tried changing the associated value in /etc/hadoop/conf/yarn-site.xml
, to no effect. Even when I try a different cluster setup (e.g. using executors with 60+ GB RAM) I end up with the same problem. For some reason the max threshold remains at 22528MB.
Is there something I'm doing wrong here, or is this is a problem with Google's automatic configuration?
回答1:
There are some known issues with default memory configs in clusters where the master machine type is different from the worker machine type, though in your case that doesn't appear to be the main issue.
When you saw the following:
spark.executor.cores 4
spark.executor.memory 9310m
this actually means that each worker node will run 2 executors, and each executor will utilize 4 cores such that all 8 cores are indeed used up on each worker. This way, if we give the AppMaster half of one machine, the AppMaster can successfully be packed next to an executor.
The amount of memory given to NodeManagers needs to leave some overhead for the NodeManager daemon itself, and misc. other daemon services such as the DataNode, so ~80% is left for NodeManagers. Additionally, allocations must be a multiple of the minimum YARN allocation, so after flooring to the nearest allocation multiple, that's where the 22528MB comes from for n1-standard-8.
If you add workers that have 60+ GB of RAM, then as long as you use a master node of the same memory size then you should be seeing a higher max threshold number.
Either way, if you're seeing OOM issues, then it's not so much the memory per-executor that matters the most, but rather the memory per-task. And if you are increasing spark.executor.cores
at the same time as spark.executor.memory
, then the memory per-task isn't actually being increased, so you won't really be giving more headroom to your application logic in that case; Spark will use spark.executor.cores
to determine the number of concurrent tasks to run in the same memory space.
To actually get more memory per task, you should mostly try:
- Use n1-highmem-* machine types
- Try reducing spark.executor.cores while leaving spark.executor.memory the same
- Try increasing spark.executor.memory while leaving spark.executor.cores the same
If you do (2) or (3) above then you'll indeed be leaving cores idle compared to the default config which tries to occupy all cores, but that's really the only way to get more memory per-task aside from going to highmem
instances.
来源:https://stackoverflow.com/questions/34140667/google-cloud-dataproc-configuration-issues