Spark + EMR using Amazon's “maximizeResourceAllocation” setting does not use all cores/vcores

前端 未结 3 1731
我寻月下人不归
我寻月下人不归 2021-01-30 04:34

I\'m running an EMR cluster (version emr-4.2.0) for Spark using the Amazon specific maximizeResourceAllocation flag as documented here. According to those docs, \"

3条回答
  •  野趣味
    野趣味 (楼主)
    2021-01-30 05:10

    Okay, after a lot of experimentation, I was able to track down the problem. I'm going to report my findings here to help people avoid frustration in the future.

    • While there is a discrepancy between the 8 cores asked for and the 16 VCores that YARN knows about, this doesn't seem to make a difference. YARN isn't using cgroups or anything fancy to actually limit how many CPUs the executor can actually use.
    • "Cores" on the executor is actually a bit of a misnomer. It is actually how many concurrent tasks the executor will willingly run at any one time; essentially boils down to how many threads are doing "work" on each executor.
    • When maximizeResourceAllocation is set, when you run a Spark program, it sets the property spark.default.parallelism to be the number of instance cores (or "vCPUs") for all the non-master instances that were in the cluster at the time of creation. This is probably too small even in normal cases; I've heard that it is recommended to set this at 4x the number of cores you will have to run your jobs. This will help make sure that there are enough tasks available during any given stage to keep the CPUs busy on all executors.
    • When you have data that comes from different runs of different spark programs, your data (in RDD or Parquet format or whatever) is quite likely to be saved with varying number of partitions. When running a Spark program, make sure you repartition data either at load time or before a particularly CPU intensive task. Since you have access to the spark.default.parallelism setting at runtime, this can be a convenient number to repartition to.

    TL;DR

    1. maximizeResourceAllocation will do almost everything for you correctly except...
    2. You probably want to explicitly set spark.default.parallelism to 4x number of instance cores you want the job to run on on a per "step" (in EMR speak)/"application" (in YARN speak) basis, i.e. set it every time and...
    3. Make sure within your program that your data is appropriately partitioned (i.e. want many partitions) to allow Spark to parallelize it properly

提交回复
热议问题