Huge delays translating the DAG to tasks

ε祈祈猫儿з 提交于 2019-12-11 07:27:24

问题


this are my steps:

  1. Submit the spark app to a EMR cluster
  2. The driver starts and I can see the Spark-ui (no stages have been created yet)
  3. The driver reads an orc file with ~3000 parts from s3, make some transformations and save it back to s3
  4. The execution of the save should create some stages in the spark-ui but the stages take really long time to appear in the spark-ui
  5. The stages appear and start the execution

Why am I getting that huge delay in step 4? During this time the cluster is apparently waiting for something and the CPU usage is 0%

Thanks


回答1:


Despite its merits S3 is not a file system and it makes it a suboptimal choice for working with complex binary formats which are typically designed with actual file system in mind. In many cases secondary tasks (like reading metadata) are more expensive than the actual data fetching.




回答2:


It's probably the commit process between 3&4; the Hadoop MR and spark committers assume that rename is an O(1) atomic operation, and rely on it to do atomic commits of work. On S3, rename is O(data) and non-atomic when multiple files in a directory are involved. the 0-CPU load is the giveaway: the client is just awaiting a response from S3, which is doing the COPY internally at 6-10 MB/S

There's work underway in HADOOP-13345 to do a 0-rename commit in S3. For now, you can look for the famed-but-fails-in-interesting-ways Direct Committer from Databricks.

One more thing: make sure you are using "algorithm 2" for commiting, as algorithm 1 does a lot more renaming in the final job master commit. My full recommended setting for ORC/Parquet perf on Hadoop 2.7 is (along with use s3a: urls):

spark.sql.parquet.filterPushdown true
spark.sql.parquet.mergeSchema false
spark.hadoop.parquet.enable.summary-metadata false

spark.sql.orc.filterPushdown true
spark.sql.orc.splits.include.file.footer true
spark.sql.orc.cache.stripe.details.size 10000

spark.sql.hive.metastorePartitionPruning true
spark.speculation false
spark.hadoop.mapreduce.fileoutputcommitter.algorithm.version 2
spark.hadoop.mapreduce.fileoutputcommitter.cleanup.skipped true


来源:https://stackoverflow.com/questions/41558052/huge-delays-translating-the-dag-to-tasks

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!