AWS Glue takes a long time to finish

前端 未结 3 836
借酒劲吻你
借酒劲吻你 2021-01-01 20:04

I just run a very simple job as follows

glueContext = GlueContext(SparkContext.getOrCreate())
l_table = glueContext.create_dynamic_frame.from_catalog(
               


        
相关标签:
3条回答
  • 2021-01-01 20:35

    Update as of May 2019 -

    • Cold start times = 7-8 minutes

    • Warm pool maintained for = 10-15 mins

    0 讨论(0)
  • 2021-01-01 20:53

    It's taking the time to setup the environment that allows your code to run. I had the same issue, contacted the AWS GLUE team and they were helpful. The reason it takes a long time is that GLUE builds an environment when you run the first job (which stays alive for 1 hours) if you run the same script twice or any other script within one hour, the next job will take significantly less time. They call this Cold Start when you run the first script, It took my first job 17 minutes, I ran the same job again right after the first one finished and it took 3 minutes only.

    0 讨论(0)
  • 2021-01-01 20:56

    when taking the action of editing a job, you can add more DPUs under the "Script libraries and job parameters (optional)" section. It helps some, but do not expect any major improvement, my experience.

    0 讨论(0)
提交回复
热议问题