AWS Athena concurrency limits: Number of submitted queries VS number of running queries

丶灬走出姿态 提交于 2019-12-19 03:56:27

问题


According to AWS Athena limitations you can submit up to 20 queries of the same type at a time, but it is a soft limit and can be increased on request. I use boto3 to interact with Athena and my script submits 16 CTAS queries each of which takes about 2 minutes to finish. In a AWS account, it is only me who is using Athena service. However, when I look at the state of queries through console I see that only a few of queries (5 on average) are actually being executed despite all of them being in state Running. Here is what would normally see in Athena hisotry tab:

I understand that, after I submit queries to Athena, it processes the queries by assigning resources based on the overall service load and the amount of incoming requests. But I tried to run them at different days and hours, still would get about 5 queries being executed at the same time.

So my question is this how it supposed to be? If it is then what is the point of being able to submit up to 20 queries if roughly 15 of them would be idling and waiting for available slots.

Update 2019-09-26

Just stumbled across HIVE CONNECTOR in presto documentation, which has a section AWS Glue Catalog Configuration Properties. There we can see

hive.metastore.glue.max-connections: Max number of concurrent connections to Glue (defaults to 5).

This got me wonder if it has something to do with my issue. As I understand, Athena is simply a Presto that runs on EMR cluster which is configured to use AWS Glue Data Catalog as the Metastore.

So what if my issue comes from the fact that EMR cluster for Athena simply uses default value for concurrent connections to Glue, which is 5 which and is exactly of how many concurrent queries are actually getting executed (on average) in my case.

Update 2019-11-27

The Athena team recently deployed a host of new functionality for Athena. although QUEUED has been in the state enum for some time is hasn't been used until now. So now I get, correct info about query state in a history tab, but everything else remains the same.

Also, another post was published with similar problem.

来源:https://stackoverflow.com/questions/57145967/aws-athena-concurrency-limits-number-of-submitted-queries-vs-number-of-running

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!