Google appengine: Task queue performance

感情迁移 提交于 2019-11-28 12:51:29
Dan Cornilescu

By placing the if more statement at the end of the addCompaniesToIndex() function you're practically serializing the task execution: the next deferred task is not created until the current deferred task completed indexing its share of docs.

What you could do is move the if more statement right after the Company.query().fetch_page() call where you obtain (most of) the variables needed for the next deferred task execution.

This way the next deferred task would be created and enqueued (long) before the current one completes, so their processing can potentially be overlapping/staggered. You will need some other modifications as well, for example handling the n_entities variable which loses its current meaning in the updated scenario - but that's more or less cosmetic/informational, not essential to the actual doc indexing operation.

If the number of deferred tasks is very high there is a risk of queueing too many of them simultaneously, which could cause an "explosion" in the number of instances GAE would spawn to handle them. In such case is not desired you can "throttle" the rate at which the deferred tasks are spawned by delaying their execution a bit, see https://stackoverflow.com/a/38958475/4495081.

I think I finally managed to get around this issue by using two queues and idea proposed by the previous answer.

  • On the first queue we only query the main entities (with keys_only). And launch another task on a second queue for those keys. The first task will then relaunch itself on queue 1 with the next_cursor.
  • The second queue gets the entity keys and does all the queries and inserts on Full text search/BigQuery/PubSub. (this is slow ~ 15s per group of 100 keys)

I tried using only one queue as well but the processing throughput was not as good. I believe that this might come from the fact that we have slow and fast tasks running on the same queue and the scheduler might not work as well in this case.

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!