How can I get result of Dask compute on a different machine than the one that submitted it?

时光总嘲笑我的痴心妄想 提交于 2020-01-15 03:22:05

问题


I am using Dask behind a Django server and the basic setup I have is summarised here: https://github.com/MoonVision/django-dask-demo/ where the Dask client can be found here: https://github.com/MoonVision/django-dask-demo/blob/master/demo/daskmanager/daskmanager.py

I want to be able to separate the saving of a task from the server that submitted it for robustness and scalability. I also would like more detailed information as to the processing status of the task, right now the future status is always pending even if the task is processing. Having a rough estimate of percent complete would also be great.

Right now, if the web server were to die, the client would get deleted and the task would stop as no client is still holding the future. I can get around this by using fire_and_forget but I then have no way to save the task status and result when it completes.

Ways I see to track the status and save the result after a fire_and_forget:

  1. I could have a scheduler plugin that sends all transfers to AMPQ server (RabbitMQ). I like the robustness and being able to subscribe to certain messages that are output by the scheduler and knowing every message will be processed. I'm not sure how I could get the result it self with this method. I could manually adding a node to the end of every graph to save the result but would rather have it be behind the scenes.

  2. get_task_stream on separate server or use it in some way. With this, it seems I could miss some messages if the server were to go down so seems like a worse option 1.

  3. Other option?

What would be the best way to accomplish this?

Edit: Just tested and it seems when the client that submitted a task shuts down, all futures it created are moved from processing to forgotten, even if calling fire_and_forget.


回答1:


You probably want to look at Dask's coordination primitivies like Queues and Pub/Sub. My guess is that putting your futures into a queue would solve your problem.

https://docs.dask.org/en/latest/futures.html#coordination-primitives



来源:https://stackoverflow.com/questions/55513623/how-can-i-get-result-of-dask-compute-on-a-different-machine-than-the-one-that-su

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!