Best practices to parallelize using async workflow

半城伤御伤魂 提交于 2019-12-09 16:08:24

问题


Lets say I wanted to scrape a webpage, and extract some data. I'd most likely write something like this:

let getAllHyperlinks(url:string) =
    async {  let req = WebRequest.Create(url)
             let! rsp = req.GetResponseAsync()
             use stream = rsp.GetResponseStream()             // depends on rsp
             use reader = new System.IO.StreamReader(stream)  // depends on stream
             let! data = reader.AsyncReadToEnd()              // depends on reader
             return extractAllUrls(data) }                    // depends on data

The let! tells F# to execute the code in another thread, then bind the result to a variable, and continue processing. The sample above uses two let statements: one to get the response, and one to read all the data, so it spawns at least two threads (please correct me if I'm wrong).

Although the workflow above spawns several threads, the order of execution is serial because each item in the workflow depends on the previous item. Its not really possible to evaluate any items further down the workflow until the other threads return.

Is there any benefit to having more than one let! in the code above?

If not, how would this code need to change to take advantage of multiple let! statements?


回答1:


The key is we are not spawning any new threads. During the whole course of the workflow, there are 1 or 0 active threads being consumed from the ThreadPool. (An exception, up until the first '!', the code runs on the user thread that did an Async.Run.) "let!" lets go of a thread while the Async operation is at sea, and then picks up a thread from the ThreadPool when the operation returns. The (performance) advantage is less pressure against the ThreadPool (and of course the major user advantage is the simple programming model - a million times better than all that BeginFoo/EndFoo/callback stuff you otherwise write).

See also http://cs.hubfs.net/forums/thread/8262.aspx




回答2:


I was writing an answer but Brian beat me to it. I fully agree with him.

I'd like to add that if you want to parallelize synchronous code, the right tool is PLINQ, not async workflows, as Don Syme explains.



来源:https://stackoverflow.com/questions/496468/best-practices-to-parallelize-using-async-workflow

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!