Is Scala's actors similar to Go's coroutines?

后端 未结 5 1884
时光说笑
时光说笑 2020-12-22 15:09

If I wanted to port a Go library that uses Goroutines, would Scala be a good choice because its inbox/akka framework is similar in nature to coroutines?

5条回答
  •  有刺的猬
    2020-12-22 15:50

    There are two questions here:

    • Is Scala a good choice to port goroutines?

    This is an easy question, since Scala is a general purpose language, which is no worse or better than many others you can choose to "port goroutines".

    There are of course many opinions on why Scala is better or worse as a language (e.g. here is mine), but these are just opinions, and don't let them stop you. Since Scala is general purpose, it "pretty much" comes down to: everything you can do in language X, you can do in Scala. If it sounds too broad.. how about continuations in Java :)

    • Are Scala actors similar to goroutines?

    The only similarity (aside the nitpicking) is they both have to do with concurrency and message passing. But that is where the similarity ends.

    Since Jamie's answer gave a good overview of Scala actors, I'll focus more on Goroutines/core.async, but with some actor model intro.

    Actors help things to be "worry free distributed"


    Where a "worry free" piece is usually associated with terms such as: fault tolerance, resiliency, availability, etc..

    Without going into grave details how actors work, in two simple terms actors have to do with:

    • Locality: each actor has an address/reference that other actors can use to send messages to
    • Behavior: a function that gets applied/called when the message arrives to an actor

    Think "talking processes" where each process has a reference and a function that gets called when a message arrives.

    There is much more to it of course (e.g. check out Erlang OTP, or akka docs), but the above two is a good start.

    Where it gets interesting with actors is.. implementation. Two big ones, at the moment, are Erlang OTP and Scala AKKA. While they both aim to solve the same thing, there are some differences. Let's look at a couple:

    • I intentionally do not use lingo such as "referential transparency", "idempotence", etc.. they do no good besides causing confusion, so let's just talk about immutability [a can't change that concept]. Erlang as a language is opinionated, and it leans towards strong immutability, while in Scala it is too easy to make actors that change/mutate their state when a message is received. It is not recommended, but mutability in Scala is right there in front of you, and people do use it.

    • Another interesting point that Joe Armstrong talks about is the fact that Scala/AKKA is limited by the JVM which just wasn't really designed with "being distributed" in mind, while Erlang VM was. It has to do with many things such as: process isolation, per process vs. the whole VM garbage collection, class loading, process scheduling and others.

    The point of the above is not to say that one is better than the other, but it's to show that purity of the actor model as a concept depends on its implementation.

    Now to goroutines..

    Goroutines help to reason about concurrency sequentially


    As other answers already mentioned, goroutines take roots in Communicating Sequential Processes, which is a "formal language for describing patterns of interaction in concurrent systems", which by definition can mean pretty much anything :)

    I am going to give examples based on core.async, since I know internals of it better than Goroutines. But core.async was built after the Goroutines/CSP model, so there should not be too many differences conceptually.

    The main concurrency primitive in core.async/Goroutine is a channel. Think about a channel as a "queue on rocks". This channel is used to "pass" messages. Any process that would like to "participate in a game" creates or gets a reference to a channel and puts/takes (e.g. sends/receives) messages to/from it.

    Free 24 hour Parking

    Most of work that is done on channels usually happens inside a "Goroutine" or "go block", which "takes its body and examines it for any channel operations. It will turn the body into a state machine. Upon reaching any blocking operation, the state machine will be 'parked' and the actual thread of control will be released. This approach is similar to that used in C# async. When the blocking operation completes, the code will be resumed (on a thread-pool thread, or the sole thread in a JS VM)" (source).

    It is a lot easier to convey with a visual. Here is what a blocking IO execution looks like:

    blocking IO

    You can see that threads mostly spend time waiting for work. Here is the same work but done via "Goroutine"/"go block" approach:

    core.async

    Here 2 threads did all the work, that 4 threads did in a blocking approach, while taking the same amount of time.

    The kicker in above description is: "threads are parked" when they have no work, which means, their state gets "offloaded" to a state machine, and the actual live JVM thread is free to do other work (source for a great visual)

    note: in core.async, channel can be used outside of "go block"s, which will be backed by a JVM thread without parking ability: e.g. if it blocks, it blocks the real thread.

    Power of a Go Channel

    Another huge thing in "Goroutines"/"go blocks" is operations that can be performed on a channel. For example, a timeout channel can be created, which will close in X milliseconds. Or select/alt! function that, when used in conjunction with many channels, works like a "are you ready" polling mechanism across different channels. Think about it as a socket selector in non blocking IO. Here is an example of using timeout channel and alt! together:

    (defn race [q]
      (searching [:.yahoo :.google :.bing])
      (let [t (timeout timeout-ms)
            start (now)]
        (go
          (alt! 
            (GET (str "/yahoo?q=" q))  ([v] (winner :.yahoo v (took start)))
            (GET (str "/bing?q=" q))   ([v] (winner :.bing v (took start)))
            (GET (str "/google?q=" q)) ([v] (winner :.google v (took start)))
            t                          ([v] (show-timeout timeout-ms))))))
    

    This code snippet is taken from wracer, where it sends the same request to all three: Yahoo, Bing and Google, and returns a result from the fastest one, or times out (returns a timeout message) if none returned within a given time. Clojure may not be your first language, but you can't disagree on how sequential this implementation of concurrency looks and feels.

    You can also merge/fan-in/fan-out data from/to many channels, map/reduce/filter/... channels data and more. Channels are also first class citizens: you can pass a channel to a channel..

    Go UI Go!

    Since core.async "go blocks" has this ability to "park" execution state, and have a very sequential "look and feel" when dealing with concurrency, how about JavaScript? There is no concurrency in JavaScript, since there is only one thread, right? And the way concurrency is mimicked is via 1024 callbacks.

    But it does not have to be this way. The above example from wracer is in fact written in ClojureScript that compiles down to JavaScript. Yes, it will work on the server with many threads and/or in a browser: the code can stay the same.

    Goroutines vs. core.async

    Again, a couple of implementation differences [there are more] to underline the fact that theoretical concept is not exactly one to one in practice:

    • In Go, a channel is typed, in core.async it is not: e.g. in core.async you can put messages of any type on the same channel.
    • In Go, you can put mutable things on a channel. It is not recommended, but you can. In core.async, by Clojure design, all data structures are immutable, hence data inside channels feels a lot safer for its wellbeing.

    So what's the verdict?


    I hope the above shed some light on differences between the actor model and CSP.

    Not to cause a flame war, but to give you yet another perspective of let's say Rich Hickey:

    "I remain unenthusiastic about actors. They still couple the producer with the consumer. Yes, one can emulate or implement certain kinds of queues with actors (and, notably, people often do), but since any actor mechanism already incorporates a queue, it seems evident that queues are more primitive. It should be noted that Clojure's mechanisms for concurrent use of state remain viable, and channels are oriented towards the flow aspects of a system."(source)

    However, in practice, Whatsapp is based on Erlang OTP, and it seemed to sell pretty well.

    Another interesting quote is from Rob Pike:

    "Buffered sends are not confirmed to the sender and can take arbitrarily long. Buffered channels and goroutines are very close to the actor model.

    The real difference between the actor model and Go is that channels are first-class citizens. Also important: they are indirect, like file descriptors rather than file names, permitting styles of concurrency that are not as easily expressed in the actor model. There are also cases in which the reverse is true; I am not making a value judgement. In theory the models are equivalent."(source)

提交回复
热议问题