What are arrows, and how can I use them?

跟風遠走 提交于 2019-11-29 18:43:43

I don't know a tutorial, but I think it's easiest to understand arrows if you look at some concrete examples. The biggest problem I had learning how to use arrows was that none of the tutorials or examples actually show how to use arrows, just how to compose them. So, with that in mind, here's my mini-tutorial. I'll examine two different arrows: functions and a user-defined arrow type MyArr.

-- type representing a computation
data MyArr b c = MyArr (b -> (c,MyArr b c))

1) An Arrow is a calculation from input of a specified type to output of a specified type. The arrow typeclass takes three type arguments: the arrow type, the input type, and the output type. Looking at the instance head for arrow instances we find:

instance Arrow (->) b c where
instance Arrow MyArr b c where

The Arrow (either (->) or MyArr) is an abstraction of a computation.

For a function b -> c, b is the input and c is the output.
For a MyArr b c, b is the input and c is the output.

2) To actually run an arrow computation, you use a function specific to your arrow type. For functions you simply apply the function to an argument. For other arrows, there needs to be a separate function (just like runIdentity, runState, etc. for monads).

-- run a function arrow
runF :: (b -> c) -> b -> c
runF = id

-- run a MyArr arrow, discarding the remaining computation
runMyArr :: MyArr b c -> b -> c
runMyArr (MyArr step) = fst . step

3) Arrows are frequently used to process a list of inputs. For functions these can be done in parallel, but for some arrows output at any given step depends upon previous inputs (e.g. keeping a running total of inputs).

-- run a function arrow over multiple inputs
runFList :: (b -> c) -> [b] -> [c]
runFList f = map f

-- run a MyArr over multiple inputs.
-- Each step of the computation gives the next step to use
runMyArrList :: MyArr b c -> [b] -> [c]
runMyArrList _ [] = []
runMyArrList (MyArr step) (b:bs) = let (this, step') = step b
                                   in this : runMyArrList step' bs

This is one reason Arrows are useful. They provide a computation model that can implicitly make use of state without ever exposing that state to the programmer. The programmer can use arrowized computations and combine them to create sophisticated systems.

Here's a MyArr that keeps count of the number of inputs it has received:

-- count the number of inputs received:
count :: MyArr b Int
count = count' 0
  where
    count' n = MyArr (\_ -> (n+1, count' (n+1)))

Now the function runMyArrList count will take a list length n as input and return a list of Ints from 1 to n.

Note that we still haven't used any "arrow" functions, that is either Arrow class methods or functions written in terms of them.

4) Most of the code above is specific to each Arrow instance[1]. Everything in Control.Arrow (and Control.Category) is about composing arrows to make new arrows. If we pretend that Category is part of Arrow instead of a separate class:

-- combine two arrows in sequence
>>> :: Arrow a => a b c -> a c d -> a b d

-- the function arrow instance
-- >>> :: (b -> c) -> (c -> d) -> (b -> d)
-- this is just flip (.)

-- MyArr instance
-- >>> :: MyArr b c -> MyArr c d -> MyArr b d

The >>> function takes two arrows and uses the output of the first as input to the second.

Here's another operator, commonly called "fanout":

-- &&& applies two arrows to a single input in parallel
&&& :: Arrow a => a b c -> a b c' -> a b (c,c')

-- function instance type
-- &&& :: (b -> c) -> (b -> c') -> (b -> (c,c'))

-- MyArr instance type
-- &&& :: MyArr b c -> MyArr b c' -> MyArr b (c,c')

-- first and second omitted for brevity, see the accepted answer from KennyTM's link
-- for further details.

Since Control.Arrow provides a means to combine computations, here's one example:

-- function that, given an input n, returns "n+1" and "n*2"
calc1 :: Int -> (Int,Int)
calc1 = (+1) &&& (*2)

I've frequently found functions like calc1 useful in complicated folds, or functions that operate on pointers for example.

The Monad type class provides us with a means to combine monadic computations into a single new monadic computation using the >>= function. Similarly, the Arrow class provides us with means to combine arrowized computations into a single new arrowized computation using a few primitive functions (first, arr, and ***, with >>> and id from Control.Category). Also similar to Monads, the question of "What does an arrow do?" can't be generally answered. It depends on the arrow.

Unfortunately I don't know of many examples of arrow instances in the wild. Functions and FRP seem to be the most common applications. HXT is the only other significant usage that comes to mind.

[1] Except count. It's possible to write a count function that does the same thing for any instance of ArrowLoop.

C. A. McCann

From a glance at your history on Stack Overflow, I'm going to assume that you're comfortable with some of the other standard type classes, particularly Functor and Monoid, and start with a brief analogy from those.

The single operation on Functor is fmap, which serves as a generalized version of map on lists. This is pretty much the entire purpose of the type class; it defines "things that you can map over". So, in a sense Functor represents a generalization of that particular aspect of lists.

The operations for Monoid are generalized versions of the empty list and (++), and it defines "things that can be combined associatively, with a particular thing that's an identity value". Lists are pretty much the simplest thing that fits that description, and Monoid represents a generalization of that aspect of lists.

In the same way as the above two, the operations on the Category type class are generalized versions of id and (.), and it defines "things connecting two types in a particular direction, that can be connected head-to-tail". So, this represents a generalization of that aspect of functions. Notably not included in the generalization are currying or function application.

The Arrow type class builds off of Category, but the underlying concept is the same: Arrows are things that compose like functions and have an "identity arrow" defined for any type. The additional operations defined on the Arrow class itself just define a way to lift an arbitrary function to an Arrow and a way to combine two arrows "in parallel" as a single arrow between tuples.

So, the first thing to keep in mind here is that expressions building Arrows are, essentially, elaborate function composition. The combinators like (***) and (>>>) are for writing "pointfree" style, while proc notation gives a way to assign temporary names to inputs and outputs while wiring things up.

A useful thing to note here is that, even though Arrows are sometimes described as being "the next step" from Monads, there's really not a terribly meaningful relationship there. For any Monad you can work with Kleisli arrows, which are just functions with a type like a -> m b. The (<=<) operator in Control.Monad is arrow composition for these. On the other hand, Arrows don't get you a Monad unless you also include the ArrowApply class. So there's no direct connection as such.

The key difference here is that whereas Monads can be used to sequence computations and do things step-by-step, Arrows are in some sense "timeless" just like regular functions. They can include extra machinery and functionality that gets spliced in by (.), but it's more like building a pipeline, not accumulating actions.

The other related type classes add additional functionality to an arrow, such as being able to combine arrows with Either as well as (,).


My favorite example of an Arrow is stateful stream transducers, which look something like this:

data StreamTrans a b = StreamTrans (a -> (b, StreamTrans a b))

A StreamTrans arrow converts an input value to an output and an "updated" version of itself; consider the ways that this differs from a stateful Monad.

Writing instances for Arrow and its related type classes for the above type may be a good exercise for understanding how they work!

I also wrote a similar answer previously that you may find helpful.

I'd like to add that arrows in Haskell are far simpler than they might appear based on the literature. They are simply abstractions of functions.

To see how this is practically useful, consider that you have a bunch of functions you want to compose, where some of them are pure and some are monadic. For example, f :: a -> b, g :: b -> m1 c, and h :: c -> m2 d.

Knowing each of the types involved, I could build a composition by hand, but the output type of the composition would have to reflect the intermediate monad types (in the above case, m1 (m2 d)). What if I just wanted to treat the functions as if they were just a -> b, b -> c, and c -> d? That is, I want to abstract away the presence of monads and reason only about the underlying types. I can use arrows to do exactly this.

Here is an arrow which abstracts away the presence of IO for functions in the IO monad, such that I can compose them with pure functions without the composing code needing to know that IO is involved. We start by defining an IOArrow to wrap IO functions:

data IOArrow a b = IOArrow { runIOArrow :: a -> IO b }

instance Category IOArrow where
  id = IOArrow return
  IOArrow f . IOArrow g = IOArrow $ f <=< g

instance Arrow IOArrow where
  arr f = IOArrow $ return . f
  first (IOArrow f) = IOArrow $ \(a, c) -> do
    x <- f a
    return (x, c)

Then I make some simple functions that I want to compose:

foo :: Int -> String
foo = show

bar :: String -> IO Int
bar = return . read

And use them:

main :: IO ()
main = do
  let f = arr (++ "!") . arr foo . IOArrow bar . arr id
  result <- runIOArrow f "123"
  putStrLn result

Here I am calling IOArrow and runIOArrow, but if I were passing these arrows around in a library of polymorphic functions, they would only need to accept arguments of type "Arrow a => a b c". None of the library code would need to be made aware that a monad was involved. Only the creator and end user of the arrow needs to know.

Generalizing IOArrow to work for functions in any Monad is called the "Kleisli arrow", and there is already a builtin arrow for doing just that:

main :: IO ()
main = do
  let g = arr (++ "!") . arr foo . Kleisli bar . arr id
  result <- runKleisli g "123"
  putStrLn result

You could of course also use arrow composition operators, and proc syntax, to make it a little clearer that arrows are involved:

arrowUser :: Arrow a => a String String -> a String String
arrowUser f = proc x -> do
  y <- f -< x
  returnA -< y

main :: IO ()
main = do
  let h =     arr (++ "!")
          <<< arr foo
          <<< Kleisli bar
          <<< arr id
  result <- runKleisli (arrowUser h) "123"
  putStrLn result

Here it should be clear that although main knows the IO monad is involved, arrowUser does not. There would be no way of "hiding" IO from arrowUser without arrows -- not without resorting to unsafePerformIO to turn the intermediate monadic value back into a pure one (and thus losing that context forever). For example:

arrowUser' :: (String -> String) -> String -> String
arrowUser' f x = f x

main' :: IO ()
main' = do
  let h      = (++ "!") . foo . unsafePerformIO . bar . id
      result = arrowUser' h "123"
  putStrLn result

Try writing that without unsafePerformIO, and without arrowUser' having to deal with any Monad type arguments.

There are John Hughes's Lecture Notes from an AFP (Advanced Functional Programming) workshop. Note they were written before the Arrow classes were changed in the Base libraries:

http://www.cse.chalmers.se/~rjmh/afp-arrows.pdf

When I began exploring Arrow compositions (essentially Monads), my approach was to break out of the functional syntax and composition it is most commonly associated with and begin by understanding its principles using a more declarative approach. With this in mind, I find the following breakdown more intuitive:

function(x) {
  func1result = func1(x)
  if(func1result == null) {
    return null
  } else {
    func2result = func2(func1result)
    if(func2result == null) {
      return null
    } else {
      func3(func2result)
    } 

So, essentially, for some value x, call one function first that we assume may return null (func1), another that may retun null or be assigned to null interchangably, finally, a third function that may also return null. Now given the value x, pass x to func3, only then, if it does not return null, pass this value into func2, and only if this value is not null, pass this value into func1. It's more deterministic and the control flow allows you to construct more sophisticated exception handling.

Here we can utilise the arrow composition: (func3 <=< func2 <=< func1) x.

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!