Should do-notation be avoided in Haskell?

前端 未结 7 1859
时光说笑
时光说笑 2020-11-29 01:32

Most Haskell tutorials teach the use of do-notation for IO.

I also started with the do-notation, but that makes my code look more like an imperative language more th

相关标签:
7条回答
  • 2020-11-29 01:54

    In my opinion <$> and <*> makes the code more FP than IO.

    Haskell is not a purely functional language because that "looks better". Sometimes it does, often it doesn't. The reason for staying functional is not its syntax but its semantics. It equips us with referential transparency, which makes it far easier to prove invariants, allows very high-level optimisations, makes it easy to write general-purpose code etc..

    None of this has much to do with syntax. Monadic computations are still purely functional – regardless of whether you write them with do notation or with <$>, <*> and >>=, so we get Haskell's benefits either way.

    However, notwithstanding the aforementioned FP-benefits, it is often more intuitive to think about algorithms from an imperative-like point of view – even if you're accustomed to how this is implemented through monads. In these cases, do notation gives you this quick insight of "order of computation", "origin of data", "point of modification", yet it's trivial to manually desugar it in your head to the >>= version, to grasp what's going on functionally.

    Applicative style is certainly great in many ways, however it is inherently point-free. That is often a good thing, but especially in more complex problems it can be very helpful to give names to "temporary" variables. When using only "FP" Haskell syntax, this requires either lambdas or explicitly named functions. Both have good use cases, but the former introduces quite a bit of noise right in the middle of your code and the latter rather disrupts the "flow" since it requires a where or let placed somewhere else from where you use it. do, on the other hand, allows you to introduce a named variable right where you need it, without introducing any noise at all.

    0 讨论(0)
  • 2020-11-29 01:56

    do notation in Haskell desugars in a pretty simple way.

    do
      x <- foo
      e1 
      e2
      ...
    

    turns into

     foo >>= \x ->
     do
       e1
       e2
    

    and

    do
      x
      e1
      e2
      ...
    

    into

    x >>
    do 
      e1
      e2
      ....
    

    This means you can really write any monadic computation with >>= and return. The only reason why we don't is because it's just more painful syntax. Monads are useful for imitating imperative code, do notation makes it look like it.

    The C-ish syntax makes it far easier for beginners to understand it. You're right it doesn't look as functional, but requiring someone to grok monads properly before they can use IO is a pretty big deterrent.

    The reason why we'd use >>= and return on the other hand is because it's much more compact for 1 - 2 liners. However it does tend to get a bit more unreadable for anything too big. So to directly answer your question, No please don't avoid do notation when appropriate.

    Lastly the two operators you saw, <$> and <*>, are actually fmap and applicative respectively, not monadic. They can't actually be used to represent a lot of what do notation does. They're more compact to be sure, but they don't let you easily name intermediate values. Personally, I use them about 80% of the time, mostly because I tend to write very small composable functions anyways which applicatives are great for.

    0 讨论(0)
  • 2020-11-29 01:56

    Should we avoid do-notation in any case?

    I'd say definitely no. For me, the most important criterion in such cases is to make the code as much readable and understandable as possible. The do-notation was introduced to make monadic code more understandable, and this is what matters. Sure, in many cases, using Applicative point-free notation is very nice, for example, instead of

    do
        f <- [(+1), (*7)]
        i <- [1..5]
        return $ f i
    

    we'd write just [(+1), (*7)] <*> [1..5].

    But there are many examples where not using the do-notation will make code very unreadable. Consider this example:

    nameDo :: IO ()
    nameDo = do putStr "What is your first name? "
                first <- getLine
                putStr "And your last name? "
                last <- getLine
                let full = first++" "++last
                putStrLn ("Pleased to meet you, "++full++"!")
    

    here it's quite clear what's happening and how the IO actions are sequenced. A do-free notation looks like

    name :: IO ()
    name = putStr "What is your first name? " >>
           getLine >>= f
           where
           f first = putStr "And your last name? " >>
                     getLine >>= g
                     where
                     g last = putStrLn ("Pleased to meet you, "++full++"!")
                              where
                              full = first++" "++last
    

    or like

    nameLambda :: IO ()
    nameLambda = putStr "What is your first name? " >>
                 getLine >>=
                 \first -> putStr "And your last name? " >>
                 getLine >>=
                 \last -> let full = first++" "++last
                              in  putStrLn ("Pleased to meet you, "++full++"!")
    

    which are both much less readable. Certainly, here the do-notation is much more preferable here.

    If you want to avoid using do, try structuring your code into many small functions. This is a good habit anyway, and you can reduce your do block to contain only 2-3 lines, which can be then replaced nicely by >>=, <$>,<*>` etc. For example, the above could be rewritten as

    name = getName >>= welcome
      where
        ask :: String -> IO String
        ask s = putStr s >> getLine
    
        join :: [String] -> String
        join  = concat . intersperse " "
    
        getName :: IO String
        getName  = join <$> traverse ask ["What is your first name? ",
                                          "And your last name? "]
    
        welcome :: String -> IO ()
        welcome full = putStrLn ("Pleased to meet you, "++full++"!")
    

    It's a bit longer, and maybe a bit less understandable to Haskell beginners (due to intersperse, concat and traverse), but in many cases those new, small functions can be reused in other places of your code, which will make it more structured and composable.


    I'd say the situation is very similar to whether to use the point-free notation or not. In many many cases (like in the top-most example [(+1), (*7)] <*> [1..5]) the point-free notation is great, but if you try to convert a complicated expression, you will get results like

    f = ((ite . (<= 1)) `flip` 1) <*>
         (((+) . (f . (subtract 1))) <*> (f . (subtract 2)))
      where
        ite e x y = if e then x else y
    

    It'd take me quite a long time to understand it without running the code. [Spoiler below:]

    f x = if (x <= 1) then 1 else f (x-1) + f (x-2)


    Also, why do most tutorials teach IO with do?

    Because IO is exactly designed to mimic imperative computations with side-effects, and so sequencing them using do is very natural.

    0 讨论(0)
  • 2020-11-29 01:57

    I often find myself first writing a monadic action in do notation, then refactoring it down to a simple monadic (or functorial) expression. This happens mostly when the do block turns out to be shorter than I expected. Sometimes I refactor in the opposite direction; it depends on the code in question.

    My general rule is: if the do block is only a couple of lines long it's usually neater as a short expression. A long do-block is probably more readable as it is, unless you can find a way to break it up into smaller, more composable functions.


    As a worked example, here's how we might transform your verbose code snippet into your simple one.

    main = do
        strFile <- readFile "testfile.txt"
        let analysisResult = stringAnalyzer strFile
        return analysisResult
    

    Firstly, notice that the last two lines have the form let x = y in return x. This can of course be transformed into simply return y.

    main = do
        strFile <- readFile "testfile.txt"
        return (stringAnalyzer strFile)
    

    This is a very short do block: we bind readFile "testfile.txt" to a name, and then do something to that name in the very next line. Let's try 'de-sugaring' it like the compiler will:

    main = readFile "testFile.txt" >>= \strFile -> return (stringAnalyser strFile)
    

    Look at the lambda-form on the right hand side of >>=. It's begging to be rewritten in point-free style: \x -> f $ g x becomes \x -> (f . g) x which becomes f . g.

    main = readFile "testFile.txt" >>= (return . stringAnalyser)
    

    This is already a lot neater than the original do block, but we can go further.

    Here's the only step that requires a little thought (though once you're familiar with monads and functors it should be obvious). The above function is suggestive of one of the monad laws: (m >>= return) == m. The only difference is that the function on the right hand side of >>= isn't just return - we do something to the object inside the monad before wrapping it back up in a return. But the pattern of 'doing something to a wrapped value without affecting its wrapper' is exactly what Functor is for. All monads are functors, so we can refactor this so that we don't even need the Monad instance:

    main = fmap stringAnalyser (readFile "testFile.txt")
    

    Finally, note that <$> is just another way of writing fmap.

    main = stringAnalyser <$> readFile "testFile.txt"
    

    I think this version is a lot clearer than the original code. It can be read like a sentence: "main is stringAnalyser applied to the result of reading "testFile.txt"". The original version bogs you down in the procedural details of its operation.


    Addendum: my comment that 'all monads are functors' can in fact be justified by the observation that m >>= (return . f) (aka the standard library's liftM) is the same as fmap f m. If you have an instance of Monad, you get an instance of Functor 'for free' - just define fmap = liftM! If someone's defined a Monad instance for their type but not instances for Functor and Applicative, I'd call that a bug. Clients expect to be able to use Functor methods on instances of Monad without too much hassle.

    0 讨论(0)
  • 2020-11-29 01:57

    Applicative style should be encouraged because it composes (and it is prettier). Monadic style is necessary in certain cases. See https://stackoverflow.com/a/7042674/1019205 for an in depth explanation.

    0 讨论(0)
  • 2020-11-29 02:03

    The do notation is expanded to an expression using the functions (>>=) and (>>), and the let expression. So it is not part of the core of the language.

    (>>=) and (>>) are used to combine actions sequentially and they are essential when the result of an action changes the structure of the following actions.

    In the example given in the question this is not apparent as there is only one IO action, therefore no sequencing is needed.

    Consider for example the expression

    do x <- getLine
       print (length x)
       y <- getLine
       return (x ++ y)
    

    which is translated to

    getLine >>= \x ->
    print (length x) >>
    getLine >>= \y ->
    return (x ++ y)
    

    In this example the do notation (or the (>>=) and (>>) functions) is needed for sequencing the IO actions.

    So soon or later the programmer will need it.

    0 讨论(0)
提交回复
热议问题