Having briefly looked at Haskell recently, what would be a brief, succinct, practical explanation as to what a monad essentially is?
I have found most expla
(See also the answers at What is a monad?)
A good motivation to Monads is sigfpe (Dan Piponi)'s You Could Have Invented Monads! (And Maybe You Already Have). There are a LOT of other monad tutorials, many of which misguidedly try to explain monads in "simple terms" using various analogies: this is the monad tutorial fallacy; avoid them.
As DR MacIver says in Tell us why your language sucks:
So, things I hate about Haskell:
Let’s start with the obvious. Monad tutorials. No, not monads. Specifically the tutorials. They’re endless, overblown and dear god are they tedious. Further, I’ve never seen any convincing evidence that they actually help. Read the class definition, write some code, get over the scary name.
You say you understand the Maybe monad? Good, you're on your way. Just start using other monads and sooner or later you'll understand what monads are in general.
[If you are mathematically oriented, you might want to ignore the dozens of tutorials and learn the definition, or follow lectures in category theory :) The main part of the definition is that a Monad M involves a "type constructor" that defines for each existing type "T" a new type "M T", and some ways for going back and forth between "regular" types and "M" types.]
Also, surprisingly enough, one of the best introductions to monads is actually one of the early academic papers introducing monads, Philip Wadler's Monads for functional programming. It actually has practical, non-trivial motivating examples, unlike many of the artificial tutorials out there.
You should first understand what a functor is. Before that, understand higher-order functions.
A higher-order function is simply a function that takes a function as an argument.
A functor is any type construction T
for which there exists a higher-order function, call it map
, that transforms a function of type a -> b
(given any two types a
and b
) into a function T a -> T b
. This map
function must also obey the laws of identity and composition such that the following expressions return true for all p
and q
(Haskell notation):
map id = id
map (p . q) = map p . map q
For example, a type constructor called List
is a functor if it comes equipped with a function of type (a -> b) -> List a -> List b
which obeys the laws above. The only practical implementation is obvious. The resulting List a -> List b
function iterates over the given list, calling the (a -> b)
function for each element, and returns the list of the results.
A monad is essentially just a functor T
with two extra methods, join
, of type T (T a) -> T a
, and unit
(sometimes called return
, fork
, or pure
) of type a -> T a
. For lists in Haskell:
join :: [[a]] -> [a]
pure :: a -> [a]
Why is that useful? Because you could, for example, map
over a list with a function that returns a list. Join
takes the resulting list of lists and concatenates them. List
is a monad because this is possible.
You can write a function that does map
, then join
. This function is called bind
, or flatMap
, or (>>=)
, or (=<<)
. This is normally how a monad instance is given in Haskell.
A monad has to satisfy certain laws, namely that join
must be associative. This means that if you have a value x
of type [[[a]]]
then join (join x)
should equal join (map join x)
. And pure
must be an identity for join
such that join (pure x) == x
.
This answer begins with a motivating example, works through the example, derives an example of a monad, and formally defines "monad".
Consider these three functions in pseudocode:
f(<x, messages>) := <x, messages "called f. ">
g(<x, messages>) := <x, messages "called g. ">
wrap(x) := <x, "">
f
takes an ordered pair of the form <x, messages>
and returns an ordered pair. It leaves the first item untouched and appends "called f. "
to the second item. Same with g
.
You can compose these functions and get your original value, along with a string that shows which order the functions were called in:
f(g(wrap(x)))
= f(g(<x, "">))
= f(<x, "called g. ">)
= <x, "called g. called f. ">
You dislike the fact that f
and g
are responsible for appending their own log messages to the previous logging information. (Just imagine for the sake of argument that instead of appending strings, f
and g
must perform complicated logic on the second item of the pair. It would be a pain to repeat that complicated logic in two -- or more -- different functions.)
You prefer to write simpler functions:
f(x) := <x, "called f. ">
g(x) := <x, "called g. ">
wrap(x) := <x, "">
But look at what happens when you compose them:
f(g(wrap(x)))
= f(g(<x, "">))
= f(<<x, "">, "called g. ">)
= <<<x, "">, "called g. ">, "called f. ">
The problem is that passing a pair into a function does not give you what you want. But what if you could feed a pair into a function:
feed(f, feed(g, wrap(x)))
= feed(f, feed(g, <x, "">))
= feed(f, <x, "called g. ">)
= <x, "called g. called f. ">
Read feed(f, m)
as "feed m
into f
". To feed a pair <x, messages>
into a function f
is to pass x
into f
, get <y, message>
out of f
, and return <y, messages message>
.
feed(f, <x, messages>) := let <y, message> = f(x)
in <y, messages message>
Notice what happens when you do three things with your functions:
First: if you wrap a value and then feed the resulting pair into a function:
feed(f, wrap(x))
= feed(f, <x, "">)
= let <y, message> = f(x)
in <y, "" message>
= let <y, message> = <x, "called f. ">
in <y, "" message>
= <x, "" "called f. ">
= <x, "called f. ">
= f(x)
That is the same as passing the value into the function.
Second: if you feed a pair into wrap
:
feed(wrap, <x, messages>)
= let <y, message> = wrap(x)
in <y, messages message>
= let <y, message> = <x, "">
in <y, messages message>
= <x, messages "">
= <x, messages>
That does not change the pair.
Third: if you define a function that takes x
and feeds g(x)
into f
:
h(x) := feed(f, g(x))
and feed a pair into it:
feed(h, <x, messages>)
= let <y, message> = h(x)
in <y, messages message>
= let <y, message> = feed(f, g(x))
in <y, messages message>
= let <y, message> = feed(f, <x, "called g. ">)
in <y, messages message>
= let <y, message> = let <z, msg> = f(x)
in <z, "called g. " msg>
in <y, messages message>
= let <y, message> = let <z, msg> = <x, "called f. ">
in <z, "called g. " msg>
in <y, messages message>
= let <y, message> = <x, "called g. " "called f. ">
in <y, messages message>
= <x, messages "called g. " "called f. ">
= feed(f, <x, messages "called g. ">)
= feed(f, feed(g, <x, messages>))
That is the same as feeding the pair into g
and feeding the resulting pair into f
.
You have most of a monad. Now you just need to know about the data types in your program.
What type of value is <x, "called f. ">
? Well, that depends on what type of value x
is. If x
is of type t
, then your pair is a value of type "pair of t
and string". Call that type M t
.
M
is a type constructor: M
alone does not refer to a type, but M _
refers to a type once you fill in the blank with a type. An M int
is a pair of an int and a string. An M string
is a pair of a string and a string. Etc.
Congratulations, you have created a monad!
Formally, your monad is the tuple <M, feed, wrap>
.
A monad is a tuple <M, feed, wrap>
where:
M
is a type constructor.feed
takes a (function that takes a t
and returns an M u
) and an M t
and returns an M u
.wrap
takes a v
and returns an M v
.t
, u
, and v
are any three types that may or may not be the same. A monad satisfies the three properties you proved for your specific monad:
Feeding a wrapped t
into a function is the same as passing the unwrapped t
into the function.
Formally: feed(f, wrap(x)) = f(x)
Feeding an M t
into wrap
does nothing to the M t
.
Formally: feed(wrap, m) = m
Feeding an M t
(call it m
) into a function that
t
into g
M u
(call it n
) from g
n
into f
is the same as
m
into g
n
from g
n
into f
Formally: feed(h, m) = feed(f, feed(g, m))
where h(x) := feed(f, g(x))
Typically, feed
is called bind
(AKA >>=
in Haskell) and wrap
is called return
.
{-# LANGUAGE InstanceSigs #-}
newtype Id t = Id t
instance Monad Id where
return :: t -> Id t
return = Id
(=<<) :: (a -> Id b) -> Id a -> Id b
f =<< (Id x) = f x
The application operator $
of functions
forall a b. a -> b
is canonically defined
($) :: (a -> b) -> a -> b
f $ x = f x
infixr 0 $
in terms of Haskell-primitive function application f x
(infixl 10
).
Composition .
is defined in terms of $
as
(.) :: (b -> c) -> (a -> b) -> (a -> c)
f . g = \ x -> f $ g x
infixr 9 .
and satisfies the equivalences forall f g h.
f . id = f :: c -> d Right identity
id . g = g :: b -> c Left identity
(f . g) . h = f . (g . h) :: a -> d Associativity
.
is associative, and id
is its right and left identity.
In programming, a monad is a functor type constructor with an instance of the monad type class. There are several equivalent variants of definition and implementation, each carrying slightly different intuitions about the monad abstraction.
A functor is a type constructor f
of kind * -> *
with an instance of the functor type class.
{-# LANGUAGE KindSignatures #-}
class Functor (f :: * -> *) where
map :: (a -> b) -> (f a -> f b)
In addition to following statically enforced type protocol, instances of the functor type class must obey the algebraic functor laws forall f g.
map id = id :: f t -> f t Identity
map f . map g = map (f . g) :: f a -> f c Composition / short cut fusion
Functor computations have the type
forall f t. Functor f => f t
A computation c r
consists in results r
within context c
.
Unary monadic functions or Kleisli arrows have the type
forall m a b. Functor m => a -> m b
Kleisi arrows are functions that take one argument a
and return a monadic computation m b
.
Monads are canonically defined in terms of the Kleisli triple forall m. Functor m =>
(m, return, (=<<))
implemented as the type class
class Functor m => Monad m where
return :: t -> m t
(=<<) :: (a -> m b) -> m a -> m b
infixr 1 =<<
The Kleisli identity return
is a Kleisli arrow that promotes a value t
into monadic context m
. Extension or Kleisli application =<<
applies a Kleisli arrow a -> m b
to results of a computation m a
.
Kleisli composition <=<
is defined in terms of extension as
(<=<) :: Monad m => (b -> m c) -> (a -> m b) -> (a -> m c)
f <=< g = \ x -> f =<< g x
infixr 1 <=<
<=<
composes two Kleisli arrows, applying the left arrow to results of the right arrow’s application.
Instances of the monad type class must obey the monad laws, most elegantly stated in terms of Kleisli composition: forall f g h.
f <=< return = f :: c -> m d Right identity
return <=< g = g :: b -> m c Left identity
(f <=< g) <=< h = f <=< (g <=< h) :: a -> m d Associativity
<=<
is associative, and return
is its right and left identity.
The identity type
type Id t = t
is the identity function on types
Id :: * -> *
Interpreted as a functor,
return :: t -> Id t
= id :: t -> t
(=<<) :: (a -> Id b) -> Id a -> Id b
= ($) :: (a -> b) -> a -> b
(<=<) :: (b -> Id c) -> (a -> Id b) -> (a -> Id c)
= (.) :: (b -> c) -> (a -> b) -> (a -> c)
In canonical Haskell, the identity monad is defined
newtype Id t = Id t
instance Functor Id where
map :: (a -> b) -> Id a -> Id b
map f (Id x) = Id (f x)
instance Monad Id where
return :: t -> Id t
return = Id
(=<<) :: (a -> Id b) -> Id a -> Id b
f =<< (Id x) = f x
An option type
data Maybe t = Nothing | Just t
encodes computation Maybe t
that not necessarily yields a result t
, computation that may “fail”. The option monad is defined
instance Functor Maybe where
map :: (a -> b) -> (Maybe a -> Maybe b)
map f (Just x) = Just (f x)
map _ Nothing = Nothing
instance Monad Maybe where
return :: t -> Maybe t
return = Just
(=<<) :: (a -> Maybe b) -> Maybe a -> Maybe b
f =<< (Just x) = f x
_ =<< Nothing = Nothing
a -> Maybe b
is applied to a result only if Maybe a
yields a result.
newtype Nat = Nat Int
The natural numbers can be encoded as those integers greater than or equal to zero.
toNat :: Int -> Maybe Nat
toNat i | i >= 0 = Just (Nat i)
| otherwise = Nothing
The natural numbers are not closed under subtraction.
(-?) :: Nat -> Nat -> Maybe Nat
(Nat n) -? (Nat m) = toNat (n - m)
infixl 6 -?
The option monad covers a basic form of exception handling.
(-? 20) <=< toNat :: Int -> Maybe Nat
The list monad, over the list type
data [] t = [] | t : [t]
infixr 5 :
and its additive monoid operation “append”
(++) :: [t] -> [t] -> [t]
(x : xs) ++ ys = x : xs ++ ys
[] ++ ys = ys
infixr 5 ++
encodes nonlinear computation [t]
yielding a natural amount 0, 1, ...
of results t
.
instance Functor [] where
map :: (a -> b) -> ([a] -> [b])
map f (x : xs) = f x : map f xs
map _ [] = []
instance Monad [] where
return :: t -> [t]
return = (: [])
(=<<) :: (a -> [b]) -> [a] -> [b]
f =<< (x : xs) = f x ++ (f =<< xs)
_ =<< [] = []
Extension =<<
concatenates ++
all lists [b]
resulting from applications f x
of a Kleisli arrow a -> [b]
to elements of [a]
into a single result list [b]
.
Let the proper divisors of a positive integer n
be
divisors :: Integral t => t -> [t]
divisors n = filter (`divides` n) [2 .. n - 1]
divides :: Integral t => t -> t -> Bool
(`divides` n) = (== 0) . (n `rem`)
then
forall n. let { f = f <=< divisors } in f n = []
In defining the monad type class, instead of extension =<<
, the Haskell standard uses its flip, the bind operator >>=
.
class Applicative m => Monad m where
(>>=) :: forall a b. m a -> (a -> m b) -> m b
(>>) :: forall a b. m a -> m b -> m b
m >> k = m >>= \ _ -> k
{-# INLINE (>>) #-}
return :: a -> m a
return = pure
For simplicity's sake, this explanation uses the type class hierarchy
class Functor f
class Functor m => Monad m
In Haskell, the current standard hierarchy is
class Functor f
class Functor p => Applicative p
class Applicative m => Monad m
because not only is every monad a functor, but every applicative is a functor and every monad is an applicative, too.
Using the list monad, the imperative pseudocode
for a in (1, ..., 10)
for b in (1, ..., 10)
p <- a * b
if even(p)
yield p
roughly translates to the do block,
do a <- [1 .. 10]
b <- [1 .. 10]
let p = a * b
guard (even p)
return p
the equivalent monad comprehension,
[ p | a <- [1 .. 10], b <- [1 .. 10], let p = a * b, even p ]
and the expression
[1 .. 10] >>= (\ a ->
[1 .. 10] >>= (\ b ->
let p = a * b in
guard (even p) >> -- [ () | even p ] >>
return p
)
)
Do notation and monad comprehensions are syntactic sugar for nested bind expressions. The bind operator is used for local name binding of monadic results.
let x = v in e = (\ x -> e) $ v = v & (\ x -> e)
do { r <- m; c } = (\ r -> c) =<< m = m >>= (\ r -> c)
where
(&) :: a -> (a -> b) -> b
(&) = flip ($)
infixl 0 &
The guard function is defined
guard :: Additive m => Bool -> m ()
guard True = return ()
guard False = fail
where the unit type or “empty tuple”
data () = ()
Additive monads that support choice and failure can be abstracted over using a type class
class Monad m => Additive m where
fail :: m t
(<|>) :: m t -> m t -> m t
infixl 3 <|>
instance Additive Maybe where
fail = Nothing
Nothing <|> m = m
m <|> _ = m
instance Additive [] where
fail = []
(<|>) = (++)
where fail
and <|>
form a monoid forall k l m.
k <|> fail = k
fail <|> l = l
(k <|> l) <|> m = k <|> (l <|> m)
and fail
is the absorbing/annihilating zero element of additive monads
_ =<< fail = fail
If in
guard (even p) >> return p
even p
is true, then the guard produces [()]
, and, by the definition of >>
, the local constant function
\ _ -> return p
is applied to the result ()
. If false, then the guard produces the list monad’s fail
( []
), which yields no result for a Kleisli arrow to be applied >>
to, so this p
is skipped over.
Infamously, monads are used to encode stateful computation.
A state processor is a function
forall st t. st -> (t, st)
that transitions a state st
and yields a result t
. The state st
can be anything. Nothing, flag, count, array, handle, machine, world.
The type of state processors is usually called
type State st t = st -> (t, st)
The state processor monad is the kinded * -> *
functor State st
. Kleisli arrows of the state processor monad are functions
forall st a b. a -> (State st) b
In canonical Haskell, the lazy version of the state processor monad is defined
newtype State st t = State { stateProc :: st -> (t, st) }
instance Functor (State st) where
map :: (a -> b) -> ((State st) a -> (State st) b)
map f (State p) = State $ \ s0 -> let (x, s1) = p s0
in (f x, s1)
instance Monad (State st) where
return :: t -> (State st) t
return x = State $ \ s -> (x, s)
(=<<) :: (a -> (State st) b) -> (State st) a -> (State st) b
f =<< (State p) = State $ \ s0 -> let (x, s1) = p s0
in stateProc (f x) s1
A state processor is run by supplying an initial state:
run :: State st t -> st -> (t, st)
run = stateProc
eval :: State st t -> st -> t
eval = fst . run
exec :: State st t -> st -> st
exec = snd . run
State access is provided by primitives get
and put
, methods of abstraction over stateful monads:
{-# LANGUAGE MultiParamTypeClasses, FunctionalDependencies #-}
class Monad m => Stateful m st | m -> st where
get :: m st
put :: st -> m ()
m -> st
declares a functional dependency of the state type st
on the monad m
; that a State t
, for example, will determine the state type to be t
uniquely.
instance Stateful (State st) st where
get :: State st st
get = State $ \ s -> (s, s)
put :: st -> State st ()
put s = State $ \ _ -> ((), s)
with the unit type used analogously to void
in C.
modify :: Stateful m st => (st -> st) -> m ()
modify f = do
s <- get
put (f s)
gets :: Stateful m st => (st -> t) -> m t
gets f = do
s <- get
return (f s)
gets
is often used with record field accessors.
The state monad equivalent of the variable threading
let s0 = 34
s1 = (+ 1) s0
n = (* 12) s1
s2 = (+ 7) s1
in (show n, s2)
where s0 :: Int
, is the equally referentially transparent, but infinitely more elegant and practical
(flip run) 34
(do
modify (+ 1)
n <- gets (* 12)
modify (+ 7)
return (show n)
)
modify (+ 1)
is a computation of type State Int ()
, except for its effect equivalent to return ()
.
(flip run) 34
(modify (+ 1) >>
gets (* 12) >>= (\ n ->
modify (+ 7) >>
return (show n)
)
)
The monad law of associativity can be written in terms of >>=
forall m f g.
(m >>= f) >>= g = m >>= (\ x -> f x >>= g)
or
do { do { do {
r1 <- do { x <- m; r0 <- m;
r0 <- m; = do { = r1 <- f r0;
f r0 r1 <- f x; g r1
}; g r1 }
g r1 }
} }
Like in expression-oriented programming (e.g. Rust), the last statement of a block represents its yield. The bind operator is sometimes called a “programmable semicolon”.
Iteration control structure primitives from structured imperative programming are emulated monadically
for :: Monad m => (a -> m b) -> [a] -> m ()
for f = foldr ((>>) . f) (return ())
while :: Monad m => m Bool -> m t -> m ()
while c m = do
b <- c
if b then m >> while c m
else return ()
forever :: Monad m => m t
forever m = m >> forever m
data World
The I/O world state processor monad is a reconciliation of pure Haskell and the real world, of functional denotative and imperative operational semantics. A close analogue of the actual strict implementation:
type IO t = World -> (t, World)
Interaction is facilitated by impure primitives
getChar :: IO Char
putChar :: Char -> IO ()
readFile :: FilePath -> IO String
writeFile :: FilePath -> String -> IO ()
hSetBuffering :: Handle -> BufferMode -> IO ()
hTell :: Handle -> IO Integer
. . . . . .
The impurity of code that uses IO
primitives is permanently protocolized by the type system. Because purity is awesome, what happens in IO
, stays in IO
.
unsafePerformIO :: IO t -> t
Or, at least, should.
The type signature of a Haskell program
main :: IO ()
main = putStrLn "Hello, World!"
expands to
World -> ((), World)
A function that transforms a world.
The category whiches objects are Haskell types and whiches morphisms are functions between Haskell types is, “fast and loose”, the category Hask
.
A functor T
is a mapping from a category C
to a category D
; for each object in C
an object in D
Tobj : Obj(C) -> Obj(D)
f :: * -> *
and for each morphism in C
a morphism in D
Tmor : HomC(X, Y) -> HomD(Tobj(X), Tobj(Y))
map :: (a -> b) -> (f a -> f b)
where X
, Y
are objects in C
. HomC(X, Y)
is the homomorphism class of all morphisms X -> Y
in C
. The functor must preserve morphism identity and composition, the “structure” of C
, in D
.
Tmor Tobj
T(id) = id : T(X) -> T(X) Identity
T(f) . T(g) = T(f . g) : T(X) -> T(Z) Composition
The Kleisli category of a category C
is given by a Kleisli triple
<T, eta, _*>
of an endofunctor
T : C -> C
(f
), an identity morphism eta
(return
), and an extension operator *
(=<<
).
Each Kleisli morphism in Hask
f : X -> T(Y)
f :: a -> m b
by the extension operator
(_)* : Hom(X, T(Y)) -> Hom(T(X), T(Y))
(=<<) :: (a -> m b) -> (m a -> m b)
is given a morphism in Hask
’s Kleisli category
f* : T(X) -> T(Y)
(f =<<) :: m a -> m b
Composition in the Kleisli category .T
is given in terms of extension
f .T g = f* . g : X -> T(Z)
f <=< g = (f =<<) . g :: a -> m c
and satisfies the category axioms
eta .T g = g : Y -> T(Z) Left identity
return <=< g = g :: b -> m c
f .T eta = f : Z -> T(U) Right identity
f <=< return = f :: c -> m d
(f .T g) .T h = f .T (g .T h) : X -> T(U) Associativity
(f <=< g) <=< h = f <=< (g <=< h) :: a -> m d
which, applying the equivalence transformations
eta .T g = g
eta* . g = g By definition of .T
eta* . g = id . g forall f. id . f = f
eta* = id forall f g h. f . h = g . h ==> f = g
(f .T g) .T h = f .T (g .T h)
(f* . g)* . h = f* . (g* . h) By definition of .T
(f* . g)* . h = f* . g* . h . is associative
(f* . g)* = f* . g* forall f g h. f . h = g . h ==> f = g
in terms of extension are canonically given
eta* = id : T(X) -> T(X) Left identity
(return =<<) = id :: m t -> m t
f* . eta = f : Z -> T(U) Right identity
(f =<<) . return = f :: c -> m d
(f* . g)* = f* . g* : T(X) -> T(Z) Associativity
(((f =<<) . g) =<<) = (f =<<) . (g =<<) :: m a -> m c
Monads can also be defined in terms not of Kleislian extension, but a natural transformation mu
, in programming called join
. A monad is defined in terms of mu
as a triple over a category C
, of an endofunctor
T : C -> C
f :: * -> *
and two natural tranformations
eta : Id -> T
return :: t -> f t
mu : T . T -> T
join :: f (f t) -> f t
satisfying the equivalences
mu . T(mu) = mu . mu : T . T . T -> T . T Associativity
join . map join = join . join :: f (f (f t)) -> f t
mu . T(eta) = mu . eta = id : T -> T Identity
join . map return = join . return = id :: f t -> f t
The monad type class is then defined
class Functor m => Monad m where
return :: t -> m t
join :: m (m t) -> m t
The canonical mu
implementation of the option monad:
instance Monad Maybe where
return = Just
join (Just m) = m
join Nothing = Nothing
The concat
function
concat :: [[a]] -> [a]
concat (x : xs) = x ++ concat xs
concat [] = []
is the join
of the list monad.
instance Monad [] where
return :: t -> [t]
return = (: [])
(=<<) :: (a -> [b]) -> ([a] -> [b])
(f =<<) = concat . map f
Implementations of join
can be translated from extension form using the equivalence
mu = id* : T . T -> T
join = (id =<<) :: m (m t) -> m t
The reverse translation from mu
to extension form is given by
f* = mu . T(f) : T(X) -> T(Y)
(f =<<) = join . map f :: m a -> m b
Philip Wadler: Monads for functional programming
Simon L Peyton Jones, Philip Wadler: Imperative functional programming
Jonathan M. D. Hill, Keith Clarke: An introduction to category theory, category theory monads, and their relationship to functional programming ´
Kleisli category
Eugenio Moggi: Notions of computation and monads
What a monad is not
But why should a theory so abstract be of any use for programming?
The answer is simple: as computer scientists, we value abstraction! When we design the interface to a software component, we want it to reveal as little as possible about the implementation. We want to be able to replace the implementation with many alternatives, many other ‘instances’ of the same ‘concept’. When we design a generic interface to many program libraries, it is even more important that the interface we choose have a variety of implementations. It is the generality of the monad concept which we value so highly, it is because category theory is so abstract that its concepts are so useful for programming.
It is hardly suprising, then, that the generalisation of monads that we present below also has a close connection to category theory. But we stress that our purpose is very practical: it is not to ‘implement category theory’, it is to find a more general way to structure combinator libraries. It is simply our good fortune that mathematicians have already done much of the work for us!
from Generalising Monads to Arrows by John Hughes
But, You could have invented Monads!
sigfpe says:
But all of these introduce monads as something esoteric in need of explanation. But what I want to argue is that they aren't esoteric at all. In fact, faced with various problems in functional programming you would have been led, inexorably, to certain solutions, all of which are examples of monads. In fact, I hope to get you to invent them now if you haven't already. It's then a small step to notice that all of these solutions are in fact the same solution in disguise. And after reading this, you might be in a better position to understand other documents on monads because you'll recognise everything you see as something you've already invented.
Many of the problems that monads try to solve are related to the issue of side effects. So we'll start with them. (Note that monads let you do more than handle side-effects, in particular many types of container object can be viewed as monads. Some of the introductions to monads find it hard to reconcile these two different uses of monads and concentrate on just one or the other.)
In an imperative programming language such as C++, functions behave nothing like the functions of mathematics. For example, suppose we have a C++ function that takes a single floating point argument and returns a floating point result. Superficially it might seem a little like a mathematical function mapping reals to reals, but a C++ function can do more than just return a number that depends on its arguments. It can read and write the values of global variables as well as writing output to the screen and receiving input from the user. In a pure functional language, however, a function can only read what is supplied to it in its arguments and the only way it can have an effect on the world is through the values it returns.
My favorite Monad tutorial:
http://www.haskell.org/haskellwiki/All_About_Monads
(out of 170,000 hits on a Google search for "monad tutorial"!)
@Stu: The point of monads is to allow you to add (usually) sequential semantics to otherwise pure code; you can even compose monads (using Monad Transformers) and get more interesting and complicated combined semantics, like parsing with error handling, shared state, and logging, for example. All of this is possible in pure code, monads just allow you to abstract it away and reuse it in modular libraries (always good in programming), as well as providing convenient syntax to make it look imperative.
Haskell already has operator overloading[1]: it uses type classes much the way one might use interfaces in Java or C# but Haskell just happens to also allow non-alphanumeric tokens like + && and > as infix identifiers. It's only operator overloading in your way of looking at it if you mean "overloading the semicolon" [2]. It sounds like black magic and asking for trouble to "overload the semicolon" (picture enterprising Perl hackers getting wind of this idea) but the point is that without monads there is no semicolon, since purely functional code does not require or allow explicit sequencing.
This all sounds much more complicated than it needs to. sigfpe's article is pretty cool but uses Haskell to explain it, which sort of fails to break the chicken and egg problem of understanding Haskell to grok Monads and understanding Monads to grok Haskell.
[1] This is a separate issue from monads but monads use Haskell's operator overloading feature.
[2] This is also an oversimplification since the operator for chaining monadic actions is >>= (pronounced "bind") but there is syntactic sugar ("do") that lets you use braces and semicolons and/or indentation and newlines.