Right now, I have some code that essentially works like this:
data Expression
= Literal Bool
| Variable String
| Not Expression
| Or Expressio
You could write a generic simplifier for all binary operations:
simplifyBinWith :: (Bool -> Bool -> Bool) -- the boolean operation
-> (Expression -> Expression -> Expression) -- the constructor
-> Expression -> Expression -- the two operands
-> Expression) -- the simplified result
simplifyBinWith op cons a b = case (simplify a, simplify b) of
(Literal x, Literal y) -> Literal (op x y)
(Literal x, b') -> tryAll (x `op`) b'
(a', Literal y) -> tryAll (`op` y) a'
(a', b') -> cons a' b'
where
tryAll f term = case (f True, f False) of -- what would f do if term was true of false
(True, True) -> Literal True
(True, False) -> term
(False, True) -> Not term
(False, False) -> Literal False
That way, your simplify
function would become
simplify :: Expression -> Expression
simplify (Not e) = case simplify e of
(Literal b) -> Literal (not b)
e' -> Not e'
simplify (And a b) = simplifyBinWith (&&) And a b
simplify (Or a b) = simplifyBinWith (||) Or a b
simplify t = t
and could be easily extended to more binary operations. It would also work well with the Binary Op Expression Expression
idea, you'd pass Op
instead of an Expression
constructor to simplifyBinWith
and the pattern in simplify
could be generalised:
simplify :: Expression -> Expression
simplify (Not e) = case simplify e of
(Literal b) -> Literal (not b)
e' -> Not e'
simplify (Binary op a b) = simplifyBinWith (case op of
And -> (&&)
Or -> (||)
Xor -> (/=)
Implies -> (<=)
Equals -> (==)
…
) op a b
simplify t = t
where
simplifyBinWith f op a b = case (simplify a, simplify b) of
(Literal x, Literal y) -> Literal (f x y)
…
(a', b') -> Binary op a' b'
Basically the problem is that you have to write out simplify
of the subexpressions in each clause, over and over again. It would be better to first get all the subexpressions done before even considering laws involving the top-level operator. One simple way is to add an auxiliary version of simplify
, that doesn't recurse down:
simplify :: Expression -> Expression
simplify (Literal b) = Literal b
simplify (Variable s) = Variable s
simplify (Not e) = simplify' . Not $ simplify e
simplify (And a b) = simplify' $ And (simplify a) (simplify b)
simplify (Or a b) = simplify' $ Or (simplify a) (simplify b)
simplify' :: Expression -> Expression
simplify' (Not (Literal b)) = Literal $ not b
simplify' (And (Literal False) _) = Literal False
...
With the only small amount of operations you have in booleans, this is probably the most sensible approach. However with more operations, the duplication in simplify
might still be worth to avoid. To that end, you can conflate the unary and binary operations to a common constructor:
data Expression
= Literal Bool
| Variable String
| BoolPrefix BoolPrefix Expression
| BoolInfix BoolInfix Expression Expression
deriving Eq
data BoolPrefix = Negation
data BoolInfix = AndOp | OrOp
and then you have just
simplify (Literal b) = Literal b
simplify (Variable s) = Variable s
simplify (BoolPrefix bpf e) = simplify' . BoolPrefix bpf $ simplify e
simplify (BoolInfix bifx a b) = simplify' $ BoolInfix bifx (simplify a) (simplify b)
Obviously this makes simplify'
more awkward though, so perhaps not such a good idea. You can however get around this syntactical overhead by defining specialised pattern synonyms:
{-# LANGUAGE PatternSynonyms #-}
pattern Not :: Expression -> Expression
pattern Not x = BoolPrefix Negation x
infixr 3 :∧
pattern (:∧) :: Expression -> Expression -> Expression
pattern a:∧b = BoolInfix AndOp a b
infixr 2 :∨
pattern (:∨) :: Expression -> Expression -> Expression
pattern a:∨b = BoolInfix OrOp a b
For that matter, perhaps also
pattern F, T :: Expression
pattern F = Literal False
pattern T = Literal True
With that, you can then write
simplify' :: Expression -> Expression
simplify' (Not (Literal b)) = Literal $ not b
simplify' (F :∧ _) = F
simplify' (_ :∧ F) = F
simplify' (T :∨ _) = T
simplify' (a :∧ Not b) | a==b = T
...
I should add a caveat though: when I tried something similar to those pattern synonyms, not for booleans but affine mappings, it made the compiler extremely slow. (Also, GHC-7.10 didn't yet support polymorphic pattern synonyms yet; this has changed quite a bit as of now.)
Note also that all this will not generally yield the simplest possible form –
for that, you'd need to find the fixed point of simplify
.
Carrying on with your Binary Op Expression Expression
idea, we could have the datatype:
data Expression
= Literal Bool
| Variable String
| Not Expression
| Binary Op Expression Expression
deriving Eq
data Op = Or | And deriving Eq
And an auxiliary function
{-# LANGUAGE ViewPatterns #-}
simplifyBinary :: Op -> Expression -> Expression -> Expression
simplifyBinary binop (simplify -> leftexp) (simplify -> rightexp) =
case oneway binop leftexp rightexp ++ oneway binop rightexp leftexp of
simplified : _ -> simplified
[] -> Binary binop leftexp rightexp
where
oneway :: Op -> Expression -> Expression -> [Expression]
oneway And (Literal False) _ = [Literal False]
oneway Or (Literal True) _ = [Literal True]
-- more cases here
oneway _ _ _ = []
The idea is that you would put the simplification cases in oneway
and then simplifyBinary
would take care of reversing the arguments, to avoid having to write the symmetric cases.
I think Einstein said, "Simplify as much as possible, but no more." You have yourself a complicated datatype, and a correspondingly complicated concept, so I assume any technique can only be so much cleaner for the problem at hand.
That said, the first option is to use instead a case structure.
simplify x = case x of
Literal _ -> x
Variable _ -> x
Not e -> simplifyNot $ simplify e
...
where
sharedFunc1 = ...
sharedFunc2 = ...
This has the added benefit of including shared functions which will be usable by all cases but not at the top level namespace. I also like how the cases are freed of their parenthesis. (Also note that in the first two cases i just return the original term, not creating a new one). I often use this sort of structure to just break out other simplify functions, as in the case of Not
.
This problem in particular may lend itself to basing Expression
on an underlying functor, so that you may fmap
a simplification of the subexpressions and then perform the specific combinations of the given case. It will look something like the following:
simplify :: Expression' -> Expression'
simplify = Exp . reduce . fmap simplify . unExp
The steps in this are unwrapping Expression'
into the underlying functor representation, mapping the simplification on the underlying term, and then reducing that simplification and wrapping back up into the new Expression'
{-# Language DeriveFunctor #-}
newtype Expression' = Exp { unExp :: ExpressionF Expression' }
data ExpressionF e
= Literal Bool
| Variable String
| Not e
| Or e e
| And e e
deriving (Eq,Functor)
Now, I have pushed the complexity off into the reduce
function, which is only a little less complex because it doesn't have to worry about first reducing the subterm. But it will now contain solely the business logic of combining one term with another.
This may or may not be a good technique for you, but it may make some enhancements easier. For instance, if it is possible to form invalid expressions in your language, you could simplify that with Maybe
valued failures.
simplifyMb :: Expression' -> Maybe Expression'
simplifyMb = fmap Exp . reduceMb <=< traverse simplifyMb . unExp
Here, traverse
will apply simplfyMb
to the subterms of the ExpressionF
, resulting in an expression of Maybe
subterms, ExpressionF (Maybe Expression')
, and then if any subterms are Nothing
, it will return Nothing
, if all are Just x
, it will return Just (e::ExpressionF Expression')
. Traverse isn't actually separated into distinct phases like that, but it's easier to explain as if it were. Also note, you will need language pragmas for DeriveTraversable and DeriveFoldable, as well as deriving statements on the ExpressionF
data type.
The downside? Well, for one the dirt of your code will then lie in a bunch of Exp
wrappers everywhere. Consider the application of simplfyMb
of the simple term below:
simplifyMb (Exp $ Not (Exp $ Literal True))
It's also a lot to get a head around, but if you understand traverse
and fmap
pattern above, you can reuse it in lots of places, so that's good. I also believe defining simplify in that way makes it more robust to whatever the specific ExpressionF
constructions may turn into. It doesn't mention them so the deep simplification will be unaffected by refactors. The reduce function on the other hand will be.
One thing you can do is simplify as you construct, rather than first constructing then repeatedly destructing. So:
module Simple (Expression, true, false, var, not, or, and) where
import Prelude hiding (not, or, and)
data Expression
= Literal Bool
| Variable String
| Not Expression
| Or Expression Expression
| And Expression Expression
deriving (Eq, Ord, Read, Show)
true = Literal True
false = Literal False
var = Variable
not (Literal True) = false
not (Literal False) = true
not x = Not x
or (Literal True) _ = true
or _ (Literal True) = true
or x y = Or x y
and (Literal False) _ = false
and _ (Literal False) = false
and x y = And x y
We can try it out in ghci:
> and (var "x") (and (var "y") false)
Literal False
Note that the constructors are not exported: this ensures that folks constructing one of these can't avoid the simplification process. Actually, this may be a drawback; occasionally it is nice to see the "full" form. A standard approach to dealing with this is to make the exported smart constructors part of a type-class; you can either use them to build a "full" form or a "simplified" way. To avoid having to define the type twice, we could either use a newtype or a phantom parameter; I'll elect for the latter here to reduce the noise in pattern-matching.
{-# LANGUAGE DataKinds #-}
{-# LANGUAGE FlexibleInstances #-}
{-# LANGUAGE KindSignatures #-}
module Simple (Format(..), true, false, var, not, or, and) where
import Prelude hiding (not, or, and)
data Format = Explicit | Simplified
data Expression (a :: Format)
= Literal Bool
| Variable String
| Not (Expression a)
| Or (Expression a) (Expression a)
| And (Expression a) (Expression a)
deriving (Eq, Ord, Read, Show)
class Expr e where
true, false :: e
var :: String -> e
not :: e -> e
or, and :: e -> e -> e
instance Expr (Expression Explicit) where
true = Literal True
false = Literal False
var = Variable
not = Not
or = Or
and = And
instance Expr (Expression Simplified) where
true = Literal True
false = Literal False
var = Variable
not (Literal True) = false
not (Literal False) = true
not x = Not x
or (Literal True) _ = true
or _ (Literal True) = true
or x y = Or x y
and (Literal False) _ = false
and _ (Literal False) = false
and x y = And x y
Now in ghci we can "run" the same term in two different ways:
> :set -XDataKinds
> and (var "x") (and (var "y") false) :: Expression Explicit
And (Variable "x") (And (Variable "y") (Literal False))
> and (var "x") (and (var "y") false) :: Expression Simplified
Literal False
You might want to add more rules later; for example, maybe you want:
and (Variable x) (Not (Variable y)) | x == y = false
and (Not (Variable x)) (Variable y) | x == y = false
Having to repeat both "orders" of patterns is a bit annoying. We should abstract over that! The data declaration and classes will be the same, but we'll add the helper function eitherOrder
and use it in the definitions of and
and or
. Here's a more complete set of simplifications using this idea (and our final version of the module):
{-# LANGUAGE DataKinds #-}
{-# LANGUAGE FlexibleContexts #-}
{-# LANGUAGE FlexibleInstances #-}
{-# LANGUAGE KindSignatures #-}
module Simple (Format(..), true, false, var, not, or, and) where
import Data.Maybe
import Data.Monoid
import Prelude hiding (not, or, and)
import Control.Applicative ((<|>))
data Format = Explicit | Simplified
data Expression (a :: Format)
= Literal Bool
| Variable String
| Not (Expression a)
| Or (Expression a) (Expression a)
| And (Expression a) (Expression a)
deriving (Eq, Ord, Read, Show)
class Expr e where
true, false :: e
var :: String -> e
not :: e -> e
or, and :: e -> e -> e
instance Expr (Expression Explicit) where
true = Literal True
false = Literal False
var = Variable
not = Not
or = Or
and = And
eitherOrder :: (e -> e -> e)
-> (e -> e -> Maybe e)
-> e -> e -> e
eitherOrder fExplicit fSimplified x y = fromMaybe
(fExplicit x y)
(fSimplified x y <|> fSimplified y x)
instance Expr (Expression Simplified) where
true = Literal True
false = Literal False
var = Variable
not (Literal True) = false
not (Literal False) = true
not (Not x) = x
not x = Not x
or = eitherOrder Or go where
go (Literal True) _ = Just true
go (Literal False) x = Just x
go (Variable x) (Variable y) | x == y = Just (var x)
go (Variable x) (Not (Variable y)) | x == y = Just true
go _ _ = Nothing
and = eitherOrder And go where
go (Literal True) x = Just x
go (Literal False) _ = Just false
go (Variable x) (Variable y) | x == y = Just (var x)
go (Variable x) (Not (Variable y)) | x == y = Just false
go _ _ = Nothing
Now in ghci we can do more complicated simplifications, like:
> and (not (not (var "x"))) (var "x") :: Expression Simplified
Variable "x"
And even though we only wrote one order of the rewrite rule, both orders work properly:
> and (not (var "x")) (var "x") :: Expression Simplified
Literal False
> and (var "x") (not (var "x")) :: Expression Simplified
Literal False