Right now, I have some code that essentially works like this:
data Expression
= Literal Bool
| Variable String
| Not Expression
| Or Expressio
I think Einstein said, "Simplify as much as possible, but no more." You have yourself a complicated datatype, and a correspondingly complicated concept, so I assume any technique can only be so much cleaner for the problem at hand.
That said, the first option is to use instead a case structure.
simplify x = case x of
Literal _ -> x
Variable _ -> x
Not e -> simplifyNot $ simplify e
...
where
sharedFunc1 = ...
sharedFunc2 = ...
This has the added benefit of including shared functions which will be usable by all cases but not at the top level namespace. I also like how the cases are freed of their parenthesis. (Also note that in the first two cases i just return the original term, not creating a new one). I often use this sort of structure to just break out other simplify functions, as in the case of Not
.
This problem in particular may lend itself to basing Expression
on an underlying functor, so that you may fmap
a simplification of the subexpressions and then perform the specific combinations of the given case. It will look something like the following:
simplify :: Expression' -> Expression'
simplify = Exp . reduce . fmap simplify . unExp
The steps in this are unwrapping Expression'
into the underlying functor representation, mapping the simplification on the underlying term, and then reducing that simplification and wrapping back up into the new Expression'
{-# Language DeriveFunctor #-}
newtype Expression' = Exp { unExp :: ExpressionF Expression' }
data ExpressionF e
= Literal Bool
| Variable String
| Not e
| Or e e
| And e e
deriving (Eq,Functor)
Now, I have pushed the complexity off into the reduce
function, which is only a little less complex because it doesn't have to worry about first reducing the subterm. But it will now contain solely the business logic of combining one term with another.
This may or may not be a good technique for you, but it may make some enhancements easier. For instance, if it is possible to form invalid expressions in your language, you could simplify that with Maybe
valued failures.
simplifyMb :: Expression' -> Maybe Expression'
simplifyMb = fmap Exp . reduceMb <=< traverse simplifyMb . unExp
Here, traverse
will apply simplfyMb
to the subterms of the ExpressionF
, resulting in an expression of Maybe
subterms, ExpressionF (Maybe Expression')
, and then if any subterms are Nothing
, it will return Nothing
, if all are Just x
, it will return Just (e::ExpressionF Expression')
. Traverse isn't actually separated into distinct phases like that, but it's easier to explain as if it were. Also note, you will need language pragmas for DeriveTraversable and DeriveFoldable, as well as deriving statements on the ExpressionF
data type.
The downside? Well, for one the dirt of your code will then lie in a bunch of Exp
wrappers everywhere. Consider the application of simplfyMb
of the simple term below:
simplifyMb (Exp $ Not (Exp $ Literal True))
It's also a lot to get a head around, but if you understand traverse
and fmap
pattern above, you can reuse it in lots of places, so that's good. I also believe defining simplify in that way makes it more robust to whatever the specific ExpressionF
constructions may turn into. It doesn't mention them so the deep simplification will be unaffected by refactors. The reduce function on the other hand will be.