Why constraints on data are a bad thing?

前端 未结 3 1029
深忆病人
深忆病人 2021-02-18 18:20

I know this question has been asked and answered lots of times but I still don\'t really understand why putting constraints on a data type is a bad thing.

For example, l

相关标签:
3条回答
  • 2021-02-18 18:52

    Constraints

    The problem is that constraints are not a property of the data type, but of the algorithm/function that operates on them. Different functions might need different and unique constraints.

    A Box example

    As an example, let's assume we want to create a container called Box which contains only 2 values.

    data Box a = Box a a
    

    We want it to:

    • be showable
    • allow the sorting of the two elements via sort

    Does it make sense to apply the constraint of both Ord and Show on the data type? No, because the data type in itself could be only shown or only sorted and therefore the constraints are related to its use, not it's definition.

    instance (Show a) => Show (Box a) where
        show (Box a b) = concat ["'", show a, ", ", show b, "'"]
    
    instance (Ord a) => Ord (Box a) where
        compare (Box a b) (Box c d) =
            let ca = compare a c
                cb = compare b d
            in if ca /= EQ then ca else cb
    

    The Data.Map case

    Data.Map's Ord constraints on the type is really needed only when we have > 1 elements in the container. Otherwise the container is usable even without an Ord key. For example, this algorithm:

    transf :: Map NonOrd Int -> Map NonOrd Int
    transf x = 
        if Map.null x
            then Map.singleton NonOrdA 1
            else x
    

    Live demo

    works just fine without the Ord constraint and always produce a non empty map.

    0 讨论(0)
  • 2021-02-18 19:02

    Using DataTypeContexts reduces the number of programs you can write. If most of those illegal programs are nonsense, you might say it's worth the runtime cost associated with ghc passing in a type class dictionary that isn't used. For example, if we had

    data Ord k => MapDTC k a
    

    then @jefffrey's transf is rejected. But we should probably have transf _ = return (NonOrdA, 1) instead.

    In some sense the context is documentation that says "every Map must have ordered keys". If you look at all of the functions in Data.Map you'll get a similar conclusion "every useful Map has ordered keys". While you can create maps with unordered keys using

    mapKeysMonotonic :: (k1 -> k2) -> Map k1 a -> Map k2 a
    singleton :: k2 a -> Map k2 a
    

    But the moment you try to do anything useful with them, you'll wind up with No instance for Ord k2 somewhat later.

    0 讨论(0)
  • 2021-02-18 19:18

    TL;DR:
    Use GADTs to provide implicit data contexts.
    Don't use any kind of data constraint if you could do with Functor instances etc.
    Map's too old to change to a GADT anyway. Scroll to the bottom if you want to see the User implementation with GADTs


    Let's use a case study of a Bag where all we care about is how many times something is in it. (Like an unordered sequence. We nearly always need an Eq constraint to do anything useful with it.

    I'll use the inefficient list implementation so as not to muddy the waters over the Data.Map issue.

    GADTs - the solution to the data constraint "problem"

    The easy way to do what you're after is to use a GADT:

    Notice below how the Eq constraint not only forces you to use types with an Eq instance when making GADTBags, it provides that instance implicitly wherever the GADTBag constructor appears. That's why count doesn't need an Eq context, whereas countV2 does - it doesn't use the constructor:

    {-# LANGUAGE GADTs #-}
    
    data GADTBag a where
       GADTBag :: Eq a => [a] -> GADTBag a
    unGADTBag (GADTBag xs) = xs
    
    instance Show a => Show (GADTBag a) where
      showsPrec i (GADTBag xs) = showParen (i>9) (("GADTBag " ++ show xs) ++)
    
    count :: a -> GADTBag a -> Int -- no Eq here
    count a (GADTBag xs) = length.filter (==a) $ xs  -- but == here
    
    countV2 a = length.filter (==a).unGADTBag
    
    size :: GADTBag a -> Int
    size (GADTBag xs) = length xs
    
    ghci> count 'l' (GADTBag "Hello")
    2
    ghci> :t countV2
    countV2 :: Eq a => a -> GADTBag a -> Int
    

    Now we didn't need the Eq constraint when we found the total size of the bag, but it didn't clutter up our definition anyway. (We could have used size = length . unGADTBag just as well.)

    Now lets make a functor:

    instance Functor GADTBag where
      fmap f (GADTBag xs) = GADTBag (map f xs)
    

    oops!

    DataConstraints_so.lhs:49:30:
        Could not deduce (Eq b) arising from a use of `GADTBag'
        from the context (Eq a)
    

    That's unfixable (with the standard Functor class) because I can't restrict the type of fmap, but need to for the new list.

    Data Constraint version

    Can we do as you asked? Well, yes, except that you have to keep repeating the Eq constraint wherever you use the constructor:

    {-# LANGUAGE DatatypeContexts #-}
    
    data Eq a => EqBag a = EqBag {unEqBag :: [a]}
      deriving Show
    
    count' a (EqBag xs) = length.filter (==a) $ xs
    size' (EqBag xs) = length xs   -- Note: doesn't use (==) at all
    

    Let's go to ghci to find out some less pretty things:

    ghci> :so DataConstraints
    DataConstraints_so.lhs:1:19: Warning:
        -XDatatypeContexts is deprecated: It was widely considered a misfeature, 
        and has been removed from the Haskell language.
    [1 of 1] Compiling Main             ( DataConstraints_so.lhs, interpreted )
    Ok, modules loaded: Main.
    ghci> :t count
    count :: a -> GADTBag a -> Int
    ghci> :t count'
    count' :: Eq a => a -> EqBag a -> Int
    ghci> :t size
    size :: GADTBag a -> Int
    ghci> :t size'
    size' :: Eq a => EqBag a -> Int
    ghci> 
    

    So our EqBag count' function requires an Eq constraint, which I think is perfectly reasonable, but our size' function also requires one, which is less pretty. This is because the type of the EqBag constructor is EqBag :: Eq a => [a] -> EqBag a, and this constraint must be added every time.

    We can't make a functor here either:

    instance Functor EqBag where
       fmap f (EqBag xs) = EqBag (map f xs)
    

    for exactly the same reason as with the GADTBag

    Constraintless bags

    data ListBag a = ListBag {unListBag :: [a]}
      deriving Show
    count'' a = length . filter (==a) . unListBag
    size'' = length . unListBag
    
    instance Functor ListBag where
       fmap f (ListBag xs) = ListBag (map f xs)
    

    Now the types of count'' and show'' are exactly as we expect, and we can use standard constructor classes like Functor:

    ghci> :t count''
    count'' :: Eq a => a -> ListBag a -> Int
    ghci> :t size''
    size'' :: ListBag a -> Int
    ghci> fmap (Data.Char.ord) (ListBag "hello")
    ListBag {unListBag = [104,101,108,108,111]}
    ghci> 
    

    Comparison and conclusions

    The GADTs version automagically propogates the Eq constraint everywhere the constructor is used. The type checker can rely on there being an Eq instance, because you can't use the constructor for a non-Eq type.

    The DatatypeContexts version forces the programmer to manually propogate the Eq constraint, which is fine by me if you want it, but is deprecated because it doesn't give you anything more than the GADT one does and was seen by many as pointless and annoying.

    The unconstrained version is good because it doesn't prevent you from making Functor, Monad etc instances. The constraints are written exactly when they're needed, no more or less. Data.Map uses the unconstrained version partly because unconstrained is generally seen as most flexible, but also partly because it predates GADTs by some margin, and there needs to be a compelling reason to potentially break existing code.

    What about your excellent User example?

    I think that's a great example of a one-purpose data type that benefits from a constraint on the type, and I'd advise you to use a GADT to implement it.

    (That said, sometimes I have a one-purpose data type and end up making it unconstrainedly polymorphic just because I love to use Functor (and Applicative), and would rather use fmap than mapBag because I feel it's clearer.)

    {-# LANGUAGE GADTs #-}
    import Data.String
    
    data User s where 
       User :: (IsString s, Show s) => s -> User s
    
    name :: User s -> s
    name (User s) = s
    
    instance Show (User s) where  -- cool, no Show context
      showsPrec i (User s) = showParen (i>9) (("User " ++ show s) ++)
    
    instance (IsString s, Show s) => IsString (User s) where
      fromString = User . fromString
    

    Notice since fromString does construct a value of type User a, we need the context explicitly. After all, we composed with the constructor User :: (IsString s, Show s) => s -> User s. The User constructor removes the need for an explicit context when we pattern match (destruct), becuase it already enforced the constraint when we used it as a constructor.

    We didn't need the Show context in the Show instance because we used (User s) on the left hand side in a pattern match.

    0 讨论(0)
提交回复
热议问题