Type inference interferes with referential transparency

前端 未结 7 1294
醉酒成梦
醉酒成梦 2021-02-12 18:07

What is the precise promise/guarantee the Haskell language provides with respect to referential transparency? At least the Haskell report does not mention this notion.

C

相关标签:
7条回答
  • 2021-02-12 18:08

    What is the precise promise/guarantee the Haskell language provides with respect to referential transparency? At least the Haskell report does not mention this notion.

    Haskell does not provide a precise promise or guarantee. There exist many functions like unsafePerformIO or traceShow which are not referentially transparent. The extension called Safe Haskell however provides the following promise:

    Referential transparency — Functions in the safe language are deterministic, evaluating them will not cause any side effects. Functions in the IO monad are still allowed and behave as usual. Any pure function though, as according to its type, is guaranteed to indeed be pure. This property allows a user of the safe language to trust the types. This means, for example, that the unsafePerformIO :: IO a -> a function is disallowed in the safe language.

    Haskell provides an informal promise outside of this: the Prelude and base libraries tend to be free of side effects and Haskell programmers tend to label things with side effects as such.

    Evidently, this expression is now referentially opaque. How can I tell whether or not a program is subject to such behavior? I can inundate the program with :: all over but that does not make it very readable. Is there any other class of Haskell programs in between that I miss? That is between a fully annotated and an unannotated one?

    As others have said, the problem emerges from this behavior:

    Prelude> ( (7^7^7`mod`5`mod`2)==1, [False,True]!!(7^7^7`mod`5`mod`2) )
    (True,False)
    Prelude> 7^7^7`mod`5`mod`2 :: Integer
    1
    Prelude> 7^7^7`mod`5`mod`2 :: Int
    0
    

    This happens because 7^7^7 is a huge number (about 700,000 decimal digits) which easily overflows a 64-bit Int type, but the problem will not be reproducible on 32-bit systems:

    Prelude> :m + Data.Int
    Prelude Data.Int> 7^7^7 :: Int64
    -3568518334133427593
    Prelude Data.Int> 7^7^7 :: Int32
    1602364023
    Prelude Data.Int> 7^7^7 :: Int16
    8823
    

    If using rem (7^7^7) 5 the remainder for Int64 will be reported as -3 but since -3 is equivalent to +2 modulo 5, mod reports +2.

    The Integer answer is used on the left due to the defaulting rules for Integral classes; the platform-specific Int type is used on the right due to the type of (!!) :: [a] -> Int -> a. If you use the appropriate indexing operator for Integral a you instead get something consistent:

    Prelude> :m + Data.List
    Prelude Data.List> ((7^7^7`mod`5`mod`2) == 1, genericIndex [False,True] (7^7^7`mod`5`mod`2))
    (True,True)
    

    The problem here is not referential transparency because the functions that we're calling ^ are actually two different functions (as they have different types). What has tripped you up is typeclasses, which are an implementation of constrained ambiguity in Haskell; you have discovered that this ambiguity (unlike unconstrained ambiguity -- i.e. parametric types) can deliver counterintuitive results. This shouldn't be too surprising but it's definitely a little strange at times.

    0 讨论(0)
  • 2021-02-12 18:10

    The problem is overloading, which does indeed sort of violate referential transparency. You have no idea what something like (+) does in Haskell; it depends on the type.

    When a numeric type is unconstrained in a Haskell program the compiler uses type defaulting to pick some suitable type. This is for convenience, and usually doesn't lead to any surprises. But in this case it did lead to a surprise. In ghc you can use -fwarn-type-defaults to see when the compiler has used defaulting to pick a type for you. You can also add the line default () to your module to stop all defaulting.

    0 讨论(0)
  • 2021-02-12 18:16

    I do not think there's any guarantee that evaluating a polymorphically typed expression such as 5 at different types will produce "compatible" results, for any reasonable definition of "compatible".

    GHCi session:

    > class C a where num :: a
    > instance C Int    where num = 0
    > instance C Double where num = 1
    > num + length []  -- length returns an Int
    0
    > num + 0          -- GHCi defaults to Double for some reason
    1.0
    

    This looks as it's breaking referential transparency since length [] and 0 should be equal, but under the hood it's num that's being used at different types.

    Also,

    > "" == []
    True
    > [] == [1]
    False
    > "" == [1]
    *** Type error
    

    where one could have expected False in the last line.

    So, I think referential transparency only holds when the exact types are specified to resolve polymorphism. An explicit type parameter application à la System F would make it possible to always substitute a variable with its definition without altering the semantics: as far as I understand, GHC internally does exactly this during optimization to ensure that semantics is unaffected. Indeed, GHC Core has explicit type arguments which are passed around.

    0 讨论(0)
  • 2021-02-12 18:21

    A another type has been chosen, because !! requires an Int. The full computation now uses Int instead of Integer.

    λ> ( (7^7^7`mod`5`mod`2 :: Int)==1, [False,True]!!(7^7^7`mod`5`mod`2) )
    (False,False)
    
    0 讨论(0)
  • 2021-02-12 18:23

    What you think this has to do with referential transparency? Your uses of 7, ^, mod, 5, 2, and == are applications of those variables to dictionaries, yes, but I don't see why you think that fact makes Haskell referentially opaque. Often applying the same function to different arguments produces different results, after all!

    Referential transparency has to do with this expression:

    let x :: Int = 7^7^7`mod`5`mod`2 in (x == 1, [False, True] !! x)
    

    x is here a single value, and should always have that same single value.

    By contrast, if you say:

    let x :: forall a. Num a => a; x = 7^7^7`mod`5`mod`2 in (x == 1, [False, True] !! x)
    

    (or use the expression inline, which is equivalent), x is now a function, and can return different values depending on the Num argument you supply to it. You might as well complain that let f = (+1) in map f [1, 2, 3] is [2, 3, 4], but let f = (+3) in map f [1, 2, 3] is [4, 5, 6] and then say "Haskell gives different values for map f [1, 2, 3] depending on the context so it's referentially opaque"!

    0 讨论(0)
  • 2021-02-12 18:32

    Probably another type-inference and referential-transparency related thing is the „dreaded“ Monomorphism restriction (its absence, to be exact). A direct quote:

    An example, from „A History of Haskell“:
    Consider the genericLength function, from Data.List

    genericLength :: Num a => [b] -> a

    And consider the function:

    f xs = (len, len) where len = genericLength xs

    len has type Num a => a and, without the monomorphism restriction, it could be computed twice.

    Notice that in this case types of both expressions are the same. Results are too, but the substitution isn't always possible.

    0 讨论(0)
提交回复
热议问题