Convert String to Int checking for overflow

后端 未结 2 1394
执念已碎
执念已碎 2021-01-20 03:04

When I tried to convert a very long integer to Int, I was surprised that no error was thrown:

Prelude> read \"123456789012345678901234567890\         


        
相关标签:
2条回答
  • 2021-01-20 03:45

    Background: why unchecked overflow is actually wonderful

    Haskell 98 leaves overflow behavior explicitly unspecified, which is good for implementers and bad for everyone else. Haskell 2010 discusses it in two sections—in the section inherited from Haskell 98, it's left explicitly unspecified, whereas in the sections on Data.Int and Data.Word, it is specified. This inconsistency will hopefully be resolved eventually.

    GHC is kind enough to specify it explicitly:

    All arithmetic is performed modulo 2^n, where n is the number of bits in the type.

    This is an extremely useful specification. In particular, it guarantees that Int, Word, Int64, Word32, etc., form rings, and even principal ideal rings, under addition and multiplication. This means that arithmetic will always work right—you can transform expressions and equations in lots of different ways without breaking things. Throwing exceptions on overflow would break all these properties, making it much more difficult to write and reason about programs. The only times you really need to be careful are when you use comparison operators like < and compare—fixed width integers do not form ordered groups, so these operators are a bit touchy.

    Why it makes sense not to check reads

    Reading an integer involves many multiplications and additions. It also needs to be fast. Checking to make sure the read is "valid" is not so easy to do quickly. In particular, while it's easy to find out whether an addition has overflowed, it is not easy to find out whether a multiplication has. The only sensible ways I can think of to perform a checked read for Int are

    1. Read as an Integer, check, then convert. Integer arithmetic is significantly more expensive than Int arithmetic. For smaller things, like Int16, the read can be done with Int, checking for Int16 overflow, then narrowed. This is cheaper, but still not free.

    2. Compare the number in decimal to maxBound (or, for a negative number, minBound) while reading. This seems more likely to be reasonably efficient, but there will still be some cost. As the first section of this answer explains, there is nothing inherently wrong with overflow, so it's not clear that throwing an error is actually better than giving an answer modulo 2^n.

    0 讨论(0)
  • 2021-01-20 03:45

    If isn't "unsafe", in that the behaviour of the problem isn't undefined. (It's perfectly defined, just probably not what you wanted.) For example, unsafeWriteAray is unsafe, in that if you make a mistake with it, it writes data into arbitrary memory locations, either causing your application to segfault, or merely making corrupt its own memory, causing it behave in arbitrary undefined ways.

    So much for splitting hairs. If you want to deal with such huge numbers, Integer is the only way to go. But you probably knew that already.

    As for why there's no overflow check... Sometimes you actually want a number to overflow. (E.g., you might convert to Word8 without explicitly ANDing out the bottom 8 bits.) At any rate, every possible arithmetic operation can potentially overflow (e.g., maxBound + 1 = minBound, and that's just normal addition.) Do you really want every single arithmetic operation to have an overflow check, slowing your program down at every step?

    You get the exact same behaviour in C, C++ or C#. I guess the difference is, in C# we have the checked keyword, which allows you to automatically check for overflow. Maybe somebody has a Haskell package for doing checked arithmetic… For now, it's probably simpler to just implement this one check yourself.

    0 讨论(0)
提交回复
热议问题