Are the bit patterns of NaNs really hardware-dependent?

前端 未结 4 620
臣服心动
臣服心动 2020-12-01 08:52

I was reading about floating-point NaN values in the Java Language Specification (I\'m boring). A 32-bit float has this bit format:

seee eeee em         


        
相关标签:
4条回答
  • 2020-12-01 09:18

    This is what §2.3.2 of the JVM 7 spec has to say about it:

    The elements of the double value set are exactly the values that can be represented using the double floating-point format defined in the IEEE 754 standard, except that there is only one NaN value (IEEE 754 specifies 253-2 distinct NaN values).

    and §2.8.1:

    The Java Virtual Machine has no signaling NaN value.

    So technically there is only one NaN. But §4.2.3 of the JLS also says (right after your quote):

    For the most part, the Java SE platform treats NaN values of a given type as though collapsed into a single canonical value, and hence this specification normally refers to an arbitrary NaN as though to a canonical value.

    However, version 1.3 of the Java SE platform introduced methods enabling the programmer to distinguish between NaN values: the Float.floatToRawIntBits and Double.doubleToRawLongBits methods. The interested reader is referred to the specifications for the Float and Double classes for more information.

    Which I take to mean exactly what you and CandiedOrange propose: It is dependent on the underlying processor, but Java treats them all the same.

    But it gets better: Apparently, it is entirely possible that your NaN values are silently converted to different NaNs, as described in Double.longBitsToDouble():

    Note that this method may not be able to return a double NaN with exactly same bit pattern as the long argument. IEEE 754 distinguishes between two kinds of NaNs, quiet NaNs and signaling NaNs. The differences between the two kinds of NaN are generally not visible in Java. Arithmetic operations on signaling NaNs turn them into quiet NaNs with a different, but often similar, bit pattern. However, on some processors merely copying a signaling NaN also performs that conversion. In particular, copying a signaling NaN to return it to the calling method may perform this conversion. So longBitsToDouble may not be able to return a double with a signaling NaN bit pattern. Consequently, for some long values, doubleToRawLongBits(longBitsToDouble(start)) may not equal start. Moreover, which particular bit patterns represent signaling NaNs is platform dependent; although all NaN bit patterns, quiet or signaling, must be in the NaN range identified above.

    For reference, there is a table of the hardware-dependant NaNs here. In summary:

    - x86:     
       quiet:      Sign=0  Exp=0x7ff  Frac=0x80000
       signalling: Sign=0  Exp=0x7ff  Frac=0x40000
    - PA-RISC:               
       quiet:      Sign=0  Exp=0x7ff  Frac=0x40000
       signalling: Sign=0  Exp=0x7ff  Frac=0x80000
    - Power:
       quiet:      Sign=0  Exp=0x7ff  Frac=0x80000
       signalling: Sign=0  Exp=0x7ff  Frac=0x5555555500055555
    - Alpha:
       quiet:      Sign=0  Exp=0      Frac=0xfff8000000000000
       signalling: Sign=1  Exp=0x2aa  Frac=0x7ff5555500055555
    

    So, to verify this you would really need one of these processors and go try it out. Also any insights on how to interpret the longer values for the Power and Alpha architectures are welcome.

    0 讨论(0)
  • 2020-12-01 09:35

    Here is a program demonstrating different NaN bit patterns:

    public class Test {
      public static void main(String[] arg) {
        double myNaN = Double.longBitsToDouble(0x7ff1234512345678L);
        System.out.println(Double.isNaN(myNaN));
        System.out.println(Long.toHexString(Double.doubleToRawLongBits(myNaN)));
        final double zeroDivNaN = 0.0 / 0.0;
        System.out.println(Double.isNaN(zeroDivNaN));
        System.out
            .println(Long.toHexString(Double.doubleToRawLongBits(zeroDivNaN)));
      }
    }
    

    output:

    true
    7ff1234512345678
    true
    7ff8000000000000
    

    Regardless of what the hardware does, the program can itself create NaNs that may not be the same as e.g. 0.0/0.0, and may have some meaning in the program.

    0 讨论(0)
  • 2020-12-01 09:37

    The only other NaN value that I could generate with normal arithmetic operations so far is the same but with the sign changed:

    public static void main(String []args) {
        Double tentative1 = 0d/0d;
        Double tentative2 = Math.sqrt(-1d);
        
        System.out.println(tentative1);
        System.out.println(tentative2);
        
        System.out.println(Long.toHexString(Double.doubleToRawLongBits(tentative1)));
        System.out.println(Long.toHexString(Double.doubleToRawLongBits(tentative2)));
        
        System.out.println(tentative1 == tentative2);
        System.out.println(tentative1.equals(tentative2));
    }
    

    Output:

    NaN

    NaN

    7ff8000000000000

    fff8000000000000

    false

    true

    0 讨论(0)
  • 2020-12-01 09:42

    The way I read the JLS here the exact bit value of a NaN is dependent on who/what made it and since the JVM didn't make it, don't ask them. You might as well ask them what an "Error code 4" string means.

    The hardware produces different bit patterns meant to represent different kinds of NaN's. Unfortunately the different kinds hardware produce different bit patterns for the same kinds of NaN's. Fortunately there is a standard pattern that Java can use to at least tell that it is some kind of NaN.

    It's like Java looked at the "Error code 4" string and said, "We don't know what 'code 4' means on your hardware, but there was the word 'error' in that string, so we think it's an error."

    The JLS tries to give you a chance to figure it out on your own though:

    "However, version 1.3 of the Java SE platform introduced methods enabling the programmer to distinguish between NaN values: the Float.floatToRawIntBits and Double.doubleToRawLongBits methods. The interested reader is referred to the specifications for the Float and Double classes for more information."

    Which looks to me like a C++ reinterpret_cast. It's Java giving you a chance to analyze the NaN yourself in case you happen to know how its signal was encoded. If you want to track down the hardware specs so you can predict what different events should produce which NaN bit patterns you are free to do so but you are outside the uniformity the JVM was meant to give us. So expect it might change from hardware to hardware.

    When testing if a number is NaN we check if it's equal to itself since it's the only number that isn't. This isn't to say that the bits are different. Before comparing the bits the JVM tests for the many bit patterns that say it's a NaN. If it's any of those patterns then it reports that it's not equal, even if the bits of the two operands really are the same (and even if they're different).

    Back in 1964, when pressed to give an exact definition for pornography, U.S. Supreme Court Justice Stewart famously said, “I Know It When I See It”. I think of Java as doing the same thing with NaN's. Java can't tell you anything that a "signaling" NaN might be signaling cause it doesn't know how that signal was encoded. But it can look at the bits and tell it's a NaN of some kind since that pattern follows one standard.

    If you happen to be on hardware that encodes all NaN's with uniform bits you'll never prove that Java is doing anything to make NaN's have uniform bits. Again, the way I read the JLS, they are outright saying you are on your own here.

    I can see why this feels flaky. It is flaky. It's just not Java's fault. I'll lay odds that some where out there some enterprising hardware manufactures came up with wonderfully expressive signaling NaN bit patterns but they failed to get it adopted as a standard widely enough that Java can count on it. That's what's flaky. We have all these bits reserved for signalling what kind of NaN we have and can't use them because we don't agree what they mean. Having Java beat NaN's into a uniform value after the hardware makes them would only destroy that information, harm performance, and the only payoff is to not seem flaky. Given the situation, I'm glad they realized they could cheat their way out of the problem and define NaN as not being equal to anything.

    0 讨论(0)
提交回复
热议问题