Loss of precision - int -> float or double

后端未结

关注

 9  1456

小蘑菇

I have an exam question I am revising for and the question is for 4 marks.

\"In java we can assign a int to a double or a float\". Will this ever lose

相关标签:

9条回答

滥情空心

2020-11-27 15:49
Your intuition is correct, you MAY loose precision when converting int to float. However it not as simple as presented in most other answers.

In Java a FLOAT uses a 23 bit mantissa, so integers greater than 2^23 will have their least significant bits truncated. (from a post on this page)

Not true.
Example: here is an integer that is greater than 2^23 that converts to a float with no loss:
```
int i = 33_554_430 * 64; // is greater than 2^23 (and also greater than 2^24); i = 2_147_483_520
float f = i;
System.out.println("result: " + (i - (int) f)); // Prints: result: 0
System.out.println("with i:" + i + ",  f:" + f);//Prints: with i:2_147_483_520,  f:2.14748352E9
```
Therefore, it is not true that integers greater than 2^23 will have their least significant bits truncated.

The best explanation I found is here:
A float in Java is 32-bit and is represented by:
sign * mantissa * 2^exponent
sign * (0 to 33_554_431) * 2^(-125 to +127)
Source: http://www.ibm.com/developerworks/java/library/j-math2/index.html

Why is this an issue?
It leaves the impression that you can determine whether there is a loss of precision from int to float just by looking at how large the int is.
I have especially seen Java exam questions where one is asked whether a large int would convert to a float with no loss.

Also, sometimes people tend to think that there will be loss of precision from int to float:
when an int is larger than: 1_234_567_890 not true (see counter-example above)
when an int is larger than: 2 exponent 23 (equals: 8_388_608) not true
when an int is larger than: 2 exponent 24 (equals: 16_777_216) not true

Conclusion
Conversions from sufficiently large ints to floats MAY lose precision.
It is not possible to determine whether there will be loss just by looking at how large the int is (i.e. without trying to go deeper into the actual float representation).
0 讨论(0)
发布评论:

提交评论
- 加载中...
攒了一身酷

2020-11-27 15:51
Here's what JLS has to say about the matter (in a non-technical discussion).

JLS 5.1.2 Widening primitive conversion
The following 19 specific conversions on primitive types are called the widening primitive conversions:
- int to long, float, or double
- (rest omitted)
Conversion of an int or a long value to float, or of a long value to double, may result in loss of precision -- that is, the result may lose some of the least significant bits of the value. In this case, the resulting floating-point value will be a correctly rounded version of the integer value, using IEEE 754 round-to-nearest mode.

Despite the fact that loss of precision may occur, widening conversions among primitive types never result in a run-time exception.

Here is an example of a widening conversion that loses precision:
```
class Test {
         public static void main(String[] args) {
                int big = 1234567890;
                float approx = big;
                System.out.println(big - (int)approx);
        }
}
```
which prints:
```
-46
```
thus indicating that information was lost during the conversion from type int to type float because values of type float are not precise to nine significant digits.
0 讨论(0)
发布评论:

提交评论
- 加载中...
长发绾君心

2020-11-27 15:52
For these examples, I'm using Java.

Use a function like this to check for loss of precision when casting from int to float
```
static boolean checkPrecisionLossToFloat(int val)
{
  if(val < 0)
  {
    val = -val;
  }
  // 8 is the bit-width of the exponent for single-precision
  return Integer.numberOfLeadingZeros(val) + Integer.numberOfTrailingZeros(val) < 8;
}
```
Use a function like this to check for loss of precision when casting from long to double
```
static boolean checkPrecisionLossToDouble(long val)
{
  if(val < 0)
  {
    val = -val;
  }
  // 11 is the bit-width for the exponent in double-precision
  return Long.numberOfLeadingZeros(val) + Long.numberOfTrailingZeros(val) < 11;
}
```
Use a function like this to check for loss of precision when casting from long to float
```
static boolean checkPrecisionLossToFloat(long val)
{
  if(val < 0)
  {
    val = -val;
  }
  // 8 + 32
  return Long.numberOfLeadingZeros(val) + Long.numberOfTrailingZeros(val) < 40;
}
```
For each of these functions, returning true means that casting that integral value to the floating point value will result in a loss of precision.

Casting to float will lose precision if the integral value has more than 24 significant bits.

Casting to double will lose precision if the integral value has more than 53 significant bits.
0 讨论(0)
发布评论:

提交评论
- 加载中...
囚心锁ツ

2020-11-27 16:02

You can assign double as int without losing precision.

0 讨论(0)
发布评论:

提交评论
- 加载中...
梦如初夏

2020-11-27 16:04

No, float and double are fixed-length too - they just use their bits differently. Read more about how exactly they work in the Floating-Poing Guide .

Basically, you cannot lose precision when assigning an int to a double, because double has 52 bits of precision, which is enough to hold all int values. But float only has 23 bits of precision, so it cannot exactly represent all int values that are larger than about 2^23.

0 讨论(0)
发布评论:

提交评论
- 加载中...
花落未央

2020-11-27 16:07
It's not necessary to know the internal layout of floating-point numbers. All you need is the pigeonhole principle and the knowledge that int and float are the same size.
- int is a 32-bit type, for which every bit pattern represents a distinct integer, so there are 2^32 int values.
- float is a 32-bit type, so it has at most 2^32 distinct values.
- Some floats represent non-integers, so there are fewer than 2^32 float values that represent integers.
- Therefore, different int values will be converted to the same float (=loss of precision).
Similar reasoning can be used with long and double.
0 讨论(0)
发布评论:

提交评论
- 加载中...

1 2 下一页

Loss of precision - int -> float or double

JLS 5.1.2 Widening primitive conversion