ieee-754

Find min/max of a float/double that has the same internal representation

岁酱吖の 提交于 2019-12-03 19:34:36
问题 Refreshing on floating points (also PDF), IEEE-754 and taking part in this discussion on floating point rounding when converting to strings, brought me to tinker: how can I get the maximum and minimum value for a given floating point number whose binary representations are equal. Disclaimer : for this discussion, I like to stick to 32 bit and 64 bit floating point as described by IEEE-754. I'm not interested in extended floating point (80-bits) or quads (128 bits IEEE-754-2008) or any other

Convert float to bigint (aka portable way to get binary exponent & mantissa)

耗尽温柔 提交于 2019-12-03 16:20:38
In C++, I have a bigint class that can hold an integer of arbitrary size. I'd like to convert large float or double numbers to bigint. I have a working method, but it's a bit of a hack. I used IEEE 754 number specification to get the binary sign, mantissa and exponent of the input number. Here is the code (Sign is ignored here, that's not important): float input = 77e12; bigint result; // extract sign, exponent and mantissa, // according to IEEE 754 single precision number format unsigned int *raw = reinterpret_cast<unsigned int *>(&input); unsigned int sign = *raw >> 31; unsigned int exponent

Layman's explanation for why JavaScript has weird floating math – IEEE 754 standard [duplicate]

强颜欢笑 提交于 2019-12-03 16:18:35
This question already has an answer here: Is floating point math broken? 31 answers I never understand exactly what's going on with JavaScript when I do mathematical operations on floating point numbers. I've been down-right fearful of using decimals, to the point where I just avoid them when at all possible. However, if I knew what was going on behind the scenes when it comes to the IEEE 754 standard, then I would be able to predict what would happen; with predictability, I'll be more confident and less fearful. Could someone give me a simple explanation ( as simple as explaining binary

Floating point arithmetic and reproducibility

对着背影说爱祢 提交于 2019-12-03 12:43:02
问题 Is IEEE-754 arithmetic reproducible on different platforms? I was testing some code written in R, that uses random numbers. I thought that setting the seed of the random number generator on all tested platforms would make the tests reproducible, but this does not seem to be true for rexp() , which generates exponentially distributed random numbers. This is what I get on 32 bit Linux: options(digits=22) ; set.seed(9) ; rexp(1, 5) # [1] 0.2806184054728815824298 sessionInfo() # R version 3.0.2

what languages expose IEEE 754 traps to the developer?

为君一笑 提交于 2019-12-03 12:41:19
I'd like to play with those traps for educational purpose. A common problem with the default behavior in numerical calculus is that we "miss" the Nan (or +-inf) that appeared in a wrong operation. Default behavior is propagation through the computation, but some operation (like comparisons) break the chain and loose the Nan, and the rest of the treatment continue without acknowledging the singularity in previous steps of the algorithm. Sometimes we have ways to react to this kind of event : prolongating a function ("0/0 = 12 in my case"), or in time-domain simulation throwing the step away and

Is there any IEEE 754 standard implementations for Java floating point primitives?

核能气质少年 提交于 2019-12-03 12:08:41
I'm interested if Java is using IEEE 754 standard for implementing its floating point arithmetic. Here I saw this kind of thing in documentation: operation defined in IEEE 754-2008 As I understand positive side of IEEE 754 is to increase precision of floating point arithmetics so if I'll use double or float in Java would presision of computations be same as in BigDecimal ? And if not than what's the point of using IEEE 754 standard in Math class? I'm interested if Java is using IEEE 754 standard for implementing it's floating point arithmetic. IEEE-754 defines standards for multiple floating

How many unique values are there between 0 and 1 of a standard float?

耗尽温柔 提交于 2019-12-03 11:34:59
问题 I guess another way of phrasing this question is what decimal place can you go to using a float that will only be between 0 and 1? I've tried to work it out by looking at the MSDN. Which says the precision is 7 digits. I thought that meant it could only track changes of 0.0000001 . However if I do: float test = 0.00000000000000000000000000000000000000000001f; Console.WriteLine(test); It writes out 9.949219E-44 If I add any more zeroes, it will output 0 . I'm pretty sure I'm missing something

Do-s and Don't-s for floating point arithmetic?

风格不统一 提交于 2019-12-03 10:59:19
What are some good do-s and don't-s for floating point arithmetic (IEEE754 in case there's confusion) to ensure good numerical stability and high accuracy in your results? I know a few like don't subtract quantities of similar magnitude, but I'm curious what other good rules are out there. First, enter with the notion that floating point numbers do NOT necessarily follow the same rules as real numbers... once you have accepted this, you will understand most of the pitfalls. Here's some rules/tips that I've always followed: NEVER compare a floating point number to zero or anything else for that

Why the IEEE-754 exponent bias used in this C code is 126.94269504 instead of 127?

只谈情不闲聊 提交于 2019-12-03 10:41:09
The following C function is from fastapprox project. static inline float fasterlog2 (float x) { union { float f; uint32_t i; } vx = { x }; float y = vx.i; y *= 1.1920928955078125e-7f; return y - 126.94269504f; } Could some experts here explain why the exponent bias used in the above code is 126.94269504 instead of 127? Is it more accurate bias value? In the project you linked, they included a Mathematica notebook with an explanation of their algorithms, which includes the "mysterious" -126.94269 value. If you need a viewer, you can get one from the Mathematica website for free. Edit: Since I'm

Is there an open-source c/c++ implementation of IEEE-754 operations? [closed]

China☆狼群 提交于 2019-12-03 10:37:06
I am looking for a reference implementation of IEEE-754 operations. Is there such a thing? I believe the C libraries SoftFloat and fdlibm are suitable for what you are looking for. Others include Linux (GNU libc, glibc ) or *BSD libc's math functions . Finally, CRlibm should also be of interest to you. Ulrich Drepper has a interesting look at different math libraries , that might be also worth reading through. I must disappoint you: There is practically none. While technically there are IEEE-754 compliant systems because they do not implement non-required features described in the standard, a