ieee-754

Converting double to hexadecimal - Code review

烈酒焚心 提交于 2019-12-24 01:02:40
问题 I have the following code that takes a double value and converts it to a hexadecimal representation, and vice versa. I would like to know whether there are any potential problems with it - whether I have overlooked something. double hex_to_double2(string &hexString) { unsigned char byte_string[sizeof(double)]; int number; int j = 0; for(int i = 0; i < hexString.size() ; i += 2) { sscanf(&hexString[i], "%02x", &number); byte_string[j] = (unsigned char)number; ++j; } double p = (double&)byte

Is it 52 or 53 bits of floating point precision?

南笙酒味 提交于 2019-12-23 15:35:26
问题 I keep on seeing this nonsense about 53 bits of precision in 64-bit IEEE floating point representation. Would someone please explain to me how in the world a bit that is stuck with a 1 in it contributes ANYTHING to the numeric precision? If you had a floating point unit with bit0 stuck-on with 1, you would of course know that it produces 1 less bit of precision than normally. Where are those sensibilities on this? Further, just the exponent, the scaling factor without the mantissa, completely

What does standard say about cmath functions like std::pow, std::log etc?

大憨熊 提交于 2019-12-23 09:46:46
问题 Does the standard guarantee that functions return the exact same result across all implementations? Take for example pow(float,float) for 32bit IEEE floats. Is the result across all implementations identical if the same two floats are passed in? Or is there some flexibility that the standard allows with regard to tiny differences depending on the algorithm used to implement pow ? 回答1: No, the C++ standard doesn't require the results of cmath functions to be the same across all implementations

How does the C == operator decide whether or not two floating point values are equal?

喜夏-厌秋 提交于 2019-12-23 08:26:32
问题 Today I was tracking down why my program was getting some unexpected checksum-mismatch errors, in some code that I wrote that serializes and deserializes IEEE-754 floating-point values, in a format that includes a 32-bit checksum value (which is computed by running a CRC-type algorithm over the bytes of the floating-point array). After a bit of head-scratching, I realized the problem was the 0.0f and -0.0f have different bit-patterns (0x00000000 vs 0x00000080 (little-endian), respectively),

How does the C == operator decide whether or not two floating point values are equal?

你离开我真会死。 提交于 2019-12-23 08:26:10
问题 Today I was tracking down why my program was getting some unexpected checksum-mismatch errors, in some code that I wrote that serializes and deserializes IEEE-754 floating-point values, in a format that includes a 32-bit checksum value (which is computed by running a CRC-type algorithm over the bytes of the floating-point array). After a bit of head-scratching, I realized the problem was the 0.0f and -0.0f have different bit-patterns (0x00000000 vs 0x00000080 (little-endian), respectively),

Is a float guaranteed to be preserved when transported through a double in C/C++?

会有一股神秘感。 提交于 2019-12-23 06:44:52
问题 Assuming IEEE-754 conformance, is a float guaranteed to be preserved when transported through a double? In other words, will the following assert always be satisfied? int main() { float f = some_random_float(); assert(f == (float)(double)f); } Assume that f could acquire any of the special values defined by IEEE, such as NaN and Infinity. According to IEEE, is there a case where the assert will be satisfied, but the exact bit-level representation is not preserved after the transportation

C++ convert floating point number to string

家住魔仙堡 提交于 2019-12-23 04:24:46
问题 I am trying to convert a floating point number to string. I know you can do it using ostringstream & sprintf etc. but in the project I am working I am trying to do it using my own functions only (I am creating my own string class without using any outside functions). I don't want a perfect representation e.g. I don't mind it if this happens with large or small number: 1.0420753e+4 like it does with the standard stringstream. I know how floating point numbers work (e.g. sign, exponent,

Error using Newton-Raphson Iteration Method for Floating Point Division

天涯浪子 提交于 2019-12-23 01:52:19
问题 I am using the Newton-Raphson Algorithm to divide IEEE-754 single-precision floating point values using single precision hardware. I am using the method described at these two links: Wikipedia Newton-Raphson Division Newton-Raphson Method I'm Using However, despite computing Xi to X_3 (i.e. using 3 iterations), my answers are still off a bit. I'm wondering why this is so? I'm comparing my results using MATLAB. Here is the output showing an example of an incorrect result ===== DIVISION RESULT

Converting from floating-point to decimal with floating-point computations

不羁岁月 提交于 2019-12-22 12:07:28
问题 I am trying to convert a floating-point double-precision value x to decimal with 12 (correctly rounded) significant digits. I am assuming that x is between 10^110 and 10^111 such that its decimal representation will be of the form x.xxxxxxxxxxxE110 . And, just for fun, I am trying to use floating-point arithmetic only. I arrived to the pseudo-code below, where all operations are double-precision operations, The notation 1e98 is for the double nearest to the mathematical 10^98, and 1e98_2 is

Does a floating-point reciprocal always round-trip?

五迷三道 提交于 2019-12-22 09:23:35
问题 For IEEE-754 arithmetic, is there a guarantee of 0 or 1 units in the last place accuracy for reciprocals? From that, is there a guaranteed error-bound on the reciprocal of a reciprocal? 回答1: [Everything below assumes a fixed IEEE 754 binary format, with some form of round-to-nearest as the rounding-mode.] Since reciprocal (computed as 1/x ) is a basic arithmetic operation, 1 is exactly representable, and the arithmetic operations are guaranteed correctly rounded by the standard, the