IEEE 754 Bit manipulation Rounding Error

问题

Without using casts or functionality of libraries, I must cast an integer to a float with bit manipulation. Below is the code I am currently working on. It is based off of code that I found in Cast Integer to Float using Bit Manipulation breaks on some integers in C. The problem that I have ran into involves the rounding standards in IEEE 754. More specifically my code rounds towards 0, but it should round towards even numbers. What changes do I need to make?

unsigned inttofloat(int x) {
    int bias = 127;
    int man;
    int exp = bias + 31; //8-bit exp
    int count = 0;
    int tmin = 1 << 31;
    int manpattern = 0x7FFFFF;

    int sign = 0;

    if (x == 0){
        return 0;
    }
    else if (x == tmin){
        return 0xcf << 24;
    }

    if (x < 0) {
        sign = tmin;
        x = ~x + 1; // makes x negative so that we can accurately represent it later on.
    }

    while((x & tmin) == 0){
        exp--;
        x <<= 1;
        count++;
    }

    exp <<= 23;
    man = (x >> 8) & manpattern;

    return (sign | exp | man);
}

回答1:

To round toward nearest - ties to even, replace (x >> 8) with:

unsigned u = x;  // avoid any potential signed shifting issues
unsigned lease_significant_bit = (u >> 8) & 1;
unsigned round_bit = (u >> 7) & 1; // Most significant bit shifted out
unsigned sticky_bit_flag = !!(u & 0x7F);  // All other bits shifts out

// OP's shifted answer.
u = (u >> 8): 

// round away if more than half-way or
//  if at half-way and number is odd
u += (round_bit & sticky_bit_flag) | (round_bit & lease_significant_bit);

Leave it for OP to simplify

Note that u += 1 may propagate all the way through and require an exponent increase.

来源：https://stackoverflow.com/questions/42031305/ieee-754-bit-manipulation-rounding-error

标签

bit-manipulation

ieee-754