问题
I have learnt how to convert numbers to floating point (on top of binary, octal and hexadecimal), and know how to convert numbers to floating point.
However, while looking through a worksheet I have been given, I have encountered the following question:
Using 32-bit IEEE 754 single precision floating point show the representation of -12.13 in Hexadecimal.
I have tried looking at the resources I have and still can't figure out how to answer the above. The answer given is 0xc142147b.
Edit: Sorry for not clarifying but I wanted to know how to get this done by hand instead of coding it.
回答1:
-12.13
must be converted to binary and then hex. Let's do that more or less like the glibc library does it, using just pen and paper and the Windows calculator.
Remove the sign, but remember we had one: 12.13
Significand (or mantissa)
The integer part, 12
is easy: C
(hex)
The fractional part, 0.13
is a little trickier. 0.13
is 13/100
. I use the Windows calculator (Programmer mode, hex) and shift 13
(hex D
) by 32(*) bits to the left: D00000000
. Divide that by 100
(hex 64
) to get: 2147AE14
hex.
Since we need a value below 1, we shift right by 32 bits again, and get: 0.2147AE14
Now add the integer part on the left: C.2147AE14
We only need 24 bits for the mantissa, so we round: C.2147B
--> C2147B
Now this must be normalized, so the binary point is moved 3 bits to the left (but the bits remain the same, of course). The exponent (originally 0) is raised accordingly, by 3, so now it is 3.
The hidden bit can now be removed: 42147B
(now the 23 low bits)
This can be turned into a 32 bit value for now: 0x0042147B
Exponent and sign
Now let's take on the exponent: 3
+ bias of hex 7F
= hex 82
, or 1000 0010
binary.
Add the sign bit on the left: 1 1000 0010
. Regrouped: 1100 0001 0
or C10
Of course these are top bits, so we turn that into 0xC1000000
for the full 32 bits
"Bitwise-Or" both parts
0xC100000 | 0x0042147B = 0xC142147B
And that is the value you want.
(*)32 bits so I have more than enough bits to be able to round properly, later on.
回答2:
To code a floating number, we must rewrite it as (-1)s 2e 1.m and to code the different parts in 32 bits as follows
(from https://en.wikipedia.org/wiki/Single-precision_floating-point_format)
First bit is the sign s: 0 for + and 1 for -
8 following bits are the shifted exponent e+127
23 last bits are the fractional part of the mantissa (m)
The hard part is to convert the mantissa to binary. For some numbers, it is easy. For instance, 5.75=4+1+1/2+1/4=22+20+2-1+2-2=101.11=1.0111×22
For other numbers (as yours), it is harder. The solution is to multiply the number by two until we find an integer or we exceed the total number of bits in the code (23+1).
We can do that for your number:
12.13 = 12.13 2^-0
= 24.26 2^-1
= 48.52 2^-2
= 97.04 2^-3
= 194.08 2^-4
= 388.16 2^-5
= 776.32 2^-6
= 1552.64 2^-7
= 3105.28 2^-8
= 6210.56 2^-9
= 12421.12 2^-10
= 24842.24 2^-11
= 49684.48 2^-12
= 99368.96 2^-13
= 198737.92 2^-14
= 397475.84 2^-15
= 794951.69 2^-16
= 1589903.38 2^-17
= 3179806.75 2^-18
= 6359613.50 2^-19
= 12719227.00 2^-20
Next iteration would lead to a number larger than 2^24(=~16M), and we can stop.
Mantissa code is easy (but a bit long) to convert by hand to binary using usual methods, and its code is 0xc2147b. If we extract the leading bit at 1 in position 223 and put it left of "dot", we have mantissa=1.42147b×223 (where the fractional part is limited to 23 bits). As we had to multiply by the initial number by 220 to get this value, we finally have
mant=1.42147b×23
So exponent is 3 and its code is 3+127=130
exp=130d=0x82
and as number is negative
sign=1
We just have, to suppress the integer part of mantissa (hidden bit) and to concatenate this numbers to get final value of 0xc142147b
(Of course, I used a program to generate these numbers. If interested, here is the C code)
#include <stdio.h>
int main () {
float f=-12.13;
int sign=(f<0.0);
float fmantissa;
fmantissa = (f<0.0?-f:f) ; // abs value of f
int e = 0 ; // the raw exponent
printf("%2.2f = %11.2f 2^-%d\n",f,fmantissa,e);
while (fmantissa<=(1<<23)){
e++; fmantissa*=2.0;
printf(" = %11.2f 2^-%d\n",fmantissa,e);
}
// convert to int
int mantissa=fmantissa;
//and suppress hidden bit in mantissa
mantissa &= ~(1<<23) ;
// coded exponent
int exp=127-e+23;
printf("sign: %d exponent: %d mantissa: 1.%x\n",sign, exp, mantissa);
//final code
int fltcode = (sign << 31) | (exp << 23) | mantissa;
printf("0x%x\n",fltcode);
}
来源:https://stackoverflow.com/questions/54947861/32-bit-ieee-754-single-precision-floating-point-to-hexadecimal