ieee-754

how to convert floating-point number to IEEE 754 using assembly

随声附和 提交于 2019-12-02 13:10:47
can you please help me to convert floating-point number to IEEE 754 using assembly i have this number -1.75 and i know it equla to -1.11000000000000000000000 E+0 on IEEE754 but i dont know how to do the convert in assembly Did you mean something like this: ; Conversion of an ASCII-coded decimal rational number (DecStr) ; to an ASCII-coded decimal binary number (BinStr) as 32-bit single (IEEE 754) include \masm32\include\masm32rt.inc ; MASM32 headers, mainly for printf .data DecStr db "-1.75",0 BinStr db 40 dup (0) result REAL4 ? number dd 0 frac dd 0 divisor dd 1 .code main PROC mov edi,

How to apply bitwise operations to the actual IEEE 754 representation of JS Numbers?

主宰稳场 提交于 2019-12-02 12:16:30
问题 In JavaScript, whenever you perform a bitwise operation such as x << 2 , the 64-bit float representation gets converted to a 32-bit unsigned int before the shifting actually occurs. I am insterested in applying the shift to the actual, unaltered IEEE 754 bitwise representation. How is that possible? 回答1: You might try converting the JSNumber to bytes/integers first and shifting the result yourself. Using TypedArray stuff available in recent versions of major browsers: var f = new Float64Array

Convert ieee754 to decimal in node

∥☆過路亽.° 提交于 2019-12-02 11:19:21
问题 I have a buffer in node <Buffer 42 d9 00 00> that is supposed to represent the decimal 108.5. I am using this module to try and decode the buffer: https://github.com/feross/ieee754. ieee754.read = function (buffer, offset, isLE, mLen, nBytes) The arguments mean the following: buffer = the buffer offset = offset into the buffer value = value to set (only for write) isLe = is little endian? mLen = mantissa length nBytes = number of bytes I try to read the value: ieee754.read(buffer, 0, false,

Why is Infinity × 0 = NaN?

故事扮演 提交于 2019-12-02 09:36:13
问题 IEEE 754 specifies the result of 1 / 0 as ∞ (Infinity). However, IEEE 754 then specifies the result of 0 × ∞ as NaN. This feels counter-intuitive : Why is 0 × ∞ not 0? We can think of 1 / 0 = ∞ as the limit of 1 / z as z tends to zero We can think of 0 × ∞ = 0 as the limit of 0 × z as z tends to ∞. Why does the IEEE standard follow intuition 1. but not 2.? 回答1: It is easier to understand the behavior of IEEE 754 floating point zeros and infinities if you do not think of them as being

Convert MBF Single and Double to IEEE

痴心易碎 提交于 2019-12-02 08:10:52
问题 Follow-Up available: There's a follow-up with further details, see Convert MBF to IEEE. I've got some legacy data which is still in use, reading the binary files is not the problem, the number format is. All floating point numbers are saved in MBF format (Single and Double). I've found a topic about that on the MSDN boards but that one only deals with Single values. I'd also would like to stay away from API-Calls as far as I can. Does anyone have a solution for Doubles? Edit: Just in case

IEEE-754: cardinality of the set of rational numbers

痞子三分冷 提交于 2019-12-02 08:08:14
问题 What is the cardinality of the set of rational numbers, which have an exact representation in floating point format compatible with single-precision IEEE-754? 回答1: There are 2139095039 finite positive floats. There are as many finite negative floats. Do you want to include +0.0 and -0.0 as two items or as one? Depending on the answer the total is 2 * 2139095039 + 2 or 2 * 2139095039 + 1, that is, respectively, 4278190080 or 4278190079. Source for the 2139095039 number: #include <float.h>

Convert ieee754 to decimal in node

删除回忆录丶 提交于 2019-12-02 07:49:31
I have a buffer in node <Buffer 42 d9 00 00> that is supposed to represent the decimal 108.5. I am using this module to try and decode the buffer: https://github.com/feross/ieee754 . ieee754.read = function (buffer, offset, isLE, mLen, nBytes) The arguments mean the following: buffer = the buffer offset = offset into the buffer value = value to set (only for write) isLe = is little endian? mLen = mantissa length nBytes = number of bytes I try to read the value: ieee754.read(buffer, 0, false, 5832704, 4) but am not getting the expected result. I think I am calling the function correctly,

Easiest way to convert a decimal float to bit representation manually based on IEEE 754, without using any library

ⅰ亾dé卋堺 提交于 2019-12-02 05:54:12
问题 I know there are number ways to read every bit of a IEEE 754 float using written libraries. I don't want that, and I want to be able to manually convert a decimal float to binary representation based on IEEE 754. I understand how IEEE 754 works and I am just trying to apply it. I ask this question here just want to see whether my way is normal or stupid and I am also wondering how PC does it quickly. If I am given a decimal float in a string , I need to figure out what the E is and what the M

Convert MBF Double to IEEE

拈花ヽ惹草 提交于 2019-12-02 04:57:28
I found a topic below for convert MBF to IEEE. Convert MBF Single and Double to IEEE Anyone can explain what are the function of the code marked below? Dim sign As Byte = mbf(6) And ToByte(&H80) 'What is the reason AND (&H80)? Dim exp As Int16 = mbf(7) - 128S - 1S + 1023S 'Why is 1152 (128+1+1023)? ieee(7) = ieee(7) Or sign 'Why don't just save sign to ieee(7)? ieee(7) = ieee(7) Or ToByte(exp >> 4 And &HFF) 'What is the reason to shift 4? Public Shared Function MTID(ByVal src() As Byte, ByVal startIndex As Integer) As Double Dim mbf(7) As Byte Dim ieee(7) As Byte Array.Copy(src, startIndex,

How to apply bitwise operations to the actual IEEE 754 representation of JS Numbers?

对着背影说爱祢 提交于 2019-12-02 04:29:37
In JavaScript, whenever you perform a bitwise operation such as x << 2 , the 64-bit float representation gets converted to a 32-bit unsigned int before the shifting actually occurs. I am insterested in applying the shift to the actual, unaltered IEEE 754 bitwise representation. How is that possible? You might try converting the JSNumber to bytes/integers first and shifting the result yourself. Using TypedArray stuff available in recent versions of major browsers: var f = new Float64Array( 1 ); // creating typed array to contain single 64-bit IEEE754 f.set( [ 1.0 ], 0 ); // transferring