I know that x87 has higher internal precision, which is probably the biggest difference that people see between it and SSE operations. But I have to wonder, is there any other
Conversion between float
and double
is faster with x87 (usually free) than with SSE. With x87, you can load and store a float
, double
or long double
to or from the register stack and it is converted to or from extended precision without extra cost. With SSE, additional instructions are required to do the type conversion if types are mixed, because the registers contain float
or double
values. These conversion instructions are fairly fast but do take extra time.
The real fix is to refrain from mixing float
and double
excessively, not to use x87, of course.