I have a decent understanding of how floating point works, but I want to know how the specific exponent and mantissa sizes were decided upon. Are they optimal in some way? H
For 32-bit IEEE floats, the reasoning is that the precision should be at least as good as 24 bits fixed point.
Why exactly 24 bits, I don't know, but it seems like a reasonable tradeoff.
I suppose having a nice "round" number like that (mantissa + sign = 3 bytes, exponent = 1 byte) can also make implementations more efficient.
According to this interview with Will Kahan, they were based on the VAX F and G formats of the era.
Of course that doesn't answer the question of how those formats were chosen...