问题
Consider the following example. There is an image where user can select rectangular area (part of it). The image is displayed with some scale. Then we change the scale and we need to recalculate the new coordinates of selection. Let's take width,
newSelectionWidth = round(oldSelectionWidth / oldScale * newScale)
where oldScale = oldDisplayImageWidth / realImageWidth
, newScale = newDisplayImageWidth / realImageWidth
, all the values except for scales are integers.
The question is how to prove that newSelectionWidth = newDisplayImageWidth
given oldSelectionWidth = oldDisplayImageWidth
for any value of oldDisplayImageWidth
, newDisplayImageWidth
, realImageWidth
? Or under what conditions this doesn't hold?
I was thinking about the answer too and this is what I've come up with, may be inaccurate and/or incomplete.
All numbers in JavaScript are double-precision numbers. Generally, this gives us maximum error of about 10-16 (machine epsilon). This means in order to have error of 0.5 or more, (1) we need to perform 0.5 / 10-16 = 5·1015 operations. The other source of error is calculation with too big (|value| > 1.7976931348623157·10308) or too low numbers (|value| < 2.2250738585072014·10-308) (link). This means (2) if somewhere in the course of calculation we get too big or too low number, e.g. because oldDisplayImageWidth / realImageWidth > 1.7976931348623157·10308 or the like, then the error might exceed 0.5. Granted we're talking about displaying images on today's monitors, all these conditions are extremely unlikely.
回答1:
If newDisplayWidth is less than 1125899906842624 and the other integers are positive and do not exceed 53 bits, then newSelectionWidth
equals newDisplayWidth
. A proof follows.
Notation:
- I will use the term
double
to name the floating-point type being used, IEEE-754 64-bit binary. - Text in
code
style represents computed values, while plain text represents mathematical values. Thus 1/3 is exactly one-third, while1./3.
is the result of dividing 1 by 3 in floating-point arithmetic.
I assume:
- The widths are positive integers not wider than the
double
significand (53 bits). - The divisions
oldDisplayImageWidth / realImageWidth
andnewDisplayImageWidth / realImageWidth
are performed indouble
arithmetic with the operands converted todouble
.
The limits on the integers assures that conversion to double
is exact and that overflow and underflow are not encountered during the operations used in this problem.
Consider oldScale
, which is a double
set to oldDisplayImageWidth / realImageWidth
. The maximum error in a single floating-point operation in round-to-nearest mode is half an ULP (because every mathematical number is no farther than half an ULP from a representable number). Thus, oldScale
equals oldDisplayImageWidth / realImageWidth • (1+e0), where e0 represents the relative error and is at most half a double
epsilon. (The double
epsilon is 2-52, so |e0| ≤ 2-53.)
Similarly, newScale
is newDisplayImageWidth / realImageWidth • (1+e1), where e1 is some error that is at most 2-53.
Then oldSelectionWidth / oldScale
is oldSelectionWidth / oldScale
• (1+e2), again for some e2 ≤ 2-53, and oldSelectionWidth / oldScale * newScale
is oldSelectionWidth / oldScale
• (1+e2) • newScale
• (1+oldSelectionWidth / oldScale
• (1+e3). Note that this is the argument passed to round
.
Now substitute the expressions we have for oldScale
and newScale
. This yields oldSelectionWidth / (oldDisplayImageWidth / realImageWidth • (1+e0)) • (1+e2) • (newDisplayImageWidth / realImageWidth • (1+e1)) • (1+e3). The realImageWidth terms cancel, and we can rearrange the others to produce oldSelectionWidth • newDisplayImageWidth / oldDisplayImageWidth • (1+e1) • (1+e2) • (1+e3) / (1+e0).
We are given that oldSelectionWidth equals oldDisplayImageWidth, so those cancel, and the argument to round
is exactly: newDisplayImageWidth • (1+e1) • (1+e2) • (1+e3) / (1+e0).
Consider the combined error terms minus one (this is the relative error in the final value): (1+e1) • (1+e2) • (1+e3) / (1+e0) – 1. This expression has greatest magnitude when e0 is –2-53 and the others are +2-53. Then it is slightly greater than 2 ULP (at most 324518553658426753804753784799233 / 730750818665451377972204001751459814038961127424). If newDisplayImageWidth is less than 1125899906842624, then newDisplayImageWidth times this relative error is less than ½. Therefore, newDisplayImageWidth • (1+e1) • (1+e2) • (1+e3) / (1+e0) would be within ½ of newDisplayImageWidth.
Since newDisplayImageWidth is an integer, if the argument to round
is within ½ of newDisplayWidth, then the result is newDisplayWidth.
Therefore, if newDisplayWidth is less than 1125899906842624, then newSelectionWidth
equals newDisplayWidth
.
(The above proves that 1125899906842624 is a sufficient limit, but it may not be necessary. A more involved analysis may be able to prove that certain combinations of errors are impossible, so the maximum combined error is less than used above. This would relax the limit, allowing larger values of newDisplayWidth.)
回答2:
You're mixing up absolute and relative error. Assuming a relative error of 10^-16 you end up with a maximum relative error of 4 * 10^-16 after the 4 operations in your example. You want an absolute error < 0.5, so you're fine, as long as newSelectionWidth * 4 * 10^-16 < 0.5
.
来源:https://stackoverflow.com/questions/16862538/how-to-weigh-up-calculation-error