Why and how does python truncate numerical data?

问题

Am dealing with two variables here, but confused because their values seem to be changing (they loose precision) when I want to send them as URL parameters as they are.

Look at this scenario as I reproduce it here from the python interpreter:

>>> lat = 0.33245794180134
>>> long = 32.57355093956
>>> lat
0.33245794180133997
>>> long
32.57355093956
>>> nl = str(lat)
>>> nl '0.332457941801'
>>> nlo = str(long)
>>> nlo '32.5735509396'

So what is happening? and how can I ensure that when I serialize lat and long to strings and send them as part of a url's query string I don't lose their exact precision?

To clarify the situation:

The data originally comes to my module as floats (in a collection) from another module that creates them from calculations.
Precision is a sensitive issue here because this data is being used to do tracking and monitoring of sorts and wrong values might cause false-positives or unnecessary alarms.
There is no way to send the data over to the target engine (which listens via a restiful api) without serializing the data to strings (so i can place them in a query string as params)

So what I needed was the best way to transform floats into strings with minimal loss of precision / information.

回答1:

In general if you use '%.14f' % lat, you are LOSING PRECISION.

To get full precision from a float, use repr().

Example:

>>> lat = 1/3.
>>> lat
0.3333333333333333
>>> str(lat).count('3')
12
>>> ('%.14f' % lat).count('3')
14
>>> repr(lat).count('3')
16
>>>

By the way, you are using an old Python.

>>> 0.33245794180134 == 0.33245794180133997
True
>>>

Pythons before 2.7 produce repr(a_float) by using 17 significant decimal digits because that will guarantee that float(repr(a_float)) == a_float. The new method is to use the smallest number of digits that will provide the same guarantee. Follow this link and Ctrl-F search for repr().

If you are getting those numbers from an external source, then you could be losing precision by floating them and then serialising them with 14 decimal digits of precision.

If you are getting those numbers by calculation, then you could be losing precision by serialising them with 14 decimal digits of precision.

Summary: In general if you use '%.14f' % lat, YOU are losing precision -- not Python, not floating-point arithmetic, it's you..

回答2:

You can try using string formatting to get desired precision.

>>> lat = 0.33245794180134
>>> lat
0.33245794180134
>>> "%.14f" % lat
'0.33245794180134'
>>>

edit to incorporate comments:

>>> '{0:.14f}'.format(lat)
'0.33245794180134'
>>>

回答3:

str is for human-readable representations. It rarely produces something that's equivalent or similar to an expression that produces the value fed to it. repr, on the other hand, is explicitly for that. In fact, it's what the REPL uses to give feedback about the results of expressions.

Note though that floats are still of finite precision and can't represent certain numbers exactly, regardless of how you serialize them to strings.

回答4:

The Decimal type from the python standard library decimal module is definitely what you want. It allows you to keep up 28 digits of precision by default but doesn't force numbers into a binary floating point representation. The Decimal type also allows for mathematical operations involving numbers of other types without requiring conversion.

Your example converted to Decimal:

>>> import decimal
>>> lat = decimal.Decimal(repr(0.33245794180134))
>>> long = decimal.Decimal(repr(32.57355093956))
>>> lat
Decimal('0.33245794180134')
>>> long
Decimal('32.57355093956')
>>> repr(lat)
'0.33245794180134'
>>> repr(long)
'32.57355093956'

Adding a number to a Decimal:

>>> lat + 2
Decimal('2.33245794180134')

Avoiding the imprecision of binary floating point representations of numbers like 1.1:

>>> onepointone = decimal.Decimal(repr(1.1))
>>> onepointone
Decimal('1.1')

The decimal module in the python standard library is a real math module rather than the approximation of math you get with traditional floating point representations and floating point processors. I wish it was the default because in the dictionary the approximation floating point math we get by default in most languages should be the first example of the definition of useless.

来源：https://stackoverflow.com/questions/5728978/why-and-how-does-python-truncate-numerical-data

标签

python

serialization

floating-accuracy