I have a list of floats generated from a data structure which is a list of dictionaries - i.e. I\'ve iterated over the whole list and selected for certain values in the given di
If I understand correctly, you have generated a list of floats, each one from one of the dicts in the original list. Instead of generating a list of floats, why not generate a list of 2-tuples, being the float and it's corresponding dictionary-list-index...
Firstly, let's address the problems posed by using floating point.
floats aren't precisely represented due to the way computers work.
Floating point numbers are precisely represented in computers. There are, however, some limitations:
"{0:.20f}".format(0.1)
in python will return 0.10000000000000000555
.Now, depending on the source of your numbers, and the kind of computations you want to perform, there are different possible solutions for indexing them.
For numbers that can be described precisely in base10, you can use a Decimal
. This represents numbers in base10 exactly:
>>> from decimal import Decimal
>>> "{0:.20f}".format(Decimal('0.1'))
'0.10000000000000000000'
If you're dealing exclusively with rational numbers (even those without exact decimal representation), you can use fractions.
Note that if you use decimals or fractions, you'll need to use them as soon as possible in your processing. Converting from a float to a decimal/fraction in the late stages defeats their purpose - you can't get data that isn't there:
>>> "{0:.20f}".format(Decimal('0.1'))
'0.10000000000000000000'
>>> "{0:.20f}".format(Decimal(0.1))
'0.10000000000000000555'
Also, using decimals or fractions will come at a significant performance penalty. For serious number crunching you'll want to always use float, or even integers in their place
Finally, if your numbers are irrational, or if you're getting indexing mishaps even while using decimals or fractions, your best choice is probably indexing rounded versions of the numbers. Use buckets if necessary. collections.defaultdict
may be useful for this.
You could also keep a tree, or use binary search over a list with a custom comparison function, but you won't have O(1)
lookup