Get key by value in dictionary

后端 未结 30 2421
北恋
北恋 2020-11-21 06:28

I made a function which will look up ages in a Dictionary and show the matching name:

dictionary = {\'george\' : 16, \'amber\' : 19}
search_age          


        
30条回答
  •  暗喜
    暗喜 (楼主)
    2020-11-21 07:02

    already been answered, but since several people mentioned reversing the dictionary, here's how you do it in one line (assuming 1:1 mapping) and some various perf data:

    python 2.6:

    reversedict = dict([(value, key) for key, value in mydict.iteritems()])
    

    2.7+:

    reversedict = {value:key for key, value in mydict.iteritems()}
    

    if you think it's not 1:1, you can still create a reasonable reverse mapping with a couple lines:

    reversedict = defaultdict(list)
    [reversedict[value].append(key) for key, value in mydict.iteritems()]
    

    how slow is this: slower than a simple search, but not nearly as slow as you'd think - on a 'straight' 100000 entry dictionary, a 'fast' search (i.e. looking for a value that should be early in the keys) was about 10x faster than reversing the entire dictionary, and a 'slow' search (towards the end) about 4-5x faster. So after at most about 10 lookups, it's paid for itself.

    the second version (with lists per item) takes about 2.5x as long as the simple version.

    largedict = dict((x,x) for x in range(100000))
    
    # Should be slow, has to search 90000 entries before it finds it
    In [26]: %timeit largedict.keys()[largedict.values().index(90000)]
    100 loops, best of 3: 4.81 ms per loop
    
    # Should be fast, has to only search 9 entries to find it. 
    In [27]: %timeit largedict.keys()[largedict.values().index(9)]
    100 loops, best of 3: 2.94 ms per loop
    
    # How about using iterkeys() instead of keys()?
    # These are faster, because you don't have to create the entire keys array.
    # You DO have to create the entire values array - more on that later.
    
    In [31]: %timeit islice(largedict.iterkeys(), largedict.values().index(90000))
    100 loops, best of 3: 3.38 ms per loop
    
    In [32]: %timeit islice(largedict.iterkeys(), largedict.values().index(9))
    1000 loops, best of 3: 1.48 ms per loop
    
    In [24]: %timeit reversedict = dict([(value, key) for key, value in largedict.iteritems()])
    10 loops, best of 3: 22.9 ms per loop
    
    In [23]: %%timeit
    ....: reversedict = defaultdict(list)
    ....: [reversedict[value].append(key) for key, value in largedict.iteritems()]
    ....:
    10 loops, best of 3: 53.6 ms per loop
    

    Also had some interesting results with ifilter. Theoretically, ifilter should be faster, in that we can use itervalues() and possibly not have to create/go through the entire values list. In practice, the results were... odd...

    In [72]: %%timeit
    ....: myf = ifilter(lambda x: x[1] == 90000, largedict.iteritems())
    ....: myf.next()[0]
    ....:
    100 loops, best of 3: 15.1 ms per loop
    
    In [73]: %%timeit
    ....: myf = ifilter(lambda x: x[1] == 9, largedict.iteritems())
    ....: myf.next()[0]
    ....:
    100000 loops, best of 3: 2.36 us per loop
    

    So, for small offsets, it was dramatically faster than any previous version (2.36 *u*S vs. a minimum of 1.48 *m*S for previous cases). However, for large offsets near the end of the list, it was dramatically slower (15.1ms vs. the same 1.48mS). The small savings at the low end is not worth the cost at the high end, imho.

提交回复
热议问题