Python list of dictionaries search

后端 未结 21 2045
-上瘾入骨i
-上瘾入骨i 2020-11-22 09:41

Assume I have this:

[
{\"name\": \"Tom\", \"age\": 10},
{\"name\": \"Mark\", \"age\": 5},
{\"name\": \"Pam\", \"age\": 7}
]

and by searchin

相关标签:
21条回答
  • 2020-11-22 10:05

    You can try this:

    ''' lst: list of dictionaries '''
    lst = [{"name": "Tom", "age": 10}, {"name": "Mark", "age": 5}, {"name": "Pam", "age": 7}]
    
    search = raw_input("What name: ") #Input name that needs to be searched (say 'Pam')
    
    print [ lst[i] for i in range(len(lst)) if(lst[i]["name"]==search) ][0] #Output
    >>> {'age': 7, 'name': 'Pam'} 
    
    0 讨论(0)
  • 2020-11-22 10:06

    I tested various methods to go through a list of dictionaries and return the dictionaries where key x has a certain value.

    Results:

    • Speed: list comprehension > generator expression >> normal list iteration >>> filter.
    • All scale linear with the number of dicts in the list (10x list size -> 10x time).
    • The keys per dictionary does not affect speed significantly for large amounts (thousands) of keys. Please see this graph I calculated: https://imgur.com/a/quQzv (method names see below).

    All tests done with Python 3.6.4, W7x64.

    from random import randint
    from timeit import timeit
    
    
    list_dicts = []
    for _ in range(1000):     # number of dicts in the list
        dict_tmp = {}
        for i in range(10):   # number of keys for each dict
            dict_tmp[f"key{i}"] = randint(0,50)
        list_dicts.append( dict_tmp )
    
    
    
    def a():
        # normal iteration over all elements
        for dict_ in list_dicts:
            if dict_["key3"] == 20:
                pass
    
    def b():
        # use 'generator'
        for dict_ in (x for x in list_dicts if x["key3"] == 20):
            pass
    
    def c():
        # use 'list'
        for dict_ in [x for x in list_dicts if x["key3"] == 20]:
            pass
    
    def d():
        # use 'filter'
        for dict_ in filter(lambda x: x['key3'] == 20, list_dicts):
            pass
    

    Results:

    1.7303 # normal list iteration 
    1.3849 # generator expression 
    1.3158 # list comprehension 
    7.7848 # filter
    
    0 讨论(0)
  • 2020-11-22 10:08
    dicts=[
    {"name": "Tom", "age": 10},
    {"name": "Mark", "age": 5},
    {"name": "Pam", "age": 7}
    ]
    
    from collections import defaultdict
    dicts_by_name=defaultdict(list)
    for d in dicts:
        dicts_by_name[d['name']]=d
    
    print dicts_by_name['Tom']
    
    #output
    #>>>
    #{'age': 10, 'name': 'Tom'}
    
    0 讨论(0)
  • 2020-11-22 10:09

    You can use a list comprehension:

    def search(name, people):
        return [element for element in people if element['name'] == name]
    
    0 讨论(0)
  • 2020-11-22 10:09

    Have you ever tried out the pandas package? It's perfect for this kind of search task and optimized too.

    import pandas as pd
    
    listOfDicts = [
    {"name": "Tom", "age": 10},
    {"name": "Mark", "age": 5},
    {"name": "Pam", "age": 7}
    ]
    
    # Create a data frame, keys are used as column headers.
    # Dict items with the same key are entered into the same respective column.
    df = pd.DataFrame(listOfDicts)
    
    # The pandas dataframe allows you to pick out specific values like so:
    
    df2 = df[ (df['name'] == 'Pam') & (df['age'] == 7) ]
    
    # Alternate syntax, same thing
    
    df2 = df[ (df.name == 'Pam') & (df.age == 7) ]
    

    I've added a little bit of benchmarking below to illustrate pandas' faster runtimes on a larger scale i.e. 100k+ entries:

    setup_large = 'dicts = [];\
    [dicts.extend(({ "name": "Tom", "age": 10 },{ "name": "Mark", "age": 5 },\
    { "name": "Pam", "age": 7 },{ "name": "Dick", "age": 12 })) for _ in range(25000)];\
    from operator import itemgetter;import pandas as pd;\
    df = pd.DataFrame(dicts);'
    
    setup_small = 'dicts = [];\
    dicts.extend(({ "name": "Tom", "age": 10 },{ "name": "Mark", "age": 5 },\
    { "name": "Pam", "age": 7 },{ "name": "Dick", "age": 12 }));\
    from operator import itemgetter;import pandas as pd;\
    df = pd.DataFrame(dicts);'
    
    method1 = '[item for item in dicts if item["name"] == "Pam"]'
    method2 = 'df[df["name"] == "Pam"]'
    
    import timeit
    t = timeit.Timer(method1, setup_small)
    print('Small Method LC: ' + str(t.timeit(100)))
    t = timeit.Timer(method2, setup_small)
    print('Small Method Pandas: ' + str(t.timeit(100)))
    
    t = timeit.Timer(method1, setup_large)
    print('Large Method LC: ' + str(t.timeit(100)))
    t = timeit.Timer(method2, setup_large)
    print('Large Method Pandas: ' + str(t.timeit(100)))
    
    #Small Method LC: 0.000191926956177
    #Small Method Pandas: 0.044392824173
    #Large Method LC: 1.98827004433
    #Large Method Pandas: 0.324505090714
    
    0 讨论(0)
  • 2020-11-22 10:09
    def dsearch(lod, **kw):
        return filter(lambda i: all((i[k] == v for (k, v) in kw.items())), lod)
    
    lod=[{'a':33, 'b':'test2', 'c':'a.ing333'},
         {'a':22, 'b':'ihaha', 'c':'fbgval'},
         {'a':33, 'b':'TEst1', 'c':'s.ing123'},
         {'a':22, 'b':'ihaha', 'c':'dfdvbfjkv'}]
    
    
    
    list(dsearch(lod, a=22))
    
    [{'a': 22, 'b': 'ihaha', 'c': 'fbgval'},
     {'a': 22, 'b': 'ihaha', 'c': 'dfdvbfjkv'}]
    
    
    
    list(dsearch(lod, a=22, b='ihaha'))
    
    [{'a': 22, 'b': 'ihaha', 'c': 'fbgval'},
     {'a': 22, 'b': 'ihaha', 'c': 'dfdvbfjkv'}]
    
    
    list(dsearch(lod, a=22, c='fbgval'))
    
    [{'a': 22, 'b': 'ihaha', 'c': 'fbgval'}]
    
    0 讨论(0)
提交回复
热议问题