Getting a list of specific index items from a list of dictionaries in python (list comprehension)

后端 未结 5 1126
北海茫月
北海茫月 2021-01-24 08:31

I have a list of dictionaries like so:

listDict = [{\'id\':1,\'other\':2},{\'id\':3,\'other\':4},{\'id\':5,\'other\':6}]

I want a list of all t

5条回答
  •  清歌不尽
    2021-01-24 08:43

    More conceptually pleasing and potentially faster method depending on how big your data is.

    Using pandas package to simply refer to keys as column headers and group values using the same key:

    import pandas as pd
    listDict = [{'id':1,'other':2},{'id':3,'other':4},{'id':5,'other':6}]
    df = pd.DataFrame(listDict)
    
    # Then just reference the 'id' column to get a numpy array of it
    df['id']
    
    # or just get a list
    df['id'].tolist()
    

    Some benchmarking below, pandas clearly outperforms on large data. The small case uses the given 3 entries, the large case has 150k entries:

    setup_large = "listDict = [];\
    [listDict.extend(({'id':1,'other':2},{'id':3,'other':4},\
    {'id':5,'other':6})) for _ in range(50000)];\
    from operator import itemgetter;import pandas as pd;\
    df = pd.DataFrame(listDict);"
    
    setup_small = "listDict = [];\
    listDict.extend(({'id':1,'other':2},{'id':3,'other':4},{'id':5,'other':6}));\
    from operator import itemgetter;import pandas as pd;\
    df = pd.DataFrame(listDict);"
    
    method1 = '[item["id"] for item in listDict]'
    method2 = "df['id'].tolist()"
    
    import timeit
    t = timeit.Timer(method1, setup_small)
    print('Small Method LC: ' + str(t.timeit(100)))
    t = timeit.Timer(method2, setup_small)
    print('Small Method Pandas: ' + str(t.timeit(100)))
    
    t = timeit.Timer(method1, setup_large)
    print('Large Method LC: ' + str(t.timeit(100)))
    t = timeit.Timer(method2, setup_large)
    print('Large Method Pandas: ' + str(t.timeit(100)))
    
    #Small Method LC: 8.79764556885e-05
    #Small Method Pandas: 0.00153517723083
    #Large Method LC: 2.34853601456
    #Large Method Pandas: 0.605192184448
    

提交回复
热议问题