I have a list of dictionaries like so:
listDict = [{\'id\':1,\'other\':2},{\'id\':3,\'other\':4},{\'id\':5,\'other\':6}]
I want a list of all t
More conceptually pleasing and potentially faster method depending on how big your data is.
Using pandas package to simply refer to keys as column headers and group values using the same key:
import pandas as pd
listDict = [{'id':1,'other':2},{'id':3,'other':4},{'id':5,'other':6}]
df = pd.DataFrame(listDict)
# Then just reference the 'id' column to get a numpy array of it
df['id']
# or just get a list
df['id'].tolist()
Some benchmarking below, pandas clearly outperforms on large data. The small case uses the given 3 entries, the large case has 150k entries:
setup_large = "listDict = [];\
[listDict.extend(({'id':1,'other':2},{'id':3,'other':4},\
{'id':5,'other':6})) for _ in range(50000)];\
from operator import itemgetter;import pandas as pd;\
df = pd.DataFrame(listDict);"
setup_small = "listDict = [];\
listDict.extend(({'id':1,'other':2},{'id':3,'other':4},{'id':5,'other':6}));\
from operator import itemgetter;import pandas as pd;\
df = pd.DataFrame(listDict);"
method1 = '[item["id"] for item in listDict]'
method2 = "df['id'].tolist()"
import timeit
t = timeit.Timer(method1, setup_small)
print('Small Method LC: ' + str(t.timeit(100)))
t = timeit.Timer(method2, setup_small)
print('Small Method Pandas: ' + str(t.timeit(100)))
t = timeit.Timer(method1, setup_large)
print('Large Method LC: ' + str(t.timeit(100)))
t = timeit.Timer(method2, setup_large)
print('Large Method Pandas: ' + str(t.timeit(100)))
#Small Method LC: 8.79764556885e-05
#Small Method Pandas: 0.00153517723083
#Large Method LC: 2.34853601456
#Large Method Pandas: 0.605192184448