Consider a list of tuples lst
lst = [(\'a\', 10), (\'b\', 20)]
question
What is the quickest way to
Two possible downsides to @Divakar's
np.asarray(lst)
- it converts everything to string, requiring Pandas to convert them back. And speed - making arrays is relatively expensive.
An alternative is to use the zip(*)
idiom to 'transpose' the list:
In [65]: lst = [('a', 10), ('b', 20), ('j',1000)]
In [66]: zlst = list(zip(*lst))
In [67]: zlst
Out[67]: [('a', 'b', 'j'), (10, 20, 1000)]
In [68]: out = pd.Series(zlst[1], index = zlst[0])
In [69]: out
Out[69]:
a 10
b 20
j 1000
dtype: int32
Note that my dtype is int, not object.
In [79]: out.values
Out[79]: array(['10', '20', '1000'], dtype=object)
So in the array case, Pandas doesn't convert the values back to integer; it leaves them as strings.
==============
My guess about timings is off - I don't have any feel for pandas Series creation times. Also the sample is too small to do meaningful timings:
In [71]: %%timeit
...: out=pd.Series(dict(lst))
1000 loops, best of 3: 305 µs per loop
In [72]: %%timeit
...: arr=np.array(lst)
...: out = pd.Series(arr[:,1], index=arr[:,0])
10000 loops, best of 3: 198 µs per loop
In [73]: %%timeit
...: zlst = list(zip(*lst))
...: out = pd.Series(zlst[1], index=zlst[0])
...:
1000 loops, best of 3: 275 µs per loop
Or forcing the integer interpretation
In [85]: %%timeit
...: arr=np.array(lst)
...: out = pd.Series(arr[:,1], index=arr[:,0], dtype=int)
...:
...:
1000 loops, best of 3: 253 µs per loop