I have the following DataFrame. I am wondering whether it is possible to break the data
column into multiple columns. E.g., from this:
ID Date
Here is a function that can convert the string to a dictionary and aggregate values based on the key; After the conversion it will be easy to get the results with the pd.Series
method:
def str_to_dict(str1):
import re
from collections import defaultdict
d = defaultdict(int)
for k, v in zip(re.findall('[A-Z]', str1), re.findall('\d+', str1)):
d[k] += int(v)
return d
pd.concat([df, df['dictionary'].apply(str_to_dict).apply(pd.Series).fillna(0).astype(int)], axis=1)