Pandas, DataFrame: Splitting one column into multiple columns

后端 未结 2 929
花落未央
花落未央 2021-02-13 14:27

I have the following DataFrame. I am wondering whether it is possible to break the data column into multiple columns. E.g., from this:

ID       Date             


        
相关标签:
2条回答
  • 2021-02-13 14:51
    df = pd.DataFrame([
            [6, "a: 1, b: 2"],
            [6, "a: 1, b: 2"],
            [6, "a: 1, b: 2"],
            [6, "a: 1, b: 2"],
        ], columns=['ID', 'dictionary'])
    
    def str2dict(s):
        split = s.strip().split(',')
        d = {}
        for pair in split:
            k, v = [_.strip() for _ in pair.split(':')]
            d[k] = v
        return d
    
    df.dictionary.apply(str2dict).apply(pd.Series)
    

    Or:

    pd.concat([df, df.dictionary.apply(str2dict).apply(pd.Series)], axis=1)
    

    0 讨论(0)
  • 2021-02-13 14:54

    Here is a function that can convert the string to a dictionary and aggregate values based on the key; After the conversion it will be easy to get the results with the pd.Series method:

    def str_to_dict(str1):
        import re
        from collections import defaultdict
        d = defaultdict(int)
        for k, v in zip(re.findall('[A-Z]', str1), re.findall('\d+', str1)):
            d[k] += int(v)
        return d
    
    pd.concat([df, df['dictionary'].apply(str_to_dict).apply(pd.Series).fillna(0).astype(int)], axis=1)
    

    0 讨论(0)
提交回复
热议问题