Split (explode) pandas dataframe string entry to separate rows

后端 未结 22 3596
一向
一向 2020-11-21 05:03

I have a pandas dataframe in which one column of text strings contains comma-separated values. I want to split each CSV field and create a new row per entry (as

22条回答
  •  梦谈多话
    2020-11-21 05:29

    Just used jiln's excellent answer from above, but needed to expand to split multiple columns. Thought I would share.

    def splitDataFrameList(df,target_column,separator):
    ''' df = dataframe to split,
    target_column = the column containing the values to split
    separator = the symbol used to perform the split
    
    returns: a dataframe with each entry for the target column separated, with each element moved into a new row. 
    The values in the other columns are duplicated across the newly divided rows.
    '''
    def splitListToRows(row, row_accumulator, target_columns, separator):
        split_rows = []
        for target_column in target_columns:
            split_rows.append(row[target_column].split(separator))
        # Seperate for multiple columns
        for i in range(len(split_rows[0])):
            new_row = row.to_dict()
            for j in range(len(split_rows)):
                new_row[target_columns[j]] = split_rows[j][i]
            row_accumulator.append(new_row)
    new_rows = []
    df.apply(splitListToRows,axis=1,args = (new_rows,target_column,separator))
    new_df = pd.DataFrame(new_rows)
    return new_df
    

提交回复
热议问题