Splitting a List inside a Pandas DataFrame

后端未结

关注

 4  1786

南方客 2021-02-02 01:09

I have a csv file that contains a number of columns. Using pandas, I read this csv file into a dataframe and have a datetime index and five or six other columns.

One of

4条回答

予麋鹿 (楼主)

2021-02-02 02:09

This doesn't feel very pythonic, but it works (provided your createDate is unique!)

Apply will only return more rows than it gets with a groupby, so we're going to use groupby artificially (i.e. groupby a column of unique values, so each group is one line).

def splitRows(x):

    # Extract the actual list of time-stamps. 
    theList = x.TimeStamps.iloc[0]

    # Each row will be a dictionary in this list.
    listOfNewRows = list()

    # Iterate over items in list of timestamps, 
    # putting each one in a dictionary to later convert to a row, 
    # then adding the dictionary to a list. 

    for i in theList:
        newRow = dict()
        newRow['CreateDate'] = x.CreateDate.iloc[0]
        newRow['TimeStamps'] = i
        listOfNewRows.append(newRow)

    # Now convert these dictionaries into rows in a new dataframe and return it. 
    return pd.DataFrame(listOfNewRows)


df.groupby('CreateDate', as_index = False, group_keys = False).apply(splitRows)

Followup: If CreateDate is NOT unique, you can just reset the index to a new column and groupby that.

0 讨论(0)

查看其它4个回答