Splitting a List inside a Pandas DataFrame

后端 未结 4 1786
南方客
南方客 2021-02-02 01:09

I have a csv file that contains a number of columns. Using pandas, I read this csv file into a dataframe and have a datetime index and five or six other columns.

One of

4条回答
  •  予麋鹿
    予麋鹿 (楼主)
    2021-02-02 02:09

    This doesn't feel very pythonic, but it works (provided your createDate is unique!)

    Apply will only return more rows than it gets with a groupby, so we're going to use groupby artificially (i.e. groupby a column of unique values, so each group is one line).

    def splitRows(x):
    
        # Extract the actual list of time-stamps. 
        theList = x.TimeStamps.iloc[0]
    
        # Each row will be a dictionary in this list.
        listOfNewRows = list()
    
        # Iterate over items in list of timestamps, 
        # putting each one in a dictionary to later convert to a row, 
        # then adding the dictionary to a list. 
    
        for i in theList:
            newRow = dict()
            newRow['CreateDate'] = x.CreateDate.iloc[0]
            newRow['TimeStamps'] = i
            listOfNewRows.append(newRow)
    
        # Now convert these dictionaries into rows in a new dataframe and return it. 
        return pd.DataFrame(listOfNewRows)
    
    
    df.groupby('CreateDate', as_index = False, group_keys = False).apply(splitRows)
    

    Followup: If CreateDate is NOT unique, you can just reset the index to a new column and groupby that.

提交回复
热议问题