Convert pandas df with data in a “list column” into a time series in long format. Use three columns: [list of data] + [timestamp] + [duration]

后端 未结 1 593
情歌与酒
情歌与酒 2021-01-29 01:24

The aim is to convert a dataframe with a list column as the data column (and thus with just one timestamp and duration per row) into a time series in long format with a datetime

1条回答
  •  后悔当初
    2021-01-29 02:04

    Use DataFrame.explode first and then add counter by GroupBy.cumcount and to_timedelta to df.index:

    df_test = df_test.explode('nestedList')
    df_test.index += pd.to_timedelta(df_test.groupby(level=0).cumcount(), unit='s')
    
    print (df_test)
                        nestedList  duration_sec
    2016-05-04 08:53:20          1           3.0
    2016-05-04 08:53:21          2           3.0
    2016-05-04 08:53:22          1           3.0
    2016-05-04 08:53:23          9           3.0
    2016-05-04 08:55:00          2           3.0
    2016-05-04 08:55:01          2           3.0
    2016-05-04 08:55:02          3           3.0
    2016-05-04 08:55:03          0           3.0
    2016-05-04 08:56:40          1           3.0
    2016-05-04 08:56:41          3           3.0
    2016-05-04 08:56:42          3           3.0
    2016-05-04 08:56:43          0           3.0
    2016-05-04 08:58:20          1           3.0
    2016-05-04 08:58:21          1           3.0
    2016-05-04 08:58:22          3           3.0
    2016-05-04 08:58:23          9           3.0
    

    EDIT:

    df_test = df_test.explode('nestedList') 
    sizes = df_test.groupby(level=0)['nestedList'].transform('size').sub(1)
    duration = df_test['duration_sec'].div(sizes) 
    df_test.index += pd.to_timedelta(df_test.groupby(level=0).cumcount() * duration, unit='s') 
    

    EDIT2 by asker:

    With the resulting df this simple application of decompose() is now possible, which was the final aim:

    result_add = seasonal_decompose(x=df_test['nestedList'], model='additive', extrapolate_trend='freq', period=int(len(df_test)/2))
    plt.rcParams.update({'figure.figsize': (5,5)})
    result_add.plot().suptitle('Additive Decompose', fontsize=22)
    plt.show()
    

    0 讨论(0)
提交回复
热议问题