“ValueError: cannot reindex from a duplicate axis”

前端 未结 1 475
北海茫月
北海茫月 2021-01-02 16:43

I have the following df:

Timestamp                            A      B      C     ...     
2014-11-09 00:00:00                     NaN     1      NaN   NaN           


        
相关标签:
1条回答
  • 2021-01-02 17:16

    Assumed that you have your Timestamp as index to begin with, you need to do the resample first, and reset_index before doing a groupby, here's the working sample:

    import pandas as pd
    
    df
                           A   B   C  ...
    Timestamp                            
    2014-11-09 00:00:00  NaN   1 NaN  NaN
    2014-11-09 00:00:00    2 NaN NaN  NaN
    2014-11-09 00:00:00  NaN NaN   3  NaN
    2014-11-09 08:24:00  NaN NaN   1  NaN
    2014-11-09 08:24:00  105 NaN NaN  NaN
    2014-11-09 09:19:00  NaN NaN  23  NaN
    
    df.resample('1Min', how='max').reset_index().groupby('Timestamp').sum()
    
                          A   B   C  ...
    Timestamp                           
    2014-11-09 00:00:00   2   1   3  NaN
    2014-11-09 00:01:00 NaN NaN NaN  NaN
    2014-11-09 00:02:00 NaN NaN NaN  NaN
    2014-11-09 00:03:00 NaN NaN NaN  NaN
    2014-11-09 00:04:00 NaN NaN NaN  NaN
    ...
    2014-11-09 09:17:00 NaN NaN NaN  NaN
    2014-11-09 09:18:00 NaN NaN NaN  NaN
    2014-11-09 09:19:00 NaN NaN  23  NaN
    

    Hope this helps.

    Updated:

    As said in comment, your 'Timestamp' isn't datetime and probably as string so you cannot resample by DatetimeIndex, just reset_index and convert it something like this:

    df = df.reset_index()
    df['ts'] = pd.to_datetime(df['Timestamp'])
    # 'ts' is now datetime of 'Timestamp', you just need to set it to index
    df = df.set_index('ts')
    ...
    

    Now just run the previous code again but replace 'Timestamp' with 'ts' and you should be OK.

    0 讨论(0)
提交回复
热议问题