Calculating time series days-in-a-row that fit certain criteria

五迷三道 提交于 2021-01-29 10:42:59

问题


I have a spatial and temporal df:

'date'        'spatial_pixel'   'column_A'   ...
 ----             -----          ---          
 2012-04-01   |   1000     |      5
 2012-04-01   |   1001     |      1
 ...              ...            ...

I want a column (groupedby 'spatial_pixel' and 'date') that counts the days-in-a-row a boolean is met. Say 'column_A' < 2:

'date'        'spatial_pixel'   'column_A'   'days-in-a-row'   ...
 ----             -----          ---           ----
 2012-03-30   |   1001     |      5    |         0
 2012-04-01   |   1001     |      1    |         1
 2012-04-02   |   1001     |      1    |         2
 2012-04-03   |   1001     |      3    |         0
 ...              ...            ...            ...

My Attempts:

First, I made a new dataframe that when the boolean is True ('column_A'< 2) the monthly day number (e.g. 1,2,3,....28,29,30) is written. (However, I need it to range from 1-365, so that end of months and beginning of months are easily identified as consecutive).

'date'        'spatial_pixel'   'column_A'   'day'   ...
 ----             -----          ---           ----
 2012-03-30   |   1001     |      5    |         NaN
 2012-04-01   |   1001     |      1    |         1
 2012-04-02   |   1001     |      1    |         2
 2012-04-03   |   1001     |      3    |         NaN
 2012-04-30   |   1001     |      1    |         30
 2012-04-31   |   1001     |      1    |         31     
 ...              ...            ...            ...

Second,

I have unsuccessfully tried to create a new column that counts how many consecutive month days, using modified code from @ZJS: Pandas: conditional rolling count.

def rolling_count(val):
    if val == rolling_count.previous + 1 :
        rolling_count.count +=1
    else:
        rolling_count.previous = val
        rolling_count.count = 1
    return rolling_count.count
rolling_count.count = 0 #static variable
rolling_count.previous = None #static variable

df['count'] == df.groupby(['spatial_pixel','date'])['day'].apply(rolling_count)                             


KeyError: 'count'

Any help would be greatly appreciated!


回答1:


IIUYC, here is my aproach to this problem:

import pandas as pd
from datetime import datetime

df = pd.DataFrame(
    [
     [datetime(2016, 1, 1), 1000, 5], 
     [datetime(2016, 1, 1), 1001, 1], 
     [datetime(2016, 1, 2), 1000, 1], 
     [datetime(2016, 1, 2), 1001, 1], 
     [datetime(2016, 1, 3), 1000, 1], 
     [datetime(2016, 1, 3), 1001, 5], 
     [datetime(2016, 1, 4), 1000, 1], 
     [datetime(2016, 1, 4), 1001, 1],
    ], 
    columns=['date', 'spatial_pixel', 'column_A']
)

df
#         date  spatial_pixel  column_A
# 0 2016-01-01           1000         5
# 1 2016-01-01           1001         1
# 2 2016-01-02           1000         1
# 3 2016-01-02           1001         1
# 4 2016-01-03           1000         1
# 5 2016-01-03           1001         5
# 6 2016-01-04           1000         1
# 7 2016-01-04           1001         1

def sum_days_in_row_with_condition(g):
    sorted_g = g.sort_values(by='date', ascending=True)
    condition = sorted_g['column_A'] < 2
    sorted_g['days-in-a-row'] = condition.cumsum() - condition.cumsum().where(~condition).ffill().astype(int)
    return sorted_g

(df.groupby('spatial_pixel')
   .apply(sum_days_in_row_with_condition)
   .reset_index(drop=True))
#         date  spatial_pixel  column_A  days-in-a-row
# 0 2016-01-01           1000         5              0
# 1 2016-01-02           1000         1              1
# 2 2016-01-03           1000         1              2
# 3 2016-01-04           1000         1              3
# 4 2016-01-01           1001         1              1
# 5 2016-01-02           1001         1              2
# 6 2016-01-03           1001         5              0
# 7 2016-01-04           1001         1              1


来源:https://stackoverflow.com/questions/50687623/calculating-time-series-days-in-a-row-that-fit-certain-criteria

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!