How to find the start time and end time of an event in python?

后端 未结 3 1064
隐瞒了意图╮
隐瞒了意图╮ 2021-01-27 11:22

I have a data frame consists of column 1 i.e event and column 2 is Datetime:

Sample data

 Event   Time
    0   2020-02-12 11:00:00
    0   2020-02-12 11         


        
相关标签:
3条回答
  • 2021-01-27 11:54

    Here is a method that can get the results without a for loop. I assume that the input data is read into a dataframe called df:

    # Initialize the output df
    dfout = pd.DataFrame()
    dfout['Event'] = df['Event']
    dfout['EventStartTime'] = df['Time']
    

    Now, I create a variable called 'change' that tells you whether the event changed.

    dfout['change'] = df['Event'].diff()
    

    This is how dfout looks now:

       Event       EventStartTime  change
    0      0  2020-02-12 11:00:00     NaN
    1      0  2020-02-12 11:30:00     0.0
    2      2  2020-02-12 12:00:00     2.0
    3      1  2020-02-12 12:30:00    -1.0
    4      0  2020-02-12 13:00:00    -1.0
    5      0  2020-02-12 13:30:00     0.0
    6      0  2020-02-12 14:00:00     0.0
    7      1  2020-02-12 14:30:00     1.0
    8      0  2020-02-12 15:00:00    -1.0
    9      0  2020-02-12 15:30:00     0.0
    

    Now, I go on to remove the rows where the event did not change:

    dfout = dfout.loc[dfout['change'] !=0 ,:]
    

    This will now leave me with rows where the event has changed.

    Next, the event end time of the current event is the start time of the next event.

    dfout['EventEndTime'] = dfout['EventStartTime'].shift(-1)
    

    The dataframe looks like this:

       Event       EventStartTime  change         EventEndTime
    0      0  2020-02-12 11:00:00     NaN  2020-02-12 12:00:00
    2      2  2020-02-12 12:00:00     2.0  2020-02-12 12:30:00
    3      1  2020-02-12 12:30:00    -1.0  2020-02-12 13:00:00
    4      0  2020-02-12 13:00:00    -1.0  2020-02-12 14:30:00
    7      1  2020-02-12 14:30:00     1.0  2020-02-12 15:00:00
    8      0  2020-02-12 15:00:00    -1.0                  NaN
    

    You may chose to remove the 'change' column and also the last row if not needed.

    0 讨论(0)
  • 2021-01-27 12:16

    Use group by and agg to get the output in desired format.

    df =pd.DataFrame([['0',11],['1',12],['1',13],['0',15],['1',16],['3',11]],columns=['Event','Time'] )
    df.groupby(['Event']).agg(['first','last']).rename(columns={'first':'start-event','last':'end-event'})
    

    Output:

    Event start-event   end-event   
    0      11           15
    1      12           16
    3      11           11
    
    0 讨论(0)
  • 2021-01-27 12:18

    Assuming the dataframe is data:

    current_event = None
    result = []
    for event, time in zip(data['Event'], data['Time']):
        if event != current_event:
            if current_event is not None:
                result.append([current_event, start_time, time])
            current_event, start_time = event, time
    data = pandas.DataFrame(result, columns=['Event','EventStartTime','EventEndTime'])
    

    The trick is to save your event number; if the next event number is not the same as the saved one, the saved one has to be ended and a new one started.

    0 讨论(0)
提交回复
热议问题