Extracting just Month and Year separately from Pandas Datetime column

后端 未结 11 1617
抹茶落季
抹茶落季 2020-11-22 09:09

I have a Dataframe, df, with the following column:

df[\'ArrivalDate\'] =
...
936   2012-12-31
938   2012-12-29
965   2012-12-31
966   2012-12-31
967   2012-1         


        
相关标签:
11条回答
  • 2020-11-22 09:26

    Extracting the Year say from ['2018-03-04']

    df['Year'] = pd.DatetimeIndex(df['date']).year  
    

    The df['Year'] creates a new column. While if you want to extract the month just use .month

    0 讨论(0)
  • 2020-11-22 09:32
    df['year_month']=df.datetime_column.apply(lambda x: str(x)[:7])
    

    This worked fine for me, didn't think pandas would interpret the resultant string date as date, but when i did the plot, it knew very well my agenda and the string year_month where ordered properly... gotta love pandas!

    0 讨论(0)
  • 2020-11-22 09:34

    Thanks to jaknap32, I wanted to aggregate the results according to Year and Month, so this worked:

    df_join['YearMonth'] = df_join['timestamp'].apply(lambda x:x.strftime('%Y%m'))
    

    Output was neat:

    0    201108
    1    201108
    2    201108
    
    0 讨论(0)
  • 2020-11-22 09:35

    SINGLE LINE: Adding a column with 'year-month'-paires: ('pd.to_datetime' first changes the column dtype to date-time before the operation)

    df['yyyy-mm'] = pd.to_datetime(df['ArrivalDate']).dt.strftime('%Y-%m')

    

    Accordingly for an extra 'year' or 'month' column:

    df['yyyy'] = pd.to_datetime(df['ArrivalDate']).dt.strftime('%Y')

    df['mm'] = pd.to_datetime(df['ArrivalDate']).dt.strftime('%m')

    
    0 讨论(0)
  • 2020-11-22 09:36

    If you want the month year unique pair, using apply is pretty sleek.

    df['mnth_yr'] = df['date_column'].apply(lambda x: x.strftime('%B-%Y')) 
    

    Outputs month-year in one column.

    Don't forget to first change the format to date-time before, I generally forget.

    df['date_column'] = pd.to_datetime(df['date_column'])
    
    0 讨论(0)
  • 2020-11-22 09:41

    Best way found!!

    the df['date_column'] has to be in date time format.

    df['month_year'] = df['date_column'].dt.to_period('M')
    

    You could also use D for Day, 2M for 2 Months etc. for different sampling intervals, and in case one has time series data with time stamp, we can go for granular sampling intervals such as 45Min for 45 min, 15Min for 15 min sampling etc.

    0 讨论(0)
提交回复
热议问题