Extracting just Month and Year separately from Pandas Datetime column

后端未结

关注

 11  1617

I have a Dataframe, df, with the following column:

df[\'ArrivalDate\'] =
...
936   2012-12-31
938   2012-12-29
965   2012-12-31
966   2012-12-31
967   2012-1


                      
              相关标签:


      
      
        
          11条回答        

        
                         				            
            
           
            
                              
                
              
              
                
                  野的像风        
                
              
                            
                2020-11-22 09:26
              
            
            
                                                                       
Extracting the Year say from ['2018-03-04']  

df['Year'] = pd.DatetimeIndex(df['date']).year  


The df['Year'] creates a new column. While if you want to extract the month just use .month  
                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
            
           
            
                              
                
              
              
                
                  别那么骄傲        
                
              
                            
                2020-11-22 09:32
              
            
            
                                                                       
df['year_month']=df.datetime_column.apply(lambda x: str(x)[:7])


This worked fine for me, didn't think pandas would interpret the resultant string date as date, but when i did the plot, it knew very well my agenda and the string year_month where ordered properly... gotta love pandas!
                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
            
           
            
                              
                
              
              
                
                  盖世英雄少女心        
                
              
                            
                2020-11-22 09:34
              
            
            
                                                                       
Thanks to jaknap32, I wanted to aggregate the results according to Year and Month, so this worked:

df_join['YearMonth'] = df_join['timestamp'].apply(lambda x:x.strftime('%Y%m'))


Output was neat:

0    201108
1    201108
2    201108

                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
            
           
            
                              
                
              
              
                
                  猫巷女王i        
                
              
                            
                2020-11-22 09:35
              
            
            
                                                                       
SINGLE LINE: Adding a column with 'year-month'-paires:
('pd.to_datetime' first changes the column dtype to date-time before the operation)
df['yyyy-mm'] = pd.to_datetime(df['ArrivalDate']).dt.strftime('%Y-%m') 

Accordingly for an extra 'year' or 'month' column:
df['yyyy'] = pd.to_datetime(df['ArrivalDate']).dt.strftime('%Y') 
df['mm'] = pd.to_datetime(df['ArrivalDate']).dt.strftime('%m') 

                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
            
           
            
                              
                
              
              
                
                  萌比男神i        
                
              
                            
                2020-11-22 09:36
              
            
            
                                                                       
If you want the month year unique pair, using apply is pretty sleek.

df['mnth_yr'] = df['date_column'].apply(lambda x: x.strftime('%B-%Y')) 


Outputs month-year in one column.

Don't forget to first change the format to date-time before, I generally forget.

df['date_column'] = pd.to_datetime(df['date_column'])

                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
            
           
            
                              
                
              
              
                
                  挽巷        
                
              
                            
                2020-11-22 09:41
              
            
            
                                                                       
Best way found!!

the df['date_column'] has to be in date time format.

df['month_year'] = df['date_column'].dt.to_period('M')


You could also use D for Day, 2M for 2 Months etc. for different sampling intervals, and in case one has time series data with time stamp, we can go for granular sampling intervals such as 45Min for 45 min, 15Min for 15 min sampling etc.
                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
   
          
     1
2
下一页
           
           
        
                                  
        
        
          
            
            
              
              
            
    


                                 
              
            
                          
    

        
         
                验证码
                
                  
                
                
                   看不清?
                
              
                                  
                    
   
                 
             
              提交回复