converting daily stock data to weekly-based via pandas in Python

后端未结

关注

 4  845

I\'ve got a DataFrame storing daily-based data which is as below:

Date              Open        High         Low       Close   Volume
2010-01-04


                      
              相关标签:


      
      
        
          4条回答        

        
                         				            
            
           
            
                              
                
              
              
                
                  -上瘾入骨i        
                
              
                            
                2020-12-23 11:16
              
            
            
                                                                       
In general, assuming that you have the dataframe in the form you specified, you need to do the following steps:


put Date in the index
resample the index. 


What you have is a case of applying different functions to different columns. See.

You can resample in various ways. for e.g. you can take the mean of the values or count or so on. check pandas resample. 

You can also apply custom aggregators (check the same link).
With that in mind, the code snippet for your case can be given as:

f['Date'] = pd.to_datetime(f['Date'])
f.set_index('Date', inplace=True)
f.sort_index(inplace=True)

def take_first(array_like):
    return array_like[0]

def take_last(array_like):
    return array_like[-1]

output = f.resample('W',                                 # Weekly resample
                    how={'Open': take_first, 
                         'High': 'max',
                         'Low': 'min',
                         'Close': take_last,
                         'Volume': 'sum'}, 
                    loffset=pd.offsets.timedelta(days=-6))  # to put the labels to Monday

output = output[['Open', 'High', 'Low', 'Close', 'Volume']]


Here, W signifies a weekly resampling which by default spans from Monday to Sunday. To keep the labels as Monday, loffset is used.
There are several predefined day specifiers. Take a look at pandas offsets. You can even define custom offsets (see).  

Coming back to the resampling method. Here for Open and Close you can specify custom methods to take the first value or so on and pass the function handle to the how argument.

This answer is based on the assumption that the data seems to be daily, i.e. for each day you have only 1 entry. Also, no data is present for the non-business days. i.e. Sat and Sun. So taking the last data point for the week as the one for Friday is ok. If you so want you can use business week instead of 'W'. Also, for more complex data you may want to use groupby to group the weekly data and then work on the time indices within them.

btw a gist for the solution can be found at: 
https://gist.github.com/prithwi/339f87bf9c3c37bb3188
                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
            
           
            
                              
                
              
              
                
                  星月不相逢        
                
              
                            
                2020-12-23 11:16
              
            
            
                                                                       
I had the exact same question and found a great solution here. 

https://www.techtrekking.com/how-to-convert-daily-time-series-data-into-weekly-and-monthly-using-pandas-and-python/

The weekly code is posted below.

import pandas as pd
import numpy as np

print('*** Program Started ***')

df = pd.read_csv('15-06-2016-TO-14-06-2018HDFCBANKALLN.csv')

# ensuring only equity series is considered
df = df.loc[df['Series'] == 'EQ']

# Converting date to pandas datetime format
df['Date'] = pd.to_datetime(df['Date'])
# Getting week number
df['Week_Number'] = df['Date'].dt.week
# Getting year. Weeknum is common across years to we need to create unique index by using year and weeknum
df['Year'] = df['Date'].dt.year

# Grouping based on required values
df2 = df.groupby(['Year','Week_Number']).agg({'Open Price':'first', 'High Price':'max', 'Low Price':'min', 'Close Price':'last','Total Traded Quantity':'sum'})
# df3 = df.groupby(['Year','Week_Number']).agg({'Open Price':'first', 'High Price':'max', 'Low Price':'min', 'Close Price':'last','Total Traded Quantity':'sum','Average Price':'avg'})
df2.to_csv('Weekly_OHLC.csv')
print('*** Program ended ***')

                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
            
           
            
                              
                
              
              
                
                  南旧        
                
              
                            
                2020-12-23 11:17
              
            
            
                                                                       
You can resample (to weekly), offset (shift), and apply aggregation rules as follows:

logic = {'Open'  : 'first',
         'High'  : 'max',
         'Low'   : 'min',
         'Close' : 'last',
         'Volume': 'sum'}

offset = pd.offsets.timedelta(days=-6)

f = pd.read_clipboard(parse_dates=['Date'], index_col=['Date'])
f.resample('W', loffset=offset).apply(logic)


to get:

                 Open       High        Low      Close   Volume
Date                                                           
2010-01-04  38.660000  40.700001  38.509998  40.290001  5925600
2010-01-11  40.209999  40.970001  39.279999  40.450001  6234600

                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
            
           
            
                              
                
              
              
                
                  渐次进展        
                
              
                            
                2020-12-23 11:35
              
            
            
                                                                       
Not a direct answer, but suppose the columns are the dates (transpose of your table), without missing dates.

'''sum up daily results in df to weekly results in wdf'''
wdf = pd.DataFrame(index = df.index)
for i in range(len(df.columns)):
    if (i!=0) & (i%7==0):
        wdf['week'+str(i//7)]= df[df.columns[i-7:i]].sum(axis = 1)

                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
                             
        
        
          
            
            
              
              
            
    


                                 
              
            
                          
    

        
         
                验证码
                
                  
                
                
                   看不清?
                
              
                                  
                    
   
                 
             
              提交回复