Log returns of multiple securities for multiple time period in R

北城余情 提交于 2019-12-08 11:12:26

问题


I have dataset containing daily closing prices of 5413 companies from 2000 to 2014. I want to calculate daily log returns for the stocks as according to dates as log(Price today/Price yesterday). I illustrate the dataset as follows:

Date       A G L    ABA    ABB ABBEY 
2000-1-3    NA      NA      NA  NA
2000-1-4    79.5    325     NA  961  
2000-1-5    79.5    322.5   NA  945
2000-1-6    79.5    327.5   NA  952
2000-1-7    NA      327.5   NA  941  
2000-1-10   79.5    327.5   NA  946
2000-1-11   79.5    327.5   NA  888

How could calculate the the daily log returns and additionally tackle the NA. My sample period is from 2000 to 2014 so there are some companies who were listed in year 2001,so, for the whole year 2000 they have NA, how should this be handled. Your help is highly appreciated.


回答1:


According to this topic I'll try to describe my proposition:

What I understand is that we have got a dataframe of dates and thousands of companies. Here's our example dataframe called prices:

> prices
    newdates nsp1  nsp2 nsp3 nsp4
1 2000-01-03   NA    NA   NA   NA
2 2000-01-04 79.5 325.0   NA  961
3 2000-01-05 79.5 322.5   NA  945
4 2000-01-06 79.5 327.5   NA  952
5 2000-01-07   NA 327.5   NA  941
6 2000-01-10 79.5 327.5   NA  946
7 2000-01-11 79.5 327.5   NA  888

To create a new dataframe of log-returns I used below code:

logs=data.frame(
+   cbind.data.frame(
+     newdates[-1],
+     diff(as.matrix(log(prices[,-1])))
+     )
+   )
> logs
  newdates..1. nsp1         nsp2 nsp3         nsp4
1   2000-01-04   NA           NA   NA           NA
2   2000-01-05    0 -0.007722046   NA -0.016789481
3   2000-01-06    0  0.015384919   NA  0.007380107
4   2000-01-07   NA  0.000000000   NA -0.011621895
5   2000-01-10   NA  0.000000000   NA  0.005299429
6   2000-01-11    0  0.000000000   NA -0.063270826

To clarify what is going on in this code lets analyze it from the inside out:

Step 1: Calculating log-returns

  • You know that log(a/b) = log(a)-log(b), so we can calculate differences of logarithms. Funcition diff(x,lag=1) calculates differences with given lag. Here it is lag=1 so it gives first differences.
  • Our x are prices in dataframe. Do pick from a data.frame every columns without the first (there are dates) we use prices[,-1].
  • We need logarithms, so log(prices[,-1])
  • Function diff() works with vector or matrix, so we need to treat calculated logarithms as matrix, thus `as.matrix(log(prices[,-1]))
  • Now we can use diff() with lag=1, so diff(as.matrix(log(prices[,-1])))

Step 2: Creating dataframe of log-returns and dates

  • We can't use just cbind(). Firstly, because lengths are different (returns are shorter by 1 record). We need to remove first date, so newdates[-1]

  • Secondly, using cbind() dates will be transformed into numeric values such 160027 or other.
    Here we have to use cbind.data.frame(x,y), as seen above.

  • Now data is ready and we can create use a data.frame() and name it as logs so logs=data.frame(...) as above.

If your dataset look like dataframe prices it should run. Most important thing is to use diff(log(x)) to easily calculate log-returns.

If you have any questions or problem, then just ask.



来源:https://stackoverflow.com/questions/34514240/log-returns-of-multiple-securities-for-multiple-time-period-in-r

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!