问题
I am trying to calculate the percentage change in price for quarterly data of companies recognized by a gvkey
(1001, 1384, etc...). and it's corresponding quarterly stock price, PRCCQ
.
gvkey PRCCQ
1 1004 23.750
2 1004 13.875
3 1004 11.250
4 1004 10.375
5 1004 13.600
6 1004 14.000
7 1004 17.060
8 1004 8.150
9 1004 7.400
10 1004 11.440
11 1004 6.200
12 1004 5.500
13 1004 4.450
14 1004 4.500
15 1004 8.010
What I am trying to do is add 8 columns showing 1 quarter return, 2 quarter return, etc. all the way to 8 quarters. I have been able to calculate 1 quarter return for each PRCCQ by using the delt
function of quantmod
and ddply
of plyr
, and I was also able to get the 2 quarter return using the same code by altering k
.
ddply(data, "gvkey", transform, DeltaCol = Delt(PRCCQ,k=2))
However, this equation will NOT allow me to go higher than k=2
without giving me an error of differing number of rows 2,3. I've tried using many alternate methods now but dint work. Is there a function I can plug into the ddply
code I have to replace Delt
or maybe another completely alternative line of code to display all 8 quarters of return in individual columns?
回答1:
You can declare your data as ts()
and use cbind()
and diff()
data <- read.table(header=T,text='gvkey PRCCQ
1004 23.750
1004 13.875
1004 11.250
1004 10.375
1004 13.600
1004 14.000
1004 17.060
1005 8.150
1005 7.400
1005 11.440
1005 6.200
1005 5.500
1005 4.450
1005 4.500
1005 8.010')
data <- split(data,list(data$gvkey))
(newdata <- do.call(rbind,lapply(data,function(data) { data <- ts(data) ; cbind(data,Quarter=diff(data[,2]),Two.Quarter=diff(data[,2],2))})))
data.gvkey data.PRCCQ Quarter Two.Quarter
[1,] 1004 23.750 NA NA
[2,] 1004 13.875 -9.875 NA
[3,] 1004 11.250 -2.625 -12.500
[4,] 1004 10.375 -0.875 -3.500
[5,] 1004 13.600 3.225 2.350
[6,] 1004 14.000 0.400 3.625
[7,] 1004 17.060 3.060 3.460
[8,] 1005 8.150 NA NA
[9,] 1005 7.400 -0.750 NA
[10,] 1005 11.440 4.040 3.290
[11,] 1005 6.200 -5.240 -1.200
[12,] 1005 5.500 -0.700 -5.940
[13,] 1005 4.450 -1.050 -1.750
[14,] 1005 4.500 0.050 -1.000
[15,] 1005 8.010 3.510 3.560
EDIT:
Another way, without split()
and lapply()
(probably faster)
data <- read.table(header=T,text='gvkey PRCCQ
1004 23.750
1004 13.875
1004 11.250
1004 10.375
1004 13.600
1004 14.000
1004 17.060
1005 8.150
1005 7.400
1005 11.440
1005 6.200
1005 5.500
1005 4.450
1005 4.500
1005 8.010')
newdata <- do.call(rbind,by(data, data$gvkey,function(data) { data <- ts(data) ; cbind(data,Quarter=diff(data[,2]),Two.Quarter=diff(data[,2],2))}))
回答2:
df <- read.table(text="gvkey PRCCQ
1 1004 5.500
2 1004 4.450
3 1004 4.500
4 1004 8.010
5 1005 4.450
6 1005 4.500",header=TRUE)
library(plyr)
library(quantmod)
ddply(df, "gvkey", transform, DeltaCol = Delt(PRCCQ,k=3))
#error
Delt2 <- function(x,k) {
if(length(x)>k) as.vector(Delt(x1=x,k=k)) else rep(NA,length(x))
}
ddply(df, "gvkey", transform, DeltaCol = Delt2(PRCCQ,k=3))
# gvkey PRCCQ DeltaCol
#1 1004 5.50 NA
#2 1004 4.45 NA
#3 1004 4.50 NA
#4 1004 8.01 0.4563636
#5 1005 4.45 NA
#6 1005 4.50 NA
来源:https://stackoverflow.com/questions/14900885/how-to-calculate-percentage-change-from-different-rows-over-different-spans