plyr | 易学教程

How to calculate percentage change from different rows over different spans

阅读更多关于 How to calculate percentage change from different rows over different spans

问题 I am trying to calculate the percentage change in price for quarterly data of companies recognized by a gvkey (1001, 1384, etc...). and it's corresponding quarterly stock price, PRCCQ . gvkey PRCCQ 1 1004 23.750 2 1004 13.875 3 1004 11.250 4 1004 10.375 5 1004 13.600 6 1004 14.000 7 1004 17.060 8 1004 8.150 9 1004 7.400 10 1004 11.440 11 1004 6.200 12 1004 5.500 13 1004 4.450 14 1004 4.500 15 1004 8.010 What I am trying to do is add 8 columns showing 1 quarter return, 2 quarter return, etc.

How to calculate percentage change from different rows over different spans

阅读更多关于 How to calculate percentage change from different rows over different spans

How to expand a large dataframe in R

阅读更多关于 How to expand a large dataframe in R

问题 I have a dataframe df <- data.frame( id = c(1, 1, 1, 2, 2, 3, 3, 3, 3, 4), date = c("1985-06-19", "1985-06-19", "1985-06-19", "1985-08-01", "1985-08-01", "1990-06-19", "1990-06-19", "1990-06-19", "1990-06-19", "2000-05-12"), spp = c("a", "b", "c", "c", "d", "b", "c", "d", "a", "b"), y = rpois(10, 5)) id date spp y 1 1 1985-06-19 a 6 2 1 1985-06-19 b 3 3 1 1985-06-19 c 7 4 2 1985-08-01 c 7 5 2 1985-08-01 d 6 6 3 1990-06-19 b 5 7 3 1990-06-19 c 4 8 3 1990-06-19 d 4 9 3 1990-06-19 a 6 10 4 2000

How can I calculate an inner product with an arbitrary number of columns using ddply?

阅读更多关于 How can I calculate an inner product with an arbitrary number of columns using ddply?

问题 I want to perform an inner product of the first D columns for each row in a data frame with a given array, W . I am trying the following: W = (1,2,3); ddply(df, .(id), transform, inner_product=c(col1, col2, col3) %*% W); This works but I typically may have an arbitrary number of columns. Can I generalize the above expression to handle that case? Update: This is an updated example as asked for in the comments: libary(kernlab); data(spam); W = array(); W[1:3] = seq(1,3); spamdf = head(spam);

selecting specific rows etc. using ddply

阅读更多关于 selecting specific rows etc. using ddply

问题 I have a three part question based on a dataframe (df is example rows) of goals scored by soccer players in a season Player Season Goals Teddy Sheringham 1992/3 22 Les Ferdinand 1992/3 20 Dean Holdsworth 1992/3 19 Andy Cole 1993/4 34 Alan Shearer 1993/4 31 Chris Sutton 1993/4 25 If I want to obtain the top scorer each year I can use ddply(df, "Season", summarise, maxGoals = max(Goals), Player=Player[which.max(Goals)]) Questions: 1) It does not apply in this case but does this suffice if there

How do I sub sample data by group using ddply?

阅读更多关于 How do I sub sample data by group using ddply?

问题 I've got a data frame with far too many rows to be able to do a spatial correlogram. Instead, I want to grab 40 rows for each species and run my correlogram on that subset. I wrote a function to subset a data frame as follows: samp <- function(dataf) { dataf[sample(1:dim(dataf)[1], size=40, replace=FALSE),] } Now I want to apply this function to each species in a larger data frame. When I try something like culled_data = ddply (larger_data, .(species), subset, samp) I get this error: Error in

ddply summarise proportional count

阅读更多关于 ddply summarise proportional count

问题 I am having some trouble using the ddply function from the plyr package. I am trying to summarise the following data with counts and proportions within each group. Here's my data: structure(list(X5employf = structure(c(1L, 3L, 1L, 1L, 1L, 3L, 1L, 1L, 1L, 3L, 1L, 1L, 1L, 2L, 2L, 3L, 3L, 3L, 1L, 2L, 2L, 2L, 2L, 2L, 1L, 1L, 1L, 3L, 3L, 3L, 3L, 3L, 3L, 2L, 1L, 1L, 3L, 1L, 3L, 2L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 3L, 3L, 3L

Reshape package masking preventing melt from naming columns

阅读更多关于 Reshape package masking preventing melt from naming columns

问题 I have a script which requires both reshape and reshape2 libraries. I know this is poor practise, but I think plyr (or another library I am using) Vennerable is loading reshape and I have personally used reshape2 in a lot of places. The problem is that the masking of reshape2 by reshape is causing problems for the melt function # Example data frame df <- data.frame(id=c(1:5), a=c(rnorm(5)), b=c(rnorm(5))) # With just reshape2, variable and value columns are labelled correctly library(reshape2

Conditional NA filling by group

阅读更多关于 Conditional NA filling by group

问题 edit The question was originally asked for data.table . A solution with any package would be interesting. I am a little stuck with a particular variation of a more general problem. I have panel data that I am using with data.table and I would like to fill in some missing values using the group by functionality of data.table. Unfortunately they are not numeric, so I can't simply interpolate, but they should only be filled in based on a condition. Is it possible to perform a kind of conditional

Conditional NA filling by group

阅读更多关于 Conditional NA filling by group