plyr | 易学教程

R growth rate calculation week over week on daily timeseries data

阅读更多关于 R growth rate calculation week over week on daily timeseries data

问题 I'm trying to calculate w/w growth rates entirely in R. I could use excel, or preprocess with ruby, but that's not the point. data.frame example date gpv type 1 2013-04-01 12900 back office 2 2013-04-02 16232 back office 3 2013-04-03 10035 back office I want to do this factored by 'type' and I need to wrap up the Date type column into weeks. And then calculate the week over week growth. I think I need to do ddply to group by week - with a custom function that determines if a date is in a

R growth rate calculation week over week on daily timeseries data

阅读更多关于 R growth rate calculation week over week on daily timeseries data

如何按组对变量求和

阅读更多关于如何按组对变量求和

假设我有两列数据。第一个包含诸如“第一”，“第二”，“第三”等类别。第二个具有代表我看到“第一”的次数的数字。例如： Category Frequency First 10 First 15 First 5 Second 2 Third 14 Third 20 Second 3 我想按类别对数据进行排序并求和： Category Frequency First 30 Second 5 Third 34 我将如何在R中执行此操作？ #1楼如果 x 是包含数据的数据框，则以下操作将满足您的要求： require(reshape) recast(x, Category ~ ., fun.aggregate=sum) #2楼 library(plyr) ddply(tbl, .(Category), summarise, sum = sum(Frequency)) #3楼只是添加第三个选项： require(doBy) summaryBy(Frequency~Category, data=yourdataframe, FUN=sum) 编辑：这是一个非常古老的答案。现在，我建议使用 group_by 并从 dplyr summarise ，如@docendo答案中所示。 #4楼使用 aggregate ： aggregate(x$Frequency, by=list

Aggregate rows by shared values in a variable

阅读更多关于 Aggregate rows by shared values in a variable

问题 I have a somewhat dumb R question. If I have a matrix (or dataframe, whichever is easier to work with) like: Year Match 2008 1808 2008 137088 2008 1 2008 56846 2007 2704 2007 169876 2007 75750 2006 2639 2006 193990 2006 2 And I wanted to sum each of the match counts for the years (so, e.g. the 2008 row would be 2008 195743 , how would I go about doing this? I've got a few solutions in my head but they are all needlessly complicated and R tends to have some much easier solution tucked away

Aggregate rows by shared values in a variable

阅读更多关于 Aggregate rows by shared values in a variable

Aggregate rows by shared values in a variable

阅读更多关于 Aggregate rows by shared values in a variable

Replacing column values with maximum by group

阅读更多关于 Replacing column values with maximum by group

问题 Say I want to locate the maximum values in one column based on the value of another (i.e. max by group). I found a number of helpful threads on how to do this (ex1 ex2). For example, using the plyr package, ddply(data, .(x), summarise, max.score=max(y)) returns a list of the maximum values of y for each x. However, what if I then wanted to replace all elements in x < max(y) with max(y) itself? (The specific application would be to recode all dates in a particular set with that set's end date.

loop_apply.o: file not recognized: File format not recognized

阅读更多关于 loop_apply.o: file not recognized: File format not recognized

问题 I am trying to install R ’s plyr package. Here is the error message: * installing *source* package ‘plyr’ ... ** package ‘plyr’ successfully unpacked and MD5 sums checked ** libs clang++ -I/opt/R-3.4.1/include -DNDEBUG -I"/home/isomorphismes/R/i686-pc-linux-gnu-library/3.4/Rcpp/include" -I/usr/local/include -fpic -I/opt/boost_1_61_0/boost -c RcppExports.cpp -o RcppExports.o clang -I/opt/R-3.4.1/include -DNDEBUG -I"/home/cd/R/i686-pc-linux-gnu-library/3.4/Rcpp/include" -I/usr/local/include

Return value based on finding closest value between other two columns in df

阅读更多关于 Return value based on finding closest value between other two columns in df

问题 My question is almost identical to this one except instead of finding the closest value between a column value and a fixed number, e.g. "2", I want to find the closest value to the value in another column. . Here's an example of data: df <- data.frame(site_no=c("01010500", "01010500", "01010500","02010500", "02010500", "02010500", "03010500", "03010500", "03010500"), OBS=c(423.9969, 423.9969, 423.9969, 123, 123, 123, 150,150,150), MOD=c(380,400,360,150,155,135,170,180,140), HT=c(14,12,15,3,8

R plyr, data.table, apply certain columns of data.frame

阅读更多关于 R plyr, data.table, apply certain columns of data.frame

问题 I am looking for ways to speed up my code. I am looking into the apply / ply methods as well as data.table . Unfortunately, I am running into problems. Here is a small sample data: ids1 <- c(1, 1, 1, 1, 2, 2, 2, 2) ids2 <- c(1, 2, 3, 4, 1, 2, 3, 4) chars1 <- c("aa", " bb ", "__cc__", "dd ", "__ee", NA,NA, "n/a") chars2 <- c("vv", "_ ww_", " xx ", "yy__", " zz", NA, "n/a", "n/a") data <- data.frame(col1 = ids1, col2 = ids2, col3 = chars1, col4 = chars2, stringsAsFactors = FALSE) Here is a