问题
I have a large data frame in which I am multiplying two columns together to get another column. At first I was running a for-loop, like so:
for(i in 1:nrow(df)){
df$new_column[i] <- df$column1[i] * df$column2[i]
}
but this takes like 9 days.
Another alternative was plyr
, and I actually might be using the variables incorrectly:
new_df <- ddply(df, .(column1,column2), transform, new_column = column1 * column2)
but this is taking forever
回答1:
As Blue Magister said in comments,
df$new_column <- df$column1 * df$column2
should work just fine. Of course we can never know for sure if we don't have an example of the data.
回答2:
A data.table solution will avoid lots of internal copying while having the advantages of not spattering the code with $
.
library(data.table)
DT <- data.table(df)
DT[ , new := column1 * column2]
回答3:
A minor, somewhat less efficient, version of Sacha's Answer is to use transform()
or within()
df <- transform(df, new = column1 * column2)
or
df <- within(df, new <- column1 * column2)
(I hate spattering my user code with $
.)
来源:https://stackoverflow.com/questions/12357592/efficient-multiplication-of-columns-in-a-data-frame