Efficient multiplication of columns in a data frame

可紊 提交于 2020-01-09 10:07:48

问题


I have a large data frame in which I am multiplying two columns together to get another column. At first I was running a for-loop, like so:

for(i in 1:nrow(df)){
    df$new_column[i] <- df$column1[i] * df$column2[i]
}

but this takes like 9 days.

Another alternative was plyr, and I actually might be using the variables incorrectly:

new_df <- ddply(df, .(column1,column2), transform, new_column = column1 * column2)

but this is taking forever


回答1:


As Blue Magister said in comments,

df$new_column <- df$column1 * df$column2

should work just fine. Of course we can never know for sure if we don't have an example of the data.




回答2:


A data.table solution will avoid lots of internal copying while having the advantages of not spattering the code with $.

 library(data.table)
 DT <- data.table(df)
 DT[ , new := column1 * column2]



回答3:


A minor, somewhat less efficient, version of Sacha's Answer is to use transform() or within()

df <- transform(df, new = column1 * column2)

or

df <- within(df, new <- column1 * column2)

(I hate spattering my user code with $.)



来源:https://stackoverflow.com/questions/12357592/efficient-multiplication-of-columns-in-a-data-frame

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!