问题
Within the data.table
package in R, is there a way in order to use a character vector to be assigned within the by
argument of the calculation?
Here is an example of what would be the desired output from this using mtcars:
mtcars <- data.table(mtcars)
ColSelect <- 'cyl' # One Column Option
mtcars[,.( AveMpg = mean(mpg)), by = .(ColSelect)] # Doesn't work
# Desired Output
cyl AveMpg
1: 6 19.74286
2: 4 26.66364
3: 8 15.10000
I know that this is possible to use assigning column names in j
by enclosing the vector around brackets.
ColSelect <- 'AveMpg' # Column to be assigned for average mpg value
mtcars[,(ColSelect):= mean(mpg), by = .(cyl)]
head(mtcars)
mpg cyl disp hp drat wt qsec vs am gear carb AveMpg
1: 21.0 6 160 110 3.90 2.620 16.46 0 1 4 4 19.74286
2: 21.0 6 160 110 3.90 2.875 17.02 0 1 4 4 19.74286
3: 22.8 4 108 93 3.85 2.320 18.61 1 1 4 1 26.66364
4: 21.4 6 258 110 3.08 3.215 19.44 1 0 3 1 19.74286
5: 18.7 8 360 175 3.15 3.440 17.02 0 0 3 2 15.10000
6: 18.1 6 225 105 2.76 3.460 20.22 1 0 3 1 19.74286
Is there a suggestion as to what to put in the by
argument in order to achieve this?
回答1:
try to use it like this
mtcars <- data.table(mtcars)
ColSelect <- 'cyl' # One Column Option
mtcars[, AveMpg := mean(mpg), by = ColSelect] # Should work
回答2:
From ?data.table
in the by
section it says that by
accepts:
- a single character string containing comma separated column names (where spaces are significant since column names may contain spaces
even at the start or end): e.g., DT[, sum(a), by="x,y,z"]- a character vector of column names: e.g., DT[, sum(a), by=c("x", "y")]
So yes, you can use the answer in @cccmir's response. You can also use c()
as @akrun mentioned, but that seems slightly extraneous unless you want multiple columns.
The reason you cannot use .()
syntax is that in data.table
.()
is an alias for list()
. And according to the same help for by
the list()
syntax requires an expression of column names - not a character string.
Going off the examples in the by
help if you wanted to use multiple variables and pass the names as characters you could do:
mtcars[,.( AveMpg = mean(mpg)), by = "cyl,am"]
mtcars[,.( AveMpg = mean(mpg)), by = c("cyl","am")]
来源:https://stackoverflow.com/questions/48442422/use-a-character-vector-in-the-by-argument