问题
A completely basic question - and forgive me if it is a duplicate.
set.seed(1)
df <-
data.frame(id=c('a', 'a', 'b', 'b', 'a'),
a=sample(1:10, size=5, replace=T),
b=sample(1:10, size=5, replace=T),
c=sample(1:10, size=5, replace=T))
Then,
> df
id a b c
1 a 3 9 3
2 a 4 10 2
3 b 6 7 7
4 b 10 7 4
5 a 3 1 8
To return the column name (a, b or c) with the largest value, and if this is in the id
variable take the second highest, I use the below function.
FUN <- function(r) {
top <- names(r[,c('a', 'b', 'c')])[order(r[,c('a', 'b', 'c')], decreasing=T)]
ifelse(top[1] == r[['id']], top[2], top[1])
}
I can do:
FUN(df[1,]) #[1] "b"
and for all rows:
res <- NULL
for(i in 1:nrow(df)) {
res <- c(res, FUN(df[i,]))
}
And get
> res
[1] "b" "b" "c" "a" "c"
But how can I apply
this ? E.g. this is not working:
apply(df, 1, FUN)
I suspect the trouble is that FUN
assumes a 1-row data frame (and not a named vector of characters like (first row))
id a b c
"a" "3" "9" "c"
From apply?
:
If X is not an array but an object of a class with a non-null dim value (such as a data frame), apply attempts to coerce it to an array via as.matrix if it is two-dimensional (e.g., a data frame) or via as.array.
回答1:
If you must use your function, you can do,
sapply(split(df, 1:nrow(df)), f1)
# 1 2 3 4 5
#"b" "b" "c" "a" "c"
NOTE I renamed your FUN
to f1
since FUN
is used by various functions in R so as to define the argument of function
回答2:
Another option is to make some minor modifications to your FUN
. I think the issue you were running into was that apply
will treat each row as a vector. Since your id
column is a character, this means that your a/b/c
columns will also be coerced to character. Realizing this we can modify the FUN
slightly to convert it back to numeric
for ordering:
FUN <- function(r) {
top <- c('a', 'b', 'c')[order(as.numeric(r[c('a', 'b', 'c')]), decreasing=T)]
ifelse(top[1] == as.character(r['id']), top[2], top[1])
}
apply(df, 1, FUN)
#[1] "b" "b" "c" "a" "c"
To see how this works in a little more detail you can run the below and see that apply
is reading through named character vectors.
apply(df, 1, function(x) {print(x); print(class(x)); return(NULL)})
# id a b c
# "a" " 3" " 9" "3"
#[1] "character"
# id a b c
# "a" " 4" "10" "2"
#[1] "character"
# id a b c
# "b" " 6" " 7" "7"
#[1] "character"
# id a b c
# "b" "10" " 7" "4"
#[1] "character"
# id a b c
# "a" " 3" " 1" "8"
#[1] "character"
#NULL
来源:https://stackoverflow.com/questions/44591238/apply-fun-row-wise-on-data-frame-with-integer-and-character-variables