问题
This is a follow-up on this question, which was marked as a duplicate to this, but the suggested solution does not work.
I have the following data.frame
:
set.seed(1)
mydf <- data.frame(A=paste(sample(LETTERS, 4), sample(1:20, 20), sep=""),
B=paste(sample(1:20, 20), sample(LETTERS, 4), sep=""),
C=sample(LETTERS, 20), D=sample(1:100, 20), value=rnorm(20))
> mydf
A B C D value
1 G5 6N T 9 -0.68875569
2 J18 8T R 87 -0.70749516
3 N19 1A L 34 0.36458196
4 U12 7K Z 82 0.76853292
5 G11 14N J 98 -0.11234621
6 J1 20T F 32 0.88110773
7 N3 17A B 45 0.39810588
8 U14 19K W 83 -0.61202639
9 G9 15N U 80 0.34111969
10 J20 3T I 36 -1.12936310
11 N8 9A K 70 1.43302370
12 U16 16K G 86 1.98039990
13 G6 10N M 39 -0.36722148
14 J7 18T D 62 -1.04413463
15 N13 5A Y 35 0.56971963
16 U4 11K N 28 -0.13505460
17 G17 4N O 64 2.40161776
18 J15 2T C 17 -0.03924000
19 N2 12A P 59 0.68973936
20 U10 13K X 10 0.02800216
I want to order it according to columns A
to D
, but A
and D
are mixed, so natural order is required.
I know I can apply regular ordering, like:
mydf2 <- mydf[do.call(order, c(mydf[1:4], list(decreasing = FALSE))),]
> mydf2
A B C D value
5 G11 14N J 98 -0.11234621
17 G17 4N O 64 2.40161776
1 G5 6N T 9 -0.68875569
13 G6 10N M 39 -0.36722148
9 G9 15N U 80 0.34111969
6 J1 20T F 32 0.88110773
18 J15 2T C 17 -0.03924000
2 J18 8T R 87 -0.70749516
10 J20 3T I 36 -1.12936310
14 J7 18T D 62 -1.04413463
15 N13 5A Y 35 0.56971963
3 N19 1A L 34 0.36458196
19 N2 12A P 59 0.68973936
7 N3 17A B 45 0.39810588
11 N8 9A K 70 1.43302370
20 U10 13K X 10 0.02800216
4 U12 7K Z 82 0.76853292
8 U14 19K W 83 -0.61202639
12 U16 16K G 86 1.98039990
16 U4 11K N 28 -0.13505460
But this is not the result I need. I need 10
after 9
, not after 1
(you can check column A
to see it is not in the order I need.)
In the comments of my original question, it was suggested to use the multi.mixedorder
function.
However, as you can see below, the result is identical to the one using just order
, which is still not what I want.
multi.mixedorder <- function(..., na.last = TRUE, decreasing = FALSE){
do.call(order, c(
lapply(list(...), function(l){
if(is.character(l)){
factor(l, levels=mixedsort(unique(l)))
} else {
l
}
}),
list(na.last = na.last, decreasing = decreasing)
))
}
mydf3 <- mydf[do.call(multi.mixedorder, c(mydf[1:4], list(decreasing = FALSE))),]
> mydf3
A B C D value
5 G11 14N J 98 -0.11234621
17 G17 4N O 64 2.40161776
1 G5 6N T 9 -0.68875569
13 G6 10N M 39 -0.36722148
9 G9 15N U 80 0.34111969
6 J1 20T F 32 0.88110773
18 J15 2T C 17 -0.03924000
2 J18 8T R 87 -0.70749516
10 J20 3T I 36 -1.12936310
14 J7 18T D 62 -1.04413463
15 N13 5A Y 35 0.56971963
3 N19 1A L 34 0.36458196
19 N2 12A P 59 0.68973936
7 N3 17A B 45 0.39810588
11 N8 9A K 70 1.43302370
20 U10 13K X 10 0.02800216
4 U12 7K Z 82 0.76853292
8 U14 19K W 83 -0.61202639
12 U16 16K G 86 1.98039990
16 U4 11K N 28 -0.13505460
回答1:
OK solved it, the multi.mixedsort
function needs a fix to be able to deal with factors:
multi.mixedorder <- function(..., na.last = TRUE, decreasing = FALSE){
do.call(order, c(
lapply(list(...), function(l){
if(is.character(l)){
factor(l, levels=mixedsort(unique(l)))
} else {
factor(as.character(l), levels=mixedsort(levels(l)))
}
}),
list(na.last = na.last, decreasing = decreasing)
))
}
Otherwise convert all factor columns in mydf
into character, with:
mydf[] <- lapply(mydf, as.character)
but with the fix, this shouldn't be needed
来源:https://stackoverflow.com/questions/54089471/r-mixedsort-on-multiple-vectors-columns