How to order my dataframe lexicographicaly

后端 未结 4 762
北海茫月
北海茫月 2020-12-04 01:36

I have a following data frame

a = data.frame(a=c(1,2,3,4,5,6,7),b=c(1,2,3,10,12,21,4),c=c(1,2,10,11,\"X\",\"Y\",3))
> a
  a  b  c
1 1  1  1
2 2  2  2
3 3         


        
相关标签:
4条回答
  • 2020-12-04 02:04

    Unfortunately mixedsort does not (yet) support multiple column sorting. So, you need to implement it yourself, for example like this:

    a[order(sub("[0-9]+", "", a$c),
            as.numeric(sub("[[:alpha:]]*([[:digit:]]*)", '\\1', a$c)),
            as.numeric(a$b),
            as.numeric(a$a)), ]
    

    This first, alphanumerically sorts data.frame using a$c, and for tie situations(which actually does not exist in your data.frame 'a'), it uses a$b and a$a.

    Output is:

      a  b  c
    1 1  1  1
    2 2  2  2
    7 7  4  3
    3 3  3 10
    4 4 10 11
    5 5 12  X
    6 6 21  Y
    

    PS: This was written by David Winsemius in this post as a reply to a similar question.

    0 讨论(0)
  • 2020-12-04 02:08

    Assuming these are human chromosome names, chr1...chr22, chrX, chrY. We can convert them to numeric, then use order:

    # convert to numeric
    a$chromN <- as.integer(ifelse(a$c == "X", "23", ifelse(a$c == "Y", "24", a$c)))
    
    # now sort as usual:
    a[ order(a$chromN), ]
    
    #   a  b  c chromN
    # 1 1  1  1      1
    # 3 3  3 10      2
    # 4 4 10 11      3
    # 2 2  2  2      4
    # 7 7  4  3      5
    # 5 5 12  X     23
    # 6 6 21  Y     24
    
    0 讨论(0)
  • 2020-12-04 02:27

    One option is to use mixedorder() from the gtools package.

    library(gtools)
    a[mixedorder(a$c),]
    #   a  b  c
    # 1 1  1  1
    # 2 2  2  2
    # 7 7  4  3
    # 3 3  3 10
    # 4 4 10 11
    # 5 5 12  X
    # 6 6 21  Y
    
    0 讨论(0)
  • 2020-12-04 02:28

    Sticking in base you could make a function yourself:

    a = data.frame(a=c(1,2,3,4,5,6,7),b=c(1,2,3,10,12,21,4),c=c(1,2,10,11,"X","Y",3))
    
    SORTER_DEVICE <- function(x) {
        c(sort(as.numeric(na.omit(gsub("[a-zA-Z]", NA, x)))),
            sort(na.omit(gsub("[0-9]", NA, x))))
    }
    data.frame(apply(a, 2, SORTER_DEVICE))
    
    0 讨论(0)
提交回复
热议问题