Removing the unordered pairs repeated twice in a file in R

前端 未结 2 857
南方客
南方客 2021-01-27 16:14

I have a file like this in R.

**0 1** 
0   2
**0 3**
0   4
0   5
0   6
0   7
0   8
0   9
0   10
**1 0**
1   11
1   12
1   13  
1   14
1   15
1   16
1   17
1   18         


        
相关标签:
2条回答
  • 2021-01-27 16:39

    Here's one approach:

    First, create a vector of the columns sorted and then pasted together.

    x <- apply(mydf, 1, function(x) paste(sort(x), collapse = " "))
    

    Then, use ave to create the counts you are looking for.

    mydf$count <- ave(x, x, FUN = length)
    

    Finally, you can use the "x" vector again, this time to detect and remove duplicated values.

    mydf[!duplicated(x), ]
    #    V1 V2 count
    # 1   0  1     2
    # 2   0  2     1
    # 3   0  3     2
    # 4   0  4     1
    # 5   0  5     1
    # 6   0  6     1
    # 7   0  7     1
    # 8   0  8     1
    # 9   0  9     1
    # 10  0 10     1
    # 12  1 11     1
    # 13  1 12     1
    # 14  1 13     1
    # 15  1 14     1
    # 16  1 15     1
    # 17  1 16     1
    # 18  1 17     1
    # 19  1 18     1
    # 20  1 19     1
    
    0 讨论(0)
  • 2021-01-27 16:50

    Here is a way using transform, pmin and pmax to reorder the data by row, and then aggregate to provide a count:

    # data
    x <- data.frame(a=c(rep(0,10),rep(1,10),3),b=c(1:10,0,11:19,0))
    
    #logic
    aggregate(count~a+b,transform(x,a=pmin(a,b), b=pmax(a,b), count=1),sum)
       a  b count
    1  0  1     2
    2  0  2     1
    3  0  3     2
    4  0  4     1
    5  0  5     1
    6  0  6     1
    7  0  7     1
    8  0  8     1
    9  0  9     1
    10 0 10     1
    11 1 11     1
    12 1 12     1
    13 1 13     1
    14 1 14     1
    15 1 15     1
    16 1 16     1
    17 1 17     1
    18 1 18     1
    19 1 19     1
    
    0 讨论(0)
提交回复
热议问题