merge (opposite of split) pair of rows in r

半腔热情 提交于 2021-02-17 20:34:05

问题


I have the column like the following. Each column has two pairs each with suffix "a" and "b" - for example col1a, col1b, colNa, colNb and so on till end of the file (> 50000).

mydataf <- data.frame (Ind = 1:5, col1a = sample (c(1:3), 5, replace = T), 
   col1b = sample (c(1:3), 5, replace = T),  colNa = sample (c(1:3), 5, replace = T),
   colNb = sample (c(1:3),5, replace = T),
     K_a = sample (c("A", "B"),5, replace = T),  
    K_b = sample (c("A", "B"),5, replace = T))

mydataf 
   Ind col1a col1b colNa colNb K_a K_b
1   1     1     1     2     3   B   A
2   2     1     3     2     2   B   B
3   3     2     1     1     1   B   B
4   4     3     1     1     3   A   B
5   5     1     1     3     2   B   A

Except First column (Ind), I want to collapse the pair of rows to make the dataframe look like the following, at the sametime the suffix "a" and "b" be removed. Also merged characters or number be ordered 1 first that 2, A first than B

   Ind col1   colN  K_
    1   11     23   AB   
    2   13     22   BB
    3   12     11   BB
    4   13     13   AB
    5   11     23   AB   

Edit: The grep function (probably) in the answer has problem if the name of columns are similar.

mydataf <- data.frame (col_1_a = sample (c(1:3), 5, replace = T),
   col_1_b = sample (c(1:3), 5, replace = T),  col_1_Na = sample (c(1:3), 5, replace = T),
   col_1_Nb = sample (c(1:3),5, replace = T),
     K_a = sample (c("A", "B"),5, replace = T),
    K_b = sample (c("A", "B"),5, replace = T))
n <- names(mydataf)
nm <- c(unique(substr(n, 1, nchar(n)-1)))
df <- data.frame(sapply(nm, function(x){
                             idx <- grep(x, n)
                             cols <- mydataf[idx]
                             x <- apply(cols, 1,
                                       function(z) paste(sort(z), collapse = ""))
                             return(x)
                            }))
names(df) <- nm
df

 col_1_ col_1_N K_
1   2233      23 BB
2   2233      22 BB
3   1123      13 AB
4   1223      12 AB
5   2333      33 AB

回答1:


mydataf
  Ind col1a col1b colNa colNb K_a K_b
1   1     2     1     1     1   A   A
2   2     1     2     1     3   B   A
3   3     1     2     3     2   A   A
4   4     1     2     3     1   A   B
5   5     1     2     2     1   A   A
n <- names(mydataf)
nm <- c("Ind", unique(substr(n, 1, nchar(n)-1)[-1]))
df <- data.frame(sapply(nm, function(x){
                             idx <- grep(paste0(x, "[ab]?$"), n)
                             cols <- mydataf[idx]
                             x <- apply(cols, 1, 
                                       function(z) paste(sort(z), collapse = ""))
                             return(x)
                            }))
names(df) <- nm
df
  Ind col1 colN K_
1   1   12   11 AA
2   2   12   13 AB
3   3   12   23 AA
4   4   12   13 AB
5   5   12   12 AA


来源:https://stackoverflow.com/questions/11695519/merge-opposite-of-split-pair-of-rows-in-r

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!