Merging rows with shared information

前端 未结 4 1171
故里飘歌
故里飘歌 2021-01-07 04:18

I have a data.frame with several rows which come from a merge which are not completely merged:

b <- read.table(text = \"
      ID   Age    Steatosis               


        
4条回答
  •  一生所求
    2021-01-07 04:26

    Here is a base R method that should work, for a version of the data that you provided:

    aggregate(b[-grep("^(ID|Age)$", names(b))], b[c("ID", "Age")], 
              FUN=function(x) if(all(is.na(x))) NA else x[!is.na(x)][1])
    
       ID Age Steatosis       Mallory Lille_dico Lille_3  Bili.AHHS2cat
     1 HA-09  16      <33% no/occasional         NA       5  1          
    

    It uses aggregate together with an if else check. This will return the first element that is not missing if any should exist. I take the first element as there is at least one observation. The i in the code could be replaced by length(x) to select the last element.

    As suggested by @jdobres in a comment to another answer, it would be possible to use paste with the collapse argument to combine multiple non-missing elements. This, of course would convert the type of the vector to character, which may not be desirable if the variable is numeric.

    Note: I edited my original answer to include "Age" in the key, thanks to @sebastian-c for pointing this out.


    If "Age" is not part of the key, then

    aggregate(b[-grep("^(ID)$", names(b))], b["ID"], 
              FUN=function(x) if(all(is.na(x))) NA else x[!is.na(x)][1])
    

    will work.

    data

    b <- read.table(text = "
          ID   Age    Steatosis       Mallory Lille_dico Lille_3 Bili.AHHS2cat
    68 HA-09   16   NA          NA       NA       5             NA
    69 HA-09   16   <33% no/occasional     NA      NA             1")
    

提交回复
热议问题