I have a data.frame with several rows which come from a merge which are not completely merged:
b <- read.table(text = \"
ID Age Steatosis
Here is a base R method that should work, for a version of the data that you provided:
aggregate(b[-grep("^(ID|Age)$", names(b))], b[c("ID", "Age")],
FUN=function(x) if(all(is.na(x))) NA else x[!is.na(x)][1])
ID Age Steatosis Mallory Lille_dico Lille_3 Bili.AHHS2cat
1 HA-09 16 <33% no/occasional NA 5 1
It uses aggregate
together with an if
else
check. This will return the first element that is not missing if any should exist. I take the first element as there is at least one observation. The i
in the code could be replaced by length(x)
to select the last element.
As suggested by @jdobres in a comment to another answer, it would be possible to use paste
with the collapse argument to combine multiple non-missing elements. This, of course would convert the type of the vector to character, which may not be desirable if the variable is numeric.
Note: I edited my original answer to include "Age" in the key, thanks to @sebastian-c for pointing this out.
If "Age" is not part of the key, then
aggregate(b[-grep("^(ID)$", names(b))], b["ID"],
FUN=function(x) if(all(is.na(x))) NA else x[!is.na(x)][1])
will work.
data
b <- read.table(text = "
ID Age Steatosis Mallory Lille_dico Lille_3 Bili.AHHS2cat
68 HA-09 16 NA NA NA 5 NA
69 HA-09 16 <33% no/occasional NA NA 1")