问题
How can I address a partial match in a data frame? Lets say this is my df df
V1 V2 V3 V4
1 ABC 1.2 4.3 A
2 CFS 2.3 1.7 A
3 dgf 1.3 4.4 A
and I want to add a column V5 containing a number 111 only if the value in V1 contains a "f" in the name and a number 222 only if the value in V1 contains a "gf". Will I get problems since several values contain an "f" - or does the order I ender the commands will take care of it?
I tried something like:
df$V5<- ifelse(df$V1 = c("*f","*gf"),c=(111,222) )
but it does not work.
Main problem is how can I tell R to look for "partial match"?
Thanks a million for your help!
回答1:
Besides the solution setting the values in a sequence for "f", "gf", ...
it's worth to have a look at regular expressions capability for zero-width lookahead / lookbehind.
If you want to grep all rows which contain "f"
but not "gf"
you can
v1 <- c("abc", "f", "gf" )
grep( "(?<![g])f" , v1, perl= TRUE )
[1] 2
and if you want to grep only those which contain "f"
but not "fg"
v2 <- c("abc", "f", "fg")
grep( "f(?![g])" , v2, perl= TRUE )
[1] 2
And of course you can mix that:
v3 <- c("abc", "f", "fg", "gf")
grep( "(?<![g])f(?![g])" , v3, perl= TRUE )
[1] 2
So for your case you can do
df[ grep( "(?<![g])f" , df$V1, perl= TRUE ), "V5" ] <- 111
df[ grep( "gf" , df$V1, perl= TRUE ), "V5" ] <- 222
回答2:
df$V5 <- NA
df$V5[grep("f", df$V1)] <- 111
df$V5[grep("gf", df$V1)] <- 222 # obviously some of the "f" values could be overwritten.
There is a switch
function which I am too dense to understand that always seemed to me like it should be like the Pascal case
. I could do it with some weird Boolean to numeric indexing maneuvers but that is not likely to be helpful.
来源:https://stackoverflow.com/questions/16364205/r-partial-match-in-data-frame