R partial match in data frame

淺唱寂寞╮ 提交于 2020-01-04 09:04:41

问题


How can I address a partial match in a data frame? Lets say this is my df df

   V1  V2  V3 V4
1 ABC 1.2 4.3  A
2 CFS 2.3 1.7  A
3 dgf 1.3 4.4  A

and I want to add a column V5 containing a number 111 only if the value in V1 contains a "f" in the name and a number 222 only if the value in V1 contains a "gf". Will I get problems since several values contain an "f" - or does the order I ender the commands will take care of it?

I tried something like:

df$V5<- ifelse(df$V1 = c("*f","*gf"),c=(111,222) )

but it does not work.

Main problem is how can I tell R to look for "partial match"?

Thanks a million for your help!


回答1:


Besides the solution setting the values in a sequence for "f", "gf", ... it's worth to have a look at regular expressions capability for zero-width lookahead / lookbehind.

If you want to grep all rows which contain "f" but not "gf" you can

v1 <- c("abc", "f", "gf" )
grep( "(?<![g])f" , v1, perl= TRUE )
[1] 2

and if you want to grep only those which contain "f" but not "fg"

v2 <- c("abc", "f", "fg")
grep( "f(?![g])" , v2, perl= TRUE )
[1] 2

And of course you can mix that:

v3 <- c("abc", "f", "fg", "gf")
grep( "(?<![g])f(?![g])" , v3, perl= TRUE )
[1] 2

So for your case you can do

df[ grep( "(?<![g])f" , df$V1, perl= TRUE ), "V5" ] <- 111
df[ grep( "gf" , df$V1, perl= TRUE ), "V5" ] <- 222



回答2:


 df$V5 <- NA
 df$V5[grep("f", df$V1)] <- 111
 df$V5[grep("gf", df$V1)] <- 222  # obviously some of the "f" values could be overwritten.

There is a switch function which I am too dense to understand that always seemed to me like it should be like the Pascal case. I could do it with some weird Boolean to numeric indexing maneuvers but that is not likely to be helpful.



来源:https://stackoverflow.com/questions/16364205/r-partial-match-in-data-frame

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!