I have a data frame. One of the columns has values like:
WIND
WINDS
HIGH WIND
etc
among the other values. Now
You can just subset the original column for these values by using grepl and replace
df$col1[grepl("WIND",df$col1)]<-"WIND"
UPDATE: a bit of a brainfart, agrep
actually doesn't add anything here over grep, but you can just replace the agrep
with grep
. It does if you have some words that have roots that vary slightly but you still want to match.
Here is an approach using agrep
:
> wind.vec
[1] "WINDS" "HIGH WIND" "WINDY" "VERY WINDY"
> wind.vec[agrep("WIND", wind.vec)] <- "WIND"
> wind.vec
[1] "WIND" "WIND" "WIND" "WIND"
The nice thing about agrep
is it matches approximately, so "WINDY" is replaced. Note I'm doing this with a vector, but you can easily extend to a data frame by replacing wind.vec
with my.data.frame$my.wind.col
.
agrep
returns the indices that match approximately, which then allows me to use the [<-
replacement operator to replace the approximately matching values with "WIND".