I would like to fill in a new column with one of two values if a pattern is matched.
Here is my data frame:
df <- structure(list(loc_01 = c("ap
To check if a string contains a certain substring, you can't use ==
because it performs an exact matching (i.e. returns true only if the string is exactly "non").
You could use for example grepl
function (belonging to grep family of functions) that performs a pattern matching:
df$loc01 <- ifelse(grepl("non",df$loc_01),'outside','inside')
Result :
> df
loc_01 loc01_land loc01
1 apis 165730500 inside
2 indu 62101800 inside
3 isro 540687600 inside
4 miss 161140500 inside
5 non_apis 1694590200 outside
6 non_indu 1459707300 outside
7 non_isro 1025051400 outside
8 non_miss 1419866100 outside
9 non_piro 2037064500 outside
10 non_sacn 2204629200 outside
11 non_slbe 1918840500 outside
12 non_voya 886299300 outside
13 piro 264726000 inside
14 sacn 321003900 inside
15 slbe 241292700 inside
16 voya 530532000 inside
You only need one line of code:
library(dplyr)
library(stringr)
df %>%
mutate(loc01 = if_else(str_starts(loc_01, "non_"), "outside", "inside"))
For using more complex regex-pattern you can use str_detect
instead str_starts
:
df %>%
mutate(loc01 = if_else(str_detect(loc_01, "^(non_)"), "outside", "inside"))
Output:
loc_01 loc01_land loc01
<chr> <dbl> <chr>
1 apis 165730500 inside
2 indu 62101800 inside
3 isro 540687600 inside
4 miss 161140500 inside
5 non_apis 1694590200 outside
6 non_indu 1459707300 outside
7 non_isro 1025051400 outside
8 non_miss 1419866100 outside
9 non_piro 2037064500 outside