I am trying to use grep
to test whether a vector of strings are present in an another vector or not, and to output the values that are present (the matching pat
Have you tried the match()
or charmatch()
functions?
Example use:
match(c("A1", "A9", "A6"), myfile$Letter)
In addition to @Marek's comment about not including fixed==TRUE
, you also need to not have the spaces in your regular expression. It should be "A1|A9|A6"
.
You also mention that there are lots of patterns. Assuming that they are in a vector
toMatch <- c("A1", "A9", "A6")
Then you can create your regular expression directly using paste
and collapse = "|"
.
matches <- unique (grep(paste(toMatch,collapse="|"),
myfile$Letter, value=TRUE))
I suggest writing a little script and doing multiple searches with Grep. I've never found a way to search for multiple patterns, and believe me, I've looked!
Like so, your shell file, with an embedded string:
#!/bin/bash
grep *A6* "Alex A1 Alex A6 Alex A7 Bob A1 Chris A9 Chris A6";
grep *A7* "Alex A1 Alex A6 Alex A7 Bob A1 Chris A9 Chris A6";
grep *A8* "Alex A1 Alex A6 Alex A7 Bob A1 Chris A9 Chris A6";
Then run by typing myshell.sh.
If you want to be able to pass in the string on the command line, do it like this, with a shell argument--this is bash notation btw:
#!/bin/bash
$stingtomatch = "${1}";
grep *A6* "${stingtomatch}";
grep *A7* "${stingtomatch}";
grep *A8* "${stingtomatch}";
And so forth.
If there are a lot of patterns to match, you can put it in a for loop.
Based on Brian Digg's post, here are two helpful functions for filtering lists:
#Returns all items in a list that are not contained in toMatch
#toMatch can be a single item or a list of items
exclude <- function (theList, toMatch){
return(setdiff(theList,include(theList,toMatch)))
}
#Returns all items in a list that ARE contained in toMatch
#toMatch can be a single item or a list of items
include <- function (theList, toMatch){
matches <- unique (grep(paste(toMatch,collapse="|"),
theList, value=TRUE))
return(matches)
}