So I basically have an array of words and phrases. Some of them contain curses. I want to create a method that automatically scans each of the units in the array for curses. If
As you state you are:
appalled that I have not been able to find a method of
NSString
that will search for a bunch of words at the same time
though this seems a strange reaction - programming is about building solutions after all, here is a solution which searches for all the words at the same time using a single method, but belonging to NSRegularExpression
rather than NSString
.
Our sample data:
NSArray *sampleLines = @[@"Hey how are you",
@"What is going on?",
@"What’s up dude?",
@"Do you want to get chipotle?",
@"They are the youth"
];
NSArray *stopWords = @[@"you", @"hey"];
The last sample line to check we don't match partial words. Capitalisation added to test for case insensitive matching.
We construct a RE to match any of the stop words:
\b
- word boundary, options set to use Unicode word boundaries in this example(?: ... )
- a non-capturing group, just used as it is slightly faster than a capturing one and it will be the same as the whole match anyway|
- orPattern for exmaple stop words: \b(?:you|hey)\b
// don't forget to use \\ in a string literal to insert a backslash into the pattern
NSString *pattern = [NSString stringWithFormat:@"\\b(?:%@)\\b", [stopWords componentsJoinedByString:@"|"]];
NSError *error = nil;
NSRegularExpression *stopRE = [NSRegularExpression regularExpressionWithPattern:pattern
options:(NSRegularExpressionCaseInsensitive | NSRegularExpressionUseUnicodeWordBoundaries)
error:&error];
// always check error returns
if (error)
{
NSLog(@"RE construction failed: %@", error);
return;
}
Iterate through sample lines checking if they contain a stop word or not and display result on console:
for (NSString *aLine in sampleLines)
{
// check for all words anywhere in line in one go
NSRange match = [stopRE rangeOfFirstMatchInString:aLine
options:0
range:NSMakeRange(0, aLine.length)];
BOOL containsStopWord = match.location != NSNotFound;
NSLog(@"%@: %@", aLine, containsStopWord ? @"Bad" : @"OK");
}
Regular expression matching should be efficient, and as the example never copies individual words or matches as NSString
objects this should not create a lot of temporary objects as methods which enumerate the individual words do.
HTH