How would you scan an array of strings for a set of substrings in objective-c?

后端 未结 5 1755
情歌与酒
情歌与酒 2021-01-27 03:21

So I basically have an array of words and phrases. Some of them contain curses. I want to create a method that automatically scans each of the units in the array for curses. If

5条回答
  •  暖寄归人
    2021-01-27 03:27

    As you state you are:

    appalled that I have not been able to find a method of NSString that will search for a bunch of words at the same time

    though this seems a strange reaction - programming is about building solutions after all, here is a solution which searches for all the words at the same time using a single method, but belonging to NSRegularExpression rather than NSString.

    Our sample data:

    NSArray *sampleLines = @[@"Hey how are you",
                             @"What is going on?",
                             @"What’s up dude?",
                             @"Do you want to get chipotle?",
                             @"They are the youth"
                             ];
    NSArray *stopWords = @[@"you", @"hey"];
    

    The last sample line to check we don't match partial words. Capitalisation added to test for case insensitive matching.

    We construct a RE to match any of the stop words:

    • \b - word boundary, options set to use Unicode word boundaries in this example
    • (?: ... ) - a non-capturing group, just used as it is slightly faster than a capturing one and it will be the same as the whole match anyway
    • | - or

    Pattern for exmaple stop words: \b(?:you|hey)\b

    // don't forget to use \\ in a string literal to insert a backslash into the pattern
    NSString *pattern = [NSString stringWithFormat:@"\\b(?:%@)\\b", [stopWords componentsJoinedByString:@"|"]];
    NSError *error = nil;
    NSRegularExpression *stopRE = [NSRegularExpression regularExpressionWithPattern:pattern
                                                                            options:(NSRegularExpressionCaseInsensitive | NSRegularExpressionUseUnicodeWordBoundaries)
                                                                              error:&error];
    // always check error returns
    if (error)
    {
        NSLog(@"RE construction failed: %@", error);
        return;
    }
    

    Iterate through sample lines checking if they contain a stop word or not and display result on console:

    for (NSString *aLine in sampleLines)
    {
        // check for all words anywhere in line in one go
        NSRange match = [stopRE rangeOfFirstMatchInString:aLine
                                                  options:0
                                                    range:NSMakeRange(0, aLine.length)];
        BOOL containsStopWord = match.location != NSNotFound;
        NSLog(@"%@: %@", aLine, containsStopWord ? @"Bad" : @"OK");
    }
    

    Regular expression matching should be efficient, and as the example never copies individual words or matches as NSString objects this should not create a lot of temporary objects as methods which enumerate the individual words do.

    HTH

提交回复
热议问题