How would you scan an array of strings for a set of substrings in objective-c?

后端 未结 5 1753
情歌与酒
情歌与酒 2021-01-27 03:21

So I basically have an array of words and phrases. Some of them contain curses. I want to create a method that automatically scans each of the units in the array for curses. If

5条回答
  •  终归单人心
    2021-01-27 03:41

    Honestly, I think your problem is that more that you think that because parts of the problem can be glossed over in casual speech that must make it an easy problem. Breaking a sentence into words is hard. Examples:

    Words often contain other complete words within them. For example "they" contains "hey". You can't just search for substrings.

    American typographical conventions dictate that you don't put spaces around an emdash. So the correctly written sentence is "hey—how are you?". You can't just split on whitespace and/or just remove punctuation.

    Diacritics are often optional. Even in American English, a minority of publishers — most notably those of the New Yorker — use a diaresis; it looks like an umlaut but marks the second vowel if two run together in a word. Like coöperate. However in some languages they change the word — in German the umlaut is a pronunciation mark and e.g. differentiates Apfel the singular from Äpfel the plural.

    So what exactly would you have Apple add as a simple API-level approach? What should everyone who picked a different option do? It's much smarter to just give you the tools to compose the approach that best suits you.

    That all being said, I think the neatest and most compact form of what I think you're describing is:

        NSArray *inputSentences =
            @[
                @"hey how are you",
                @"what is going on?",
                @"whats up dude?",
                @"do you want to get chipotle?"
            ];
        NSArray *forbiddenWords =
            @[@"you", @"hey"];
    
        NSSet *forbiddenWordsSet = [NSSet setWithArray:forbiddenWords];
        NSCharacterSet *nonLetterSet = 
                     [[NSCharacterSet letterCharacterSet] invertedSet];
    
        NSPredicate *predicate =
            [NSPredicate 
                predicateWithBlock:
                    ^BOOL(NSString *evaluatedObject, NSDictionary *bindings)
                    {
                        return ![forbiddenWordsSet intersectsSet:
                                 [NSSet setWithArray:
                                   [evaluatedObject 
                            componentsSeparatedByCharactersInSet:nonLetterSet]]];
                    }];
    
        NSLog(@"%@", [inputSentences filteredArrayUsingPredicate:predicate]);
    

    Though you might want nonLetterSet to be whitespaceCharacterSet instead. Judge for yourself.

    A predicate is used to automatically filter a set without an explicit loop and manual accumulation. Set intersections are used to avoid a manual internal loop. The only slightly untidy bit is having to use a block predicate as you have to apply preparatory logic.

    On the plus side, most of the code is setup. You can create the predicate once, store it somewhere, then apply it to any array or set of strings anywhere in your code with a single one-line call.

    As noted by other commenters, this will produce a lot of temporary objects.

提交回复
热议问题