Fastest way to get array of NSRange objects for all uppercase letters in an NSString?

后端 未结 4 881
臣服心动
臣服心动 2021-02-10 09:06

I need NSRange objects for the position of each uppercase letter in a given NSString for input into a method for a custom attributed string class. 

There are of course q

4条回答
  •  灰色年华
    2021-02-10 09:32

    The simplest way is probably to use -rangeOfCharacterFromSet:options:range: with [NSCharacterSet uppercaseLetterCharacterSet]. By modifying the range to search over with each call, you can find all of the uppercase letters pretty easily. Something like the following will work to give you an NSArray of all ranges (encoded as NSValues):

    - (NSArray *)rangesOfUppercaseLettersInString:(NSString *)str {
        NSCharacterSet *cs = [NSCharacterSet uppercaseLetterCharacterSet];
        NSMutableArray *results = [NSMutableArray array];
        NSRange searchRange = NSMakeRange(0, [str length]);
        NSRange range;
        while ((range = [str rangeOfCharacterFromSet:cs options:0 range:searchRange]).location != NSNotFound) {
            [results addObject:[NSValue valueWithRange:range]];
            searchRange = NSMakeRange(NSMaxRange(range), [str length] - NSMaxRange(range));
        }
        return results;
    }
    

    Note, this will not coalesce adjacent ranges into a single range, but that's easy enough to add.

    Here's an alternative solution based on NSScanner:

    - (NSArray *)rangesOfUppercaseLettersInString:(NSString *)str {
        NSCharacterSet *cs = [NSCharacterSet uppercaseLetterCharacterSet];
        NSMutableArray *results = [NSMutableArray array];
        NSScanner *scanner = [NSScanner scannerWithString:str];
        while (![scanner isAtEnd]) {
            [scanner scanUpToCharactersFromSet:cs intoString:NULL]; // skip non-uppercase characters
            NSString *temp;
            NSUInteger location = [scanner scanLocation];
            if ([scanner scanCharactersFromSet:cs intoString:&temp]) {
                // found one (or more) uppercase characters
                NSRange range = NSMakeRange(location, [temp length]);
                [results addObject:[NSValue valueWithRange:range]];
            }
        }
        return results;
    }
    

    Unlike the last, this one does coalesce adjacent uppercase characters into a single range.

    Edit: If you're looking for absolute speed, this one will likely be the fastest of the 3 presented here, while still preserving correct unicode support (note, I have not tried compiling this):

    // returns a pointer to an array of NSRanges, and fills in count with the number of ranges
    // the buffer is autoreleased
    - (NSRange *)rangesOfUppercaseLettersInString:(NSString *)string count:(NSUInteger *)count {
        NSMutableData *data = [NSMutableData data];
        NSUInteger numRanges = 0;
        NSUInteger length = [string length];
        unichar *buffer = malloc(sizeof(unichar) * length);
        [string getCharacters:buffer range:NSMakeRange(0, length)];
        NSCharacterSet *cs = [NSCharacterSet uppercaseLetterCharacterSet];
        NSRange range = {NSNotFound, 0};
        for (NSUInteger i = 0; i < length; i++) {
            if ([cs characterIsMember:buffer[i]]) {
                if (range.location == NSNotFound) {
                    range = (NSRange){i, 0};
                }
                range.length++;
            } else if (range.location != NSNotFound) {
                [data appendBytes:&range length:sizeof(range)];
                numRanges++;
                range = (NSRange){NSNotFound, 0};
            }
        }
        if (range.location != NSNotFound) {
            [data appendBytes:&range length:sizeof(range)];
            numRanges++;
        }
        if (count) *count = numRanges;
        return [data bytes];
    }
    

提交回复
热议问题