Read UTF8 character in specify position from a NSString

前端 未结 3 620
隐瞒了意图╮
隐瞒了意图╮ 2021-02-03 13:12

    NSString* str = @\"1二3四5\";
    NSLog(@\"%c\",[str characterAtIndex:0]); 
    NSLog(@\"%c\",[str characterAtIndex:1]);  

NSString - characterAtIndex works we

相关标签:
3条回答
  • 2021-02-03 13:32

    Unfortunately Dave's answer doesn't actually do what you want. The index supplied to rangeOfComposedCharacterSequenceAtIndex is an index of a UTF-16 code unit, 1 or 2 or which make a UTF-16 code point. So 1 is not the second UTF-16 code point if the first code point in the string requires 2 code units... (rangeOfComposedCharacterSequenceAtIndex returns the range of the code point which includes the code unit at the given index, so if your first char requires 2 code units then passing an index of 0 or 1 returns the same range).

    If you want to find the UTF-8 sequence for a character you can use UTF8String and then parse the resultant bytes to find the byte sequence for the nth character. Or you can likewise use rangeOfComposedCharacterSequenceAtIndex starting at index 0 and iterate till you get to the nth character, then convert the 1 or 2 UTF-16 code units to UTF-8 code units.

    I hope we're all missing something and this is built-in...

    A start (needs bounds checking!) of a category which might help:

    @interface NSString (UTF)
    
    - (NSRange) rangeOfUTFCodePoint:(NSUInteger)number;
    
    @end
    
    @implementation NSString (UTF)
    
    - (NSRange) rangeOfUTFCodePoint:(NSUInteger)number
    {
        NSUInteger codeUnit = 0;
        NSRange result;
        for(NSUInteger ix = 0; ix <= number; ix++)
        {
            result = [self rangeOfComposedCharacterSequenceAtIndex:codeUnit];
            codeUnit += result.length;
        }
        return result;
    }
    
    @end
    

    but this sort of stuff is more efficient using char * rather then NSString

    0 讨论(0)
  • 2021-02-03 13:37

    You'd use the more verbose methods:

    NSRange rangeOfSecondCharacter = [str rangeOfComposedCharacterSequenceAtIndex:1];
    NSString *secondCharacter = [str substringWithRange:rangeOfSecondCharacter];
    

    ...with proper bounds and range checking, of course. Note that this gives you an NSString, an object, not a unichar or some other primitive data type.

    0 讨论(0)
  • 2021-02-03 13:51

    Why don't you try to use something like that:

    const char *yourWantedCharacter = [[yourSourceString substringWithRange:yourRange] UTF8String];
    

    where yourSourceString is your NSString object, yourRange is an NSRange object with the index of the needed character as the location parameter and an length parameter of '0' (zero).

    0 讨论(0)
提交回复
热议问题