Remove HTML Tags from an NSString on the iPhone

前端 未结 22 1122
心在旅途
心在旅途 2020-11-22 10:02

There are a couple of different ways to remove HTML tags from an NSString in Cocoa.

One way is to render the string into an

22条回答
  •  情深已故
    2020-11-22 10:37

    Here's a more efficient solution than the accepted answer:

    - (NSString*)hp_stringByRemovingTags
    {
        static NSRegularExpression *regex = nil;
        static dispatch_once_t onceToken;
        dispatch_once(&onceToken, ^{
            regex = [NSRegularExpression regularExpressionWithPattern:@"<[^>]+>" options:kNilOptions error:nil];
        });
    
        // Use reverse enumerator to delete characters without affecting indexes
        NSArray *matches =[regex matchesInString:self options:kNilOptions range:NSMakeRange(0, self.length)];
        NSEnumerator *enumerator = matches.reverseObjectEnumerator;
    
        NSTextCheckingResult *match = nil;
        NSMutableString *modifiedString = self.mutableCopy;
        while ((match = [enumerator nextObject]))
        {
            [modifiedString deleteCharactersInRange:match.range];
        }
        return modifiedString;
    }
    

    The above NSString category uses a regular expression to find all the matching tags, makes a copy of the original string and finally removes all the tags in place by iterating over them in reverse order. It's more efficient because:

    • The regular expression is initialised only once.
    • A single copy of the original string is used.

    This performed well enough for me but a solution using NSScanner might be more efficient.

    Like the accepted answer, this solution doesn't address all the border cases requested by @lfalin. Those would be require much more expensive parsing which the average use case most likely doesn't need.

提交回复
热议问题