Remove HTML Tags from an NSString on the iPhone

前端 未结 22 1113
心在旅途
心在旅途 2020-11-22 10:02

There are a couple of different ways to remove HTML tags from an NSString in Cocoa.

One way is to render the string into an

相关标签:
22条回答
  • 2020-11-22 10:15
    #import "RegexKitLite.h"
    
    string text = [html stringByReplacingOccurrencesOfRegex:@"<[^>]+>" withString:@""]
    
    0 讨论(0)
  • 2020-11-22 10:15

    I've extended the answer by m.kocikowski and tried to make it a bit more efficient by using an NSMutableString. I've also structured it for use in a static Utils class (I know a Category is probably the best design though), and removed the autorelease so it compiles in an ARC project.

    Included here in case anybody finds it useful.

    .h

    + (NSString *)stringByStrippingHTML:(NSString *)inputString;
    

    .m

    + (NSString *)stringByStrippingHTML:(NSString *)inputString 
    {
      NSMutableString *outString;
    
      if (inputString)
      {
        outString = [[NSMutableString alloc] initWithString:inputString];
    
        if ([inputString length] > 0)
        {
          NSRange r;
    
          while ((r = [outString rangeOfString:@"<[^>]+>" options:NSRegularExpressionSearch]).location != NSNotFound)
          {
            [outString deleteCharactersInRange:r];
          }      
        }
      }
    
      return outString; 
    }
    
    0 讨论(0)
  • 2020-11-22 10:18

    Another one way:

    Interface:

    -(NSString *) stringByStrippingHTML:(NSString*)inputString;

    Implementation

    (NSString *) stringByStrippingHTML:(NSString*)inputString
    { 
    NSAttributedString *attrString = [[NSAttributedString alloc] initWithData:[inputString dataUsingEncoding:NSUTF8StringEncoding] options:@{NSDocumentTypeDocumentAttribute: NSHTMLTextDocumentType,NSCharacterEncodingDocumentAttribute: @(NSUTF8StringEncoding)} documentAttributes:nil error:nil];
    NSString *str= [attrString string]; 
    
    //you can add here replacements as your needs:
        [str stringByReplacingOccurrencesOfString:@"[" withString:@""];
        [str stringByReplacingOccurrencesOfString:@"]" withString:@""];
        [str stringByReplacingOccurrencesOfString:@"\n" withString:@""];
    
        return str;
    }
    

    Realization

    cell.exampleClass.text = [self stringByStrippingHTML:[exampleJSONParsingArray valueForKey: @"key"]];

    or simple

    NSString *myClearStr = [self stringByStrippingHTML:rudeStr];

    0 讨论(0)
  • 2020-11-22 10:19

    use this

    NSString *myregex = @"<[^>]*>"; //regex to remove any html tag
    
    NSString *htmlString = @"<html>bla bla</html>";
    NSString *stringWithoutHTML = [hstmString stringByReplacingOccurrencesOfRegex:myregex withString:@""];
    

    don't forget to include this in your code : #import "RegexKitLite.h" here is the link to download this API : http://regexkit.sourceforge.net/#Downloads

    0 讨论(0)
  • 2020-11-22 10:19

    Extending this more from m.kocikowski's and Dan J's answers with more explanation for newbies

    1# First you have to create objective-c-categories to make the code useable in any class.

    .h

    @interface NSString (NAME_OF_CATEGORY)
    
    - (NSString *)stringByStrippingHTML;
    
    @end
    

    .m

    @implementation NSString (NAME_OF_CATEGORY)
    
    - (NSString *)stringByStrippingHTML
    {
    NSMutableString *outString;
    NSString *inputString = self;
    
    if (inputString)
    {
        outString = [[NSMutableString alloc] initWithString:inputString];
    
        if ([inputString length] > 0)
        {
            NSRange r;
    
            while ((r = [outString rangeOfString:@"<[^>]+>" options:NSRegularExpressionSearch]).location != NSNotFound)
            {
                [outString deleteCharactersInRange:r];
            }
        }
    }
    
    return outString;
    }
    
    @end
    

    2# Then just import the .h file of the category class you've just created e.g.

    #import "NSString+NAME_OF_CATEGORY.h"
    

    3# Calling the Method.

    NSString* sub = [result stringByStrippingHTML];
    NSLog(@"%@", sub);
    

    result is NSString I want to strip the tags from.

    0 讨论(0)
  • 2020-11-22 10:20

    Take a look at NSXMLParser. It's a SAX-style parser. You should be able to use it to detect tags or other unwanted elements in the XML document and ignore them, capturing only pure text.

    0 讨论(0)
提交回复
热议问题