Wondering if there is an easy way to do a simple HTML escape/unescape in Objective C. What I want is something like this psuedo code:
NSString *string = @\"
Here's a solution that neutralizes all characters (by making them all HTML encoded entities for their unicode value)... Used this for my need (making sure a string that came from the user but was placed inside of a webview couldn't have any XSS attacks):
Interface:
@interface NSString (escape)
- (NSString*)stringByEncodingHTMLEntities;
@end
Implementation:
@implementation NSString (escape)
- (NSString*)stringByEncodingHTMLEntities {
// Rather then mapping each individual entity and checking if it needs to be replaced, we simply replace every character with the hex entity
NSMutableString *resultString = [NSMutableString string];
for(int pos = 0; pos<[self length]; pos++)
[resultString appendFormat:@"&#x%x;",[self characterAtIndex:pos]];
return [NSString stringWithString:resultString];
}
@end
Usage Example:
UIWebView *webView = [[UIWebView alloc] init];
NSString *userInput = @"<script>alert('This is an XSS ATTACK!');</script>";
NSString *safeInput = [userInput stringByEncodingHTMLEntities];
[webView loadHTMLString:safeInput baseURL:nil];
Your mileage will vary.
The MREntitiesConverter above is an HTML stripper, not encoder.
If you need an encoder, go here: Encode NSString for XML/HTML
This is an incredibly hacked together solution I did, but if you want to simply escape a string without worrying about parsing, do this:
-(NSString *)htmlEntityDecode:(NSString *)string
{
string = [string stringByReplacingOccurrencesOfString:@""" withString:@"\""];
string = [string stringByReplacingOccurrencesOfString:@"'" withString:@"'"];
string = [string stringByReplacingOccurrencesOfString:@"<" withString:@"<"];
string = [string stringByReplacingOccurrencesOfString:@">" withString:@">"];
string = [string stringByReplacingOccurrencesOfString:@"&" withString:@"&"]; // Do this last so that, e.g. @"&lt;" goes to @"<" not @"<"
return string;
}
I know it's by no means elegant, but it gets the job done. You can then decode an element by calling:
string = [self htmlEntityDecode:string];
Like I said, it's hacky but it works. IF you want to encode a string, just reverse the stringByReplacingOccurencesOfString parameters.
This link contains the solution below. Cocoa CF has the CFXMLCreateStringByUnescapingEntities function but that's not available on the iPhone.
@interface MREntitiesConverter : NSObject <NSXMLParserDelegate>{
NSMutableString* resultString;
}
@property (nonatomic, retain) NSMutableString* resultString;
- (NSString*)convertEntitiesInString:(NSString*)s;
@end
@implementation MREntitiesConverter
@synthesize resultString;
- (id)init
{
if([super init]) {
resultString = [[NSMutableString alloc] init];
}
return self;
}
- (void)parser:(NSXMLParser *)parser foundCharacters:(NSString *)s {
[self.resultString appendString:s];
}
- (NSString*)convertEntitiesInString:(NSString*)s {
if (!s) {
NSLog(@"ERROR : Parameter string is nil");
}
NSString* xmlStr = [NSString stringWithFormat:@"<d>%@</d>", s];
NSData *data = [xmlStr dataUsingEncoding:NSUTF8StringEncoding allowLossyConversion:YES];
NSXMLParser* xmlParse = [[[NSXMLParser alloc] initWithData:data] autorelease];
[xmlParse setDelegate:self];
[xmlParse parse];
return [NSString stringWithFormat:@"%@",resultString];
}
- (void)dealloc {
[resultString release];
[super dealloc];
}
@end
Another HTML NSString category from Google Toolbox for Mac
Despite the name, this works on iOS too.
http://google-toolbox-for-mac.googlecode.com/svn/trunk/Foundation/GTMNSString+HTML.h
/// Get a string where internal characters that are escaped for HTML are unescaped
//
/// For example, '&' becomes '&'
/// Handles   and 2 cases as well
///
// Returns:
// Autoreleased NSString
//
- (NSString *)gtm_stringByUnescapingFromHTML;
And I had to include only three files in the project: header, implementation and GTMDefines.h
.
If you need to generate a literal you might consider using a tool like this:
http://www.freeformatter.com/java-dotnet-escape.html#ad-output
to accomplish the work for you.
See also this answer.