I am pulling a JSON file from a site and one of the strings received is:
The Weeknd ‘King Of The Fall&
I was looking for a pure Swift 3.0 utility to escape to/unescape from HTML character references (i.e. for server-side Swift apps on both macOS and Linux) but didn't find any comprehensive solutions, so I wrote my own implementation: https://github.com/IBM-Swift/swift-html-entities
The package, HTMLEntities
, works with HTML4 named character references as well as hex/dec numeric character references, and it will recognize special numeric character references per the W3 HTML5 spec (i.e.
should be unescaped as the Euro sign (unicode U+20AC
) and NOT as the unicode character for U+0080
, and certain ranges of numeric character references should be replaced with the replacement character U+FFFD
when unescaping).
Usage example:
import HTMLEntities
// encode example
let html = ""
print(html.htmlEscape())
// Prints ”<script>alert("abc")</script>"
// decode example
let htmlencoded = "<script>alert("abc")</script>"
print(htmlencoded.htmlUnescape())
// Prints ”"
And for OP's example:
print("The Weeknd ‘King Of The Fall’ [Video Premiere] | @TheWeeknd | #SoPhi ".htmlUnescape())
// prints "The Weeknd ‘King Of The Fall’ [Video Premiere] | @TheWeeknd | #SoPhi "
Edit: HTMLEntities
now supports HTML5 named character references as of version 2.0.0. Spec-compliant parsing is also implemented.