How to convert surrogate pair to Unicode scalar in Swift

 ̄綄美尐妖づ 提交于 2019-12-18 16:39:08

问题


The following example is taken from the Strings and Characters documentation:

The values 55357 (U+D83D in hex) and 56374 (U+DC36 in hex) are the surrogate pairs that form the Unicode scalar U+1F436, which is the DOG FACE character. Is there any way to go the other direction? That is, can I convert a surrogate pair into a scalar?

I tried

let myChar: Character = "\u{D83D}\u{DC36}"

but I got an "Invalid Unicode scalar" error.

This Objective C answer and this project seem to be custom solutions, but is there anything built into Swift (especially Swift 2.0+) that does this?


回答1:


There are formulas to calculate the original code point based on a surrogate pair and vice versa. From https://mathiasbynens.be/notes/javascript-encoding#surrogate-formulae:

Section 3.7 of The Unicode Standard 3.0 defines the algorithms for converting to and from surrogate pairs.

A code point C greater than 0xFFFF corresponds to a surrogate pair <H, L> as per the following formula:

H = Math.floor((C - 0x10000) / 0x400) + 0xD800
L = (C - 0x10000) % 0x400 + 0xDC00

The reverse mapping, i.e. from a surrogate pair <H, L> to a Unicode code point C, is given by:

C = (H - 0xD800) * 0x400 + L - 0xDC00 + 0x10000



回答2:


Given an sequence of UTF-16 code units (i.e. 16-bit numbers, such as you get from String.utf16 or just an array of numbers), you can use the UTF16 type and its decode method to turn it into UnicodeScalars, which you can then convert into a String.

It’s a bit of a grungy item, that takes a generator (as it does stateful processing) and returns an enum that indicates a result (with an associated type of the scalar), or an error or completion. Swift 2.0 pattern matching makes it a lot easier to use:

let u16data: [UInt16] = [0xD83D,0xDC36]
//or let u16data = "Hello, 🌍".utf16

var g = u16data.generate()
var s: String = ""
var utf16 = UTF16()
while case let .Result(scalar) = utf16.decode(&g) {
    print(scalar, &s)
}
print(s) // prints 🐶


来源:https://stackoverflow.com/questions/31282675/how-to-convert-surrogate-pair-to-unicode-scalar-in-swift

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!