How can I use NSRegularExpression on Swift strings with variable-width Unicode characters?

梦想的初衷 提交于 2019-12-21 09:14:53

问题


I'm having trouble getting NSRegularExpression to match patterns on strings with wider (?) Unicode characters in them. It looks like the problem is the range parameter -- Swift counts individual Unicode characters, while Objective-C treats strings as if they're made up of UTF-16 code units.

Here is my test string and two regular expressions:

let str = "dog🐶🐮cow"
let dogRegex = NSRegularExpression(pattern: "d.g", options: nil, error: nil)!
let cowRegex = NSRegularExpression(pattern: "c.w", options: nil, error: nil)!

I can match the first regex with no problems:

let dogMatch = dogRegex.firstMatchInString(str, options: nil, 
                   range: NSRange(location: 0, length: countElements(str)))
println(dogMatch?.range)  // (0, 3)

But the second fails with the same parameters, because the range I send it (0...7) isn't long enough to cover the whole string as far as NSRegularExpression is concerned:

let cowMatch = cowRegex.firstMatchInString(str, options: nil, 
                   range: NSRange(location: 0, length: countElements(str)))
println(cowMatch.range)  // nil

If I use a different range I can make the match succeed:

let cowMatch2 = cowRegex.firstMatchInString(str, options: nil, 
                    range: NSRange(location: 0, length: str.utf16Count))
println(cowMatch2?.range)  // (7, 3)

but then I don't know how to extract the matched text out of the string, since that range falls outside the range of the Swift string.


回答1:


Turns out you can fight fire with fire. Using the Swift-native string's utf16Count property and the substringWithRange: method of NSString -- not String -- gets the right result. Here's the full working code:

let str = "dog🐶🐮cow"
let cowRegex = NSRegularExpression(pattern: "c.w", options: nil, error: nil)!

if let cowMatch = cowRegex.firstMatchInString(str, options: nil,
                      range: NSRange(location: 0, length: str.utf16Count)) {
    println((str as NSString).substringWithRange(cowMatch.range))
    // prints "cow"
}

(I figured this out in the process of writing the question; score one for rubber duck debugging.)



来源:https://stackoverflow.com/questions/25882503/how-can-i-use-nsregularexpression-on-swift-strings-with-variable-width-unicode-c

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!