问题
I'm having trouble getting NSRegularExpression
to match patterns on strings with wider (?) Unicode characters in them. It looks like the problem is the range parameter -- Swift counts individual Unicode characters, while Objective-C treats strings as if they're made up of UTF-16 code units.
Here is my test string and two regular expressions:
let str = "dog🐶🐮cow"
let dogRegex = NSRegularExpression(pattern: "d.g", options: nil, error: nil)!
let cowRegex = NSRegularExpression(pattern: "c.w", options: nil, error: nil)!
I can match the first regex with no problems:
let dogMatch = dogRegex.firstMatchInString(str, options: nil,
range: NSRange(location: 0, length: countElements(str)))
println(dogMatch?.range) // (0, 3)
But the second fails with the same parameters, because the range I send it (0...7) isn't long enough to cover the whole string as far as NSRegularExpression
is concerned:
let cowMatch = cowRegex.firstMatchInString(str, options: nil,
range: NSRange(location: 0, length: countElements(str)))
println(cowMatch.range) // nil
If I use a different range I can make the match succeed:
let cowMatch2 = cowRegex.firstMatchInString(str, options: nil,
range: NSRange(location: 0, length: str.utf16Count))
println(cowMatch2?.range) // (7, 3)
but then I don't know how to extract the matched text out of the string, since that range falls outside the range of the Swift string.
回答1:
Turns out you can fight fire with fire. Using the Swift-native string's utf16Count
property and the substringWithRange:
method of NSString
-- not String
-- gets the right result. Here's the full working code:
let str = "dog🐶🐮cow"
let cowRegex = NSRegularExpression(pattern: "c.w", options: nil, error: nil)!
if let cowMatch = cowRegex.firstMatchInString(str, options: nil,
range: NSRange(location: 0, length: str.utf16Count)) {
println((str as NSString).substringWithRange(cowMatch.range))
// prints "cow"
}
(I figured this out in the process of writing the question; score one for rubber duck debugging.)
来源:https://stackoverflow.com/questions/25882503/how-can-i-use-nsregularexpression-on-swift-strings-with-variable-width-unicode-c