How can I check if a string contains Chinese in Swift?

流过昼夜 提交于 2019-12-03 03:46:28
Martin R

This answer to How to determine if a character is a Chinese character can also easily be translated from Ruby to Swift (now updated for Swift 3):

extension String {
    var containsChineseCharacters: Bool {
        return self.range(of: "\\p{Han}", options: .regularExpression) != nil
    }
}

if myString.containsChineseCharacters {
    print("Contains Chinese")
}

In a regular expression, "\p{Han}" matches all characters with the "Han" Unicode property, which – as I understand it – are the characters from the CJK languages.

Airspeed Velocity

Looking at questions on how to do this in other languages (such as this accepted answer for Ruby) it looks like the common technique is to determine if each character in the string falls in the CJK range. The ruby answer could be adapted to Swift strings as extension with the following code:

extension String {
    var containsChineseCharacters: Bool {
        return self.unicodeScalars.contains { scalar in
            let cjkRanges: [ClosedInterval<UInt32>] = [
                0x4E00...0x9FFF,   // main block
                0x3400...0x4DBF,   // extended block A
                0x20000...0x2A6DF, // extended block B
                0x2A700...0x2B73F, // extended block C
            ]
            return cjkRanges.contains { $0.contains(scalar.value) }
        }
    }
}

// true:
"Hi! 大家好!It's contains Chinese!".containsChineseCharacters
// false:
"Hello, world!".containsChineseCharacters

The ranges may already exist in Foundation somewhere rather than manually hardcoding them.

The above is for Swift 2.0, for earlier, you will have to use the free contains function rather than the protocol extension (twice):

extension String {
    var containsChineseCharacters: Bool {
        return contains(self.unicodeScalars) {
          // older version of compiler seems to need extra help with type inference 
          (scalar: UnicodeScalar)->Bool in
            let cjkRanges: [ClosedInterval<UInt32>] = [
                0x4E00...0x9FFF,   // main block
                0x3400...0x4DBF,   // extended block A
                0x20000...0x2A6DF, // extended block B
                0x2A700...0x2B73F, // extended block C
            ]
            return contains(cjkRanges) { $0.contains(scalar.value) }
        }
    }
}

Try this in Swift 2:

var myString = "Hi! 大家好!It's contains Chinese!"

var a = false

for c in myString.characters {
    let cs = String(c)
    a = a || (cs != cs.stringByApplyingTransform(NSStringTransformMandarinToLatin, reverse: false))
}
print("\(myString) contains Chinese characters = \(a)")

The accepted answer only find if string contains Chinese character, i created one suit for my own case:

enum ChineseRange {
    case notFound, contain, all
}

extension String {
    var findChineseCharacters: ChineseRange {
        guard let a = self.range(of: "\\p{Han}*\\p{Han}", options: .regularExpression) else {
            return .notFound
        }
        var result: ChineseRange
        switch a {
        case nil:
            result = .notFound
        case self.startIndex..<self.endIndex:
            result = .all
        default:
            result = .contain
        }
        return result
    }
}

if "你好".findChineseCharacters == .all {
    print("All Chinese")
}

if "Chinese".findChineseCharacters == .notFound {
    print("Not found Chinese")
}

if "Chinese你好".findChineseCharacters == .contain {
    print("Contains Chinese")
}

gist here: https://gist.github.com/williamhqs/6899691b5a26272550578601bee17f1a

I have created a Swift 3 String extension for checking how much Chinese characters a String contains. Similar to the code by Airspeed Velocity but more comprehensive. Checking various Unicode ranges to see whether a character is Chinese. See Chinese character ranges listed in the tables under section 18.1 in the Unicode standard specification: http://www.unicode.org/versions/Unicode9.0.0/ch18.pdf

The String extension can be found on GitHub: https://github.com/niklasberglund/String-chinese.swift

Usage example:

let myString = "Hi! 大家好!It contains Chinese!"
let chinesePercentage = myString.chinesePercentage()
let chineseCharacterCount = myString.chineseCharactersCount()
print("String contains \(chinesePercentage) percent Chinese. That's \(chineseCharacterCount) characters.")
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!