Detect Language of NSString

后端 未结 6 1021
鱼传尺愫
鱼传尺愫 2020-11-27 11:39

Somebody told me about a class for language recognition in Cocoa. Does anybody know which one it is?

This is not working:

NSSpellCh         


        
相关标签:
6条回答
  • 2020-11-27 12:11

    There is API in cocoa available to check the language of a string, and it is always best to use Foundation over CoreFoundation whenever possible.

    NSArray *tagschemes = [NSArray arrayWithObjects:NSLinguisticTagSchemeLanguage, nil];
    NSLinguisticTagger *tagger = [[NSLinguisticTagger alloc] initWithTagSchemes:tagschemes options:0];
    [tagger setString:@"Das ist ein bisschen deutscher Text. Bitte löschen Sie diesen nicht."];
    NSString *language = [tagger tagAtIndex:0 scheme:NSLinguisticTagSchemeLanguage tokenRange:NULL sentenceRange:NULL];
    

    Alternatively, if you happen to have mixed language text, you can use the enumerateLinguisticTagsInRange API to get the language of each word in the text.

    0 讨论(0)
  • 2020-11-27 12:12

    With Swift 5, you can choose one of the following approaches in order to detect the language of a given string.


    #1. Using NSLinguisticTagger's dominantLanguage property

    Since iOS 11, NSLinguisticTagger has a property called dominantLanguage. dominantLanguage has the following declaration:

    var dominantLanguage: String? { get }
    

    Returns the dominant language of the string set for the linguistic tagger.

    The Playground sample code below show how to use dominantLanguage in order to know the dominant language of a string:

    import Foundation
    
    let text = "あなたはそれを行うべきではありません。"
    let tagger = NSLinguisticTagger(tagSchemes: [.language], options: 0)
    tagger.string = text
    let language = tagger.dominantLanguage
    print(language) // Optional("ja")
    

    #2. Using NSLinguisticTagger's dominantLanguage(for:) method

    As an alternative, NSLinguisticTagger has a convenience method called dominantLanguage(for:) for creating a new linguistic tagger, setting its string property and getting the dominantLanguage property. dominantLanguage(for:) has the following declaration:

    class func dominantLanguage(for string: String) -> String?
    

    Returns the dominant language for the specified string.

    Usage:

    import Foundation
    
    let text = "Die Kleinen haben friedlich zusammen gespielt."
    let language = NSLinguisticTagger.dominantLanguage(for: text)
    print(language) // Optional("de")
    

    #3. Using NLLanguageRecognizer's dominantLanguage property

    Since iOS 12, NLLanguageRecognizer has a property called dominantLanguage. dominantLanguage has the following declaration:

    var dominantLanguage: NLLanguage? { get }
    

    The most likely language for the processed text.

    Here’s how to use dominantLanguage to guess the dominant language of natural language text:

    import NaturalLanguage
    
    let string = "J'ai deux amours. Mon pays et Paris."
    let recognizer = NLLanguageRecognizer()
    recognizer.processString(string)
    let language = recognizer.dominantLanguage
    print(language?.rawValue) // Optional("fr")
    
    0 讨论(0)
  • 2020-11-27 12:31

    You can use -requestCheckingOfString:… instead. NSTextCheckingTypeOrthography attempts to identify the language used in the string, and the completion handler receives an NSOrthography parameter that can be used to get information about the orthography in the string, including its dominant language.

    The following example outputs dominant language = de:

    NSSpellChecker *spellChecker = [NSSpellChecker sharedSpellChecker];
    [spellChecker setAutomaticallyIdentifiesLanguages:YES];
    NSString *spellCheckText = @"Guten Herr Mustermann. Dies ist ein deutscher Text. Bitte löschen Sie diesen nicht.";
    
    [spellChecker requestCheckingOfString:spellCheckText
        range:(NSRange){0, [spellCheckText length]}
        types:NSTextCheckingTypeOrthography
        options:nil
        inSpellDocumentWithTag:0
        completionHandler:^(NSInteger sequenceNumber, NSArray *results, NSOrthography *orthography, NSInteger wordCount) {
            NSLog(@"dominant language = %@", orthography.dominantLanguage);
    }];
    
    0 讨论(0)
  • 2020-11-27 12:32

    A swift String extension for Jennifer's answer:

    extension String {
        func language() -> String? {
            let tagger = NSLinguisticTagger(tagSchemes: [NSLinguisticTagSchemeLanguage], options: 0)
            tagger.string = self
            return tagger.tagAtIndex(0, scheme: NSLinguisticTagSchemeLanguage, tokenRange: nil, sentenceRange: nil)
        }
    }
    

    Usage:

    let language = "What language is this?".language()
    
    0 讨论(0)
  • 2020-11-27 12:34

    As of iOS 11 you can use the dominantLanguage(for:)/dominantLanguageForString: class method of NSLinguisticTagger.

    Swift:

    extension String {
        var language: String? {
            return NSLinguisticTagger.dominantLanguage(for: self)
        }
    }
    
    print("Good morning".language)
    print("Buenos días".language)
    

    Objective-C:

    @interface NSString (Tagger)
    
    @property (nonatomic, readonly, nullable) NSString *language;
    @end
    
    @implementation NSString (Tagger)
    
    - (NSString *)language {
        return [NSLinguisticTagger dominantLanguageForString:self];
    }
    
    @end
    
    NSLog(@"%@", @"Good morning".language);
    NSLog(@"%@", @"Buenos días".language);
    

    Output (for both):

    en
    es

    0 讨论(0)
  • 2020-11-27 12:36

    Thats the result:

    - (NSString *)languageForString:(NSString *) text{
    
         if (text.length < 100) {
             return (NSString *) CFStringTokenizerCopyBestStringLanguage((CFStringRef)text, CFRangeMake(0, text.length));
         } else {
             return (NSString *)CFStringTokenizerCopyBestStringLanguage((CFStringRef)text, CFRangeMake(0, 100));
         }
    }
    
    0 讨论(0)
提交回复
热议问题