I have a large bunch of text. For example
I want to split a paragraph into sentences. But, there is a problem. My paragraph includes dates like Jan.13, 20
NSLinguisticTagger
is deprecated. Using NLTagger
instead. (iOS 12.0+, macOS 10.14+)
import NaturalLanguage
var str = "I want to split a paragraph into sentences. But, there is a problem. My paragraph includes dates like Jan.13, 2014 , words like U.A.E and numbers like 2.2. How do i split this."
func splitSentenceFrom(text: String) -> [String] {
var result: [String] = []
let tagger = NLTagger(tagSchemes: [.lexicalClass])
tagger.string = text
tagger.enumerateTags(in: text.startIndex.. Bool in
result.append(String(text[tokenRange]))
return true
}
return result
}
let sentences = splitSentenceFrom(text: str)
sentences.forEach {
print($0)
}
output:
I want to split a paragraph into sentences.
But, there is a problem.
My paragraph includes dates like Jan.13, 2014 , words like U.A.E and numbers like 2.2.
How do i split this.
want to exclude empty sentences and trim whitespace? add this
let sentence = String(text[tokenRange]).trimmingCharacters(in: .whitespacesAndNewlines)
if sentence.count > 0 {
result.append(sentence)
}