I have a string composed of words, some of which contain punctuation, which I would like to remove, but I have been unable to figure out how to do this.
For example if I
An alternate way to filter characters from a set and obtain an array of words is by using the array's filter
and reduce
methods. It's not as compact as other answers, but it shows how the same result can be obtained in a different way.
First define an array of the characters to remove:
let charactersToRemove = Set(Array(".:?,"))
next convert the input string into an array of characters:
let arrayOfChars = Array(words)
Now we can use reduce
to build a string, obtained by appending the elements from arrayOfChars
, but skipping all the ones included in charactersToRemove
:
let filteredString = arrayOfChars.reduce("") {
let str = String($1)
return $0 + (charactersToRemove.contains($1) ? "" : str)
}
This produces a string without the punctuation characters (as defined in charactersToRemove
).
The last 2 steps:
split the string into an array of words, using the blank character as separator:
let arrayOfWords = filteredString.componentsSeparatedByString(" ")
last, remove all empty elements:
let finalArrayOfWords = arrayOfWords.filter { $0.isEmpty == false }
let charactersToRemove = NSCharacterSet.punctuationCharacterSet().invertedSet
let aWord = "".join(words.componentsSeparatedByCharactersInSet(charactersToRemove))
String
has a enumerateSubstringsInRange()
method.
With the .ByWords
option, it detects word boundaries and
punctuation automatically:
Swift 3/4:
let string = "Hello, this : is .. a \"string\"!"
var words : [String] = []
string.enumerateSubstrings(in: string.startIndex..<string.endIndex,
options: .byWords) {
(substring, _, _, _) -> () in
words.append(substring!)
}
print(words) // [Hello, this, is, a, string]
Swift 2:
let string = "Hello, this : is .. a \"string\"!"
var words : [String] = []
string.enumerateSubstringsInRange(string.characters.indices,
options: .ByWords) {
(substring, _, _, _) -> () in
words.append(substring!)
}
print(words) // [Hello, this, is, a, string]
This works with Xcode 8.1, Swift 3:
First define general-purpose extension for filtering by CharacterSet
:
extension String {
func removingCharacters(inCharacterSet forbiddenCharacters:CharacterSet) -> String
{
var filteredString = self
while true {
if let forbiddenCharRange = filteredString.rangeOfCharacter(from: forbiddenCharacters) {
filteredString.removeSubrange(forbiddenCharRange)
}
else {
break
}
}
return filteredString
}
}
Then filter using punctuation:
let s:String = "Hello, world!"
s.removingCharacters(inCharacterSet: CharacterSet.punctuationCharacters) // => "Hello world"
NSScaner way:
let words = "Hello, this : is .. a string?"
//
let scanner = NSScanner(string: words)
var wordArray:[String] = []
var word:NSString? = ""
while(!scanner.atEnd) {
var sr = scanner.scanCharactersFromSet(NSCharacterSet(charactersInString: "abcdefghijklmnopqrstuvwxyzABCDEFGHIJKMNOPQRSTUVWXYZ"), intoString: &word)
if !sr {
scanner.scanLocation++
continue
}
wordArray.append(String(word!))
}
println(wordArray)
Xcode 11.4 • Swift 5.2 or later
extension StringProtocol {
var words: [SubSequence] {
split(whereSeparator: \.isLetter.negation)
}
}
extension Bool {
var negation: Bool { !self }
}
let sentence = "Hello, this : is .. a string?"
let words = sentence.words // ["Hello", "this", "is", "a", "string"]