Count the number of lines in a Swift String

喜夏-厌秋 提交于 2019-12-19 10:32:08

问题


After reading a medium sized file (about 500kByte) from a web-service I have a regular Swift String (lines) originally encoded in .isolatin1. Before actually splitting it I would like to count the number of lines (quickly) in order to be able to initialise a progress bar.

What is the best Swift idiom to achieve this?

I came up with the following:

let linesCount = lines.reduce(into: 0) { (count, letter) in
   if letter == "\r\n" {
      count += 1
   }
}

This does not look too bad but I am asking myself if there is a shorter/faster way to do it. The characters property provides access to a sequence of Unicode graphemes which treat \r\n as only one entity. Checking this with all CharacterSet.newlines does not work, since CharacterSet is not a set of Character but a set of Unicode.Scalar (a little counter-intuitively in my book) which is a set of code points (where \r\n counts as two code points), not graphemes. Trying

var lines = "Hello, playground\r\nhere too\r\nGalahad\r\n"
lines.unicodeScalars.reduce(into: 0) { (cnt, letter) in
if CharacterSet.newlines.contains(letter) {
    cnt += 1
}

}

will count to 6 instead of 3. So this is more general than the above method, but it will not work correctly for CRLF line endings.

Is there a way to allow for more line ending conventions (as in CharacterSet.newlines) that still achieves the correct result for CRLF? Can the number of lines be computed with less code (while still remaining readable)?


回答1:


If it's ok for you to use a Foundation method on an NSString, I suggest using

enumerateLines(_ block: @escaping (String, UnsafeMutablePointer<ObjCBool>) -> Void)

Here's an example:

import Foundation

let base = "Hello, playground\r\nhere too\r\nGalahad\r\n"
let ns = base as NSString

ns.enumerateLines { (str, _) in
    print(str)
}

It separates the lines properly, taking into account all linefeed types, such as "\r\n", "\n", etc:

Hello, playground
here too
Galahad

In my example I print the lines but it's trivial to count them instead, as you need to - my version is just for the demonstration.




回答2:


As I did not find a generic way to count newlines I ended up just solving my problem by iterating through all the characters using

let linesCount = text.reduce(into: 0) { (count, letter) in
     if letter == "\r\n" {      // This treats CRLF as one "letter", contrary to UnicodeScalars
        count += 1
     }
}

I was sure this would be a lot faster than enumerating lines for just counting, but I resolved to eventually do the measurement. Today I finally got to it and found ... that I could not have been more wrong.

A 10000 line string counted lines as above in about 1.0 seconds , but counting through enumeration using

var enumCount = 0
text.enumerateLines { (str, _) in
    enumCount += 1
}

only took around 0.8 seconds and was consistently faster by a little more than 20%. I do not know what tricks the Swift engineers hide in their sleves, but they sure manage to enumerateLines very quickly. This just for the record.



来源:https://stackoverflow.com/questions/46490920/count-the-number-of-lines-in-a-swift-string

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!