What is a fast way to convert a string of two characters to an array of booleans?

前端 未结 8 1540
我寻月下人不归
我寻月下人不归 2021-02-04 10:24

I have a long string (sometimes over 1000 characters) that I want to convert to an array of boolean values. And it needs to do this many times, very quickly.

let         


        
相关标签:
8条回答
  • 2021-02-04 10:51

    This is faster:

    // Algorithm 'A'
    let input = "0101010110010101010"
    var output = Array<Bool>(count: input.characters.count, repeatedValue: false)
    for (index, char) in input.characters.enumerate() where char == "1" {
        output[index] = true
    }
    

    Update: under input = "010101011010101001000100000011010101010101010101"

    0.0741 / 0.0087, where this approach is faster that author's in 8.46 times. With bigger data correlation more positive.

    Also, with using nulTerminatedUTF8 speed a little increased, but not always speed higher than algorithm A:

    // Algorithm 'B'
    let input = "10101010101011111110101000010100101001010101"
    var output = Array<Bool>(count: input.nulTerminatedUTF8.count, repeatedValue: false)
    for (index, code) in input.nulTerminatedUTF8.enumerate() where code == 49 {
        output[index] = true
    }
    

    In result graph appears, with input length 2196, where first and last 0..1, A – second, B – third point. A: 0.311sec, B: 0.304sec

    0 讨论(0)
  • 2021-02-04 10:58

    One more step should speed that up even more. Using reserveCapacity will resize the array once before the loops starts instead of trying to do it as the loop runs.

    var output = [Bool]()
    output.reserveCapacity(input.characters.count)
    for char in input.characters {
        output.append(char == "1")
    }
    
    0 讨论(0)
  • 2021-02-04 11:01

    I need to some testing to be sure but I think one issue with many approaches given including the original map is that they need to iterate over the string to count the characters and then a second time to actually process the characters.

    Have you tried:

    let output = [Bool](input.characters.lazy.map { $0 == "1" })
    

    This might only do a single iteration.

    The other thing that could speed things up is if you can avoid using strings but instead use arrays of characters of an appropriate encoding (particularly if is more fixed size units (e.g. UTF16 or ASCII). Then then length lookup will be O(1) rather than O(n) and the iteration may be faster too

    BTW always test performance with the optimiser enabled and never in the Playground because the performance characteristics are completely different, sometimes by a factor of 100.

    0 讨论(0)
  • 2021-02-04 11:08

    Use withCString(_:) to retrieve a raw UnsafePointer<Int8>. Iterate over that and compare to 49 (ascii value of "1").

    0 讨论(0)
  • 2021-02-04 11:09

    I would guess that this is as fast as possible:

    let targ = Character("1")
    let input: String = "001" // your real string goes here
    let inputchars = Array(input.characters)
    var output:[Bool] = Array.init(count: inputchars.count, repeatedValue: false)
    inputchars.withUnsafeBufferPointer {
        inputbuf in
        output.withUnsafeMutableBufferPointer {
            outputbuf in
            var ptr1 = inputbuf.baseAddress
            var ptr2 = outputbuf.baseAddress
            for _ in 0..<inputbuf.count {
                ptr2.memory = ptr1.memory == targ
                ptr1 = ptr1.successor()
                ptr2 = ptr2.successor()
            }
        }
    }
    // output now contains the result
    

    The reason is that, thanks to the use of buffer pointers, we are simply cycling through contiguous memory, just like the way you cycle through a C array by incrementing its pointer. Thus, once we get past the initial setup, this should be as fast as it would be in C.

    EDIT In an actual test, the time difference between the OP's original method and this one is the difference between

    13.3660290241241
    

    and

    0.219357967376709
    

    which is a pretty dramatic speed-up. I hasten to add, however, that I have excluded the initial set-up from the timing test. This line:

    let inputchars = Array(input.characters)
    

    ...is particularly expensive.

    0 讨论(0)
  • 2021-02-04 11:11
    import Foundation
    
    let input:String = "010101011001010101001010101100101010100101010110010101010101011001010101001010101100101010100101010101011001010101001010101100101010100101010"
    var start  = clock()
    var output = Array<Bool>(count: input.nulTerminatedUTF8.count, repeatedValue: false)
    var index = 0
    for val in input.nulTerminatedUTF8 {
        if val != 49 {
            output[index] = true
        }
        index+=1
    }
    var diff = clock() - start;
    var msec = diff * 1000 / UInt(CLOCKS_PER_SEC);
    print("Time taken \(Double(msec)/1000.0) seconds \(msec%1000) milliseconds");
    

    This should be really fast. Try it out. For 010101011010101001000100000011010101010101010101 it takes 0.039 secs.

    0 讨论(0)
提交回复
热议问题