How to efficiently write large files to disk on background thread (Swift)

后端未结

关注

 3  1442

Update

I have resolved and removed the distracting error. Please read the entire post and feel free to leave comments if any questions remain.

Latest Solution (2018)

Another useful possibility might include the use of a closure whenever the buffer is filled (or if you've used a timed length of recording) to append the data and also to announce the end of the stream of data. In combination with some of the Photo APIs this could lead to good outcomes. So some declarative code like below could be fired during processing:

var dataSpoolingFinished: ((URL?, Error?) -> Void)?
var dataSpooling: ((Data?, Error?) -> Void)?

Handling these closures in your management object may allow you to succinctly handle data of any size while keeping the memory under control.

Couple that idea with the use of a recursive method that aggregates pieces of work into a single dispatch_group and there could be some exciting possibilities.

Apple docs state:

DispatchGroup allows for aggregate synchronization of work. You can use them to submit multiple different work items and track when they all complete, even though they might run on different queues. This behavior can be helpful when progress can’t be made until all of the specified tasks are complete.

File System Programming Guide

Apple's Processing an Entire File Linearly Using Streams article in the FSPG also provided the notion that NSInputStream and NSOutputStream should be inherently thread safe.

Further Refinements

This object doesn't use stream delegation methods. Plenty of room for other refinements as well but this is the basic approach I will take. The main focus on the iPhone is enabling the large file management while constraining the memory via a buffer (TBD - Leverage the outputStream in-memory buffer). To be clear, Apple does mention that their convenience functions that writeToURL are only for smaller file sizes (but makes me wonder why they don't take care of the larger files - These are not edge cases, note - will file question as a bug).

Conclusion

I will have to test further for integrating on a background thread as I don't want to interfere with any NSStream internal queuing. I have some other objects that use similar ideas to manage extremely large data files over the wire. The best method is to keep file sizes as small as possible in iOS to conserve memory and prevent app crashes. The APIs are built with these constraints in mind (which is why attempting unlimited video is not a good idea), so I will have to adapt expectations overall.

(Gist Source, Check gist for latest changes)

import Foundation
import Darwin.Mach.mach_time

class MNGStreamReaderWriter:NSObject {

    var copyOutput:NSOutputStream?
    var fileInput:NSInputStream?
    var outputStream:NSOutputStream? = NSOutputStream(toMemory: ())
    var urlInput:NSURL?

    convenience init(srcURL:NSURL, targetURL:NSURL) {
        self.init()
        self.fileInput  = NSInputStream(URL: srcURL)
        self.copyOutput = NSOutputStream(URL: targetURL, append: false)
        self.urlInput   = srcURL

    }

    func copyFileURLToURL(destURL:NSURL, withProgressBlock block: (fileSize:Double,percent:Double,estimatedTimeRemaining:Double) -> ()){

        guard let copyOutput = self.copyOutput, let fileInput = self.fileInput, let urlInput = self.urlInput else { return }

        let fileSize            = sizeOfInputFile(urlInput)
        let bufferSize          = 4096
        let buffer              = UnsafeMutablePointer<UInt8>.alloc(bufferSize)
        var bytesToWrite        = 0
        var bytesWritten        = 0
        var counter             = 0
        var copySize            = 0

        fileInput.open()
        copyOutput.open()

        //start time
        let time0 = mach_absolute_time()

        while fileInput.hasBytesAvailable {

            repeat {

                bytesToWrite    = fileInput.read(buffer, maxLength: bufferSize)
                bytesWritten    = copyOutput.write(buffer, maxLength: bufferSize)

                //check for errors
                if bytesToWrite < 0 {
                    print(fileInput.streamStatus.rawValue)
                }
                if bytesWritten == -1 {
                    print(copyOutput.streamStatus.rawValue)
                }
                //move read pointer to next section
                bytesToWrite -= bytesWritten
                copySize += bytesWritten

            if bytesToWrite > 0 {
                //move block of memory
                memmove(buffer, buffer + bytesWritten, bytesToWrite)
                }

            } while bytesToWrite > 0

            if fileSize != nil && (++counter % 10 == 0) {
                //passback a progress tuple
                let percent     = Double(copySize/fileSize!)
                let time1       = mach_absolute_time()
                let elapsed     = Double (time1 - time0)/Double(NSEC_PER_SEC)
                let estTimeLeft = ((1 - percent) / percent) * elapsed

                block(fileSize: Double(copySize), percent: percent, estimatedTimeRemaining: estTimeLeft)
            }
        }

        //send final progress tuple
        block(fileSize: Double(copySize), percent: 1, estimatedTimeRemaining: 0)


        //close streams
        if fileInput.streamStatus == .AtEnd {
            fileInput.close()

        }
        if copyOutput.streamStatus != .Writing && copyOutput.streamStatus != .Error {
            copyOutput.close()
        }



    }

    func sizeOfInputFile(src:NSURL) -> Int? {

        do {
            let fileSize = try NSFileManager.defaultManager().attributesOfItemAtPath(src.path!)
            return fileSize["fileSize"]  as? Int

        } catch let inputFileError as NSError {
            print(inputFileError.localizedDescription,inputFileError.localizedRecoverySuggestion)
        }

        return nil
    }


}

Delegation

Here's a similar object that I rewrote from an article on Advanced File I/O in the background, Eidhof,C., ObjC.io). With just a few tweaks this could be made to emulate the behavior above. Simply redirect the data to an NSOutputStream in the processDataChunk method.

(Gist Source - Check gist for latest changes)

import Foundation

class MNGStreamReader: NSObject, NSStreamDelegate {

    var callback: ((lineNumber: UInt , stringValue: String) -> ())?
    var completion: ((Int) -> Void)?
    var fileURL:NSURL?
    var inputData:NSData?
    var inputStream: NSInputStream?
    var lineNumber:UInt = 0
    var queue:NSOperationQueue?
    var remainder:NSMutableData?
    var delimiter:NSData?
    //var reader:NSInputStreamReader?

    func enumerateLinesWithBlock(block: (UInt, String)->() , completionHandler completion:(numberOfLines:Int) -> Void ) {

        if self.queue == nil {
            self.queue = NSOperationQueue()
            self.queue!.maxConcurrentOperationCount = 1
        }

        assert(self.queue!.maxConcurrentOperationCount == 1, "Queue can't be concurrent.")
        assert(self.inputStream == nil, "Cannot process multiple input streams in parallel")

        self.callback = block
        self.completion = completion

        if self.fileURL != nil {
            self.inputStream = NSInputStream(URL: self.fileURL!)
        } else if self.inputData != nil {
            self.inputStream = NSInputStream(data: self.inputData!)
        }

        self.inputStream!.delegate = self
        self.inputStream!.scheduleInRunLoop(NSRunLoop.currentRunLoop(), forMode: NSDefaultRunLoopMode)
        self.inputStream!.open()
    }

    convenience init? (withData inbound:NSData) {
        self.init()
        self.inputData = inbound
        self.delimiter = "\n".dataUsingEncoding(NSUTF8StringEncoding)

    }

    convenience init? (withFileAtURL fileURL: NSURL) {
        guard !fileURL.fileURL else { return nil }

        self.init()
        self.fileURL = fileURL
        self.delimiter = "\n".dataUsingEncoding(NSUTF8StringEncoding)
    }

    @objc func stream(aStream: NSStream, handleEvent eventCode: NSStreamEvent){

        switch eventCode {
        case NSStreamEvent.OpenCompleted:
            fallthrough
        case NSStreamEvent.EndEncountered:
            self.emitLineWithData(self.remainder!)
            self.remainder = nil
            self.inputStream!.close()
            self.inputStream = nil

            self.queue!.addOperationWithBlock({ () -> Void in
                self.completion!(Int(self.lineNumber) + 1)
            })

            break
        case NSStreamEvent.ErrorOccurred:
            NSLog("error")
            break
        case NSStreamEvent.HasSpaceAvailable:
            NSLog("HasSpaceAvailable")
            break
        case NSStreamEvent.HasBytesAvailable:
            NSLog("HasBytesAvaible")

            if let buffer = NSMutableData(capacity: 4096) {
                let length = self.inputStream!.read(UnsafeMutablePointer<UInt8>(buffer.mutableBytes), maxLength: buffer.length)
                if 0 < length {
                    buffer.length = length
                    self.queue!.addOperationWithBlock({ [weak self]  () -> Void in
                        self!.processDataChunk(buffer)
                        })
                }
            }
            break
        default:
            break
        }
    }

    func processDataChunk(buffer: NSMutableData) {
        if self.remainder != nil {

            self.remainder!.appendData(buffer)

        } else {

            self.remainder = buffer
        }

        self.remainder!.mng_enumerateComponentsSeparatedBy(self.delimiter!, block: {( component: NSData, last: Bool) in

            if !last {
                self.emitLineWithData(component)
            }
            else {
                if 0 < component.length {
                    self.remainder = (component.mutableCopy() as! NSMutableData)
                }
                else {
                    self.remainder = nil
                }
            }
        })
    }

    func emitLineWithData(data: NSData) {
        let lineNumber = self.lineNumber
        self.lineNumber = lineNumber + 1
        if 0 < data.length {
            if let line = NSString(data: data, encoding: NSUTF8StringEncoding) {
                callback!(lineNumber: lineNumber, stringValue: line as String)
            }
        }
    }
}

0 讨论(0)