How to efficiently write large files to disk on background thread (Swift)

后端 未结 3 1442
情话喂你
情话喂你 2020-12-22 18:56

Update

I have resolved and removed the distracting error. Please read the entire post and feel free to leave comments if any questions remain.

相关标签:
3条回答
  • 2020-12-22 19:22

    You should consider using NSStream (NSOutputStream/NSInputStream). If you are going to choose this approach, keep in mind that background thread run loop will need to be started (run) explicitly.

    NSOutputStream has a method called outputStreamToFileAtPath:append: which is what you might be looking for.

    Similar question :

    Writing a String to an NSOutputStream in Swift

    0 讨论(0)
  • 2020-12-22 19:39

    Performance depends wether or not the data fits in RAM. If it does, then you should use NSData writeToURL with the atomically feature turned on, which is what you're doing.

    Apple's notes about this being dangerous when "writing to a public directory" are completely irrelevant on iOS because there are no public directories. That section only applies to OS X. And frankly it's not really important there either.

    So, the code you've written is as efficient as possible as long as the video fits in RAM (about 100MB would be a safe limit).

    For files that don't fit in RAM, you need to use a stream or your app will crash while holding the video in memory. To download a large video from a server and write it to disk, you should use NSURLSessionDownloadTask.

    In general, streaming (including NSURLSessionDownloadTask) will be orders of magnitude slower than NSData.writeToURL(). So don't use a stream unless you need to. All operations on NSData are extremely fast, it is perfectly capable of dealing with files that are multiple terabytes in size with excellent performance on OS X (iOS obviously can't have files that large, but it's the same class with the same performance).


    There are a few issues in your code.

    This is wrong:

    let filePath = NSTemporaryDirectory() + named
    

    Instead always do:

    let filePath = NSTemporaryDirectory().stringByAppendingPathComponent(named)
    

    But that's not ideal either, you should avoid using paths (they are buggy and slow). Instead use a URL like this:

    let tmpDir = NSURL(fileURLWithPath: NSTemporaryDirectory())!
    let fileURL = tmpDir.URLByAppendingPathComponent(named)
    

    Also, you're using a path to check if the file exists... don't do this:

    if NSFileManager.defaultManager().fileExistsAtPath( filePath ) {
    

    Instead use NSURL to check if it exists:

    if fileURL.checkResourceIsReachableAndReturnError(nil) {
    
    0 讨论(0)
  • 2020-12-22 19:41

    Latest Solution (2018)

    Another useful possibility might include the use of a closure whenever the buffer is filled (or if you've used a timed length of recording) to append the data and also to announce the end of the stream of data. In combination with some of the Photo APIs this could lead to good outcomes. So some declarative code like below could be fired during processing:

    var dataSpoolingFinished: ((URL?, Error?) -> Void)?
    var dataSpooling: ((Data?, Error?) -> Void)?
    

    Handling these closures in your management object may allow you to succinctly handle data of any size while keeping the memory under control.

    Couple that idea with the use of a recursive method that aggregates pieces of work into a single dispatch_group and there could be some exciting possibilities.

    Apple docs state:

    DispatchGroup allows for aggregate synchronization of work. You can use them to submit multiple different work items and track when they all complete, even though they might run on different queues. This behavior can be helpful when progress can’t be made until all of the specified tasks are complete.

    Other Noteworthy Solutions (~2016)

    I have no doubt that I will refine this some more but the topic is complex enough to warrant a separate self-answer. I decided to take some advice from the other answers and leverage the NSStream subclasses. This solution is based on an Obj-C sample (NSInputStream inputStreamWithURL example ios, 2013, May 12) posted over on the SampleCodeBank blog.

    Apple documentation notes that with an NSStream subclass you do NOT have to load all data into memory at once. That is the key to being able to manage multimedia files of any size (not exceeding available disk or RAM space).

    NSStream is an abstract class for objects representing streams. Its interface is common to all Cocoa stream classes, including its concrete subclasses NSInputStream and NSOutputStream.

    NSStream objects provide an easy way to read and write data to and from a variety of media in a device-independent way. You can create stream objects for data located in memory, in a file, or on a network (using sockets), and you can use stream objects without loading all of the data into memory at once.

    File System Programming Guide

    Apple's Processing an Entire File Linearly Using Streams article in the FSPG also provided the notion that NSInputStream and NSOutputStream should be inherently thread safe.

    Further Refinements

    This object doesn't use stream delegation methods. Plenty of room for other refinements as well but this is the basic approach I will take. The main focus on the iPhone is enabling the large file management while constraining the memory via a buffer (TBD - Leverage the outputStream in-memory buffer). To be clear, Apple does mention that their convenience functions that writeToURL are only for smaller file sizes (but makes me wonder why they don't take care of the larger files - These are not edge cases, note - will file question as a bug).

    Conclusion

    I will have to test further for integrating on a background thread as I don't want to interfere with any NSStream internal queuing. I have some other objects that use similar ideas to manage extremely large data files over the wire. The best method is to keep file sizes as small as possible in iOS to conserve memory and prevent app crashes. The APIs are built with these constraints in mind (which is why attempting unlimited video is not a good idea), so I will have to adapt expectations overall.

    (Gist Source, Check gist for latest changes)

    import Foundation
    import Darwin.Mach.mach_time
    
    class MNGStreamReaderWriter:NSObject {
    
        var copyOutput:NSOutputStream?
        var fileInput:NSInputStream?
        var outputStream:NSOutputStream? = NSOutputStream(toMemory: ())
        var urlInput:NSURL?
    
        convenience init(srcURL:NSURL, targetURL:NSURL) {
            self.init()
            self.fileInput  = NSInputStream(URL: srcURL)
            self.copyOutput = NSOutputStream(URL: targetURL, append: false)
            self.urlInput   = srcURL
    
        }
    
        func copyFileURLToURL(destURL:NSURL, withProgressBlock block: (fileSize:Double,percent:Double,estimatedTimeRemaining:Double) -> ()){
    
            guard let copyOutput = self.copyOutput, let fileInput = self.fileInput, let urlInput = self.urlInput else { return }
    
            let fileSize            = sizeOfInputFile(urlInput)
            let bufferSize          = 4096
            let buffer              = UnsafeMutablePointer<UInt8>.alloc(bufferSize)
            var bytesToWrite        = 0
            var bytesWritten        = 0
            var counter             = 0
            var copySize            = 0
    
            fileInput.open()
            copyOutput.open()
    
            //start time
            let time0 = mach_absolute_time()
    
            while fileInput.hasBytesAvailable {
    
                repeat {
    
                    bytesToWrite    = fileInput.read(buffer, maxLength: bufferSize)
                    bytesWritten    = copyOutput.write(buffer, maxLength: bufferSize)
    
                    //check for errors
                    if bytesToWrite < 0 {
                        print(fileInput.streamStatus.rawValue)
                    }
                    if bytesWritten == -1 {
                        print(copyOutput.streamStatus.rawValue)
                    }
                    //move read pointer to next section
                    bytesToWrite -= bytesWritten
                    copySize += bytesWritten
    
                if bytesToWrite > 0 {
                    //move block of memory
                    memmove(buffer, buffer + bytesWritten, bytesToWrite)
                    }
    
                } while bytesToWrite > 0
    
                if fileSize != nil && (++counter % 10 == 0) {
                    //passback a progress tuple
                    let percent     = Double(copySize/fileSize!)
                    let time1       = mach_absolute_time()
                    let elapsed     = Double (time1 - time0)/Double(NSEC_PER_SEC)
                    let estTimeLeft = ((1 - percent) / percent) * elapsed
    
                    block(fileSize: Double(copySize), percent: percent, estimatedTimeRemaining: estTimeLeft)
                }
            }
    
            //send final progress tuple
            block(fileSize: Double(copySize), percent: 1, estimatedTimeRemaining: 0)
    
    
            //close streams
            if fileInput.streamStatus == .AtEnd {
                fileInput.close()
    
            }
            if copyOutput.streamStatus != .Writing && copyOutput.streamStatus != .Error {
                copyOutput.close()
            }
    
    
    
        }
    
        func sizeOfInputFile(src:NSURL) -> Int? {
    
            do {
                let fileSize = try NSFileManager.defaultManager().attributesOfItemAtPath(src.path!)
                return fileSize["fileSize"]  as? Int
    
            } catch let inputFileError as NSError {
                print(inputFileError.localizedDescription,inputFileError.localizedRecoverySuggestion)
            }
    
            return nil
        }
    
    
    }
    

    Delegation

    Here's a similar object that I rewrote from an article on Advanced File I/O in the background, Eidhof,C., ObjC.io). With just a few tweaks this could be made to emulate the behavior above. Simply redirect the data to an NSOutputStream in the processDataChunk method.

    (Gist Source - Check gist for latest changes)

    import Foundation
    
    class MNGStreamReader: NSObject, NSStreamDelegate {
    
        var callback: ((lineNumber: UInt , stringValue: String) -> ())?
        var completion: ((Int) -> Void)?
        var fileURL:NSURL?
        var inputData:NSData?
        var inputStream: NSInputStream?
        var lineNumber:UInt = 0
        var queue:NSOperationQueue?
        var remainder:NSMutableData?
        var delimiter:NSData?
        //var reader:NSInputStreamReader?
    
        func enumerateLinesWithBlock(block: (UInt, String)->() , completionHandler completion:(numberOfLines:Int) -> Void ) {
    
            if self.queue == nil {
                self.queue = NSOperationQueue()
                self.queue!.maxConcurrentOperationCount = 1
            }
    
            assert(self.queue!.maxConcurrentOperationCount == 1, "Queue can't be concurrent.")
            assert(self.inputStream == nil, "Cannot process multiple input streams in parallel")
    
            self.callback = block
            self.completion = completion
    
            if self.fileURL != nil {
                self.inputStream = NSInputStream(URL: self.fileURL!)
            } else if self.inputData != nil {
                self.inputStream = NSInputStream(data: self.inputData!)
            }
    
            self.inputStream!.delegate = self
            self.inputStream!.scheduleInRunLoop(NSRunLoop.currentRunLoop(), forMode: NSDefaultRunLoopMode)
            self.inputStream!.open()
        }
    
        convenience init? (withData inbound:NSData) {
            self.init()
            self.inputData = inbound
            self.delimiter = "\n".dataUsingEncoding(NSUTF8StringEncoding)
    
        }
    
        convenience init? (withFileAtURL fileURL: NSURL) {
            guard !fileURL.fileURL else { return nil }
    
            self.init()
            self.fileURL = fileURL
            self.delimiter = "\n".dataUsingEncoding(NSUTF8StringEncoding)
        }
    
        @objc func stream(aStream: NSStream, handleEvent eventCode: NSStreamEvent){
    
            switch eventCode {
            case NSStreamEvent.OpenCompleted:
                fallthrough
            case NSStreamEvent.EndEncountered:
                self.emitLineWithData(self.remainder!)
                self.remainder = nil
                self.inputStream!.close()
                self.inputStream = nil
    
                self.queue!.addOperationWithBlock({ () -> Void in
                    self.completion!(Int(self.lineNumber) + 1)
                })
    
                break
            case NSStreamEvent.ErrorOccurred:
                NSLog("error")
                break
            case NSStreamEvent.HasSpaceAvailable:
                NSLog("HasSpaceAvailable")
                break
            case NSStreamEvent.HasBytesAvailable:
                NSLog("HasBytesAvaible")
    
                if let buffer = NSMutableData(capacity: 4096) {
                    let length = self.inputStream!.read(UnsafeMutablePointer<UInt8>(buffer.mutableBytes), maxLength: buffer.length)
                    if 0 < length {
                        buffer.length = length
                        self.queue!.addOperationWithBlock({ [weak self]  () -> Void in
                            self!.processDataChunk(buffer)
                            })
                    }
                }
                break
            default:
                break
            }
        }
    
        func processDataChunk(buffer: NSMutableData) {
            if self.remainder != nil {
    
                self.remainder!.appendData(buffer)
    
            } else {
    
                self.remainder = buffer
            }
    
            self.remainder!.mng_enumerateComponentsSeparatedBy(self.delimiter!, block: {( component: NSData, last: Bool) in
    
                if !last {
                    self.emitLineWithData(component)
                }
                else {
                    if 0 < component.length {
                        self.remainder = (component.mutableCopy() as! NSMutableData)
                    }
                    else {
                        self.remainder = nil
                    }
                }
            })
        }
    
        func emitLineWithData(data: NSData) {
            let lineNumber = self.lineNumber
            self.lineNumber = lineNumber + 1
            if 0 < data.length {
                if let line = NSString(data: data, encoding: NSUTF8StringEncoding) {
                    callback!(lineNumber: lineNumber, stringValue: line as String)
                }
            }
        }
    }
    
    0 讨论(0)
提交回复
热议问题