Copy a file in a sane, safe and efficient way

前端 未结 7 1385
小蘑菇
小蘑菇 2020-11-22 07:11

I search for a good way to copy a file (binary or text). I\'ve written several samples, everyone works. But I want hear the opinion of seasoned programmers.

I missin

7条回答
  •  栀梦
    栀梦 (楼主)
    2020-11-22 07:19

    I'm not quite sure what a "good way" of copying a file is, but assuming "good" means "fast", I could broaden the subject a little.

    Current operating systems have long been optimized to deal with run of the mill file copy. No clever bit of code will beat that. It is possible that some variant of your copy techniques will prove faster in some test scenario, but they most likely would fare worse in other cases.

    Typically, the sendfile function probably returns before the write has been committed, thus giving the impression of being faster than the rest. I haven't read the code, but it is most certainly because it allocates its own dedicated buffer, trading memory for time. And the reason why it won't work for files bigger than 2Gb.

    As long as you're dealing with a small number of files, everything occurs inside various buffers (the C++ runtime's first if you use iostream, the OS internal ones, apparently a file-sized extra buffer in the case of sendfile). Actual storage media is only accessed once enough data has been moved around to be worth the trouble of spinning a hard disk.

    I suppose you could slightly improve performances in specific cases. Off the top of my head:

    • If you're copying a huge file on the same disk, using a buffer bigger than the OS's might improve things a bit (but we're probably talking about gigabytes here).
    • If you want to copy the same file on two different physical destinations you will probably be faster opening the three files at once than calling two copy_file sequentially (though you'll hardly notice the difference as long as the file fits in the OS cache)
    • If you're dealing with lots of tiny files on an HDD you might want to read them in batches to minimize seeking time (though the OS already caches directory entries to avoid seeking like crazy and tiny files will likely reduce disk bandwidth dramatically anyway).

    But all that is outside the scope of a general purpose file copy function.

    So in my arguably seasoned programmer's opinion, a C++ file copy should just use the C++17 file_copy dedicated function, unless more is known about the context where the file copy occurs and some clever strategies can be devised to outsmart the OS.

提交回复
热议问题