large-file-support

Python class to merge sorted files, how can this be improved?

随声附和 提交于 2020-01-10 14:34:00
问题 Background: I'm cleaning large (cannot be held in memory) tab-delimited files. As I clean the input file, I build up a list in memory; when it gets to 1,000,000 entries (about 1GB in memory) I sort it (using the default key below) and write the list to a file. This class is for putting the sorted files back together. It works on the files I have encountered thus far. My largest case, so far, is merging 66 sorted files. Questions: Are there holes in my logic (where is it fragile)? Have I

iostream and large file support

倖福魔咒の 提交于 2019-12-30 04:38:06
问题 I'm trying to find a definitive answer and can't, so I'm hoping someone might know. I'm developing a C++ app using GCC 4.x on Linux (32-bit OS). This app needs to be able to read files > 2GB in size. I would really like to use iostream stuff vs. FILE pointers, but I can't find if the large file #defines (_LARGEFILE_SOURCE, _LARGEFILE64_SOURCE, _FILE_OFFSET_BITS=64) have any effect on the iostream headers. I'm compiling on a 32-bit system. Any pointers would be helpful. 回答1: This has already

Is O_LARGEFILE needed just to write a large file?

♀尐吖头ヾ 提交于 2019-12-17 16:29:03
问题 Is the O_LARGEFILE flag needed if all that I want to do is write a large file ( O_WRONLY ) or append to a large file ( O_APPEND | O_WRONLY )? From a thread that I read titled "Cannot write >2gb index file" on the CLucene-dev mailing list, it appears that O_LARGEFILE might be needed to write large files, but participants in that discussion are using O_RDWR , not O_WRONLY , so I am not sure. 回答1: O_LARGEFILE should never be used directly by applications. It's to be used internally by the 64-bit

How can I portably turn on large file support?

落花浮王杯 提交于 2019-12-01 07:50:03
问题 I am currently writing a C program that reads and writes files that might be over 2 GiB in size. On linux feature_test_macros (7) specifies: _LARGEFILE64_SOURCE Expose definitions for the alternative API specified by the LFS (Large File Summit) as a "tran‐ sitional extension" to the Single UNIX Specification. (See ⟨http://opengroup.org/platform /lfs.html⟩) The alternative API consists of a set of new objects (i.e., functions and types) whose names are suffixed with "64" (e.g., off64_t versus

iostream and large file support

狂风中的少年 提交于 2019-11-30 14:04:34
I'm trying to find a definitive answer and can't, so I'm hoping someone might know. I'm developing a C++ app using GCC 4.x on Linux (32-bit OS). This app needs to be able to read files > 2GB in size. I would really like to use iostream stuff vs. FILE pointers, but I can't find if the large file #defines (_LARGEFILE_SOURCE, _LARGEFILE64_SOURCE, _FILE_OFFSET_BITS=64) have any effect on the iostream headers. I'm compiling on a 32-bit system. Any pointers would be helpful. vladr This has already been decided for you when libstdc++ was compiled, and normally depends on whether or not _GLIBCXX_USE

Python class to merge sorted files, how can this be improved?

元气小坏坏 提交于 2019-11-30 09:23:55
Background: I'm cleaning large (cannot be held in memory) tab-delimited files. As I clean the input file, I build up a list in memory; when it gets to 1,000,000 entries (about 1GB in memory) I sort it (using the default key below) and write the list to a file. This class is for putting the sorted files back together. It works on the files I have encountered thus far. My largest case, so far, is merging 66 sorted files. Questions: Are there holes in my logic (where is it fragile)? Have I implemented the merge-sort algorithm correctly? Are there any obvious improvements that could be made?

Is O_LARGEFILE needed just to write a large file?

霸气de小男生 提交于 2019-11-27 23:02:05
Is the O_LARGEFILE flag needed if all that I want to do is write a large file ( O_WRONLY ) or append to a large file ( O_APPEND | O_WRONLY )? From a thread that I read titled " Cannot write >2gb index file " on the CLucene-dev mailing list, it appears that O_LARGEFILE might be needed to write large files, but participants in that discussion are using O_RDWR , not O_WRONLY , so I am not sure. O_LARGEFILE should never be used directly by applications. It's to be used internally by the 64-bit-offset-compatible version of open in libc when it makes the syscall to the kernel (Linux, or possibly