Sorting gigantic binary files with C#

后端 未结 4 1728
别那么骄傲
别那么骄傲 2021-02-09 14:39

I have a large file of roughly 400 GB of size. Generated daily by an external closed system. It is a binary file with the following format:

byte[8]byte[4]byte[n         


        
4条回答
  •  被撕碎了的回忆
    2021-02-09 15:23

    I would do this in several passes. On the first pass, I would create a list of ticks, then distribute them evenly into many (hundreds?) buckets. If you know ahead of time that the ticks are evenly distributed, you can skip this initial pass. On a second pass, I would split the records into these few hundred separate files of about same size (these much smaller files represent groups of ticks in the order that you want). Then I would sort each file separately in memory. Then concatenate the files.

    It is somewhat similar to the hashsort (I think).

提交回复
热议问题