I have a large file of roughly 400 GB of size. Generated daily by an external closed system. It is a binary file with the following format:
byte[8]byte[4]byte[n
Use merge sort. It's online and parallelizes well.
http://en.wikipedia.org/wiki/Merge_sort