I have written a converter that takes openstreetmap xml files and converts them to a binary runtime rendering format that is typically about 10% of the original size. Input file
I suspect your memory issues are from keeping the BSP tree in memory. So keep the BSP on disk and only keep some chunks in memory. This should be fairly easy with BSP, as the structure lends itself more than some other tree structures, and the logic should be simple. To be both efficient and memory friendly you could have a cache w/ dirty flag, with the cache size set to available memory less a bit for breathing room.
It sounds like you're already doing a SAX based approach to the XML processing (loading the XML as you go instead of all at once).
The solution is almost always to change the algorithm so that it cuts the problem into smaller parts. Physically don't allocate as much memory at one time, read in only what you need, process it, then write it out.
You can sometimes extend memory via using the hard drive instead when needed in your algorithm.
If you can't split up your algorithm, you probably want something like memory mapped files.
In the worst case you can try to use something like VirtualAlloc if you are on a windows system. If you are on a 32-bit system you can try to use something like Physical Address Extension (PAE).
You could also consider putting input limitations for your program, and having a different one for 32-bit and 64-bit systems.
How are you allocating memory for points ? Are you allocating point one at a time (e.g. pt = new Point ). Then depending on the size of point, some memory may get wasted. For example on windows memory is allocated in the multiples of 16 bytes, so even if you ask try to allocate 1 byte, OS will actually allocate 16 bytes.
If this is the case, using a memory allocator may help. You can do a quick check using STL allocator. (over load the new operator for the Point class and use the STL allocator to allocate memory rather than 'malloc' or default new operator).
If you want to be memory-size independent, you need a size-independent algorithm. No matter what size your RAM is, if you don't have memory usage under control, you're going to bump into the border.
Take a look at the least chunk of information you can possibly use to produce a bit of output. Then think of a way to divide the input into chunks of this size.
Now that sounds easy, doesn't it? (Glad I don't have to do it :) )
You have to understand that Virtual Memory is different from "RAM" in that the amount of Virtual Memory you're using is the total amount you've reserved, while real memory (in Windows its called Working Set) is memory that you've actually modified or locked.
As someone else pointed out, on 32-bit Windows Platforms the limit on Virtual Memory is 2 gigabytes unless you set the special flag for 3 gigabytes and can ensure that all the pointers both in your code and any libraries you use only use unsigned pointers.
So either forcing users to 64-bit or monitoring your Virtual Memory and capping your max block size to something that comfortably fits inside the limits imposed by 32-bit operating systems would be my advice.
I've slammed into the 32-bit wall in Windows, but have no experience with working around these limitations in Linux so I've only talked about the Windows side of things.
It sound like you are doing txt to binary conversation so why do you need to have the entire data in the memory?.
Can't you just read a primitive from txt (xml) then save to binarystream?