We\'ve occasionally been getting problems whereby our long-running server processes (running on Windows Server 2003) have thrown an exception due to a memory allocation fail
if you talking about Win32 - you can try to squeeze something by using LARGEADDRESSAWARE. You'll have ~1Gb extra defragmented memory so your application will fragment it longer.
As usual, you can usually waste memory to gain some speed.
This technique isn't useful for a general purpose allocator, but it does have it's place.
Basically, the idea is to write an allocator that returns memory from a pool where all the allocations are the same size. This pool can never become fragmented because any block is as good as another. You can reduce memory wastage by creating multiple pools with different size chunks and pick the smallest chunk size pool that's still greater than the requested amount. I've used this idea to create allocators that run in O(1).
First, I agree with the other posters who suggested a resource leak. You really want to rule that out first.
Hopefully, the heap manager you are currently using has a way to dump out the actual total free space available in the heap (across all free blocks) and also the total number of blocks that it is divided over. If the average free block size is relatively small compared to the total free space in the heap, then you do have a fragmentation problem. Alternatively, if you can dump the size of the largest free block and compare that to the total free space, that will accomplish the same thing. The largest free block would be small relative to the total free space available across all blocks if you are running into fragmentation.
To be very clear about the above, in all cases we are talking about free blocks in the heap, not the allocated blocks in the heap. In any case, if the above conditions are not met, then you do have a leak situation of some sort.
So, once you have ruled out a leak, you could consider using a better allocator. Doug Lea's malloc suggested in the question is a very good allocator for general use applications and very robust most of the time. Put another way, it has been time tested to work very well for most any application. However, no algorithm is ideal for all applications and any management algorithm approach can be broken by the right pathelogical conditions against it's design.
Why are you having a fragmentation problem? - Sources of fragmentation problems are caused by the behavior of an application and have to do with greatly different allocation lifetimes in the same memory arena. That is, some objects are allocated and freed regularly while other types of objects persist for extended periods of time all in the same heap.....think of the longer lifetime ones as poking holes into larger areas of the arena and thereby preventing the coalesce of adjacent blocks that have been freed.
To address this type of problem, the best thing you can do is logically divide the heap into sub arenas where the lifetimes are more similar. In effect, you want a transient heap and a persistent heap or heaps that group things of similar lifetimes.
Some others have suggested another approach to solve the problem which is to attempt to make the allocation sizes more similar or identical, but this is less ideal because it creates a different type of fragmentation called internal fragmentation - which is in effect the wasted space you have by allocating more memory in the block than you need.
Additionally, with a good heap allocator, like Doug Lea's, making the block sizes more similar is unnecessary because the allocator will already be doing a power of two size bucketing scheme that will make it completely unnecessary to artificially adjust the allocation sizes passed to malloc() - in effect, his heap manager does that for you automatically much more robustly than the application will be able to make adjustments.
As you suggest, Doug Lea's malloc might work well. It's cross platform and it has been used in shipping code. At the very least, it should be easy to integrate into your code for testing.
Having worked in fixed memory environments for a number of years, this situation is certainly a problem, even in non-fixed environments. We have found that the CRT allocators tend to stink pretty bad in terms of performance (speed, efficiency of wasted space, etc). I firmly believe that if you have extensive need of a good memory allocator over a long period of time, you should write your own (or see if something like dlmalloc will work). The trick is getting something written that works with your allocation patterns, and that has more to do with memory management efficiency as almost anything else.
Give dlmalloc a try. I definitely give it a thumbs up. It's fairly tunable as well, so you might be able to get more efficiency by changing some of the compile time options.
Honestly, you shouldn't depend on things "going away" with new OS implementations. A service pack, patch, or another new OS N years later might make the problem worse. Again, for applications that demand a robust memory manager, don't use the stock versions that are available with your compiler. Find one that works for your situation. Start with dlmalloc and tune it to see if you can get the behavior that works best for your situation.
The simple, quick and dirty, solution is to split the application into several process, you should get fresh HEAP each time you create the process.
Your memory and speed might suffer a bit (swapping) but fast hardware and big RAM should be able to help.
This was old UNIX trick with daemons, when threads did not existed yet.
@nsaners - I'm pretty sure the problem is down to memory fragmentation. We've analyzed minidumps that point to a problem when a large (5-10mb) chunk of memory is being allocated. We've also monitored the process (on-site and in development) to check for memory leaks - none were detected (the memory footprint is generally quite low).