Is “Out Of Memory” A Recoverable Error?

后端 未结 24 1885
闹比i
闹比i 2020-11-30 21:44

I\'ve been programming a long time, and the programs I see, when they run out of memory, attempt to clean up and exit, i.e. fail gracefully. I can\'t remember the last time

相关标签:
24条回答
  • 2020-11-30 22:18

    I'm working on a system that allocates memory for IO cache to increase performance. Then, on detecting OOM, it takes some of it back, so that the business logic could proceed, even if that means less IO cache and slightly lower write performance.

    I also worked with an embedded Java applications that attempted to manage OOM by forcing garbage collection, optionally releasing some of non-critical objects, like pre-fetched or cached data.

    The main problems with OOM handling are:

    1) being able to re-try in the place where it happened or being able to roll back and re-try from a higher point. Most contemporary programs rely too much on the language to throw and don't really manage where they end up and how to re-try the operation. Usually the context of the operation will be lost, if it wasn't designed to be preserved

    2) being able to actually release some memory. This means a kind of resource manager that knows what objects are critical and what are not, and the system be able to re-request the released objects when and if they later become critical

    Another important issue is to be able to roll back without triggering yet another OOM situation. This is something that is hard to control in higher level languages.

    Also, the underlying OS must behave predictably with regard to OOM. Linux, for example, will not, if memory overcommit is enabled. Many swap-enabled systems will die sooner than reporting the OOM to the offending application.

    And, there's the case when it is not your process that created the situation, so releasing memory does not help if the offending process continues to leak.

    Because of all this, it's often the big and embedded systems that employ this techniques, for they have the control over OS and memory to enable them, and the discipline/motivation to implement them.

    0 讨论(0)
  • 2020-11-30 22:18

    There are already many good answers here. But I'd like to contribute with another perspective.

    Depletion of just about any reusable resource should be recoverable in general. The reasoning is that each and every part of a program is basically a sub program. Just because one sub cannot complete to it's end at this very point in time, does not mean that the entire state of the program is garbage. Just because the parking lot is full of cars does not mean that you trash your car. Either you wait a while for a booth to be free, or you drive to a store further away to buy your cookies.

    In most cases there is an alternative way. Making an out of error unrecoverable, effectively removes a lot of options, and none of us like to have anyone decide for us what we can and cannot do.

    The same applies to disk space. It's really the same reasoning. And contrary to your insinuation about stack overflow is unrecoverable, i would say that it's and arbitrary limitation. There is no good reason that you should not be able to throw an exception (popping a lot of frames) and then use another less efficient approach to get the job done.

    My two cents :-)

    0 讨论(0)
  • 2020-11-30 22:20

    No. An out of memory error from the GC is should not generally be recoverable inside of the current thread. (Recoverable thread (user or kernel) creation and termination should be supported though)

    Regarding the counter examples: I'm currently working on a D programming language project which uses NVIDIA's CUDA platform for GPU computing. Instead of manually managing GPU memory, I've created proxy objects to leverage the D's GC. So when the GPU returns an out of memory error, I run a full collect and only raise an exception if it fails a second time. But, this isn't really an example of out of memory recovery, it's more one of GC integration. The other examples of recovery (caches, free-lists, stacks/hashes without auto-shrinking, etc) are all structures that have their own methods of collecting/compacting memory which are separate from the GC and tend not to be local to the allocating function. So people might implement something like the following:

    T new2(T)( lazy T old_new ) {
        T obj;
        try{
            obj = old_new;
        }catch(OutOfMemoryException oome) {
            foreach(compact; Global_List_Of_Delegates_From_Compatible_Objects)
                compact();
            obj = old_new;
        }
        return obj;
    }
    

    Which is a decent argument for adding support for registering/unregistering self-collecting/compacting objects to garbage collectors in general.

    0 讨论(0)
  • 2020-11-30 22:20

    In the general case, it's not recoverable.

    However, if your system includes some form of dynamic caching, an out-of-memory handler can often dump the oldest elements in the cache (or even the whole cache).

    Of course, you have to make sure that the "dumping" process requires no new memory allocations :) Also, it can be tricky to recover the specific allocation that failed, unless you're able to plug your cache dumping code directly at the allocator level, so that the failure isn't propagated up to the caller.

    0 讨论(0)
  • 2020-11-30 22:21

    If you are really out of memory you are doomed, since you can not free anything anymore.

    If you are out of memory, but something like a garbage collector can kick in and free up some memory you are non dead yet.

    The other problem is fragmentation. Although you might not be out of memory (fragmented), you might still not be able to allocate the huge chunk you wanna have.

    0 讨论(0)
  • 2020-11-30 22:21

    I'm working on SpiderMonkey, the JavaScript VM used in Firefox (and gnome and a few others). When you're out of memory, you may want to do any of the following things:

    1. Run the garbage-collector. We don't run the garbage-collector all the time, as it would kill performance and battery, so by the time you're reaching out of memory error, some garbage may have accumulated.
    2. Free memory. For instance, get rid of some of the in-memory cache.
    3. Kill or postpone non-essential tasks. For instance, unload some tabs that haven't be used in a long time from memory.
    4. Log things to help the developer troubleshoot the out-of-memory error.
    5. Display a semi-nice error message to let the user know what's going on.
    6. ...

    So yes, there are many reasons to handle out-of-memory errors manually!

    0 讨论(0)
提交回复
热议问题