Is “Out Of Memory” A Recoverable Error?

后端 未结 24 1883
闹比i
闹比i 2020-11-30 21:44

I\'ve been programming a long time, and the programs I see, when they run out of memory, attempt to clean up and exit, i.e. fail gracefully. I can\'t remember the last time

相关标签:
24条回答
  • 2020-11-30 22:22

    What is the compelling argument for making it a recoverable error?

    In Java, a compelling argument for not making it a recoverable error is because Java allows OOM to be signalled at any time, including at times where the result could be your program entering an inconsistent state. Reliable recoery from an OOM is therefore impossible; if you catch the OOM exception, you can not rely on any of your program state. See No-throw VirtualMachineError guarantees

    0 讨论(0)
  • 2020-11-30 22:23

    The question is tagged "language-agnostic", but it's difficult to answer without considering the language and/or the underlying system. (I see several toher hadns

    If memory allocation is implicit, with no mechanism to detect whether a given allocation succeeded or not, then recovering from an out-of-memory condition may be difficult or impossible.

    For example, if you call a function that attempts to allocate a huge array, most languages just don't define the behavior if the array can't be allocated. (In Ada this raises a Storage_Error exception, at least in principle, and it should be possible to handle that.)

    On the other hand, if you have a mechanism that attempts to allocate memory and is able to report a failure to do so (like C's malloc() or C++'s new), then yes, it's certainly possible to recover from that failure. In at least the cases of malloc() and new, a failed allocation doesn't do anything other than report failure (it doesn't corrupt any internal data structures, for example).

    Whether it makes sense to try to recover depends on the application. If the application just can't succeed after an allocation failure, then it should do whatever cleanup it can and terminate. But if the allocation failure merely means that one particular task cannot be performed, or if the task can still be performed more slowly with less memory, then it makes sense to continue operating.

    A concrete example: Suppose I'm using a text editor. If I try to perform some operation within the editor that requires a lot of memory, and that operation can't be performed, I want the editor to tell me it can't do what I asked and let me keep editing. Terminating without saving my work would be an unacceptable response. Saving my work and terminating would be better, but is still unnecessarily user-hostile.

    0 讨论(0)
  • 2020-11-30 22:24

    uClibc has an internal static buffer of 8 bytes or so for file I/O when there is no more memory to be allocated dynamically.

    0 讨论(0)
  • 2020-11-30 22:25

    OOM should be recoverable because shutdown isn't the only strategy to recovering from OOM.

    There is actually a pretty standard solution to the OOM problem at the application level. As part of you application design determine a safe minimum amount of memory required to recover from an out of memory condition. (Eg. the memory required to auto save documents, bring up warning dialogs, log shutdown data).

    At the start of your application or at the start of a critical block, pre-allocate that amount of memory. If you detect an out of memory condition release your guard memory and perform recovery. The strategy can still fail but on the whole gives great bang for the buck.

    Note that the application need not shut down. It can display a modal dialog until the OOM condition has been resolved.

    I'm not 100% certain but I'm pretty sure 'Code Complete' (required reading for any respectable software engineer) covers this.

    P.S. You can extend your application framework to help with this strategy but please don't implement such a policy in a library (good libraries do not make global decisions without an applications consent)

    0 讨论(0)
  • 2020-11-30 22:25

    It depends on what you mean by running out of memory.

    When malloc() fails on most systems, it's because you've run out of address-space.

    If most of that memory is taken by cacheing, or by mmap'd regions, you might be able to reclaim some of it by freeing your cache or unmmaping. However this really requires that you know what you're using that memory for- and as you've noticed either most programs don't, or it doesn't make a difference.

    If you used setrlimit() on yourself (to protect against unforseen attacks, perhaps, or maybe root did it to you), you can relax the limit in your error handler. I do this very frequently- after prompting the user if possible, and logging the event.

    On the other hand, catching stack overflow is a bit more difficult, and isn't portable. I wrote a posixish solution for ECL, and described a Windows implementation, if you're going this route. It was checked into ECL a few months ago, but I can dig up the original patches if you're interested.

    0 讨论(0)
  • 2020-11-30 22:26

    Out of memory can be caused either by free memory depletion or by trying to allocate an unreasonably big block (like one gig). In "depletion" cases memory shortage is global to the system and usually affects other applications and system services and the whole system might become unstable so it's wise to forget and reboot. In "unreasonably big block" cases no shortage actually occurs and it's safe to continue. The problem is you can't automatically detect which case you're in. So it's safer to make the error non-recoverable and find a workaround for each case you encounter this error - make your program use less memory or in some cases just fix bugs in code that invokes memory allocation.

    0 讨论(0)
提交回复
热议问题