Handling stack overflows in embedded systems

后端 未结 4 1067
终归单人心
终归单人心 2021-02-04 15:35

In embedded software, how do you handle a stack overflow in a generic way? I come across some processor which does protect in hardware way like recent AMD processors. There are

4条回答
  •  -上瘾入骨i
    2021-02-04 16:18

    Ideally you write your code with static stack usage (no recursive calls). Then you can evaluate maximum stack usage by:

    1. static analysis (using tools)
    2. measurement of stack usage while running your code with complete code coverage (or as high as possible code coverage until you have a reasonable confidence you've established the extent of stack usage, as long as your rarely-run code doesn't use particularly more stack than the normal execution paths)

    But even with that, you still want to have a means of detecting and then handling stack overflow if it occurs, if at all possible, for more robustness. This can be especially helpful during the project's development phase. Some methods to detect overflow:

    1. If the processor supports a memory read/write interrupt (i.e. memory access breakpoint interrupt) then it can be configured to point to the furthest extent of the stack area.
    2. In the memory map configuration, set up a small (or large) block of RAM that is a "stack guard" area. Fill it with known values. In the embedded software, regularly (as often as reasonably possible) check the contents of this area. If it ever changes, assume a stack overflow.

    Once you've detected it, then you need to handle it. I don't know of many ways that code can gracefully recover from a stack overflow, because once it's happened, your program logic is almost certainly invalidated. So all you can do is

    1. log the error
      1. Logging the error is very useful, because otherwise the symptoms (unexpected reboots) can be very hard to diagnose.
      2. Caveat: The logging routine must be able to run reliably even in a corrupted-stack scenario. The routine should be simple. I.e. with a corrupted stack, you probably can't try to write to EEPROM using your fancy EEPROM writing background task. Maybe just log the error into a struct that is reserved for this purpose, in non-init RAM, which can then be checked after reboot.
    2. Reboot (or perhaps shutdown, especially if the error reoccurs repeatedly)
      1. Possible alternative: restart just the particular task, if you're using an RTOS, and your system is designed so the stack corruption is isolated, and all the other tasks are able to handle that task restarting. This would take some serious design consideration.

提交回复
热议问题