How do I debug a difficult-to-reproduce crash with no useful call stack?

后端 未结 4 1747
一向
一向 2020-12-28 18:23

I am encountering an odd crash in our software and I\'m having a lot of trouble debugging it, and so I am seeking SO\'s advice on how to tackle it.

相关标签:
4条回答
  • 2020-12-28 18:57

    That's is the reason I made the Process Stack viewer :-) http://code.google.com/p/asmprofiler/wiki/ProcessStackViewer

    It can show the stack with raw stack tracing, so it will show the complete stack when normal stack tracing is not possible.
    But beware: raw stack tracing will show "false positives"! Any address on the stack for which an function name can be found, will be listed.

    It helped me a number of times when I ran in the same problem as yours (no normal stack walking by Delphi possible due to invalid stack state)

    Edit: new version uploaded, on website was an old version (I use the new version a lot myself) http://asmprofiler.googlecode.com/files/AsmProfiler_Sampling%20v1.0.7.13.zip

    0 讨论(0)
  • 2020-12-28 19:08

    Threading may be the reason here. The usual suspect are threads that use OVERLAPPED structures on the stack and threads that send pointers to objects that are on the stack to other threads.

    It may be possible to recover partial stack information if you use the Deubgging Tools For Windows and use the "dps" command.

    0 讨论(0)
  • 2020-12-28 19:10

    Even when the IDE-provided stack trace isn't very complete, that doesn't mean there isn't still useful information on the stack. Open up the CPU view and check out the stack pane; for every CALL opcode, a return address is pushed on the stack. Since the stack grows downwards, you'll find these return addresses above the current stack location, i.e. by scrolling upwards in the stack pane.

    The stack for the main thread will be somewhere around $00120000 or $00180000 (address space randomization in Vista and upwards has made it more random). Code for the main executable will be somewhere around $00400000. You can speculatively investigate elements on the stack that don't look like integer data (low values) or stack addresses ($00120000+ range) by right-clicking on the stack entry and selecting Follow -> Near Code, which will cause the disassembly window to jump to that code address. If it looks like invalid code, it's probably not a valid entry in the stack trace. If it's valid code, it may be OS code (frequently around $77000000 and above) in which case you won't have meaningful symbols, but every so often you'll hit on an actual proper stack entry.

    This technique, though somewhat laborious, can get you meaningful stack trace info when the debugger isn't able to trace things through. It doesn't help you if ESP (the stack pointer) has been screwed with, though. Fortunately, that's pretty rare.

    0 讨论(0)
  • 2020-12-28 19:13

    I'm not 100% sure, but from the image you provided I believe that somewhere along the executing you're trying to access a object in a TList that is NULL. i.e.:

    AList[Index].SomeProperty/SomeMethod/etc. <-- error if (AList[Index] == NULL)
    

    Regarding debugging and finding the actual place where the exception is raised is never an easy task especially when there's not much information or it is hard to reproduce, in this case I usually:

    • go step by step from the main form's execution(if no exception until there)

    • while going step by step, if I find any unsafe code I put it between try...except and conditions for indexes(if I have arrays, lists, expected values to be passed, etc.)

    • if the above fails to find the issue, check if some libraries are failing

    • use Eureka log, it sometimes fail as well(very few times) but it usually points you in the right direction

    I have had numerous issues similar to yours and I can tell you that the issue was almost a extremely easy to fix, however when the error pops, I did not get a "point near" the error.

    0 讨论(0)
提交回复
热议问题