What to make of an “impossible” stack trace after a crash?

坚强是说给别人听的谎言 提交于 2019-12-23 12:22:27

问题


My program appears to be suffering from a very hard-to-reproduce bug: Once in a blue moon, when a user puts his Mac to sleep and later on wakes it back up again, one of my program's child processes will crash immediately after the Mac wakes up.

When this happens Apple's crash-reporter mechanism reliably reports a stack trace like this one:

Thread 0 Crashed:: Dispatch queue: com.apple.main-thread
0   libsystem_kernel.dylib          0x967f9a6a __pthread_kill + 10
1   libsystem_c.dylib               0x9003dacf pthread_kill + 101
2   libsystem_c.dylib               0x900744f8 abort + 168
3   com.meyersound.VirtualD-Mitri   0x0014438e muscle::CrashSignalHandler(int) + 190
4   libsystem_c.dylib               0x9002886b _sigtramp + 43
5   ???                             0xffffffff 0 + 4294967295
6   com.meyersound.VirtualD-Mitri   0x001442d0 muscle::ParsePortArg(muscle::Message const&, muscle::String const&, unsigned short&, unsigned long) + 80
7   com.meyersound.VirtualD-Mitri   0x005b3393 qnet::RepDBPeer::Pulse(muscle::PulseNode::PulseArgs const&) + 1187
8   com.meyersound.VirtualD-Mitri   0x0015717b muscle::PulseNode::PulseAux(unsigned long long) + 203
9   com.meyersound.VirtualD-Mitri   0x000cfb90 muscle::ReflectServer::ServerProcessLoop() + 3232
10  com.meyersound.VirtualD-Mitri   0x00607c7e dcasldmain(int, char**) + 2222
11  com.meyersound.VirtualD-Mitri   0x0072c14d dmitridmain(int, char**) + 4749
12  com.meyersound.VirtualD-Mitri   0x0000bc3a main + 4938
13  com.meyersound.VirtualD-Mitri   0x000061ab _start + 209
14  com.meyersound.VirtualD-Mitri   0x000060d9 start + 41

This is all well and good, except (cue eerie music) -- it's logically impossible. In particular, not only does my RepDBPeer::Pulse() method never call ParsePortArg, the crashing process never calls ParsePortArg anywhere! (I grepped all of my source code twice to make sure)

So my question is, just what is this stack trace trying to tell me? Is this most likely a case of Thread 0's stack getting corrupted badly enough that the stack-trace mechanism has gone off the rails and fingered an innocent bystander as the culprit? Or is it possible that Apple's wakeup mechanism somehow "jumped" the program counter into ParsePortArg() (whereupon the resulting confusion caused the crash)? Or is there some other deeper magic going on here that I can't even imagine?

The crashing process in question is a vanilla non-GUI background process that is a child process spawned by a Qt GUI process, for what that's worth.


回答1:


I'm going to assume you have some amount of optimization turned on. There's no magic to stack traces. They become increasingly more fuzzy (read as "less accurate") once code is inlined or omitted, which is precisely what a C++ optimizer does. In the case of ParsePortArg, there is a +80 at the end of that line, meaning 80 bytes ahead of the entry point of that function in the code segment. This indicates the true address of the instruction pointer at 0x001442d0, and ParsePortArg is the nearest symbol that the stack dump guessed at. You were right to assume it was a red herring.

Barring any other data, I would make the very conservative guess that your program is expecting some pointer to remain valid that is not valid on wakeup from sleep. Look at the disassembly for the instruction at that address. I bet a memory address is trying to be read.



来源:https://stackoverflow.com/questions/16227845/what-to-make-of-an-impossible-stack-trace-after-a-crash

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!