Solving random crashes

前端未结

关注

 17  997

I am getting random crashes on my C++ application, it may not crash for a month, and then crash 10 times in a hour, and sometimes it may crash on launch, while sometimes it may

相关标签:

17条回答

遇见更好的自我

2021-01-31 09:17
It sounds like your program is suffering from memory corruption. As already said your best option on Linux is probably valgrind. But here are two other options:
- First of all use a debug malloc. Nearly all C libraries offer a debug malloc implementation that initialize memory (normal malloc keeps "old" contents in memory), check the boundaries of an allocated block for corruption and so on. And if that's not enough there is a wide choice of 3rd party implementations.
- You might want to have a look at VMWare Workstation. I have not set it up that way, but from their marketing materials they support a rather interesting way of debugging: Run the debugee in a "recording" virtual machine. When memory corruption occurs set a memory breakpoint at the corrupted address an then turn back time in the VM to exactly that moment when that piece of memory was overwritten. See this PDF on how to setup replay debugging with Linux/gdb. I believe there is a 15 or 30 days demo for Workstation 7, that might be enough to shake out those bugs from your code.
0 讨论(0)
发布评论:

提交评论
- 加载中...
悲&欢浪女

2021-01-31 09:18
First, you are lucky that your process crashes multiple times in a short time-period. That should make it easy to proceed.

This is how you proceed.
- Get a crash dump
- Isolate a set of potential suspicious functions
- Tighten up state checking
- Repeat
Get a crash dump

First, you really need to get a crash dump.

If you don't get crash dumps when it crashes, start with writing a test that produces reliable crash dumps.

Re-compile the binary with debug symbols or make sure that you can analyze the crash dump with debug symbols.

Find suspicious functions

Given that you have a crash dump, look at it in gdb or your favorite debugger and remember to show all threads! It might not be the thread you see in gdb that is buggy.

Looking at where gdb says your binary crashed, isolate some set of functions you think might cause the problem.

Looking at multiple crashes and isolating code sections that are commonly active in all of the crashes is a real time-saver.

Tighten up state checking

A crash usually happens because some inconsistent state. The best way to proceed is often to tighten the state requirements. You do this the following way.

For each function you think might cause the problem, document what legal state the input or the object must have on entry to the function. (Do the same for what legal state it must have on exit from the function, but that's not too important).

If the function contains a loop, document the legal state it needs to have at the beginning of each loop iteration.

Add asserts for all such expressions of legal state.

Repeat

Then repeat the process. If it still crashes outside of your asserts, tighten the asserts further. At some point the process will crash on an assert and not because of some random crash. At this point you can concentrate on trying to figure out what made your program go from a legal state on entry to the function, to an illegal state at the point where the assert happened.

If you pair the asserts with verbose logging it should be easier to follow what the program does.
0 讨论(0)
发布评论:

提交评论
- 加载中...
情书的邮戳

2021-01-31 09:18

There are a lot of good answers here, but no one has yet touched on the Lua angle.

Lua is generally pretty well behaved, but it is still possible for it to cause memory corruption or crashing if e.g. the Lua stack overflows or underflows, or bad bytecode is executed.

One easy thing you can do that will detect many such errors is to define the lua_assert macro in luaconf.h. Defining this (to e.g. standard C's assert) will enable a variety of sanity checks inside the Lua core.

0 讨论(0)
发布评论:

提交评论
- 加载中...
长发绾君心

2021-01-31 09:20

Start the program under debugger (I'm sure there is a debugger together with GCC and MingW) and wait until it crashes under debugger. At the point of crash you will be able to see what specific action is failing, look into assembly code, registers, memory state - this will often help you find the cause of the problem.

0 讨论(0)
发布评论:

提交评论
- 加载中...
傲寒

2021-01-31 09:21
1. Start Logging. Put logging statements in places where you think the code flaky. focus on testing the code, and repeat until you narrow down the problem to a module or a function.
2. Put asserts everywhere!
3. While you are at it, Only put one expression in an assert.
4. Write a unit test for the code you think is failing. That way you can exercise the code in isolation from the rest of your runtime environment.
5. Write more automated tests that exercise the problematic code.
6. Do not add more code on top of the bad code that is failing. That's just a dumb idea.
7. Learn how to write out mini-dumps and do post-mortem debugging. It looks like others here have explained that quite well.
8. Exercise the bad code from as many different possible ways as you can to make you can isolate the bug.
9. Use a debug build. Run the debug build under the debugger if possible.
10. Trim down your application by removing binaries, modules etc... if possible so that you can have an easier time attempting to reproduce the bug.
0 讨论(0)
发布评论:

提交评论
- 加载中...

上一页 1 2 3