How do you reproduce bugs that occur sporadically?

前端 未结 28 1203
离开以前
离开以前 2021-01-30 19:43

We have a bug in our application that does not occur every time and therefore we don\'t know its \"logic\". I don\'t even get it reproduced in 100 times today.

Disclaime

相关标签:
28条回答
  • 2021-01-30 20:34

    This varies (as you say), but some of the things that are handy with this can be

    • immediately going into the debugger when the problem occurs and dumping all the threads (or the equivalent, such as dumping the core immediately or whatever.)
    • running with logging turned on but otherwise entirely in release/production mode. (This is possible in some random environments like c and rails but not many others.)
    • do stuff to make the edge conditions on the machine worse... force low memory / high load / more threads / serving more requests
    • Making sure that you're actually listening to what the users encountering the problem are actually saying. Making sure that they're actually explaining the relevant details. This seems to be the one that breaks people in the field a lot. Trying to reproduce the wrong problem is boring.
    • Get used to reading assembly that was produced by optimizing compilers. This seems to stop people sometimes, and it isn't applicable to all languages/platforms, but it can help
    • Be prepared to accept that it is your (the developer's) fault. Don't get into the trap of insisting the code is perfect.
    • sometimes you need to actually track the problem down on the machine it is happening on.
    0 讨论(0)
  • 2021-01-30 20:36

    There's a good chance your application is MTWIDNTBMT (Multi Threaded When It Doesn't Need To Be Multi Threaded), or maybe just multi-threaded (to be polite). A good way to reproduce sporadic errors in multi-threaded applications is to sprinkle code like this around (C#):

    Random rnd = new Random();
    System.Threading.Thread.Sleep(rnd.Next(2000));
    

    and/or this:

    for (int i = 0; i < 4000000000; i++)
    {
        // tight loop
    }
    

    to simulate threads completing their tasks at different times than usual or tying up the processor for long stretches.

    I've inherited many buggy, multi-threaded apps over the years, and code like the above examples usually makes the sporadic errors occur much more frequently.

    0 讨论(0)
  • 2021-01-30 20:36

    all the above, plus throw some brute force soft-robot at it that is semi random, and scater a lot of assert/verify (c/c++, probably similar in other langs) through the code

    0 讨论(0)
  • 2021-01-30 20:38

    Assuming you're on Windows, and your "bug" is a crash or some sort of corruption in unmanaged code (C/C++), then take a look at Application Verifier from Microsoft. The tool has a number of stops that can be enabled to verify things during runtime. If you have an idea of the scenario where your bug occurs, then try to run through the scenario (or a stress version of the scenario) with AppVerifer running. Make sure to either turn on pageheap in AppVerifier, or consider compiling your code with the /RTCcsu switch (see http://msdn.microsoft.com/en-us/library/8wtf2dfz.aspx for more information).

    0 讨论(0)
提交回复
热议问题