I\'m primarily a C++ coder, and thus far, have managed without really writing tests for all of my code. I\'ve decided this is a Bad Idea(tm), after adding new features that subt
what sort of tests would you write to constantly check those sites live?
UnitTests target small sections of code you write. UnitTests do not confirm that things are ok in the world. You should instead define application behavior for those imperfect scenarios. Then you can UnitTest your application in those imperfect scenarios.
for instance a crawler
A crawler is a large body of code you might write. It has some different parts, one part might fetch a webpage. Another part might analyze html. Even these parts may be too large to write a unit test against.
How do you test an application will do its job in the face of a degraded network stack? The developer tests it personally by purposely setting up a gateway that drops packets to simulate a bad network when he first writes it.
If a test uses the network, it's not a UnitTest.
A UnitTest (which must target your code) cannot call the network. You didn't write the network. The UnitTest should involve a mock network with simulated (but consistent each time) packet loss.
Unit tests seem to test the code only in perfect conditions
UnitTests test your code in defined conditions. If you're only capable of defining perfect conditions, your statement is true. If you're capable of defining imperfect conditions, your statement is false.
Work through any decent book on unit testing - you'll find that it's normal practise to write tests that do indeed cover edge cases where the input is not ideal or is plain wrong.
The most common approach in languages with exception handling is a "should throw" specification, where a certain test is expected to cause a specific exception type to be thrown. If it doesn't throw an exception, the test fails.
Update
In your update you describe complex timing-sensitive interactions. Unit testing simply doesn't help at all there. No need to introduce networking: just think of trying to write a simple thread safe queue class, perhaps on a platform with some new concurrency primitives. Test it on an 8 core system... does it work? You simply can't know that for sure by testing it. There are just too many different ways that the timing can cause operations to overlap between the cores. Depending on luck, it could take weeks of continuous execution before some really unlikely coincidence occurs. The only way to get such things right is through careful analysis (static checking tools can help). It's likely that most concurrent software has some rarely occuring bugs in it, including all operating systems.
Returning to the cases that can actually be tested, I've found integration tests to be often just as useful as unit tests. This can be as elaborate as automating the installation of your product, adding configurations to it (such as your users might create) and then "poking" it from the outside, e.g. automating your UI. This finds a whole other class of issue separate from unit testing.
You are talking about library or application testing, which is not the same as unit testing. You can use unit testing libraries such as CppUnit/NUnit/JUnit for library and regression testing purposes, but as others have said, unit testing is about testing your lowest level functions, which are supposed to be very well defined and easily separated from the rest of the code. Sure, you could pass all low-level unit tests, and still have a network failure in the full system.
Library testing can be very difficult, because sometimes only a human can evaluate the output for correctness. Consider a vector graphics or font rendering library; there's no single perfect output, and you may get a completely different result based on the video card in your machine.
Or testing a PDF parser or a C++ compiler is dauntingly difficult, due to the enormous number of possible inputs. This is when owning 10 years of customer samples and defect history is way more valuable than the source code itself. Almost anyone can sit down and code it, but initially you won't have a way of validating your program for correctness.
What you are talking about is making applications more robust. That is, you want them to handle failures elegantly. However, testing every possible real world failure scenario would be difficult if not impossible. The key to making applications robust is to assume that failure is normal and should be expected at some point in the future. How an application handles failure really depends on the situation. There are a number of different ways to detect and handle failure (maybe a good question to ask the group). Trying to rely on unit testing alone will only get you part of the way. Anticipating failure (even on some simple operations) will get you even closer to a more robust application. Amazon built thier entire system to anticipate all types of failures (hardware, software, memory and file corruption). Take a look at thier Dynamo for an example of real world error handling.
Although not a complete answer to the massive dilema you face, you can reduce the amount of tests by using a technique called Equivalence Partitioning.
In my organization, we perform many levels of coverage, regression, positive, negative, scenario based, UI in automated and manual tests, all starting from a 'clean environment', but even that isn't perfect.
As for one of the cases you mention, where a programmer comes in and changes some sensitive detection code and no one notices, we would have had a snapshot of data that is 'behaviourally dodgy', which fails consistently with a specific test to test the detection routine - and we would run all tests regularly (and not just at the last minute).
I think you can't and shouldn't make an unit test for all possible errors you might face (what if a meteorite hits the db server?) - you should make an effort to test errors with reasonably probablity and/or rely or another services. For example; if your application requires the correct arrival of network packets; you should use the TCP transport layer: it guarantees the correctness of the received packets transparently, so you only have to concentrace eg. what happens if network connection is dropped. Checksums are designed to detect or correct a reasonable amount of errors - if you expect 10 errors per file, you would use different checksum than if you expect 100 errors. If the chosen checksum indicates that the file is correct, than you have no reason to think it's broken (the probablity that it is broken is negligible). Because you don't have infinite resources (eg. time) you have to make compromises when you write your tests; and choosing these compromises it a tough question.