Recently I had an interesting discussion with a colleague about unit tests. We were discussing when maintaining unit tests became less productive, when your contracts change.
Someone asked the same question in the Google Group for the book "Growing Object Oriented Software - Guided by Tests". The thread is Unit-test mock/stub assumptions rots.
Here is J.B. Rainsberger's answer (he is the author of Manning's "JUnit Recipes").
The first issue you raise is the so-called "fragile test" problem. You make a change to your application, and hundreds of tests break because of that change. When this happens, you have a design problem. Your tests have been designed to be fragile. They have not been sufficiently decoupled from the production code. The solution is (as it it in all software problems like this) to find an abstraction that decouples the tests from the production code in such a way that the volatility of the production code is hidden from the tests.
Some simple things that cause this kind of fragility are:
Test design is an important issue that is often neglected by TDD beginners. This often results in fragile tests, which then leads the novices to reject TDD as "unproductive".
The second issue you raised was false positives. You have used so many mocks that none of your tests actually test the integrated system. While testing independent units is a good thing, it is also important to test partial and whole integrations of the system. TDD is not just about unit tests.
Tests should be arranged as follows:
I second uncle Bob's opinion that the problem is in the design. I would additionally go back one step and check the design of your contracts.
instead of saying "return -1 for x==0" or "throw CannotCalculateException for x==y", underspecify niftyCalcuatorThingy(x,y)
with the precondition x!=y && x!=0
in appropriate situations (see below). Thus your stubs may behave arbitrarily for these cases, your unit tests must reflect that, and you have maximal modularity, i.e. the liberty to arbitrarily change the behavior of your system under test for all underspecified cases - without the need to change contracts or tests.
You can differentiate your statement "-1 when it fails for some reason" according to the following criteria: Is the scenario
If and only if 1) to 3) hold, specify the scenario in the contract (e.g. that EmptyStackException
is thrown when calling pop() on an empty stack).
Without 1), the implementation cannot guarantee a specific behavior in the exceptional case. For instance, Object.equals() does not specify any behavior when the condition of reflexivity, symmetry, transitivity & consistency is not met.
Without 2), SingleResponsibilityPrinciple is not met, modularity is broken and users/readers of the code get confused. For instance, Graph transform(Graph original)
should not specify that MissingResourceException
might be thrown because deep down, some cloning via serialization is done.
Without 3), the caller cannot make use of the specified behavior (certain return value/exception). For instance, if the JVM throws an UnknownError.
If you do specify cases where 1), 2) or 3) does not hold, you get some difficulties:
The downside of underspecification is that (testing) robustness, i.e. the implementation's ability to react appropriately to abnormal conditions, is harder.
As compromise, I like to use the following contract schema where possible:
<(Semi-)formal PRE- and POST-condition, including exceptional behavior where 1) to 3) hold>
If PRE is not met, the current implementation throws the RTE A, B or C.