The first step towards testing everything is to decide you are going to
-- Kent Beck
Question: How do you know your code works?
Goal: your code should not have bugs.
How: unit testing and test driven development.
This document discusses unit testing and test driven development in general terms. Each language and environment will have its own document on how to do unit testing in that environment.
A Thought Experiment
- What if you didn't add a line of code unless it satisfied a test?
- What if every line of code was tested?
- What if all tests were run after adding every line of code?
- How many bugs do you think you would have?.
- More than now?
- Less than now?
- No difference?
- Would you feel more comfortable making changes in your code?
- Would you feel more comfortable of others making changes in your code?
- Can it be done?
- Is it worth the effort?
- Do you think the tests would help others understand the code?
This approach is the opposite of Code and Fix which is often used as the default development methodology.
The Big Picture
The are roughly 3 types of test coverage:
- statement coverage
- branch coverage
- path coverage.
Unit tests help with statement and branch coverage. Path coverage is not handled by unit tests which is why comprehensive system level tests are very necessary. <P>
Eyes on the Prize
Unit testing isn't enough. Quality comes from cooperative layering.
- People Who Care
- Comprehensive Requirements
- Good Design and Reviews
- Static Type Checking
- Design by Contract
- Code Reviews and Training
- Unit Tests
- System Tests
What is a Unit?
The smallest unit is a class. After the class a unit can be any combination of classes upto the system level. <P>
What is a Test?
A procedure for critical evaluation; a means of determining the presence, quality, or truth of something. A test must give a pass or fail answer. It's not enough to look at debug output or trace something in the debugger or just run the program. A suite of unit tests should give the total number of tests, the number that passed, and the number that failed. Code must be structured to enable this style of testing. <P>
Test are implementing using your test framework.
What level of testing is performed on units?
- All public and protected methods.
- Start one method at time.
- Test what clients will do.
- Test in tiny iterations.
Try looking at a unit-test as verifying that an individual method works correctly. Unit-testing becomes simpler when you are making sure that a method works correctly. It is easier to think about testing methods in classes, rather than testing classes. If your unit-tests prove that every method in a class works, then it is not a far leap to assume that the class works properly. This approach seems to make unit-testing much easier for most developers.
What is tested?
Test everything that can break. <P>
Complete verification is obviously impossible. To reduce the state space:
- Eighty percent of the errors are found in 20 percent of a project's routines. Fifty percent of the errors are found in 10 percent of a project's routines. <P>
- Many people say to not test obvious code like accessors. <P>
- Others say don't test code that is indirectly tested by other tests. <P>
- The tests themselves aren't tested. <P>
- It's often prudent to trace logic paths in the debugger when first making a test. <P>
Common Test Cases
In black-box testing, one can identifies various cases to test. These include:
- expected case (for a is_prime routine: is 10 prime? is 11 prime?)
- unexpected cases (how about -1? -11?)
- corner cases (0)
- special cases (1,2)
- extreme cases (INT_MAX; INT_MIN; largest prime representable)
Who develops tests?
Anyone who writes code. This is includes initial development and especially enhancements. If the code changes the unit tests must change too.
Who runs tests?
- Developers during the development process.
- The build system after a submit to verify the code doesn't break anything or as part of the smoke test.
I outline a bunch of tests before I start as items in a list. Then I implement the first one, make it work, refactor, etc. I used to implement a bunch at a time, but then I programmed a little with Ward and realized that all those tests implemented against a speculative interface made changing the interface harder. The reduced certainty of only having the tests outlined instead of implemented is more than made up for by the additional speed of evolving the interface.</I> -- KentBeck
The key concept is working in extremely small increments and adding just one thing at a time.
How are tests implemented?
Tests are implemented using your test framework. There are several choices for each language and environment.
Where do tests come from?
Test driven development is not an excuse to stop thinking and start hacking until it works. Exactly the contrary. TDD is about having quality at every step of development. <P>
Code is still written to satisfy requirements. The code expression of the requirements are verified through tests. Requirements can come from many sources, including the development process itself. No requirements document is going to specify what classes or methods to create. Requirements are fractal in that sense. The code itself is also an important source of requirements. <P>
Test Driven Class Design
It's not obvious at first but thinking about tests while developing positively impacts class design because making classes testable requires a certain cleanliness of class design.
- Simpler classes are easier to test. The requirement to test means that you must make your classes more cohesive. A class should have have as few responsibilities as possible. Making classes simpler reduces the number of tests.
- If a class is hard to test then refactoring">refactor. Reduce the coupling between classes by extracting functionality in to separate classes. This can also help with resource leaks beacuse the resource aquisition and release is in the class.
- Don't be afraid of creating more classes. If classes are well named and cohesive it acts as a form of documentation because your intent should be more obvious.
- A good design approach is to use abstract base classes to connect components. Test classes can be derived from the abstract class and used normally in the system.
But My Code System is too Complicated/Hard/Something
Not true. Anything can be tested if you think about and are willing to do the work. There are a lot of techniques in this document about how to test. If you code is too complex then ask why. What makes it hard to test and figure out ways around it. Is your code more complex then it needs to be? See Test Driven Class Design. Setup fake objects for hardware or other subsystems that can't be easily be tested. There's always a way.
Cost of Test Driven Development
Unit tests do take a lot of effort. No doubt. And if may feel like wasted effort. But the benefits generally outweigh the cost. How else do you know your code works? The test-driven process may feel odd at first, but it allows development to proceed on a basis of success. Tests improve designs. And developers generally like it once they start.
Should Test Code Have Access To Internals?
The short answer is yes. A class should not know it's being tested as that changes the class'es behaviour. Verifying test results often requires detailed knowlede of internal state. Yet, internal details should not, if possible, be made public just for testing as that would give access to any client. <P>
- C++'s friend feature is very useful here because a test can be made a friend of the classes it is testing.
- Some methods useful for unit testing can work in the public interface as well. Adding get accessors and test methods is often helpful.
Shorter Tests are Better
Most tests can hopefully be written like:
TEST_FAIL_IF(object.GetState() != Class::StartedState());
If a test requires a lot of complicated logic it's a clue that maybe the class being tested should be refactored or that the class should have some accessors added. <P>
Place Unit Tests in Test Directory
Tests should be created in the appropriate directory under sw/test. Tests should not be integrated with the classes under test. Using a separate directory allows developers not interested in the tests to not have to sync and compile the tests.
Testing a Container
Usually containers have counter for the number of entries.
- Make sure the counter is zero.
- Put a few objects in.
- Make sure the counter increased by the right amount.
- Get items form the container and make sure it is the same object that was put in. You can do this by saving the object or address of the object put in the container.
- If an add should fail if it already exists in the table then try this case.
- If an add should replace if it already exists in the table then try this case.
- Delete items from the container.
- Make sure the counter increased by the right amount.
- Get of the items should fail.
- Delete the container with items in the container and make sure the contained objects have been deleted.
If deleting the object from the container should delete the object then make sure the object is deleted. This is often hard to do. <P>
A container is an example of where adding an IsExistByKey and IsExistByObject method makes testing simpler. <P>
Testing for Memory Leaks
Testing for memory leaks is difficult.
- Use a tool like Purify.
- It's possible to add counters to know track creations and deletions.
- Tracing code paths in the debugger can help verify on creation if memory is freed, but it doesn't help for later modifications.
Clean Destruction Required
Each test is run in isolation. This means that the environment is set up before the test and tore down after the test. This requires objects to destruct cleanly so it can be constructed and destructed as many times as is needed. <P>
Often programmer's ignore the destruction step because it doesn't happen often in a live system.
- Namespaces must be deregistered.
- Memory must be freed.
- Actor's must destruct when told to exit.
- Actor's must be told to exit.
- Singleton's require the ability to be reset.
- The database must be reset between tests.
Using Counts and Derived Classes
Counts are often a good proxy to know that something happened correctly. For example, if you are expecting 20 messages to be sent and received then have a way to count the messages so you can make a test. <P>
A good design approach is to use abstract base classes to connect components. You can then derive a test class that includes a counting feature. Then you can easily create a test to verify the counts. <P>
Saving and Comparing State and Derived Classes
Create a list of the expected state. Then save the state of what happens. Then compare the two states to see if the test passes. This is a somewhat complicated approach but it is very powerful and flexible. <P>
For example, a test may require a certain ordered stream of messages and each message must have a particular format and content. Creating a list of the expected messages makes testing much simpler. The list is the order in which the messages are expected. If the messages can be compared easily, saying using Properties, then it's trivial to verify the correctness of a message. New messages can easily be added as well. <P>
The use of abstract base classes allow the messages to be transparently saved and compared. <P>
Singletons are troublesome in unit tests because they are set once and are supposed to last for the lifetime of the system. Each unit test operates in an isolated environment. Setting a singleton in one unit test shouldn't impact a later test. <P>
The way to make this work is to:
- Add a Reset to each singleton. Reset will destroy the singleton and reset it to null.
- Call Reset for all singleton's in the setup step for each test.
Creating Fake Objects
Some things are hard to test. Hardware, for example. Also certain scenarios are hard to test because they require odd interactions or very large data sets or for any number of other reasons. <P>
A solution is to use fake objects that provide the required test behaviour. Again, your design can make this approach hard or easy. By using installable objects based on abstract classes you can create almost any scenario you require. <P>
An application, for example, may receive data from hardware. The hardware can be faked. This allows testing of the application separate from the hardware. Complex scenarios that might be difficult to generate in the hardware can be generated by the fake object.
Creating Fake Actors and State Machines
This is similar to the fake object approach but at a larger scale. You can fake the behaviour of other actors and state machines to test your code separate from the other subsystems. A messaging interface makes this approach easier because messages can be easily faked.
Structure of Unit Tests
This section discusses how unit tests are structured in general. There may be differences in different implementations. <P>
- TestCase - a class to manage the execution of a single test. Each test case has a setUp method to set up the environment for the test and a tearDown method to tear down the environment after a test.
- TestSuite - a list of TestCases.
- TestResult - a class which collects failed test cases and other statistics.
- TestRunner - interface for running tests and viewing the results.
All tests are usually run every time. Some frameworks allow a specific test to be run. <P>
All tests are pass/fail. The result is a count of which tests passed and which failed. <P>
A class derived from TestCase can contain any number of methods each of which is a separate test. Multiple tests are in the same TestCase because they need to share the same setUp and tearDown methods. The TestSuite is told about the TestCase and each test method so it knows to run all tests in a TestCase. <P>
Create Smaller TestCases
It's tempting to put a lot of tests in the same TestCase. What you end up with is a huge file with a lot of tests. This makes it hard for multiple developers to add and change tests. Instead make more TestCases so they are well focussed. Developers can add new TestCases without creating integration problems. <P>
If common setUp and tearDown behaviour is required then create a separate class that all the other tests can use. <P>
Tips on Decoupling
Finding the Sweet Spot
Every developer will have to find their sweet spot as to what is the correct process for them. There is range of possible styles. <P>
Some developers like driving development in a test-first process. It's odd at first, but it fits many people well. <P>
Other developers may find a test-driven process more comfortable where testing is part of the overall design process that includes much thinking, class creation and coding, and documentation. All the steps are sort of simultaneous and inform each other. <P>
Different problem spaces may be more comfortable in different syles. Longer iterations may be possible in familiar code or problems. <P>
Different languages and development environment may make some approaches easier or more difficult. In C++ its easier to write complete signatures at once rather then do it in smaller chunks. A refactoring browser makes many things easier to do in smaller chunks as do interpreted environments. <P>
Execution environment definitely influences development style. If it takes 5 minutes to boot a system to run a new test then small increments won't work. In this case it's better to do development in another environment and then run all the tests in the more difficult target environment.
Prototyping vs Specification
Test driven development can be considered a kind of structured prototyping. There's an interesting paper talking about Prototyping vs Specification. <P>
In this experiment, seven software teams developed versions of the same small-size (2000-4000 source instruction) application software product. Four teams used the Specifying approach. Three teams used the Prototyping approach. The main results of the experiment were: Prototyping yielded products with roughly equivalent performance, but with about 40% less code and 45% less effort. The prototyped products rated somewhat lower on functionality and robustness, but higher on ease of use and ease of learning. Specifying produced more coherent designs and software that was easier to integrate. The paper presents the experimental data supporting these and a number of additional conclusions. <P>
The thought is with test driven development the robustness of the application would increase while keeping some of the other good characteristics. <P>
Another approach is specification using formal proofs. The SPARK Ada products and methodology has been used with very good success on even large safety critical projects. <P>
Typically more informal methods of specification are used that combine requirements, uses cases, and high level designs. <P>
The Eternal Battle
If you are using a formal proof based methodology then your code is truly a product of design and specification. <P>
If you are not using a formal methodology then there is a murky middle ground between code arising from specification and code arising from prototyping/evolution. <P>
In one camp code is the physical manifestation of abstract ideas. Code is the lowest crudest level of design expression. Requirements change and cause the design to change which causes the code to change. Incorrect designs can be corrected to more closely satisfy the requirements. Designs don't degrade or become brittle, they are found inadequate because they no longer satisfy requirements. <P>
In another camp The Source Code is the Design. The best way to represent design is the source code. Design changes are expressed in the code and not separate documentation. The system evolves through changes in the code. To understand the system you must read the code. By writing really good code (refactored, well named, well tested, etc) the design is made obvious. If the design is not obvious from the code it is the code's fault and the code should be improved.
Would it be cheating to say you need both approaches? A large system must have a lot of architectural decisions made to meet high level requirements. How can you do this without specification and analysis? <P>
On the other hand, what requirement or specification could possibly tell you which classes, methods, and attributes to write? At the lowest level the source code is the design and TDD development is an excellent method of development. <P>
The problem is the system usually evolves into an incoherent Big Ball of Mud. Hack after hack is applied to satisfy situational issues and there's not time to "fix" things later. <P>
The answer is.....? Sorry, don't have any answers. Yet, the application of principles like TDD, once and only once, etc can help beat back theforces of chaos.