Is duplicated code more tolerable in unit tests?

unit-testing dry code-duplication

24,387

Solution 1

Duplicated code is a smell in unit test code just as much as in other code. If you have duplicated code in tests, it makes it harder to refactor the implementation code because you have a disproportionate number of tests to update. Tests should help you refactor with confidence, rather than be a large burden that impedes your work on the code being tested.

If the duplication is in fixture set up, consider making more use of the setUp method or providing more (or more flexible) Creation Methods.

If the duplication is in the code manipulating the SUT, then ask yourself why multiple so-called “unit” tests are exercising the exact same functionality.

If the duplication is in the assertions, then perhaps you need some Custom Assertions. For example, if multiple tests have a string of assertions like:

assertEqual('Joe', person.getFirstName())
assertEqual('Bloggs', person.getLastName())
assertEqual(23, person.getAge())

Then perhaps you need a single assertPersonEqual method, so that you can write assertPersonEqual(Person('Joe', 'Bloggs', 23), person). (Or perhaps you simply need to overload the equality operator on Person.)

As you mention, it is important for test code to be readable. In particular, it is important that the intent of a test is clear. I find that if many tests look mostly the same, (e.g. three-quarters of the lines the same or virtually the same) it is hard to spot and recognise the significant differences without carefully reading and comparing them. So I find that refactoring to remove duplication helps readability, because every line of every test method is directly relevant to the purpose of the test. That's much more helpful for the reader than a random combination of lines that are directly relevant, and lines that are just boilerplate.

That said, sometimes tests are exercising complex situations that are similiar but still significantly different, and it is hard to find a good way to reduce the duplication. Use common sense: if you feel the tests are readable and make their intent clear, and you're comfortable with perhaps needing to update more than a theoretically minimal number of tests when refactoring the code invoked by the tests, then accept the imperfection and move on to something more productive. You can always come back and refactor the tests later, when inspiration strikes!

Solution 2

Readability is more important for tests. If a test fails, you want the problem to be obvious. The developer shouldn't have to wade through a lot of heavily factored test code to determine exactly what failed. You don't want your test code to become so complex that you need to write unit-test-tests.

However, eliminating duplication is usually a good thing, as long as it doesn't obscure anything, and eliminating the duplication in your tests may lead to a better API. Just make sure you don't go past the point of diminishing returns.

Solution 3

Implementation code and tests are different animals and factoring rules apply differently to them.

Duplicated code or structure is always a smell in implementation code. When you start having boilerplate in implementation, you need to revise your abstractions.

On the other hand, testing code must maintain a level of duplication. Duplication in test code achieves two goals:

Keeping tests decoupled. Excessive test coupling can make it hard to change a single failing test that needs updating because the contract has changed.
Keeping the tests meaningful in isolation. When a single test is failing, it must be reasonably straightforward to find out exactly what it is testing.

I tend to ignore trivial duplication in test code as long as each test method stays shorter than about 20 lines. I like when the setup-run-verify rhythm is apparent in test methods.

When duplication creeps up in the "verify" part of tests, it is often beneficial to define custom assertion methods. Of course, those methods must still test a clearly identified relation that can be made apparent in the method name: assertPegFitsInHole -> good, assertPegIsGood -> bad.

When test methods grow long and repetitive I sometimes find it useful to define fill-in-the-blanks test templates that take a few parameters. Then the actual test methods are reduced to a call to the template method with the appropriate parameters.

As for a lot of things in programming and testing, there is no clear-cut answer. You need to develop a taste, and the best way to do so is to make mistakes.

Solution 4

You can reduce repetition using several different flavours of test utility methods.

I'm more tolerant of repetition in test code than in production code, but I have been frustrated by it sometimes. When you change a class's design and you have to go back and tweak 10 different test methods that all do the same setup steps, it's frustrating.

Solution 5

I agree. The trade off exists but is different in different places.

I'm more likely to refactor duplicated code for setting up state. But less likely to refactor the part of the test that actually exercises the code. That said, if exercising the code always takes several lines of code then I might think that is a smell and refactor the actual code under test. And that will improve readability and maintainability of both the code and the tests.

View more solutions

24,387

Author by

Daryl Spitzer

Father of three, husband, computer programmer (Pythonista), skeptic, atheist, podcast listener, baseball fan, Canadian (in the United States).

Updated on December 04, 2020

Comments

Daryl Spitzer over 3 years

I ruined several unit tests some time ago when I went through and refactored them to make them more DRY--the intent of each test was no longer clear. It seems there is a trade-off between tests' readability and maintainability. If I leave duplicated code in unit tests, they're more readable, but then if I change the SUT, I'll have to track down and change each copy of the duplicated code.

Do you agree that this trade-off exists? If so, do you prefer your tests to be readable, or maintainable?
Tim Frey over 15 years

I think this is a good idea. If you have a lot of duplication, see if you can refactor to create a common "test fixture" under which many tests can run. This will eliminate duplicate setup/teardown code.
seand about 12 years

xUnit and others contain a 'message' argument in assert calls. Good idea to put meaningful phrases to allow devs to quickly find failed test results.
Tony over 11 years

@seand You can try to explain what your assert is checking, but when it's failing and contains somewhat obscured code then developer will need to go and unwind it anyway. IMO It's more important to have code to be self-describing there.
Admin over 11 years

"Duplicated code is a smell in unit test code just as much as in other code." No. "If you have duplicated code in tests, it makes it harder to refactor the implementation code because you have a disproportionate number of tests to update." This happens because you are testing the private API instead of the public API.
Petr Peller over 8 years

But to prevent duplicated code in unit tests you usually need to introduce new logic. I don't think unit tests should contain logic because then you would need unit tests of unit tests.
Anirudh almost 6 years

For readability of reports is more important than tests, specially when doing integration or end to end test, the scenarios could be complex enough to avoid navigating a tiny bit, it is okay to find the failure but again for me the failure in the reports should explain the problem good enough.
KolA about 5 years

@user11617 please define "private API" and "public Api". In my understanding, public Api is the Api that visible to external world / 3rd party consumers and explicitly versioned via SemVer or similar, anything else is private. With this definition almost all Unit tests are testing "private API" and hence more sensitive to code duplication, which I think is true.
Nathan almost 5 years

@KolA "Public" doesn't mean 3rd party consumers - this isn't a web API. The public API of a class refers to the methods that are meant to be used by client code (which typically doesn't/shouldn't be changing that much) - generally the "public" methods. The private API refers to the logic and methods that are used internally. These shouldn't be accessed from outside the class. This is one reason why it is important to correctly encapsulate logic in a class using access modifiers or the convention in the language being used.
KolA almost 5 years

@Nathan any library/dll/nuget package has 3rd party consumers, it doesn't have to be a web api. What I referred to is that it's very common to declare public classes and members that are not supposed to be used directly by library consumers (or at best make them internal and annotate assembly with InternalsVisibleToAttribute) just to allow unit tests to reach them directly. It leads to heaps of tests coupled witht implementation and makes them more of a burden than advantage
Logan Pickup over 4 years

Readability is equally important in tests and non-test code. Refactoring should not hurt readability - it should enhance readability.
Błażej Michalik over 3 years

A good rationale for what's being proposed here comes from fundamental assumptions behind DRY and unit testing: repeating yourself is bad, because it multiplies the work required to change behavior. On the other hand, if you have to change unit tests often, there's a good chance your unit tests are too strict - they are testing how instead of what. An ideal unit test doesn't change after it has been laid out. It means that applying DRY to unit tests does not come with as much profit as when applied to application code. DRY is a response to change. Unit tests shouldn't.
Derek Greer about 3 years

Oddly, I agree with both spiv's and Kristopher Johnson's answers. I would explain it this way: You always want to avoid duplicating implementation details in tests, but never at the expense of hiding the context. Poor choices for removing duplication are inheritance chains and composition of types which hide context. Good choices for removing duplication are Factories, Builders, and Object Mothers.
spikemanuk about 3 years

+1 for "I like when the setup-run-verify rhythm is apparent in test methods." just what I was going to say. So I would think twice before extracting a common method which performs two or more of the setup-run-verify phases
Christian H over 2 years

applause: You don't want your test code to become so complex that you need to write unit-test-tests.