Can unit testing be successfully added into an existing production project? If so, how and is it worth it?

unit-testing testing tdd

24,860

Solution 1

I've introduced unit tests to code bases that did not have it previously. The last big project I was involved with where I did this the product was already in production with zero unit tests when I arrived to the team. When I left - 2 years later - we had 4500+ or so tests yielding about 33 % code coverage in a code base with 230 000 + production LOC (real time financial Win-Forms application). That may sound low, but the result was a significant improvement in code quality and defect rate - plus improved morale and profitability.

It can be done when you have both an accurate understanding and commitment from the parties involved.

First of all, it is important to understand that unit testing is a skill in itself. You can be a very productive programmer by "conventional" standards and still struggle to write unit tests in a way that scales in a larger project.

Also, and specifically for your situation, adding unit tests to an existing code base that has no tests is also a specialized skill in itself. Unless you or somebody in your team has successful experience with introducing unit tests to an existing code base, I would say reading Feather's book is a requirement (not optional or strongly recommended).

Making the transition to unit testing your code is an investment in people and skills just as much as in the quality of the code base. Understanding this is very important in terms of mindset and managing expectations.

Now, for your comments and questions:

However, I'm concerned that I'll end up missing the big picture and end up missing fundamental tests that would have been included if I'd used TDD from the get go.

Short answer: Yes, you will miss tests and yes they might not initially look like what they would have in a green field situation.

Deeper level answer is this: It does not matter. You start with no tests. Start adding tests, and refactor as you go. As skill levels get better, start raising the bar for all newly written code added to your project. Keep improving etc...

Now, reading in between the lines here I get the impression that this is coming from the mindset of "perfection as an excuse for not taking action". A better mindset is to focus on self trust. So as you may not know how to do it yet, you will figure out how to as you go and fill in the blanks. Therefore, there is no reason to worry.

Again, its a skill. You can not go from zero tests to TDD-perfection in one "process" or "step by step" cook book approach in a linear fashion. It will be a process. Your expectations must be to make gradual and incremental progress and improvement. There is no magic pill.

The good news is that as the months (and even years) pass, your code will gradually start to become "proper" well factored and well tested code.

As a side note. You will find that the primary obstacle to introducing unit tests in an old code base is lack of cohesion and excessive dependencies. You will therefore probably find that the most important skill will become how to break existing dependencies and decoupling code, rather than writing the actual unit tests themselves.

Are there any process/steps that should be adhered to in order to ensure that an existing solutions is properly unit tested and not just bodged in?

Unless you already have it, set up a build server and set up a continuous integration build that runs on every checkin including all unit tests with code coverage.

Train your people.

Start somewhere and start adding tests while you make progress from the customer's perspective (see below).

Use code coverage as a guiding reference of how much of your production code base is under test.

Build time should always be FAST. If your build time is slow, your unit testing skills are lagging. Find the slow tests and improve them (decouple production code and test in isolation). Well written, you should easilly be able to have several thousands of unit tests and still complete a build in under 10 minutes (~1-few ms / test is a good but very rough guideline, some few exceptions may apply like code using reflection etc).

Inspect and adapt.

How can I ensure that the tests are of a good quality and aren't just a case of any test is better than no tests.

Your own judgement must be your primary source of reality. There is no metric that can replace skill.

If you don't have that experience or judgement, consider contracting someone who does.

Two rough secondary indicators are total code coverage and build speed.

Is it worth the effort for an existing solution that's in production?

Yes. The vast majority of the money spent on a custom built system or solution is spent after it is put in production. And investing in quality, people and skills should never be out of style.

Would it better to ignore the testing for this project and add it in a possible future re-write?

You would have to take into consideration, not only the investment in people and skills, but most importantly the total cost of ownership and the expected life time of the system.

My personal answer would be "yes of course" in the majority of cases because I know its just so much better, but I recognize that there might be exceptions.

What will be more benefical; spending a few weeks adding tests or a few weeks adding functionality?

Neither. Your approach should be to add tests to your code base WHILE you are making progress in terms of functionality.

Again, it is an investment in people, skills AND the quality of the code base and as such it will require time. Team members need to learn how to break dependencies, write unit tests, learn new habbits, improve discipline and quality awareness, how to better design software, etc. It is important to understand that when you start adding tests your team members likely don't have these skills yet at the level they need to be for that approach to be successful, so stopping progress to spend all time to add a lot of tests simply won't work.

Also, adding unit tests to an existing code base of any sizeable project size is a LARGE undertaking which requires commitment and persistance. You can't change something fundamental, expect a lot of learning on the way and ask your sponsor to not expect any ROI by halting the flow of business value. That won't fly, and frankly it shouldn't.

Thirdly, you want to instill sound business focus values in your team. Quality never comes at the expense of the customer and you can't go fast without quality. Also, the customer is living in a changing world, and your job is to make it easier for him to adapt. Customer alignment requires both quality and the flow of business value.

What you are doing is paying off technical debt. And you are doing so while still serving your customers ever changing needs. Gradually as debt is paid off, the situation improves, and it is easier to serve the customer better and deliver more value. Etc. This positive momentum is what you should aim for because it underlines the principles of sustainable pace and will maintain and improve moral - both for your development team, your customer and your stakeholders.

Hope that helps

Solution 2

Is it worth the effort for an existing solution that's in production?

Yes!

Would it better to ignore the testing for this project and add it in a possible future re-write?

No!

What will be more benefical; spending a few weeks adding tests or a few weeks adding functionality?

Adding testing (especially automated testing) makes it much easier to keep the project working in the future, and it makes it significantly less likely that you'll ship stupid problems to the user.

Tests to put in a priori are ones that check whether what you believe the public interface to your code (and each module in it) is working the way you think. If you can, try to also induce each isolated failure mode that your code modules should have (note that this can be non-trivial, and you should be careful to not check too carefully how things fail, e.g., you don't really want to do things like counting the number of log messages produced on failure, since verifying that it is logged at all is enough).

Then put in a test for each current bug in your bug database that induces exactly the bug and which will pass when the bug is fixed. Then fix those bugs! :-)

It does cost time up front to add tests, but you get paid back many times over at the back end as your code ends up being of much higher quality. That matters enormously when you're trying to ship a new version or carry out maintenance.

Solution 3

The problem with retrofitting unit tests is you'll realise you didn't think of injecting a dependency here or using an interface there, and before long you'll be rewriting the entire component. If you have the time to do this, you'll build yourself a nice safety net, but you could have introduced subtle bugs along the way.

I've been involved with many projects which really needed unit tests from day one, and there is no easy way to get them in there, short of a complete rewrite, which cannot usually be justified when the code is working and already making money. Recently, I have resorted to writing powershell scripts that exercise the code in a way that reproduces a defect as soon as it is raised and then keeping these scripts as a suite of regression tests for further changes down the line. That way you can at least start to build up some tests for the application without changing it too much, however, these are more like end to end regression tests than proper unit tests.

Solution 4

I do agree with what most everyone else has said. Adding tests to existing code is valuable. I will never disagree with that point, but I would like to add one caveat.

Although adding tests to existing code is valuable, it does come at a cost. It comes at the cost of not building out new features. How these two things balance out depends entirely on the project, and there are a number of variables.

How long will it take you to put all that code under test? Days? Weeks? Months? Years?
Who are you writing this code for? Paying customers? A professor? An open source project?
What is your schedule like? Do you have hard deadlines you must meet? Do you have any deadlines at all?

Again, let me stress, tests are valuable and you should work to put your old code under test. This is really more a matter of how you approach it. If you can afford to drop everything and put all your old code under test, do it. If that's not realistic, here's what you should do at the very least

Any new code you write should be completely under unit test
Any old code you happen to touch (bug fix, extension, etc.) should be put under unit test

Also, this is not an all or nothing proposition. If you have a team of, say, four people, and you can meet your deadlines by putting one or two people on legacy testing duty, by all means do that.

Edit:

I'm aiming to write this question up later with pros and cons to try and show management that it's worth spending the man hours on moving the future development of the product to TDD.

This is like asking "What are the pros and cons to using source control?" or "What are the pros and cons to interviewing people before hiring them?" or "What are the pros and cons to breathing?"

Sometimes there is only one side to the argument. You need to have automated tests of some form for any project of any complexity. No, tests don't write themselves, and, yes, it will take a little extra time to get things out the door. But in the long run it will take more time and cost more money to fix bugs after the fact than write tests up front. Period. That's all there is to it.

Solution 5

When we started adding tests, it was to a ten-year-old, approximately million-line codebase, with far too much logic in the UI and in the reporting code.

One of the first things we did (after setting up a continuous build server) was to add regression tests. These were end-to-end tests.

Each test suite starts by initializing the database to a known state. We actually have dozens of regression datasets that we keep in Subversion (in a separate repository from our code, because of the sheer size). Each test's FixtureSetUp copies one of these regression datasets into a temp database, and then runs from there.
The test fixture setup then runs some process whose results we're interested in. (This step is optional -- some regression tests exist only to test the reports.)
Then each test runs a report, outputs the report to a .csv file, and compares the contents of that .csv to a saved snapshot. These snapshot .csvs are stored in Subversion next to each regression dataset. If the report output doesn't match the saved snapshot, the test fails.

The purpose of regression tests is to tell you if something changes. That means they fail if you broke something, but they also fail if you changed something on purpose (in which case the fix is to update the snapshot file). You don't know that the snapshot files are even correct -- there might be bugs in the system (and then when you fix those bugs, the regression tests will fail).

Nevertheless, regression tests were a huge win for us. Just about everything in our system has a report, so by spending a few weeks getting a test harness around the reports, we were able to get some level of coverage over a huge part of our code base. Writing the equivalent unit tests would have taken months or years. (Unit tests would have given us far better coverage, and would have been far less fragile; but I'd rather have something now, rather than waiting years for perfection.)

Then we went back and started adding unit tests when we fixed bugs, or added enhancements, or needed to understand some code. Regression tests in no way remove the need for unit tests; they're just a first-level safety net, so that you get some level of test coverage quickly. Then you can start refactoring to break dependencies, so you can add unit tests; and the regression tests give you a level of confidence that your refactoring isn't breaking anything.

Regression tests have problems: they're slow, and there are too many reasons why they can break. But at least for us, they were so worth it. They've caught countless bugs over the last five years, and they catch them within a few hours, rather than waiting for a QA cycle. We still have those original regression tests, spread over seven different continuous-build machines (separate from the one that runs the fast unit tests), and we even add to them from time to time, because we still have so much code that our 6,000+ unit tests don't cover.

View more solutions

24,860

djdd87

Updated on August 21, 2020

Comments

djdd87 almost 4 years
I'm strongly considering adding unit testing to an existing project that is in production. It was started 18 months ago before I could really see any benefit of TDD (face palm), so now it's a rather large solution with a number of projects and I haven't the foggiest idea where to start in adding unit tests. What's making me consider this is that occasionally an old bug seems to resurface, or a bug is checked in as fixed without really being fixed. Unit testing would reduce or prevents these issues occuring.

By reading similar questions on SO, I've seen recommendations such as starting at the bug tracker and writing a test case for each bug to prevent regression. However, I'm concerned that I'll end up missing the big picture and end up missing fundamental tests that would have been included if I'd used TDD from the get go.

Are there any process/steps that should be adhered to in order to ensure that an existing solutions is properly unit tested and not just bodged in? How can I ensure that the tests are of a good quality and aren't just a case of any test is better than no tests.

So I guess what I'm also asking is;
- Is it worth the effort for an existing solution that's in production?
- Would it better to ignore the testing for this project and add it in a possible future re-write?
- What will be more benefical; spending a few weeks adding tests or a few weeks adding functionality?
(Obviously the answer to the third point is entirely dependant on whether you're speaking to management or a developer)

Reason for Bounty

Adding a bounty to try and attract a broader range of answers that not only confirm my existing suspicion that it is a good thing to do, but also some good reasons against.

I'm aiming to write this question up later with pros and cons to try and show management that it's worth spending the man hours on moving the future development of the product to TDD. I want to approach this challenge and develop my reasoning without my own biased point of view.
- Ralph Sinsuat almost 14 years
  
  Obligatory link to Michael Feathers' book on the topic: amazon.com/Working-Effectively-Legacy-Michael-Feathers/dp/…
- djdd87 almost 14 years
  
  @Mark - Thanks, it's in the link I provided. I was hoping to get a decent answer without buying another book... although you can never have too many books (unless you want to get work done that is).
- manuel aldana almost 14 years
  
  You really need to read this book :) It is one of my favorites and really helps to understand the tension between refactoring and automated-testing.
Roopesh Shenoy almost 14 years

And I urge you to read this - joelonsoftware.com/items/2009/01/31.html
Ihor Kaharlichenko almost 14 years

Would you prefer to 'fix pending bugs' or prevent them from arising? Proper unit testing does save time by minimizing the amount of time spent on bug-fixing.
Roopesh Shenoy almost 14 years

That's a myth. If you are telling me that automated unit tests are a replacement for manual testing, then you are seriously, seriously mistaken. And what do manual testers log, if not bugs?
Roopesh Shenoy almost 14 years

And yeah dont get me wrong - Im not saying unit tests are an absolute waste - the point is considering the time it takes to write them and the reasons for which they might have to change when you change your product, do they really pay back.. for me I have tried both sides and the answer has been no, they dont pay back fast enough.
Donal Fellows almost 13 years

If the code is split up reasonably well from the beginning, it's fairly easy to test it. The problem comes when you've got code that is messily all stitched together and where the only test points exposed are for full integration tests. (I have code like that where testability is nearly zero because of the amount that it relies on other components that aren't easy to mock, and where deployment requires a server reboot. Testable… but hard to do well.)