Technical debt and legacy code

Every job I’ve ever had has involved dealing with some amount of technical debt. The amount has varied, but its always there, even in modern codebases.

So what is technical debt?

There are loads of definitions – I think of it as code that is difficult or dangerous to change or extend. Technical debt is often associated with legacy code, which Michael Feathers defines as code without automated tests, though i’ve seen legacy code that was straightforward to change, and code with automated tests that wasn’t.

There’s a surprising amount of code with tests thats difficult to change or extend. Often dependencies have not been isolated well, so tests have extensive scaffolding. Another reason is overly complex or poor designs – if I find i’m changing every class in a subsystem for a simple addition, there is often a design problem.

The issue for me is not that technical debt exists (we’ve all written some – or at least I have), but how to address it. I think its as much a people problem as a technical one and there are many reasons for it

Managers discourage changes to existing code

  • Some managers don’t understand why existing code should change to accommodate new features. They ask why couldn’t you get it right first time? Ironically the more successful a product, the more likely it is to change and grow, often in unexpected ways.
  • Managers who have been developers in the past, know the dangers of making changes to code that has no tests, and shy away from making changes for that reason.
  • Others have been burnt by teams that got bogged down performing epic refactoring’s or rewrites
  • Finally, there is the ever present pressure of the roadmap, which managers are typically far more exposed to than any individual developer. Under that pressure, its tempting for even the best manager to encourage shortcuts.
I’m maybe being unfair to managers here. Much of the above could be applied to product owners, team leads, or anyone with else with a say in what’s being built.

We want to work on new code

Preferably in the hottest language using a cool new framework. It combines with appeal of the new with a great bullet point for the resume. And of course, the newer frameworks and languages can be more productive than older ones. Many (perhaps most) developers would rather attempt a rewrite than refactor legacy code. That can be the right thing to do; often it isn’t.
  • Old code still needs maintaining while the rewrite is happening
  • In agile teams where backlogs can change rapidly its easy for large rewrites to stall or be abandoned as priorities change.
  • If you are dealing with a codebase that has a lot of duplication, it may be better to reduce the duplication first, rather than introduce yet another mechanism for doing something
  • If the code is reasonably modern and has decent test coverage, refactoring is often by far the safer route – though some developers still argue for a rewrite

We don’t recognise that there is a problem

Ignorance or apathy? I’m not sure.

We’re scared to touch it

After being burnt a couple of times by introducing subtle bugs its easy to see why. Which leads to…

We don’t have the skills

There are many developers with good skills these days in TDD/BDD (or at least writing tests concurrently with the code). Fewer know the techniques for dealing with legacy code. A rewrite often feels easier.

What to do about it?

Change attitudes

I like Robert Martins boy scout rule: ‘Leave the campground cleaner than you found it’, which encourages a continual improvement mindset. Improve the code base in a minor way every time you make a change, even if its just clarifying a name or removing an unused variable.
Changing managerial attitudes can be a little trickier. Modern code is meant to be malleable, and not everyone gets that. Incremental refactoring rather than big epics can help.

Skills and Training

There are some great books, primarily ‘Working Effectively with Legacy Code’, which give practical techniques and insights.
A simple example:
Imagine we have a class that we cannot easily put under test – maybe it references many other classes or a socket or database connection, but we need to add or change some behaviour. Here are a few techniques to consider:
  • If the method in question does not reference any member variables we can make it static and thereby write tests for it without having to instantiate the whole class. Ugly but effective.
  • We can subclass and in the derived class override selected methods to effectively null out dependencies
  • We can add setter methods to override dependencies.
  • We can make private methods public to get access to them (!)
  • We can link to mock libraries
  • When adding behaviour we could create a small object with the new behaviour using TDD and then just call it from the legacy class. In the short term this can be pretty ugly – maybe the new class has only a single method; but over time we could move behaviour as appropriate from the legacy class to the new one.

These techniques are highly incremental and can make the code feel worse in the short term. Maybe thats why they are used as much as they could be.

Another fun exercise is to try out the excellent ‘Gilded Rose’ kata, which presents a horribly messed up method and asks you to refactor it.

Practice

I learn a lot by doing dry runs. Check out the code and try out a few refactorings. Then throw that code away and try it for real. Don’t be surprised if your real refactoring works out differently – the point of the dry run is to gain confidence and context.

Personally I find improving a legacy codebase can be a rewarding activity in itself. Good luck with yours!
Advertisements