Paying Technical Debt: How to Rescue Legacy Code Through Refactoring

How can you get a legacy codebase under control and bring it to a new level of maturity? This post summarises my advice and lessons learned from years of working on a large legacy web application.

Legacy code can be saved by refactoring

I have good news for you! Squirrels plant thousands of new trees every year by simply forgetting where they leave their acorns. Also: your project can be saved.

No matter how awful a muddy legacy code mess your boss has bravely volunteered for you to deal with, there is a way out of the mire. There will be twists and turns along the way, and a monster behind every other tree. But, one step at a time, you will get there.

Fear no evil

Fair enough, you didn’t ask for this horrific quest. The blood-covered ‘Here Be Dragons’ sign near the entrance of the code swamp induces a strong urge to call in sick for the next couple of years. At first glance, you’d rather have some light lunch first and perhaps a leisurely stroll through the meadow after that, than being the petrified peasant to rid said swamp of its monsters.

Unfortunately, though, you are forced past the sign by a gallant yet firm nudge from the boss, who will stay behind to defend the town and in the process have some light lunch with the duke who owns the swamp.

Bar some drastic measures you can’t change these basic facts. However, you can turn a swamp into a magnificent meadow with just one muddy patch where cows live!

Technical debt – How did it get this way?

As you poke and prod the area just past the warning sign with a stick, you might wonder: how on earth can anyone have let this happen? Someone must have seen this coming, surely? Were the people who wrote this code that incompetent?

Possibly. But not more so than anyone else. When push comes to shove, however, confident people might seem, nobody else knows what they’re doing either.

Incompetence is far from the whole story. The principle at work here is usually referred to with a metaphor: Technical Debt.

Code cancer

The idea is that during the development of any project, shortcuts will be taken. Ugly hacks will be allowed to get ahead quickly, just this once. This way, debt is accumulated: code that does the job any way it can but is completely incapable of being maintained properly, let alone built upon.

And so, bit by bit, a project’s code is corrupted. Nobody cares about this as long as they’re not the one having to touch it. But eventually, someone has to, and it will probably be you.

This principle is at work in any project, and if it isn’t it most likely means you’re not moving fast enough. Your competition WILL take shortcuts, and get new features out more quickly. As a result, your precious users will leave, and join the party in the swamp next door and have a long island iced tea at the cocktail bar there instead of standing in your impeccable square meter meadow you’ve painstakingly created over the past year. Your meadow might be pretty in and of itself but there’s just not much going on there.

So, any healthy project will accumulate some technical debt, but in order not to go bankrupt – the code getting so hard to maintain it becomes unmanageable – at some point this debt will have to be settled.

That is where the frightened and resentful peasant comes in, forced to pay the debt in the swamp owner’s stead. Paying this debt is often referred to as refactoring: swapping a bit of code with another bit that does the same but can be maintained and expanded more easily.

Persuading the customer

The duke, being the owner of the swamp, has a bit to say about which monsters you should be adding. There are always new features your customer wants to implement, the deadline for each one is last week and you should feel incompetent for not having finished all of them two weeks ago. So you will probably have a bit of explaining to do to convince him he needs to let you rebuild what he perceives as being in his possession already.

This is a critical point. If you don’t succeed in swaying your customer to start paying what he owes, technical debt will grow to the point of collapse. You will be the guy left in the rubble, held accountable.

In the end, your customer and you want the same thing: have a fun and painless job that earns everyone involved a living. This follows from a stable project built from code you can change without the whole thing coming down on top of you.

Fight for your freedom

So for your own good, you need to make the situation clear. I think the best way to do it is to describe the impending doom lurking just around the corner, what the alternative is, and let the swamp owner know your good intentions:

Explain what technical debt means, and how it makes development slow down to a halt if it gets too high. The customer has to realise in which ways this will cost him: time spent finding bugs, fixing them, and creating new ones, insufficient time to add features and advance his business.
Refactor with clear short-term goals in mind. You can add a new feature in a week, or you can refactor for a week and add it in a day, with the prospect of being able to add similar features in the future very easily. Show how option two will be more expensive short term but infinitely cheaper in the long term.
Point out that you want the best for your customer and yourself, and that these two concerns are really the same thing.

Once you have the swamp owner on your side the hardest part is over, you now have the freedom to move the project in the right direction.

A new swamp do not make

You can recover from a huge mess, don’t replace it with a new one. Greenfield rewrites seldom work out, most never get released.

When you finally convince the duke that people who stroll into his swamp consistently don’t come out again, he might want to build a new version. You might be tempted to comply, starting anew using shiny new tools, convinced everything will be better when you’re done. Don’t.

The main risks of a rewrite, as I see it, are:

The Big Red Button Switch: the new code has never run in production before and is guaranteed to explode on first production use.
Migration of data: Exporting and importing data from the old system while keeping up with live data coming in is likely to go wrong.
Mistakes old and new: While rewriting code you will create bugs, recreate old bugs, miss subtleties and implicit features, … Often these problems will occur in parts of the system that were not really problematic to begin with. You lose lots of time while not adding any value to the business of your customer.
Keeping up with the business: during the long time you’re working on the new project, the old project needs to keep evolving with the business needs as well. The new code will need to rebuild the old features AND add the new features continually being added to the old project.

So, don’t start over. Improve what you have.

Make problems visible

This might sound vaguely unpleasant, but the monsters in your swamp should be spitting in your face, dragons burning off the hairs on your arms, the gnome living near the mushroom-shaped rock kicking you in the shins.

These nuisances are in there ruining your project whether you notice them or not, you need to face them in order to get rid of them. You need to be able to see what is going wrong at any point in time.

Visualise errors. Make some time every week to fix some of the errors that occur most often. Eventually, the graph will flatline, hopefully beating you to it.
Monitor your environment. This is critical in diagnosing bottlenecks and impending crises (see next paragraph).

Now you know where and when it hurts, enabling you to administer first aid instead of potentially spending all your time scratching itches on a dying man.

Fight what hurts most

Next, it is vital to have a vision of the system as it ought to be in a perfect world, a mental picture of your ultimate meadow guiding your every move:

Scratch it in the mud so you don’t forget. By taking small intermediate steps in the direction of heaven, you will find yourself looking through the fence eventually.

The key is now to take the information you get from your monitoring tools and the yearning for Walhalla, combining the two to decide which problem to tackle first. Your biggest problem might not be a realistic goal right away, but start refactoring manageable roadblocks first and make your way there.

The cripple pixie that seems cross with you for some reason might annoy you but she’s harmless. Your time is better spent contemplating how to kick the ogre out.

This code is mine

Fixing problems that matter is essential, but this does not imply details do not matter. In fact, they are probably just as important.

The Boy Scout rule of keeping the camping ground cleaner than you found it is rule number one in the swamp survival guide. By continually cleaning up clutter, taking care not to leave new filth anywhere, your environment will get ever cleaner until you find yourself in the pristine meadow you crave.

To achieve this, attitude is everything. Strive towards clean code.

You have to care. The code is yours and anyone who touches it has to answer to you. Sloppiness and carelessness can not be tolerated.
The whole team has to care. However much trouble you go through to clean up a project, if somebody undoes your hard work behind you, you’re getting nowhere.
Discipline is of the essence. Once you (or any team member!) start letting things slide, it will be next to impossible to escape.
Keep taking small steps. In the right direction of course. Progress is way more important than achieving perfection.
Small victories generate momentum. You will start seeing really nice patches of code inducing you to clean up adjacent patches as well.

Build a library

A good indicator of civilisation is the number of libraries per square meter. Therefore, to introduce civility into your project, this seems like an excellent approach.

Even the most treacherous morass will have, here and there, a few spots that aren’t too bad on their own. Whenever you find a bit of code that does a thing well, move it to your brand new library so it can be reused.

Doubtless, the code you’re expected to rescue will do things completely wrong. It will somehow completely ignore and circumvent any standards or best practices that would make your life as a developer easier. Nobody will fix this for you, there is only one way to go: start following these practices and standards in your new code.

Make available industry-standard components or modules and start using them. You will have code using old or custom ways to accomplish things, but that doesn’t mean you can’t build new stuff the proper way.

Refactor using your new tools

Once the tools are there, you can refactor existing code to use them as well. You don’t have to keep a marathon sprint to do this, just whenever you stumble upon a case that does things the old way, fix that case. Small steps. You will get to a point where everything is replaced and you won’t recognise the old code anymore.

Often, code would be quite acceptable, if it weren’t for the myriad dependencies (like session access, modules or services consulted on the spot, …) scattered throughout. A good way to handle this is by pushing these out of the method or function the code resides in, so they come in through the method’s parameters. That way you make that bit of code self-contained and it becomes possible to move it around or split it up further.

And once code is self-contained, you can test it.

Building confidence – testing

A violent ogre roaming through the mud can do more damage than a violent ogre in chains. A great way to gain some control over your system is trying to put chains on the parts that are most critical and thus potentially cause the most damage. Tests, preferably automated, that run on every build.

The more automated tests you have to check the behaviour of the system, the more confidence you can have when changing things Because, if anything breaks, your tests will tell you something is wrong so you can fix it before the problem makes it to the production environment.

High-level tests

One way to start testing your system is by adding acceptance tests for critical scenarios within your system. For an average e-commerce system this would definitely include the checkout process, if no orders come in you lose money. If orders come in but there is a problem processing them, that’s something you can recover from without any customer noticing. With tests like that in place you can protect yourself from unwittingly introducing major problems.

Low-level tests

On a lower level you can apply unit tests. Using the small steps approach of pushing dependencies out of your methods one by one, the result will be code that has a clear input and a clear output. In this situation it becomes possible to cover the functionality of that code using unit tests. You define a set of inputs and declare the outputs you expect your code to produce, and your test suite will make sure that bit of code keeps behaving as expected.

Don’t test everything

Testing every single line of code in a system is both economically unviable and unnecessary. In principle it would be great to be perfectly confident about every aspect of your system but the cost of making everything testable, writing tests and maintaining the tests will start to outweigh the benefits at some point. In my experience, your time is best spent writing tests for non-trivial business logic and parts of the system that would cause havoc should they malfunction (even if the code is actually simple).

In this way you get the most dangerous monsters in chains and the swamp will be a safer, more predictable place for it.

Isolate and replace

A useful technique that combines pretty much all of the strategic patterns listed above is the old ‘isolate-and-replace’.

Every once in a while you will come across some code that is just intangible in its internal workings and effects to be properly small-steps refactored within polynomial time. In a case like this, consider the following approach:

Isolate the messy part, drop it in a separate method or class.
Push out dependencies so they come in as parameters
Split logic from side-effects (like saving to a DB) using a small-steps approach
Isolate the logic in a separate method
Add unit tests that cover the behaviour of the logic
Rewrite the logic from scratch. If it passes the tests, it should at least be close to the behaviour displayed by the old version
Run the old and new versions of the logic side by side in the production environment. The old version is still the one actually used.
Log differences in output between the new and the old version.
Monitor the logs diligently, whenever a difference shows up, add a new unit test for that case and fix the new version to pass the new case.
Repeat until the old and new versions consistently agree with each other.
Switch the old version of the logic with the new version with confidence.

This technique allows you to swap a car’s motor without the car breaking down or it even noticing it had its internals replaced.

Persuading yourself

The mindset and tools outlined above have proven, to me, to be a successful way of steering a system on the brink of collapse away from disaster.

Where initially I was hesitant to dedicate my efforts full-time to the task of taking on a long-term legacy challenge, I now see the many ways in which it has contributed to me becoming a better developer:

Learning to recognise bad code. You learn to see what’s bad because it’s what you have to spend most time on. The patterns you see here will not trip you up when you work on a new project because you know how it will hurt you in the future.
Back to basics. In my case, every single aspect of the system had to be questioned and re-evaluated. Nothing is holy and you get an appreciation for lots of issues modern-day frameworks handle for you.
You evolve with the system. You see new challenges arise as the system in your care grows. This way you learn which bottlenecks you need to deal with as more and more people use your work and you need to scale up to stay on top of things.
People are generating shiny new legacy constantly. Diligently. Faster than they generate garbage belts in our oceans, knowing how to make it work is a good skill to have.
Fun and rewarding. Especially looking back on a road well traveled and realising the system you have today looks nothing like the system you first met years ago induces a real sense of accomplishment.

To bring this substantial (or lengthy, at least) post to a close:

You can be the frightened peasant tiptoeing through the swamp hoping somebody will come find you and take you to Disneyland instead, or you can take matters into your own hands, act like you own the place, and enforce your own rules. Any swamp has solid ground underneath, and you can find it.

Epilogue

A few years later, you stroll through the meadow. On the horizon, you can see the duke replacing the old ‘Here Be Dragons’ sign with a beer commercial.

He’s got a huge grin on his face.

This article was first published on blog.intracto.com.

We hope you found this post informative

Before you move on, please consider supporting our non-profit mission by making a donation to Agile Alliance today. This is a community blog post. The opinions contained within belong solely to the author or authors, and may not represent the opinion or policy of Agile Alliance.

Jeroen Moons

Jeroen Moons is a senior Web developer at Intracto Digital Agency (Belgium), working on legacy old and new. He has expertise in e-commerce and Web application conversion optimisation along with an interest in machine learning, AI, genetic algorithms, and science in general.

Cookie	Duration	Description
__cfduid	1 month	The cookie is used by cdn services like CloudFare to identify individual clients behind a shared IP address and apply security settings on a per-client basis. It does not correspond to any user ID in the web application and does not store any personally identifiable information.
_csrf	session	This cookie is essential for the security of the website and visitor. It ensures visitor browsing security by preventing cross-site request forgery.
_GRECAPTCHA	5 months 27 days	This cookie is set by Google. In addition to certain standard Google cookies, reCAPTCHA sets a necessary cookie (_GRECAPTCHA) when executed for the purpose of providing its risk analysis.
cookielawinfo-checbox-analytics	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Analytics".
cookielawinfo-checbox-functional	11 months	The cookie is set by GDPR cookie consent to record the user consent for the cookies in the category "Functional".
cookielawinfo-checbox-others	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Other.
cookielawinfo-checkbox-advertisement	1 year	The cookie is set by GDPR cookie consent to record the user consent for the cookies in the category "Advertisement".
cookielawinfo-checkbox-necessary	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookies is used to store the user consent for the cookies in the category "Necessary".
cookielawinfo-checkbox-performance	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Performance".
gdpr[allowed_cookies]	1 year	This cookie is set by the GDPR WordPress plugin. It is used to store the cookies allowed by the logged-in users and the visitors of the website.
JSESSIONID	session	Used by sites written in JSP. General purpose platform session cookies that are used to maintain users' state across page requests.
PHPSESSID	session	This cookie is native to PHP applications. The cookie is used to store and identify a users' unique session ID for the purpose of managing user session on the website. The cookie is a session cookies and is deleted when all the browser windows are closed.
pmpro_visit		The cookie is set by PaidMembership Pro plugin. The cookie is used to manage user memberships.
viewed_cookie_policy	11 months	The cookie is set by the GDPR Cookie Consent plugin and is used to store whether or not user has consented to the use of cookies. It does not store any personal data.

Cookie	Duration	Description
__atuvc	1 year 1 month	This cookie is set by Addthis to make sure you see the updated count if you share a page and return to it before our share count cache is updated.
__atuvs	30 minutes	This cookie is set by Addthis to make sure you see the updated count if you share a page and return to it before our share count cache is updated.
__jid	30 minutes	Used to remember the user's Disqus login credentials across websites that use Disqus
aka_debug		This cookie is set by the provider Vimeo.This cookie is essential for the website to play video functionality. The cookie collects statistical information like how many times the video is displayed and what settings are used for playback.
bcookie	2 years	This cookie is set by linkedIn. The purpose of the cookie is to enable LinkedIn functionalities on the page.
CONSENT	16 years 8 months 15 days 5 hours	Description Pending
disqus_unique	1 year	Disqus.com internal statistics
lang	session	This cookie is used to store the language preferences of a user to serve up content in that stored language the next time user visit the website.
language		This cookie is used to store the language preference of the user.
lidc	1 day	This cookie is set by LinkedIn and used for routing.
locale	3 days	This cookie is used to store the language preference of a user allowing the website to content relevant to the preferred language.
STYXKEY_aa_signup_visited	session	No description

Cookie	Duration	Description
_gat_UA-17319182-1	1 minute	Set by Google Analytics and Google Tag Manager to enable website owners to track visitor behaviour and measure site performance. These cookies are used to collect information about how you use our website. The information collected includes number of visitors, pages visited and time spent on the website. The information is collected by Google Analytics in aggregated and anonymous form, and we use the data to help us make improvements to the website.
YSC	session	This cookies is set by Youtube and is used to track the views of embedded videos.

Cookie	Duration	Description
_ga	2 years	This cookie is installed by Google Analytics. The cookie is used to calculate visitor, session, campaign data and keep track of site usage for the site's analytics report. The cookies store information anonymously and assign a randomly generated number to identify unique visitors.
_gat_gtag_UA_17319182_1	1 minute	Set by Google Analytics and Google Tag Manager to enable website owners to track visitor behaviour and measure site performance. These cookies are used to collect information about how you use our website. The information collected includes number of visitors, pages visited and time spent on the website. The information is collected by Google Analytics in aggregated and anonymous form, and we use the data to help us make improvements to the website.
_gat_UA-0000000-1	1 minute	Set by Google Analytics and Google Tag Manager to enable website owners to track visitor behaviour and measure site performance. These cookies are used to collect information about how you use our website. The information collected includes number of visitors, pages visited and time spent on the website. The information is collected by Google Analytics in aggregated and anonymous form, and we use the data to help us make improvements to the website.
_gid	1 day	This cookie is installed by Google Analytics. The cookie is used to store information of how visitors use a website and helps in creating an analytics report of how the website is doing. The data collected including the number visitors, the source where they have come from, and the pages visted in an anonymous form.
eud	1 year 24 days	The domain of this cookie is owned by Rocketfuel. This cookie is used to sync with partner systems to identify the users. This cookie contains partner user IDs and last successful match time.
S	1 hour	domain .google.com
uvc	1 year 1 month	The cookie is set by addthis.com to determine the usage of Addthis.com service.
vuid	2 years	This domain of this cookie is owned by Vimeo. This cookie is used by vimeo to collect tracking information. It sets a unique ID to embed videos to the website.

Membership

Members-only Content

Become an Agile Alliance member!

Agile Conferences

Virtual Events

Community Events

Agile Essentials

Download the Agile Manifesto

Recent Blog Posts

Agile Resources

The NEW Agile Resource Guide

Reimagining Agile

MEMBER INITIATIVES

Your Community

Global Development

Global Affiliates