Legacy Code Photo by Pixabay Contributor

Legacy Code

As developers, we know all too well that code has a lifespan. Knowing how to avoid code re-writes, how to improve software longevity, readability, maintainability and reduce the risk of bugs are all ongoing challenges. It’s clear that a good team will need to set some best practices and standards either directly through documentation or indirectly through consistency in the code. Some recent reading around the topic of legacy code (Working Effectively with Legacy Code by Michael Feathers) and technical debt inspired me to write this post to re-collect what I’ve learned. There are actually some very simple tweaks that everyone can do to make technical debt easier to deal with. When done right and often, small changes can hang together as part of a much bigger picture or help the next guy pull out some code, uplift a module or re-write some functionality from scratch. Sure enough ensuring your code reads well and that it is free from bugs will mean a focus on writing tests. Although I find that the cost of writing tests is not actually as time consuming as people think, instead it tends to be more about letting go of some utopian best practices and being willing to make some short term compromises. Here are my tips and thoughts on improving legacy code.

1. Breaking Dependencies

When we talk about legacy code the largest burden by far and large is making new changes without changing or losing any existing functionality. TDD tackles this but some classes can grow so out of control that it may seem difficult to test them. Testable code in general leads to well designed and clean code. Here are some approaches to breaking out dependencies to lead to easier testing.

2. Naming

The power of naming can often be underestimated. Every team will have their own set standards and you can normally gauge by the existing code what makes a consistent name to fit within the mix. You should choose names that make sense to your team and within your domain but not unobvious to new comers. Failing that, my personal view is that a good name is one that is short as possible and as long as it needs to be. As a rule of thumb they should be at least more than one letter unless there is a good reason such as a lambda, algebra, formulas, for loops or in contexts where they make sense. Conversely they should be shorter than around 30-35 characters. A name should be descriptive and should not use un-documented or uncommon acronyms. It can be amazing how many short hands can be used through a codebase on the assumption that everyone is familiar with their meaning. If its a function include a verb to indicate it is actually a function and try not to use ‘and’, come up with a different name instead that represents the function as a whole or break down the method further. I am guilty of this but try not to embed a variable’s type into the variable name, if anyone changes the type at any point this can be quite confusing. Think twice before including ‘Interface’ or ‘Abstract’ into your abstractions. These are a waste of characters and are not that much more expressive. This also goes to a lesser extent to Hungarian notation. For example, a IVehicle/Car could just simply be Vehicle/Car on their own and it’s normally not too difficult in modern IDEs to see that a class is either abstract or an interface. Swap out comments with well-named variables as your code should be as good and self descriptive, comments should really be used sparingly.

3. Design Roadmaps

There can be particular confusing parts to an application or there may be a lack of structure altogether. It pays to have discussions with team members about the overall architecture to reinforce understanding. Draw out some diagrams, document features and the overall design separately to the codebase. An up to date confluence page is going to help reinforce everyone’s understanding and ensure everyone is on the same page. Moreover this encourages knowledge sharing in the team and for others to do the same.

4. Delete code and comments

There are countless times where unused code and old comments have led me to a bunch of confusion. Sometimes an old way of doing something is left around or some comments surrounding a method no longer make sense. These are small things but can introduce a lot of confusion and misdirection. If you know a line or two that doesn’t belong there anymore you should spend a couple of minutes to clean it up or delete it in a separate commit.

5. Logging

Without logs debugging production issues is pure guess work. I personally advocate more logging at the risk of storing too much and over-polluting so I can at least capture issues, but you may choose to go with a more strategic logging strategy. Either way, define a standard, stick with it to make it easily searchable. Logging is going to aid your understanding on legacy code.

6. Error Handling

Putting best practices to good use when handling exceptions will pay dividends for debugging production issues. Use the right exception for an error, if unexpectedly a number is null then throw a NullPointerException, if an argument is incorrect throw an IllegalArgumentException. Try-catches should start with the most specific exception in the block and then cascade down to more generic ones so that errors are logged with greater accuracy. Re-throwing exceptions in many languages will mean you lose information about the original stack trace such as the original source and line number of the error. Personally I try to deal with an exception within the original try-catch block, log the exception and then use something similar to the Null Object pattern to notify the caller that something went wrong. If this is not possible it would be better to create a new type for your exception to best wrap the original exception and better describe the error. Last of all, I would try to standardise exceptions as much as possible as well as the messages that are thrown to improve readability.

Bad Error Handling

public String getCustomerName(int id) {
    try {
        // stream will not close automatically 
        // unless handled ina finally statement
        FileInputStream fio = new FileInputStream(filename);
        Customer c = getCustomer(fio, id);
        return c.toString();
    }
    // no specific error handling
    catch (Exception e) {
        // re-throwing error loses stack trace
        // and can cause confusion if handled twice
        throw e;
    }
    return "";
}

Better Error Handling

public String getCustomerName(int id) {
    // using a try-with statement so that the 
    // stream is automatically closed
    try (FileInputStream fio = new FileInputStream(filename)) {
        Customer c = getCustomer(fio);
        return c.toString();
    }
    catch (FileNotFoundException e2) {
        // specific error is handled first
        logError(e2);
    }
    catch (NullPointerException e1) {
        logError(e1);
        // if we really need to re-throw an exception then
        // we wrap it in a new exception to be handled 
        // by the caller
        throw new CustomerNotFoundException("Customer not found.", e1);
    }
    catch (Exception e) {
        logError(e);
    }
    return "";
}

These are just a few points I would consider important to moving legacy code forward with a particular emphasis on tests and breaking dependencies to facilitate change and refactoring without breaking functionality. Hopefully some of the techniques here reinforce what you may already know and help bring your code to a cleaner state.