Came across this article by Steve Blank – “Startup Suicide – Rewriting the Code“. This can be summed up as:
- Under the pressures of growth, some startups allow their code base to get messy.
- This mess – and the pressure to keep it working – results, over time, in a slower delivery of features.
- This leads to a failure to adapt, which is seen as an existential threat to the future of the business,
- The business decides that a rewrite of the product is the way to go.
- This leads to death.
I don’t dispute the main premise of the article, but I do dispute the title. The answer is to rewrite the product. The question is: how do you do that?
Let’s put the obvious on the table: the company is already doomed unless something changes. The failure to adapt will see the company collapse eventually – and they recognise it. Preserving the status quo is not an option.
The riskiest thing is to do nothing
Ergo, taking a risky decision can’t be considered suicide (it may be euthanasia). The old code has to go away – or, at least, not be worked on anymore (eventually). Still, there are smart decisions and dumb decisions.
- Stopping all work on the current version to write the new one? Dumb. You bring forward the fate you’re worried about (not being able to adapt and deliver change). Unless you can deliver the new version in a very short time period, don’t go here.
- Continuing to add features to the existing version while trying to get the new one up? Also dumb. First, you slow down your rate of delivery on your old version. Your new version is going to take longer to deliver. You probably have do changes in both – double the work, for no more benefit. And, of course, you create a two-teir culture – nobody wants to work on the old platform.
The smart option (well, one smart option) is to work out how to build the new version and integrate with the old version. All new features get built in the new version. When you need to change an existing feature, you rewrite just that feature in the new version (with the change you want), and change the old version just enough to integrate it with the new one. And deliver the new version – integrated with the old – on short time cycles.
This way, you move the bulk of your development into the new version – a little at the start, and ramping up as it gets stable. You deliver the new version integrated with the old – from the user’s point of view, it’s one product. You start to get immediate payoff from the investment, and you reap the benefits in the most valuable area: the places things are changing right now. Over time, this becomes the places things change the most frequently. Sure, you’ve still got some of the crufty legacy system lying around, but if it’s not being worked on, does that matter?
Exactly how you achieve the integration will vary, depending on what the product is and the technology choices you have. My personal preference would be to move towards a modular set of services – the old application would be a client to the service, and eventually (as it gets thinner and thinner), you can put a new client in its place. The new services don’t have to do everything you need immediately – they just need to do enough. This works particularly well if the challenge is how do you get onto multiple platforms (where you need multiple clients by definition)
You do need to be careful with this approach. The biggest thing to watch out for is “don’t repeat your mistakes”. The old system started clean and got crufty for a reason – if you don’t correct that, you’ll make the new system crufty as well. You can’t just assume that it will be clean and stay clean – the natural tendency of a changing code base is to rot over time, and it takes engineering discipline – especially in the face of deadline pressures – to prevent that.
The last piece of advice here is to not focus too much on what the old system did then as you deliver the new version. Instead, focus on what you want the new system to deliver now.
“I skate to where the puck is going to be, not where it has been” – Wayne Gretsky