There is a trick to rewrites: start by writing a full suite of end to end tests....

afpx · on Dec 15, 2018

This strategy works especially well if your product provides an API and your customers or partners are motivated to work with you. Then, besides your suite of tests, you can use the customers’ products as verification. That is, if the customers products still work, then you can be confident that your rewrite works.

I’ve done this several times with success. There was a period of about 10 years where I specialized in rewriting legacy code. The first case, I rewrote 200k lines of C++ (COM/OLE) code in Java. Then, I joined a startup that had to scrap its entire codebase because of scaling issues. Then, I worked for two other companies that had acquired other software companies (with crappy code), and I led the rewrite and integration efforts. These are the notable ones.

Believe me, rewrites can be successful. They must be done carefully. But, there are lots of ways to manage the process and mitigate risks. I’ve done it at least 10 times, and I’ve never had a failure or substantial cost or schedule overrun.

olooney · on Dec 15, 2018

> There is a trick to rewrites: start by writing a full suite of end to end tests. Once done you'll discover that you can easily stabilize your old code and make changes with this security harness.

I would call that refactoring, whereas a rewrite means starting from scratch. I agree that refactoring is the superior choice if at all possible.

wvenable · on Dec 15, 2018

Tests fossilize a design. This is generally a good thing and allow you to focus on refactoring and bug fixes with some confidence that you won't break anything. It's also especially good for products that must adhere to specific interface or API.

But when you want to re-write the last thing you want is to fossilize your design. You explicitly don't want the same system you started with otherwise it wouldn't be a rewrite, It would be a refactoring.

afpx · on Dec 15, 2018

This isn’t necessarily the case. In my experience, the teams that want rewrites have built big monoliths with no design, lots of coupling, and lots of copy-and-paste code (basically a ‘big ball of mud’). So, what the rewrite does is introduce an architecture that decouples (often valuable) functionality into small, targeted functions and libraries. This enables easier maintenance and modification. It also fixes the brittleness and fragility.

So, the goal isn’t to keep the monolith as is. Instead, it’s to break up its functionality into small parts. Then, the parts may be joined so that the old interfaces work as they did before. But, they may also be used to build entirely new interfaces.

Polyisoprene · on Dec 16, 2018

Thanks for putting a name to this. Happens way too often when I start rewriting old code.

I usually rename the test class of the old implementation to OldXTest and try to keep the interface of the new code similar enough to enable a reasonably quick transition into the new code with proper unit testing.

Silhouette · on Dec 15, 2018

There is a trick to rewrites: start by writing a full suite of end to end tests. Once done you'll discover that you can easily stabilize your old code and make changes with this security harness.

Sometimes, but it's not always that simple. Useful software tends to interact with external systems, which might not be amenable to that sort of automated end-to-end testing. Also, the objectives of a rewrite might include enabling new integrations and/or user interfaces, which deliberately don't work as drop-in replacements for what was there before so wouldn't expect to pass the same end-to-end test suite. Automated testing is useful in the right context, but IME it's rarely the whole story and there are often data migration exercises and new integration tests to be done as well.

BeetleB · on Dec 15, 2018

>Sometimes, but it's not always that simple.

I don't think he said it's simple. As someone who tried to add unit tests to legacy code, I can assure you it is anything but simple. However, it is a sound approach.

>Useful software tends to interact with external systems, which might not be amenable to that sort of automated end-to-end testing.

That's what mocks are for. Writing effective mocks is an art, though. Again - not trivial.

Silhouette · on Dec 15, 2018

That's what mocks are for.

I shall respectfully disagree with you here. I mean, yes, obviously that is literally what mocks are for, but I have never found mocks to be a particularly effective or valuable tool for testing. They can take a disproportionate amount of time to write and maintain if whatever real external system they stand in for is complicated. They are inherently fragile if that system is subject to change. Most importantly, even if those tests pass, you don't actually know whether your real system is going to work, and IMHO the greatest benefits of automated testing are found where you can systematically and repeatably exercise exactly the behaviour and interactions you might see in production.

jmchuster · on Dec 15, 2018

The pre-requisite here is that you first enumerate a fully complete spec of all behavior that your clients believe it should have. This seems too high a burden for what you're trying to achieve.

My experience has seen it work well as either 1) bottom-up: pick a small enough section of the codebase such that you can fully understand it, then rewrite just that piece, repeat until complete to get an exact copy of your existing application or 2) top-down: you write a new application from scratch completely, starting with understanding what your business's goals are and how to best serve your clients

mdpopescu · on Dec 15, 2018

This seems too high a burden for what you're trying to achieve.

Isn't this an absolute requirement for a rewrite anyway? (Your "2" case?)

rgoulter · on Dec 15, 2018

WEwLC is excellent. IIRC, its definition of "legacy code" is "untested code".

End to end tests are great for providing confidence that the code does what it says on the box. Though end to end tests are slower than unit tests, and it can be tricky to track down why an end to end test failed.

A "Test Pyramid" seems a good idea to me. (Unit tests can be quick, but don't cover much of the system. E2E tests cover a lot of the system, but aren't quick. "Test Pyramid" suggests it's better to have more unit tests relative to E2E tests). "Only unit tests" or "only end to end tests" don't seem like practical things to aim for.

cc81 · on Dec 15, 2018

Unless of course you are rewriting an aging desktop application to a web application.