The history of physics is full of complex, one-off custom hardware. Reviewers have not been expected to take the full technical specs and actually build and run the exact same hardware, just to verify correctness for publication.
I doubt any physicist believes we need to get the Tevatron running again just to check decade-old measurements of the top quark. I don't understand why decade-old scientific software code must meet that bar.
They didn't rebuild the Tevatron but still were able to rediscover the top within a different experimental environment (i.e. LHC with tons of different discovery channels) and have lots of fits for it properties from indirect measurements (LEP, Belle).
Physics is not an exact science. If you have only one measurement (no matter if its software- or hardware-based), no serious physicist would fully trust in the result as long as it wasn't confirmed by an independent research group (by doing more than just rebuilding/copying the initial experiment but maybe using slightly different approximations or different models/techniques). I'm not so much in computer science, but I guess here it might be a bit different ones a prove is based on rigorous math.
However even if so, I guess, it's sometimes questionable if the prove is applicable to real-world systems and then one might be in a similar situation.
Anyways, in physics they always require several experimental proves for our theory. They also have several "software experiments" for e.g. predicting the same observables. Therefore, researchers need to be able to compile and run the code of their competitors in order to compare and verify the results in detail. In this place, bug-hunting/fixing is sometimes also taking place - of course. So applying the articles suggestions would have the potential to accelerate scientific collaboration.
btw; I know some people who do still work with the data taken at the LEP experiment which was shut down almost 20 (!) years ago and they have a hard time in combining old detector-simulations, monte-carlos etc. with new data-analysis techniques for the exact same reasons mentioned in the article.
For large-scale experiments it is a serious problem which nowadays has much more attention than at LEP ages, since LHC has anyways obvious big-data problems to solve before their next upgrade, including also software-solutions.
If you could have spun up a Tevatron at will for $10, would the culture be the same today?
I suspect that software really is different in this way, and treating it like it's complex, one off hardware is cultural inertia that's going to fade away.
I doubt any physicist believes we need to get the Tevatron running again just to check decade-old measurements of the top quark. I don't understand why decade-old scientific software code must meet that bar.