Time Spent on Hardening

Posted

I recently received mail from someone working on a software-based approach to fault tolerance. Their tool makes applications more reliable, but they think it also makes developers more productive by reducing the amount of error detection and handling code they need write.

They have never been able to find research that quantifies how much time developers spend on code for detecting and handling problems relative to the effort for the “happy path”. they know it’s substantial, and is (probably) increasing as applications become more distributed, but the only number they’ve found is from a 1995 book called Software Fault Tolerance, where Dr. Flaviu Cristian says that it often accounts for more than two-thirds of code in production systems.

So I asked a dozen researchers I met through It Will Never Work in Theory if they knew of anything, and the answer was, “No, there isn’t anything that specifically addresses that question.” This strikes me as odd, because it wouldn’t be hard to measure and the answer would be interesting.

People do throw around questionable numbers about the cost of bugs and bug fixing, e.g., claim that companies $2 trillion in 2020. Here are some other related resources my contacts were able to give me:

Again, the fact that we don’t have reliable figures for this strikes me as odd. As one of them pointed out, while everyone is throwing LLMs at often artificial and academic problems and then claiming to have improved some arbitrary metric X% over a random baseline, we still don’t know fairly basic things about software development.

My thanks to everyone who responded to my late-night email about this.