When I started revising the Software Carpentry course a couple of years ago, I thought it was going to be a how-to guide to lightweight software development practices for computational scientists and engineers. As I worked on it, though, I realized there was a deeper message:
  1. Most scientists have no idea how good or bad their code is; in fact, most don't even know how to find out. As a result, we have no idea how reliable most computational results actually are.
  2. The only way to fix this is to design quality in, then test to make sure it's there.
  3. Focusing on quality is also the key to making computational scientists more productive.
I'm not the first person to realize this: a vocal minority has been saying for years that computational science doesn't even come close to meeting the standards that experimental scientists were expected to meet two hundred years ago. Recently, though, those voices have been growing louder, spurred on in part by the agile community's emphasis on testing and testing tools. This column's books are therefore a sign of the times. The ad for the first, Accuracy and Reliability in Scientific Computing, says that it "...investigates some of the difficulties related to scientific computing and provides insight into how to overcome them and obtain dependable results." It delivers the first half several times over: contributor after contributor tells readers that floating-point numbers are fraught with peril, that it's important to choose numerically-stable algorithms, and so on. There are lots of examples, and good discussion of the gotchas in Fortran, C++, Python, and other languages. But what about the second part? Testing numerical software is hard: thanks to floating-point roundoff, you can't just compare a method's result against some previously-calculated value, because any small change in the order of operations inside that method may change the result just enough for "==" to fail. How are scientists supposed to deal with this? What changes should they make to JUnit-style testing frameworks in order to handle such problems? On this, unfortunately, the book remains silent. There is some discussion of how to gauge relative error in simple cases, but no guidance on how to figure out when that error should ring alarm bells. The average grad student in mechanical engineering will therefore come away from this book with a deeper understanding of the problems she faces, but no better idea of how to tackle them. That mechanical engineer would probably do better to pick up Oliveira and Stewart's Writing Scientific Software: A Guide to Good Style, whose subtitle could equally well have been "Things Computational Scientists Ought to Know". The chapter titles are a good summary of its contents: basics of computer organization, software design, data structures, design for testing and debugging, global vs. local optimization, memory bugs and leaks, Unix tools, and so on. The writing is clear (though more diagrams wouldn't have done any harm), and the case studies at the end are well thought out. The authors clearly know their intended audience well: on the one hand, they explain what a "register" is, while on the other, they assume their readers will recognize names like "Toeplitz" and "Krylov". I have to wonder, though, how much impact a couple of pages on revision control will actually have on our friend the mechanical engineer. I also wonder what message readers will take away from the fact that there's one section on testing and debugging, but four on architecture and optimization. I hope that in five years, the authors will do a second edition, and that by then, they'll be able to reverse those proportions.
Bo Einarsson (ed.): Accuracy and Reliability in Scientific Computing. SIAM, 2005, 0898715849, 364 pages. Suely Oliveira and David E. Stewart: Writing Scientific Software: A Guide to Good Style. Cambridge University Press, 2006, 0521858968, 316 pages.