New Book Project

By the time the Seven Years War with France ended in 1763, Britain had lost 1512 sailors in action—and almost 100,000 to scurvy. What makes this ironic, as well as tragic, is that a Scottish surgeon named James Lind had shown twenty years earlier that a little lemon or lime juice every day was enough to prevent or cure the dreaded ailment. Lind discovered this in what may have been the first controlled clinical trial in history: he divided 12 sailors with scurvy into six pairs, and gave each a different treatment. Those treated with sea water, sulfuric acid, garlic, and vinegar worsened; those who received cider got slightly better, and those given lemons and limes improved dramatically.

It was more than a century before medical practitioners began paying attention to “numerical” trials of this kind in large numbers. As late as the 1950s, when Hill and Doll began tracking the effects of smoking on cancer rates, many doctors rejected the results, saying that what happened “on average” was of no help when they were faced with a specific patient. By the time Sackett coined the term “evidence-based medicine” in 1992, however, most practitioners accepted that decisions about the care of individual patients should be based on conscientious, explicit, and judicious use of current best evidence.

Until recently, this idea—that claims and practices should be based on evidence—was foreign to most software developers. For example, many programmers believe that their favorite programming language is more natural, more expressive, or easier to learn than others, but when pressed, cannot back up their claim with anything stronger than a couple of anecdotes or “but it’s obvious!”

Let’s take just one recent bandwagon: domain-specific languages, or DSLs. A recent article in IEEE Software extolled their virtues, claiming that using them improves programmer productivity. This claim might be true, but the article did not cite any data, and one could just as easily argue that DSLs actually lower productivity by increasing the cost of maintenance.

This is starting to change. Any academic who claims that a particular tool or practice makes software development faster, cheaper, or more reliable is now expected to back up that claim with some sort of empirical study. Such studies are difficult to do well, but hundreds have now been done covering almost every aspect of software development, and those that are done well are as elegant as classic experiments in physics, psychology, and other scientific disciplines.

To date, these studies have mostly been confined to academia. Most professional developers are vaguely aware of some of the results, but often get the details wrong. For example, hundreds of books, blog posts, and presentations claim that the best programmers are forty times better than the worst—or fifteen, or a hundred, or some other number. Almost none of the people repeating that claim realize that it originally came from a very small study (twelve people) run for a very short time (one afternoon) in an era of batch processing and punch cards. The reality is more complex, but also more interesting, and if people are going to base hiring and working practices on something, shouldn’t it be reality?

The goal of this book is to present the most beautiful (and important) evidence-based facts we have about software engineering. Like Beautiful Code and its successors, this book will consist of a collection of personal essays. Each contributor will pick their favorite empirical result in software engineering and explain both what we know, and why we believe it to be true. In some cases, the work they describe will be their own; in others, they will pull together the work of several people to explain how far our understanding has progressed, and what questions are still open. Two dozen people have already agreed to contribute essays, and to donate their royalties to charity.

The book is aimed primarily at professional software developers and their managers, though it should be useful as well as a textbook for senior undergraduate students in computer science or software engineering. It assumes readers know how program, and are familiar with concepts such as agile development, the waterfall model, and so on. They are probably already reading books like The Pragmatic Programmer, The Mythical Man-Month, and Design Patterns. They may or may not attend topic-specific conferences, but they definitely read technical blogs by the likes of Martin Fowler, Jon Udell and Bruce Schneier.

If all goes well, the book will be available next summer. We hope that it will:

  • Tell developers and their managers what we actually know about software development, and why we think we know it, thereby (hopefully) dispelling a lot of the folklore, mythology, and outright BS that plagues our profession.
  • Give readers the facts they need to make sensible decisions about what working practices and tools they ought to adopt (or discard).
  • Show people what "evidence" looks like in software engineering, so that they can hold would-be gurus and salespeople to realistic standards.
  • Provide a channel for communication between the two cultures of academic research and industrial practice, to the benefit of both camps.
  • Give students a more interesting, readable, factual, and up-to-date source of knowledge than existing software engineering textbooks.

I’m really looking forward to this one—hope you’ll enjoy it too.