An article that Jorge Aranda and I wrote for American Scientist about empirical studies of software engineering is now up on the web. We hope it’s a good introduction to the area, and we look forward to your feedback. If you’d like to know more about the area, please check out our joint “neat results” blog at http://www.neverworkintheory.org/.
Inspired in part by Lambda the Ultimate, which reports on what’s new in programming language research, Jorge Aranda and I have started a new blog called “It Will Never Work in Theory” to bring you the latest results in empirical studies of software engineering. The first posts discuss:
- Rahman and Devanbu‘s “Ownership, Experience, and Defects: A Fine-Grained Study of Authorship”, which found that code worked on by one developer (rather than many) is more often implicated in defects, but that a developer’s experience with a particular file (rather than the project in general) reduces defect rates.
- Stolee and Elbaum‘s “Refactoring Pipe-like Mashups for End-User Programmers”, which applies the “code smells” meme to Yahoo! Pipes (and by implication shows that refactoring ideas can be applied to other end-user programming systems).
- Mockus‘s “Organizational Volatility and its Effects on Software”, which found that an influx of newcomers into a project doesn’t increase fault rates (since they’re usually given simple tasks to start with), but that organizational change can still account for about 20% of faults.
Our aim in starting this blog is to continue the work begun in Making Software: to let practitioners know what researchers have discovered, and what kinds of questions they can answer, and to give researchers feedback on what’s useful, what isn’t, and what they ought to look at next. We look forward to your feedback.
We have started recruiting for the second volume of The Architecture of Open Source Applications, and while I’m mostly pleased with how it’s going, there’s one glaring problem. Here’s how the three collections I’ve edited in the past five years have broken down:
Ouch—I was very pleased that MS and AOSA 1 weren’t as bad as BC, but right now, AOSA 2 isn’t where I’d like it to. Its contributors also almost all speak English as a first language, which isn’t representative of all the great open source work being done elsewhere. We’d welcome help addressing both problems…
In Canada, the phrase “two solitudes” refers to the lack of communication—and the lack of interest in communicating—between Anglophones and Francophones. I think of that phrase every time someone uses the phrase “theory versus practice” when talking about academia and industry. Having worked in both, I don’t think that’s the real dividing line: lots of academics actually do build things (look at Berkeley DB and SnowFlock), and the people writing optimizing compilers for IBM, or doing machine learning for Google, are as lost in the theoretical stratosphere as anyone. Instead, I think of industry and academia as two branches of an extended family that send each other Christmas cards, and occasionally show up for each other’s weddings or funerals, but aren’t in day-to-day or even year-to-year contact .
I think this divide is unhealthy, and while I failed to to bridge it personally, I’m hoping that the two books I’ve worked on in the past year will encourage others to do so. The second one to come out, The Architecture of Open Source Applications, has been getting some attention among practitioners since its launch this week. It’s too early to tell if academics will pay attention to it , but I’m hoping that someone at IEEE Software or Communications of the ACM will think it’s worth bringing to the attention of their audience .
The first of the two, Making Software, didn’t draw nearly as much attention (i.e., it wasn’t Slashdotted), but I think it’s a natural and necessary complement to AOSA. MS is a summary of what we actually know about how software is developed: what studies have been done, what they found, what conclusions we can draw from them, and why anyone should believe any of it. If AOSA is “what practitioners have built”, MS is “what researchers know”; my aim in doing both back to back was to give each of those two communities something they could give the other, something that would give them an excuse to sit down together and catch up with how Aunt Yena’s sciatica is doing and oh, isn’t little Zuffi just the cutest baby you ever saw?
In my dreams, what happens next is that people use these books as an opportunity to reach out to one another. I’m still digesting notes from yesterday’s ICSE panel session on “What Industry Wants From Research” , but it’s clear that a lot of researchers would love to talk practitioners about what problems really matter, and what would count as answers. At the same time, I think researchers could get some useful reality checks, and maybe even some redirection, by looking at what practitioners choose to describe when asked to describe the most important features of their applications.
As a first step, if you’re in academia, think about going to OSCON or Agile this year, telling the people there what you do, and listening to what they talk about when they talk about the things that they think are important. If you’re not an academic, but planning to go to either of those conferences, why not call up one of your old professors and invite them to join you? Or flip through Making Software and ask the author(s) of one of the chapters you find interesting to try it out. If nothing else, you’ll get a great t-shirt out of it…
 And yes, most people who get an undergraduate degree in CS do go out into industry, but for most of them, it’s a one-way trip, and very little of what they do or learn ever filters back to campus. To continue my analogy, they’re the young ‘uns that leave the old country to go work for Uncle Willi in Chicago, but stop writing home after mummi and vati pass away.
 I’m still baffled that there isn’t a “news for software engineering researchers” blog along the lines of Lambda the Ultimate to help people stay up to date with things like this. If I had more energy, I’d start one; if you have more energy than me, please do so.
 What I’d really like, of course, is for people to start using it as a textbook in advanced undergraduate software courses, but since those courses mostly don’t exist, that’s probably a vain hope…
 See Jorge Aranda’s post for a thoughtful summary of the answers that he, Daniela Damian, Marian Petre, and Peggy Storey uncovered before the panel session by interviewing industry practitioners.
Laurent Bossavit recently posted a critique of Steve McConnell’s chapter in Making Software on productivity differences between programmers (French original here, English translation here). In response, Steve posted this article that explains how he came to this topic when writing Code Complete, and then goes through the research he cited in his chapter, correcting the mistakes in Bossavits’ critique point by patient point. I think Steve has actually read the literature his claims are based on (Bossavits admits in several places that he hasn’t), and there are no drive-by ad hominem attacks—in short, his post is a great example of how this discussion should be conducted.
Thanks to a cold, I had time today to catch up on some long-delayed reading. Among the highlights were two pieces of work that I wish we had been able to include in Making Software. The first is Christoph Treude and Margaret-Anne Storey’s “Work Item Tagging: Communicating Concerns in Collaborative Software Development“, which looks at how programmers use tagging to label and find work items. The analysis is interesting, but the real payoff is the recommendations they make for tool builders.
The second find was “How People Debug, Revisited: An Information Foraging Theory Perspective“, which explores how developers follow “scents” when debugging, rather than forming and refuting hypotheses. Again, the analysis is interesting, but the real payoff is the recommendations at the end.
Both papers are prime examples of the kind of pragmatic research that is quietly revolutionizing software engineering. I look forward to seeing what both teams do next.
I’m going to go on record and say that this is one of the most important books about software development that has been published in the last few years. It’s easy for many of us in the industry to complain that software engineering research is years behind practice and that it is hard to construct experiments or perform studies which produce information that is relevant for practitioners, but fact is, there are many things we can learn from published studies.
The editors of this book do a great job of explaining what we can and can not expect from research. They also adopt a very pragmatic mindset, taking the point of view that appropriate practice is highly contextual. Research can provide us with evidence, but not necessarily conclusions.
Beyond the philosophical underpinnings, ‘Making Software’ outlines research results in a variety of areas. It gives you plenty to think about when considering various approaches on your team. The chapter ‘How Effective is Modularization?’ is worth the price of the book alone.
I recommend this book for anyone who wants to learn how to think rigorously about practice.