Inspired in part by Lambda the Ultimate, which reports on what’s new in programming language research, Jorge Aranda and I have started a new blog called “It Will Never Work in Theory” to bring you the latest results in empirical studies of software engineering. The first posts discuss:
- Rahman and Devanbu‘s “Ownership, Experience, and Defects: A Fine-Grained Study of Authorship”, which found that code worked on by one developer (rather than many) is more often implicated in defects, but that a developer’s experience with a particular file (rather than the project in general) reduces defect rates.
- Stolee and Elbaum‘s “Refactoring Pipe-like Mashups for End-User Programmers”, which applies the “code smells” meme to Yahoo! Pipes (and by implication shows that refactoring ideas can be applied to other end-user programming systems).
- Mockus‘s “Organizational Volatility and its Effects on Software”, which found that an influx of newcomers into a project doesn’t increase fault rates (since they’re usually given simple tasks to start with), but that organizational change can still account for about 20% of faults.
Our aim in starting this blog is to continue the work begun in Making Software: to let practitioners know what researchers have discovered, and what kinds of questions they can answer, and to give researchers feedback on what’s useful, what isn’t, and what they ought to look at next. We look forward to your feedback.
Making Software
In case you were wondering, The Architecture of Open Source Applications is now averaging about 4200 page views a day. (The stats are corrupted a bit by all the clone sites that have popped up and kept our Google Analytics Javascript in their page headers; I’ve tried putting a filter in place at GA to exclude them, but instead it excluded all data for a three-day period. #itshouldntbethishard)
In related news, translations are now under way in:
- Chinese (both Simplified and Traditional)
- French
- Japanese
- Korean
- Portuguese (both European and Brazilian)
- Russian
- Spanish
- Ukrainian
and we have the following chapters lined up for Volume 2:
| Apache Derby |
Tiago Espinha |
| GDB |
Stan Shebs |
| The Glasgow Haskell Compiler |
Simon Peyton-Jones and Simon Marlow |
| GPSD |
Eric Raymond |
| Inkscape |
Jon Cruz |
| jQuery |
Addy Osmani |
| Iron Languages |
Jeff Hardy |
| ITK |
Luis Ibanez and Brad King |
| K-9 Mail |
Jesse Vincent |
| Mailman |
Barry Warsaw |
| matplotlib |
John Hunter |
| Open MPI |
Jeff Squyres |
| Parrot |
Christoph Otto |
| PostgreSQL |
Selena Deckelmann |
| Processing.js |
Mike Kamermans |
| Puppet |
Luke Kanies, Nigel Kersten, and James Turnbull |
| PyPy |
Benjamin Peterson |
| SQLAlchemy |
Michael Bayer |
| Twisted |
Jessica McKellar |
| Yesod |
Michael Snoyman |
| ZeroMQ |
Martin Sustrik |
Many thanks as always to Amy Brown, my tireless co-editor, for organizing this.
Architecture of Open Source Applications
We have started recruiting for the second volume of The Architecture of Open Source Applications, and while I’m mostly pleased with how it’s going, there’s one glaring problem. Here’s how the three collections I’ve edited in the past five years have broken down:
| Title |
Female |
Male |
% Female |
| Beautiful Code |
1 |
35 |
2.7% |
| Making Software |
9 |
34 |
21% |
| AOSA 1 |
8 |
33 |
19.5% |
| AOSA 2 |
1 |
20 |
4.7% |
Ouch—I was very pleased that MS and AOSA 1 weren’t as bad as BC, but right now, AOSA 2 isn’t where I’d like it to. Its contributors also almost all speak English as a first language, which isn’t representative of all the great open source work being done elsewhere. We’d welcome help addressing both problems…
Architecture of Open Source Applications, Beautiful Code, Equity, Making Software
Silly me: I want to use RapidSVN on Windows Vista. Which means I need to install WinMerge so that I can view and merge diffs. So I do that, then try to diff a file that’s in my working copy with the HEAD that’s in the repo. Which fails because it can’t create a tunnel. Say what? Oh—it can’t create a tunnel because I haven’t set an environment variable called SVN_SSH to point to an SSH client. Oh. OK. Let me go and download plink.exe and then set an environment variable to point to it (pop quiz: how many Windows users know how to set environment variables these days?) Hm. That still doesn’t work. I wonder why not? The error message doesn’t tell me, so I’ve gone from, “It would be nice to view diffs graphically when I’m pair programming” to shaving a yak in less than five minutes.
And yes, I’m sure I could figure it out. My point is, I shouldn’t have to.
Later: several people have left comments saying (roughly), “Well, what do you expect from Windows?” Personally, I don’t think any honest person can claim that Mac OS X, or any Linux distro, is any easier. The pain might show up in different places, at different times, but it’s still there all too often.
Uncategorized
That was my daughter’s reaction when I came downstairs in a suit on Friday morning
. But it was all in a good cause: Zuzel Vera Pacheco and Mike Conley, my last two grad students, received their degrees that afternoon, and I was pleased and proud to be there to shake their hands.

Learning
I got mail yesterday from a former student of a friend of mine who has just been told that he has to teach an “Intro to Software Engineering” class this fall to a bunch of third-year undergraduates. He’s not an SE guy—his background is operating systems—so he asked me what he should read to get one step ahead of his future students. As regular readers will know, I don’t think much of most traditional software engineering books: I’ve never seen most of what’s in them in the real world, and most of what I’ve needed to know hasn’t been in them.
So what did I recommend instead? Here’s my list:
- I think that what we prescribe in software engineering should be based on what we actually know from empirical studies, just as it should be in medicine or business, so Making Software is first on the list.
- Karl Fogel’s Producing Open Source Software is number two, as it’s the best description I’ve ever read of what good developers do day by day (rather than minute by minute: for that, there’s Neal Ford’s The Productive Programmer).
- I’m also partial to Henrik Kniberg’s Scrum and XP From the Trenches; it’s a bit more ideological, but still packed with lots of “been there, done that, here’s the t-shirt” advice.
- Reekie & McAdam’s A Software Architecture Primer and Rosenberg & Scott’s Use Case Driven Object Modeling with UML are the two “traditional” books on my list. They’re both slim and uncluttered; the first presents the only useful notation I’ve ever encountered for describing applications’ architectures, while the second explains what the core elements of UML are for and when to use them.
- Michael Feathers’ Working Effectively With Legacy Code remains the best book about making large systems manageable that has ever been written. It is also one of the few that acknowledges the reality most students will encounter all too soon—that of tangled, poorly documented legacy systems that cannot just be thrown away.
- Finally, The Architecture of Open Source Applications contains, as far as I know, the only lengthy descriptions anywhere of what software engineering is meant to produce: large, long-lived programs.
I feel a bit uncomfortable listing two books I’ve worked, but my conscience goes back to sleep when I remind it that my frustration with existing books was a big part of why I started Beautiful Code, which in turn led to MS and AOSA. BC is the most wide-ranging of the three, but the other two are more focused. I hope you find them, and all the others on this list, useful.
Later: someone asked why the Gang of Four Design Patterns book isn’t on this list. The answer is:
- Undergrads don’t seem to connect with it—I had much better luck using material Freeman and Freeman’s Head First Design Patterns (but not the book itself—most of my students found its format as offputting as I did). My favorite DP book is Olsen’s Design Patterns in Ruby, which I blogged about a couple of years ago, but since most of my students don’t speak Ruby, I haven’t actually taught with it. Now, if Russ ever does a Python version…
- Software design and software engineering are obviously related, but they’re not the same thing. In my mind, software engineering is how we produce software, while software design is what software looks like when it’s built. Both are important, but trying to cram both into a single course is a disservice to both. (Of course, that’s an argument for not including AOSA in this list either…)
Learning
Thanks to heroic effort from Ian McDowell and Amy Brown, The Architecture of Open Source Applications is now available for the Kindle at Amazon.com for $9.99. As always, all royalties (well, all the royalties Amazon doesn’t gobble up) will go directly to Amnesty International. Now, who’d like to help us produce a professional-looking e-pub edition?
Architecture of Open Source Applications
A lot of work got done on a lot of great projects at last weekend’s Random Hacks of Kindness; one of the most exciting was the Hermes Message Carrier, an ad hoc store-and-forward way to get messages out of areas that have gone dark after disasters by swapping them between mobile devices until one of those devices gets back into the light. This video by Julia Stowell has a good succinct explanation of its ideas and purpose—it’s very cool stuff.
Uncategorized
Jorge Aranda has posted a good summary of the panel session at at ICSE on “What Industry Wants From Research“. We hope to have more news soon…
Making Software, Research
As I wrote a couple of week ago, one of the reasons I started The Architecture of Open Source Applications project was to fill a gap I stumbled over while teaching at the University of Toronto. There are lots of books on software architecture (an Amazon.com search for that phrase produces over 600 hits), none of the ones I have looked at describe or analyze the architectures of a broad range of actual software systems in detail. Instead, they all spend their pages telling readers how important architecture is, and how to describe architectures using UML, Petri nets, and what-not. By analogy, it’s as if books on real (physical) architecture spent all their time talking about blueprints: how important it is to have them, different notations that can appear in them, tracking their changes over time, and on and on, without ever actually showing people the buildings those blueprints portray. I know this isn’t because the people who wrote those books weren’t familiar with real software systems—all I can think is that they believe people won’t be interested in the specific, only in the general.
Puzzling…
Architecture of Open Source Applications
Recent Comments