“Making Software” Screencast
A screencast about Making Software is now up on Amazon. I had to talk pretty fast to fit their four-minute limit, but I think I hit the high points.
A screencast about Making Software is now up on Amazon. I had to talk pretty fast to fit their four-minute limit, but I think I hit the high points.
We’re starting to get feedback on Making Software, most of it positive (but some of it grumpy: “how dare your evidence contradict my cherished belief!”). Here are two recent papers that aren’t in the book, but will give you a taste of what is:
Rossbach, Hofmann, and Witchel: “Is Transactional Programming Actually Easier?” In Proc. 15th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming. The question they set out to answer is, does software transactional memory (STM) make parallel programming easier or not? From their abstract:
In this paper, we describe a user-study in which 147 undergraduate students in an operating systems course implemented the same programs using coarse and fine-grain locks, monitors, and transactions. We surveyed the students after the assignment, and examined their code to determine the types and frequency of programming errors for each synchronization technique. Inexperienced programmers found baroque syntax a barrier to entry for transactional programming. On average, subjective evaluation showed that students found transactions harder to use than coarse-grain locks, but slightly easier to use than fine-grained locks. Detailed examination of synchronization errors in the students’ code tells a rather different story. Overwhelmingly, the number and types of programming errors the students made was much lower for transactions than for locks. On a similar programming problem, over 70% of students made errors with fine-grained locking, while less than 10% made errors with transactions.
In other words, students did better, but thought they did worse. This is interesting for a whole bunch of reasons (not least that for highlighting how flaky subjective self-assessment is).
Bird, Nagappan, Murphy, Gall, and Devanbu: “An Analysis of the Effect of Code Ownership on Software Quality across Windows, Eclipse, and Firefox.”
From their abstract:
We examine the relationship between dierent ownership measures and software faults/failures in three large software projects drawn from different process domains: Windows Vista, the Eclipse Java IDE, and the Firefox Web Browser. We find that in all cases, measures of ownership such as the number of low-expertise developers, and the proportion of ownership for the top owner have a relationship with both pre-release faults and post-release failures. However, we find that the strength of the effects is related to the development process used. Vista shows the strongest relationship with ownership level, followed by Eclipse, and then Firefox, suggesting that the more that a project uses an open source style process, the more that team sizes rather than ownership levels affect failures. We also find reasons that low-expertise developers make changes to components and show that the removal of low-expertise contributions dramatically decreases the performance of contribution-based defect prediction.
They are painstaking in defining what they mean by “ownership”, and how they measure it, so that other people can (and should!) replicate their work. Drilling down, their conclusions are:
This is cool: we can measure important things, we can see how they relate to other important things, and (crucially) we can act on what we see. I’m looking forward to seeing what both groups do next.
Making Software (the collection on empirical software engineering that I helped edit) is now available on Safari Rough Cuts — chapters include:
We hope you enjoy it!
The collection of essays on evidence-based software engineering that Andy Oram and I edited has gone to production. The final title is Making Software: What Really Works, and Why We Believe It. Individual chapters will be available as Rough Cuts from O’Reilly next month, and the book itself should be on the shelves not long after.

I’d like to thank all the people who volunteered their time; in no particular order, they and their chapters are:
The Jolt Awards for best software (and book) are back: this page on the Doctor Dobb’s Journal site has the schedule and categories. It’s a shame that neither of the collections I’m helping edit right now (one on evidence-based software engineering, the other on the architecture of open source applications) will be in print in time to qualify this year, but there’s always 2011
Announcements, Architecture of Open Source Applications, Making Software
Tania Samsonova has posted an interesting article discussing the importance of communication skills to job success for junior developers. Drawing on the work of people like Andrew Begel and Beth Simon (who are contributing a chapter to our upcoming book on evidence-based software engineering), Tania talks about how the ability to ask questions and share ideas is a lot more important than specific technical skills. I particularly like this quote:
- Anna, congratulations: your understanding of spoken English improved a lot.
-How do you know? You rarely talk to me anyway.
-You’ve stopped smiling and nodding all the time when people talk to you.
I keep telling my students not to over-commit themselves. It’s a shame I don’t take my own advice
. Here’s what I’ve currently got on the go:
Software Carpentry teaches basic software development skills to scientists and engineers. I have 80% of the funding I need to spend a year upgrading its content and delivery. I hope to raise the last 20% of the money in the next few weeks. If I can pull it off, the major challenges will be:
A professional Master’s degree in Computer Science at the University of Toronto to complement the department’s existing research Master’s. The program consists of five regular graduate courses, a course each on business skills and professional communication, and an eight-month industrial internship in which students have to show that they can translate theory into practice. We are now accepting applications for September 2010 entry, so if you’d like to learn leading-edge ideas from some of the best researchers in the world, please check it out.
Basie, our replacement for Trac, built on Django and jQuery, is coming along nicely, but I don’t know what will happen to it once I leave U of T. A few non-students are now involved in its development, but we aren’t big enough to bid for our own Google Summer of Code students. If anyone would like to get involved, please give me a shout. (I’d particularly like to hear from ex-project students—it would be nice to have an excuse to stay in touch.)
UCOSP stands for “undergraduate capstone open source projects”. Since September 2008, undergraduates from several universities in Canada and the US have been taking part in joint capstone projects in order to learn first-hand what distributed development is like. Each team has students from two or three schools, and works for a term under the supervision of a faculty or industry lead on an open source project. We’re currently trying to find $35,000 to hire a half-time administrator to run the program from September 2010 so that we can scale up from the present 45 students/term to 80, 90, or more. Again, if you’re interested, please give me a shout.
CSC302 is my regular undergraduate software engineering course. This term, six teams of students are porting Django to Python 3, adding pivot tables to Gnumeric, parallelizing parts of ILUTE, upgrading PyLint, pluginifying Selenium, and extending SpatiaLite. It could be the last regular course I teach at the University of Toronto; it has been a bit bumpy, but I’m glad the students are getting to work on real things.
Grad student supervision: Alecia, Zuzel, and Mike all have topics nailed down, and Jason is writing up. I plan to spend one morning a week in the department working with them from now through next January; I’m looking forward to seeing what they produce.
The Cowichan Problems. This one goes back to the mid-1990s, when I first realized that human performance was at least as important to overall productivity in computational science as machine performance. The idea is to use a suite of fairly simple applications, all stitched together, to benchmark the usability of parallel programming systems. A couple of undergrads updated the code last year; I’m hoping to revisit it as part of my work on Software Carpentry.
Book #1, called What Really Works?, is a Beautiful Code-style book that presents evidence-based results in software engineering. Where do bugs actually come from? Does pair programming get the job done faster? Can code metrics predict post-release fault rates? Are some programming languages intrinsically more productive than others? Each of our authors will explore one such question in a chapter-length essay; contributions are now coming in, and we’re still on track to have the book on the shelves this summer. (I’ve been talking about this subject and this book for a few months now; if you’re interested, you can view the slides.)
Book #2 is yet another collection, this time exploring the architecture of open source applications. As I said in my lightning talk at PyCon, the aim isn’t really to explain the internals of Hadoop, Parrot, and Mercurial (though I think that’s worth doing). The real aim is to teach people how to think about software architecture by showing them how architects think. We’re hoping to have chapters in for review by November, and the book out this time next year.
Book #3 is an illustrated children’s book about the universe, life, science, and global warming. I’ve had some good feedback from the editor who handled my last children’s book, but most of the work is still in front of me.
Projects I’m not working on:
Government 2.0: I enjoyed working on open data/open government projects with my students last term, but I couldn’t find any faculty at U of T willing to keep it going. I could have found Gov 2.0 stuff for CSC302, but I thought open source work would be better for them.
Two novels and half a dozen short stories. I enjoy writing fiction, but it feels like an indulgence, and I keep pushing it aside to do “serious” stuff. I’m sure that when I’m seventy I’ll regret having done that, so I hope to spend one hour a day writing fiction once I start full-time on Software Carpentry.
Jazz: I haven’t touched my sax since this time last year—it may be vanity, but I’d rather not play at all than play badly. Maybe when my daughter’s a little older…
Exercise: yeah… exercise. Maybe I’ll get my bike back on the road this week…
Basie, Government 2.0, Making Software, Research, Teaching, Uncategorized
Over in the Agile Usability group, Larry Constantine writes:
…Capers Jones has been sharing with me some hard data summaries on a variety of development methods and practices gathered from a very large number of projects undertaken by varied organizations that contribute data on bugs, costs, etc., to his company….An interesting thing is that agile methods fare better in most measures, including total cost of ownership of final software product, than practices associated with CMM level 3 but are NOT as good as the Rational Unified Process and all three are trumped by CMM level 5…I don’t want to get into the specific numbers (the data set is proprietary anyway)…I want to raise a very different issue: What would it mean to the agile community IF these findings really were valid and true?
To which I can only reply, “Show me the data.” Seriously: if you’re not willing to show people your data and explain where and how it was collected, and how it was analyzed, we should pay exactly as much attention to you as we do to the guy in the bar who claims to have met a guy who met the guy who actually shot JFK. My greatest hope for our upcoming collection on evidence-based software engineering is that it will remind people that neither anecdotes nor trade secrets constitute proof.
If it’s Monday, I must be catching up… I spoke at CUSEC 2010 last week to about 250 students and others about evidence-based software engineering. The talk is an update of the one I gave at DevDays last October; it’s basically a pitch for an upcoming O’Reilly collection on the subject, and the slides are up on SlideShare. You can find Joey de Villa’s detailed notes on the CUSEC keynotes on his blog:
and more to come.
Recent Comments