Two Solitudes Illustrated

Posted 2012-12-06

Jorge Aranda and I submitted a short opinion piece to Communications of the ACM in February 2012 that discussed some of the reasons people in industry and academia don't talk to each other as much as they should. Ten months later, it has ironically turned into an illustration of one of the reasons: it was six months before we received any feedback at all, and we've now waited four months for any further word. In that time, Jorge has left academia and I've taken a job with Mozilla, so we have decided to withdraw the manuscript and publish it here. We hope you find it interesting, and we would welcome comments.

Two Solitudes

Greg Wilson and Jorge Aranda

In 2001, one of us (GW) started supervising senior undergraduate projects in computer science at the University of Toronto. His main reason for doing it was because it was fun, but he was also frustrated by how little the junior programmers we were hiring straight out of school knew about actually building software. They could talk big-O 'til they went blue in the face, but many had never seen version control, had forgotten what little they had ever learned about using Make, and thought testing was something you only did when it was in the grading scheme.

This isn't a new complaint, of course. People in all fields have long complained that universities don't prepare students for the real world, while professors have always countered that their job is to teach timeless fundamentals. What surprised him, though, was how little interest the two sides seemed to have in getting to know each other, at least in software engineering. While researchers and practitioners may mix and mingle in other specialties, every software engineering conference seemed to be strongly biased to one side or the other.

For example, less than 20% of the people who attend the International Conference on Software Engineering come from industry, and most of those work in labs like Microsoft Research. Conversely, only a handful of grad students and one or two adventurous faculty attend big industrial conferences like the annual Agile get-together.

One consequence of this is that researchers and practitioners spend a lot of time talking past one another. Wilson ran into this headlong six years ago when he was asked to teach an undergraduate course on software architecture. A quick search on Amazon turned up plenty of books on the subject with words like "Practical" and "Essential" in their titles. What didn't turn up was descriptions of the actual architectures of actual systems. Every book talked about how to describe architectures, how important it was to have a good one, and so on. When it came time to actually show readers a few, though, all they offered were a couple of pages on pipe-and-filter, client-server, MVC, and possibly some kind of peer-to-peer system. And even then, most didn't discuss actual systems: the boxes in their box-and-arrow diagrams had labels like "component 1" and "component 2".

The more he thought about this, the stranger it seemed. We wouldn't think much of a university program in architecture whose graduates had never studied any real buildings in detail. We also wouldn't be surprised if those graduates were as bad at designing buildings as most freshly-minted computer scientists are at designing software.

In this case, the problem suggested its own solution. In May 2006, Wilson emailed every famous programmer he could find an address for and invited them to contribute a chapter to a book on software design. More specifically, he asked them to describe the most beautiful piece of code they'd ever seen or written, and explain what made it beautiful in their eyes.

The result, published a year later as Beautiful Code, was well received by practitioners, but uptake in academia was close to zero. As he was trying to figure out why not, he was asked to teach another undergraduate course, this time on software engineering. Once again, he discovered that there was a lot less in most textbooks than met the eye. For example, the books all described UML in great (some might say "excruciating") detail, but in his eighteen years as a professional programmer, Wilson had only ever worked with one programmer who actually used it voluntarily (a Russian mathematician who wouldn't tie his own shoes without first brushing up on knot theory). Conversely, making things installable is as big a part of developing real applications as allocating methods to classes, but most books didn't discuss it at all, and those that did seemed to think it was a question of keeping a configuration database up to date.

But practitioners were (and are) guilty of equally great sins. At industry-oriented gatherings, it seems that a strong opinion, a loud voice, and a couple of pints of beer constitute "proof" of almost any claim. Take test-driven development, for example. If you ask its advocates for evidence, they'll tell you why it has to be true; if you press them, you'll be given anecdotes, and if you press harder, people will be either puzzled or hostile.

Most working programmers simply don't know that scientists have been doing empirical studies of TDD, and of software development in general, for almost forty years. It's as if family doctors didn't know that the medical research community existed, much less what they had discovered. Once again, the first step toward bridging this gulf seemed to be to get each group to tell the other what they knew.

On the one side this led to The Architecture of Open Source Applications, in which the people behind a double dozen open source projects walk readers through the high-level design of their applications, and explain why things are the way they are and how well they've worked. Some of the programs they discuss, like Bash and Sendmail, are as old as the Internet, while others are as fresh as this morning's top questions on Stack Overflow. And while some are gems of clean design, others are, in the words of one contributor, like third-world cities, with clean, well-kept neighborhoods lying next to run-down slums that no one seems willing to clean up.

On the other side is Making Software, in which leading software engineering researchers present and discuss key discoveries. The topics range from the impact of test-driven development on productivity (probably small, if there is one at all) to whether machine learning techniques can predict fault rates in software modules (yes, with the right training). One favorite is the discovery that geographic distance between members of a development team is only a weak predictor of how many errors there are in their software; a much better predictor is how far apart they are in the company org chart.

After Making Software came out, we, along with Daniela Damian, Marian Petre, and Margaret-Anne Storey, decided to explore how practitioners perceived software development research. We interviewed several high-profile practitioners (CEOs, senior architects, managers, developers, and entrepreneurs) and asked them what they thought of their academic counterparts, and what questions they thought researchers should focus on.

Their answers were scathing, but not surprising. They saw software engineering research as dated, dogmatic, focused on pointless questions, and biased toward either big projects or toy problems. In the words of one senior architect we interviewed:

[I'm afraid] that industrial software engineers will think that I'm now doing academic software engineering and then not listen to me. (…) If I start talking to them and claim that I'm doing software engineering research, after they stop laughing, they're gonna stop listening to me. Because it's been so long since anything actually relevant to what practitioners do has come out of that environment, or at least the percentage of things that are useful that come out of that environment is so small.

This kind of criticism is understandable. After all, plenty of practitioners remember having wasted countless hours warming the bench at college while a professor droned about the vital importance of this process or that notation with minimal experience and maximum conviction. And yet, as demonstrated by Making Software, and by the increasing number of savvy papers appearing each year, research is shifting in ways that practitioners would welcome if they hadn't been conditioned by past irrelevance to dismiss everything coming out of academia.

In 2011, we presented the results from our interviews in a panel at the International Conference on Software Engineering (ICSE). Our panelists were people with one foot on each side of the divide: Lionel Briand from the Simula Research Lab; Toshiba's Tatsuhiro Nishioka; Google's John Penix; Wolfram Schulte, from Microsoft Research; Peri Tarr, from the IBM T.J. Watson Center; and David Weiss, from Iowa State University. The bad news was that the panelists confirmed the near-complete disconnect between software research and practice. The good news was that judging by comments from them and from the audience, plenty of people would like to fix that.

But achieving that will be hard, because the root problem isn't entirely one of perceptions. There actually are differences between research and practice, three of which stand out. First, the incentive structure for researchers does not reward patient cultivation of long-lasting partnerships with practitioners. Second, researchers and practitioners have different understandings of what counts as evidence. Practitioners, being trained with an engineering mindset, expect generalized and quantitative results. They want to know what by percentage productivity will improve if they adopt a new practice, but this is a level of precision that no honest scientist can offer them today. And third, most research findings offer only piecemeal improvements; it simply isn't worth a practitioner's time to fight the inertia in their organizations for gains which are both small and uncertain.

In Canada, the phrase "two solitudes" refers to the lack of communication–and the lack of interest in communicating–between Anglophones and Francophones. Over the past three years, we have learned that it's also a good description of the gulf between software engineering researchers and practitioners. They're like two branches of an extended family that send each other Christmas cards, and occasionally show up for each other's weddings or funerals, but aren't in day-to-day or even year-to-year contact.

It doesn't have to be like this, of course. Many researchers (particularly younger ones) would love to talk to practitioners about what problems really matter. At the same time, practitioners could save themselves a lot of heartache by finding out what we actually do know about how to develop software, and by learning how to tell something that has been proven from something that has merely been asserted.

Greg Wilson has been a programmer, teacher, and author. He can be found online at https://software-carpentry.org, http://aosabook.org, and http://neverworkintheory.org. He received his PhD in Computer Science from the University of Edinburgh in 1993.

Jorge Aranda is a software developer. He received his PhD in Computer Science from the University of Toronto in 2010, and until recently conducted research on coordination in software teams.

If you enjoyed this post, you may also enjoy the presentation Greg gave at MSR Vision 2020.

Categories: academia, software-engineering, opinion