Many people think that Javascript and HTML5 are the future of the web. Respectfully, I think they're both red herrings: I think what makes successful open source projects work is older, less exciting, and still only kind of works. It's called "merge", and if we really want to help people collaborate on a global scale, we ought to put a lot more effort into making it easy to use.

What's merge? It's what you do after a "diff". What's diff? It's something that shows you the differences between two files in a human-readable way. More specifically, suppose that you and I are both working on a program. We're sitting in front of different machines, trying to fix different bugs or add different features, and it just so happens that we both need to change graphics.java. After we've both made our changes, the world looks like this:

Simultaneous Editing

At this point, we need to combine our changes. We could scroll through two copies of the file side by side, copying edits from one to the other, but we'd almost certainly miss something or make a mistake. What we should do is use a program like diff to highlight the changes for us. Or better still, we should use a tool like merge to show your version of the file on the left, mine on the right, and the merge in between:

When we're done merging, what we have is the best of both worlds—the best of your ideas combined with the best of mine. The biological term for this is recombination, and it's at least as important to evolution as its more famous cousin, mutation, because it lets good genes (or ideas) cooperate.

Diff and merge make open source possible. They let dozens, hundreds, or thousands of people remix their work—not just take what others have done and build on it, but give back their own changes and ideas to be stirred back into the original for further remixing:

Recombining Ideas

When remixing is hard, open collaboration doesn't take root [1]. Education is a prime example: at some point in their career, every teacher has picked up someone else's PowerPoint slides and used it as a starting point for their own lecture on the subject, but hardly anyone ever gives their changes back to the author of the slides they started from. It's easy to say that's because remixing isn't part of educational culture, but there's a reason it isn't: PowerPoint decks can't be diffed and merged [2]. If it takes me an hour to scroll through my slides, comparing them one by one with yours and copying changes back by hand, I'm not going to use what you send me, so you're not going to send it in the first place [3]. Going back to our biological metaphor, people who can't merge are stuck in a universe that has mutation but not recombination, and that's a really inefficient way to improve fitness.

I'm thinking about all of this now because of the IPython Notebook and Mozilla Thimble. They're both really exciting tools, but neither makes collaboration easy [4]. If I want to merge your changes to a project into my copy, I can't view them side by side in the browser and pick the pieces I want from each. Instead, I have to merge two JSON files if I'm using the Notebook and—well, I'm not sure what I'd do with Thimble. I could view the differences in the text of the HTML and CSS, but anyone who can do that can build web pages without Thimble in the first place.

More to the point, people shouldn't have to drop down a cognitive level or two in order to collaborate this way. Lots of graphic design tools can highlight and merge the differences between two photographs; DiffEngineX does it for Excel spreadsheets (though you need a pretty wide screen to use it effectively), and so on. There's no technical reason we can't diff and merge all our files; it's just that programmers mostly work with text, so they haven't built merging tools for other formats. (And increasingly, I believe they work with text because it's what they can diff and merge in version control...)

We're smarter when we work together. It's more fun, too, so I think tools ought to make collaboration as easy as adding a caption to a picture of a cat:

Captioned Cat

We were collaborating on a global scale before HTML5 and Javascript came along, and I'm confident that we'll still be doing so ten years from now when they're both regarded as legacy technologies. If we want kids to hack web pages the way we hack code, we need to make merging as easy as reading email or uploading files to Dropbox. And if we want their teachers to remix each other's lessons, we need to show a little humility and make our methods work with their files. If we do that for them, they will learn to work the way we do and raise up a generation that thinks open collaboration is normal.

And that, my friends, would be a revolution.


  1. The exception is systems like Wikipedia that have just one copy of the document which everyone edits simultaneously, but like Google Docs and Etherpad, that clearly doesn't work for programming, slide decks, or other situations in which people want to try different things at the same time.
  2. PowerPoint "merging" tools like these two just concatenate multiple presentations into one, or generate a specialized deck from a template by filling in blanks with names and dates (rather like spam generators).
  3. At this point programmers often say, "Then write your slides Markdown or LaTeX or HTML5 or some other text-based format so that merging is easy," but that's like saying, "If you take all the pictures out of your book, it'll compress much better." PowerPoint, LibreOffice, Keynote, and other WYSIWYG presentation tools have survived and thrived because they make it easy for people to mix graphics and text however they want, just as they would on a whiteboard. As this blog post shows, it's a lot harder to do this with text-based tools: I had to switch from my editor to a drawing package to create the diagrams included above, then upload them, and if you ask your browser to search for "Original Version", it still won't find that label in either of the diagrams. Given the choice between whiteboarding (which they take for granted) and merging (which they've never done before, and whose value they don't yet understand), almost everyone will choose the former.
  4. More precisely, neither makes asnchronous collaboration easy. TowTruck lets people share dynamic browser sessions in real time, which is really cool, but as noted in [1], that's a very different model than forking and merging.