Diff and Merge for ProseMirror

ProseMirror is one of the most interesting projects I’ve come across this year. As its blurb says:

An ideal content editor produces structured, semantically meaningful documents, but does so in a way that is easy for users to understand. ProseMirror tries to bridge the gap between Markdown text editing and classical WYSIWYG editors.

It does this by implementing a WYSIWYG-style editing interface for documents more constrained and structured than plain HTML. You can customize the shape and structure of the documents your editor creates, and tailor them to your application’s needs.

More specifically, ProseMirror stores a document as JSON, and allows the creators of custom editors to provide templates to specify what’s allowed where. For example, you can say that a page like this one has to have a header with three fields called layout, title, and date, a body consisting of paragraphs and block quotes, and a list of URLs at the end.

ProseMirror is designed for real-time collaborative editing: its author even says that its model is useless. Respectfully, I disagree: Jupyter Notebooks are stored as JSON, Tavish Armstrong and his colleagues implemented diff and merge for them back in 2014 as an undergraduate honors project, and nbdime now provides a full suite of diff and merge tools. I think it would be really exciting to implement a general JSON-based diff-and-merge for ProseMirror that leveraged the structural information in the schemas in the way that Compare++ and SemanticMerge do syntax-aware merging for code. “Slide moved” is just the beginning: once there’s a schema for tables, there’s the possibility of diffing rows and columns instead of the markup used to represent them, and if anyone ever gets around to creating a diagram plugin for ProseMirror (using a JSON encoding of SVG as a store) we might finally get the diff tool for diagrams that I’ve been yearning for.

I’ve been advocating extensible programming systems (and more generally, extensible authoring systems) for years. A few tools, like Proxima, Larch, MPS, and Xtext showed what could be done, but the idea never caught on. I’ve also been arguing that merging is the magic that makes collaboration at scale possible, and asking (or begging) developers to implement diff and merge in LibreOffice so that the 99% of humanity who aren’t willing to dumb everything down to ASCII text can use this powerful idea with the tools and formats they’re already using. Diff and merge for ProseMirror wouldn’t be everything I want, but it would be a big step in the right direction.

In the wake of posts about Shopify's support for white nationalists and DataCamp's attempts to cover up sexual harassment
I have had to disable comments on this blog. Please email me if you'd like to get in touch.