Punchcards Considered Harmful


Most people programming today have never punched a card, but all programming editors still treat code as lines of text—in other words, as if it still might have to fit onto punchcards. As I’ve been saying ranting for a while now, this is holding us back in ways we can barely recognize.

One example is YAML, or rather, the insistence that people must write complex nested data structures as indented lines of text. The rules are well-defined and simple cases are simple, but as anyone who has spent an hour wrestling with a Jekyll or Bookdown configuration file can attest, any complex case is an unproductive nightmare waiting to escape its cage.

So here’s a thought experiment. Imagine that every editor from Notepad to Vim to VS Code automatically displayed CSV files as editable tables. Instead of editing this:

book_filename: "r4de"
    fig: "Figure "
    tab: "Table "
    chapter_name: "Chapter "
output_dir: "docs"
delete_merged_file: false
new_session: true
  - index.Rmd
  - basics.Rmd
  - tidyverse.Rmd
  - rmarkdown.Rmd
  - package.Rmd
  - references.Rmd

people would read and edit something like this:

book_filename "r4de"
fig "Figure "
tab "Table "
chapter_name "Chapter "
output_dir "docs"
delete_merged_file false
new_session true

The file might actually be stored like this:

,,fig,"Figure "
,,tab,"Table "
,,chapter_name,"Chapter "

but if programmers could trust everyone’s favorite editor to render rows and columns as rows and columns, I believe that:

  1. most people would choose to use it instead of JSON, YAML, TOML, and STUMBL (OK, I made that last one up, but you weren’t sure, were you?) because

  2. they would find it easier to read and write nested structures if their editor gave them even this little bit of help and guidance.

But I don’t have any evidence for my second claim, which is where you (the ambitious grad student looking for a project) come in. Is there a difference in frustration quotient between YAML-in-a-text-editor and the same data in a table editor? Do people like the experience better if the table editor lives inside their usual editor? And can people find bugs faster or more reliably if nested structures are presented as tables rather than as indented text? My bet is “yes” for all three, but I don’t want you to trust me because I don’t want you to trust people—I want you to trust data.

And of course once this is working, the next experiment would be to add a tree editing widget to several common programming editors and see if it’s better, worse, or the same. I use [ a | b ] is my way of showing two editable cells side by side, and for fairness’ sake I think it’s essential to add these widgets to existing editors: many programmers will change operating system, citizenship, and gender before abandoning Emacs.

├──[ book_filename | "r4de" ]
│   ├──label
│   │  ├──[ fig | "Figure " ]
│   │  └──[ tab | "Table " ]
│   └──ui
│      └──[ chapter_name | "Chapter " ]
├──[ output_dir | "docs" ]
├──[ delete_merged_file | false ]
├──[ new_session | true ]

I don’t think “the other 99%” of humanity will use static site generators to write lessons unless we make them much (much) more approachable. Making configuration easier is just one part of that, but:

  1. it’s a part that can be used in many other places, and

  2. it’s a step toward finally ending the long, tyrannical reign of the punchcard.

In the wake of posts about Shopify's support for white nationalists and DataCamp's attempts to cover up sexual harassment
I have had to disable comments on this blog. Please email me if you'd like to get in touch.