2004 · 2005 · 2006 · 2007 · 2008 · 2009
2010 · 2011 · 2012 · 2013 · 2014 · 2015 · 2016 · 2017 · 2018 · 2019
2020 · 2021 · 2022

The Wes Mongtomery of Software

I had a conversation yesterday with a data scientist who is now working for a large IT consulting company. We got to talking about the tension between the scripts that data scientists cobble together to get a particular answer and the multi-level cloud architectures with complete integration test suites that software engineers believe are the “right” way to build software that routinely makes hundred million dollar decisions. In my mind the argument about “hacked-up scripts” versus “over-engineered collections of microservices” arises from a conflation of two distinct ideas; since quadrant diagrams are almost always a sign that someone’s trying to sell you bullshit, here’s mine:

Four quadrants of coding

The Beatles’ version of “Twist and Shout” is harmonically, melodically, and rhythmically much simpler than Coltrane’s 1967 recording of “My Favorite Things”, but both tracks were recorded in a single take. In contrast, Ravel spent months rewriting his “Pavane for a Deceased Princess” and Beethoven spent almost two years tearing his Ninth Symphony apart and putting it back together. While every single note of each was carefully considered, the former is once again much simpler than the latter.

But here’s the thing: Coltrane and the Beatles were able to create something great in a single take because they didn’t actually do it in a single take: they had both played those particular songs many (many) times before doing it in the studio. Similarly, if you watch Bruce Springsteen and the E-Street Band play You Never Can Tell for the first time, they pull it off because they’ve played songs like it literally thousands of times.

I think the same can be true of software. I can’t count the number of tools I’ve written over the last forty years to transform text from one format to another; asking me to write another is like asking a seasoned session musician to play a twelve-bar blues, or asking the data scientist I was speaking to yesterday to fit a linear model to some time-series data. We don’t need to write design docs or unit tests this time because we’ve worked through the problem and its variations so many times before that our fingers are just going to find the right chords.

Expertise doesn’t emerge without reflective practice: if you want to be Wes Montgomery, at some point you have to lock yourself in your room and play standards for a year. I wish more people who program and do data science knew that; more, I wish the people who hire them knew it too and gave them the time to practice so that they could do something great in just one take.

Shake it up baby…

A Language for Teaching

I’m hoping to send Software Design by Example to the publisher by the end of this month, and it has me thinking once again about what a programming language designed for teaching ought to look like. Here’s one request:

Built-in support for incremental exposition of code.

Most good books on programming interleave exposition and code in ways that most languages don’t directly support. For example, authors commonly want to write something like this to give readers a roadmap for what’s coming next:

class Grid:
    ...constants...

    def __init__(self, size):
        ...set up...

    def fill(self, value):
        ...fill entire grid with value...

    def adjacent(self, x, y):
        ...return neighbors of (x, y)...

They then want to fill in those markers one at a time further down the page or a few pages later:

    def fill(self, value):
        for i in range(self.size):
	    for j in range(self.size):
	        self.cells[i, j] = EMPTY

I don’t know any programming language that allows me to write this as shown. Some “simple” text processing will allow me to write something like this in my source file:

class Grid:

    ## [+fill]
    def fill(self, value):
        ## [-fill "fill entire grid with value"]
        for i in range(self.size):
	    for j in range(self.size):
	        self.cells[i, j] = EMPTY
        ## [-fill]
    ## [+fill]

and then slice the marked regions to produce the two versions shown above, but having used (and built) several such systems, I keep wondering why we don’t just add this to the language itself. Literate programming promised this, and while I was a zealous user for a couple of years in the late 1980s, bolting LP onto pre-existing languages proved too clunky to catch on. And yes, there are tricks like the Blank Maneuver and tools like jdc for the Jupyter notebook, but the former confuses novices (“Wait, you’re deriving a class from itself?”) and the latter doesn’t support forward markers in the original definition to show where the later code is going to go.

This issue may seem pretty esoteric—after all, most programmers don’t write books—but it highlights two larger points. The first is that most programmers do have to explain the work at some point, and there’s precious little in-language support for doing that. The second point is that languages don’t have any other support for incremental exposition either. For example, every textbook has diagrams, but you can’t put those in your source code: Jupyter notebooks and R Markdown files can show you the plots produced by your code in situ, but they won’t let you draw things by hand.

So here’s my suggestion for an enterprising graduate student who wants to change the world: pick half a dozen books on programming and go through them to create a catalog of explanatory techniques. Once you have that, extend your catalog by looking at slide decks and videos of whiteboard talks, and then design a little language and editor with built-in support for the top N techniques: really built-in, not wedged into specially-formatted comments or requiring extra compilation steps to see what readers are going to see. The language itself could be as small as Quorum, Hedy, or Lox: it’s just there to give users something to explain.

I often ask people what they would work on if they could work on anything. Variations of this idea have been on my list for two decades; I don’t think I have enough years left to see it through myself, but I’d be happy to chat with anyone who wants to take it on.

Four Books I'm Not Writing (Plus One)

I have bits and pieces of several overlapping technical books right now, but can’t decide which if any to complete:

  1. Building Software Together is advice for undergraduates working on their first team project (either in school or in industry). It’s the furthest along, but it’s been a few years since I was working with undergrads in large numbers so I suspect some of my advice is out of date and out of touch.

  2. Data Science for Software Engineers is an introduction to data analysis that (a) spends as much time on the messy business of getting, cleaning, and presenting data as it does on statistics and (b) uses software engineering data for all its examples. This one is more interesting to me personally, but it would be a lot of work to complete, and based on the polite disinterest I’ve received every time I’ve tried to get support for it, I suspect the audience is smaller than I’d like it to be.

  3. Designing Research Software is simultaneously a continuation of Research Software Engineering with Python, a book-length expansion of “Twelve quick tips for software design”, and a re-imagining of Software Design by Example (formerly Software Tools in JavaScript). Each chapter designs, constructs, and critiques a small version of something a research software engineer might actually build: a discrete particle simulator that illustrates the principles of object-oriented design, a file manager for large datasets that does error handling properly, a pipeline framework that records provenance in a reproducible way, and so on. Again, it’s interesting to me personally, but I suspect the audience is small.

  4. Software Engineering: A Compassionate, Evidence-Based Approach is the undergraduate software engineering textbook I think our profession needs today. Its starting points are (a) students should be introduced to the scientific study of programs, programmers, and programming and (b) now that we know how much harm software can do, we should teach prevention and remediation up front, just like the best civil and chemical engineering departments do.

The “which” part of my quandary is just my usual dithering; the “if any” goes a little deeper. Sales of technical books have been dropping steadily for twenty years: even ones as good as The Programmer’s Brain or Crafting Interpreters struggle to get through the noise. Textbooks have fared even worse, and not just because of publishers’ jam-today starve-tomorrow pricing models. I don’t think any book I could write today will ever reach as many people as Beautiful Code did 15 years ago. Maybe that shouldn’t matter to me, but it does.

And going even deeper, the two books I’m proudest of are two you’ve never read: Three Sensible Adventures and Bottle of Light. I wrote the stories in the first for my niece; it’s now out of print, but I have framed copies of the artwork up on the wall in my office and they’ve gotten me through some pretty bleak moments. The second, a middle-grade story about a world without light, is only available through one of Scholastic’s in-school programs for reluctant readers: I’ve tried periodically to buy back the rights, but so far the answer has always been “no”.

I really enjoyed writing both of them. If I could do anything with the years I have left I’d write more stories like the ones I loved growing up. I’d like to introduce you all to a cloudherd named Noxy (short for “Noxious Aftertaste”: the children in her village are given unpleasant names in order to discourage dragons from nibbling on them). I’d like you to meet a bookster’s apprentice named Erileine, a giant robotic dinosaur who may or may not be the original Santa Claus, and an orphaned clone with an odd knack for mechanical things trying to make ends meet in a post-Crunch Antarctica.

But all I’ve done in the last ten years is accumulate rejections, which makes it hard to keep writing. (And yes, I know about authors who got some-large-number of rejections before making a breakthrough sale, but I also know people who’ve won the lottery…) I know I should pick one of the above and get it over the line, but on a colder-than-spring-should-be Saturday morning, it’s all too easy to spend half an hour writing a blog post.

Family’s awake; time to make tea. Be well.

Later: the book I’d actually like to write is Sex and Drugs and Guns and Code, but I still don’t know enough and I’m not a good enough writer. Building Software Together and the software engineering textbook in the #4 spot above are partly attempts to smuggle some of those ideas past the defenses that people like my younger self have erected to protect themselves from feeling uncomfortable about themselves; one of the reasons I set BST aside was the realization that I’m not stealthy enough to pull it off.

Software Design by Example

I have moved my book-in-progress on software design from https://stjs.tech/ to a new home on this site at https://third-bit.com/sdxjs/: the old site’s name no longer matched the book’s title (which has changed from Software Tools in JavaScript to Software Design by Example), and I am trying to cut back on the number of small domains I manage. The source remains available at <ihttps://github.com/software-tools-books/stjs/>; if all goes to plan, Taylor & Francis will publish it by the end of this year.

Tehanu

When I die, I shall breathe back the breath that made me live. I shall give back to the world all that I didn’t do. All that I might have been and wasn’t. All the choices I didn’t make. All the things I lost and spent and wasted. I shall give them back to the world. To the lives that haven’t been lived yet. That will be my gift back to the world that gave me the life I did live, the life I loved, the breath I breathed.

— Ursula Le Guin, Tehanu

What I Would Change in Lox for Teaching

I spent a couple of weeks last year noodling around with Lox, the little language that Bob Nystrom created for his wonderful book Crafting Interpreters. As I said then, languages like Python and JavaScript are larger than they used to be, so I’m looking for something smaller for teaching software design (say, about the same size as Pascal was in 1980). I don’t have time to make all the changes I’d like to Lox, but I hope that writing them down and explaining why will spark ideas in someone else:

  1. Add array and hash values so that x = [1, 2, 3] and y = {"a": 1, "b": 2} work. I’d like to add sets as well—I think they’re under-rated and under-used—but I’m trying to budget complexity.

  2. Add modules and some kind of import mechanism so that learners can see how to break problems down into reusable pieces and manage the namespaces that come from doing so.

  3. Add an infix string conversion operator so that print("We need " # x # " balls of yarn.") works. If the thing being stringified has the right method, the operator should call that (which turned out to be trickier to implement in Lox than I expected). This isn’t just about making print easier to use without adding varargs to the language: showing learners how to hook into a language’s runtime can be mindblowing.

  4. Replace Lox’s C-style for loops with iterators and implement a Python-style iterator protocol so that programs can loop over user-level classes. Again, the main purpose for doing this is to explore the design possibilities it opens up.

  5. Make function definition an expression rather than a statement so that users can define little functions on the fly. The idea that functions are just another kind of data is one of the most powerful in programming, and I’d like to make it as frictionless to use (and teach) as possible.

  6. Add fibers or some other lightweight concurrency mechanism (but not JS-style callbacks or promises, because brr…). Lox’s predecessor Wren provides fibers for co-operative multitasking; combined with libuv it would allow lessons on asynchronous I/O, which I believe is as important to learn now as stdio-based pipes were for my generation.

  7. Add reflection. Going back to point #5, treating code as data is a tremendously powerful idea, and everything from testing harnesses to object-relational mappers rely on it.

The one other thing I really want to add that isn’t on this list is the hooks needed for a breakpointing debugger and for coverage/profiling tools. I think the debugger hooks are the most important, but there’s no point adding them without also providing a debugger; that said, a 1970s-style command-line interface would work, and once again would open a lot of teaching doors. (For example, I’d really like to be able to show students how to write scripts to control debugging sessions because boy howdy is that powerful once you have it.)

My guess is that there’s about 100 hours of work on the bullet list and another 100 for the runtime hooks and debugger, but I think the result would be a very nice little teaching language. If only the days were longer…

A Cacaphony of Explanations

I’m programming for a living for the first time in ten years. I’m enjoying it, but I’m finding some parts of the job more frustrating than I remember, and I think I’ve figured out why.

Mike Caulfield invented the term “a chorus of explanations” several years ago to explain why sites like Stack Overflow are useful. Where a lesson typically explains something once, SO and similar sites present several explanations with different levels of detail, different assumptions about the reader’s background, and quite possibly different solutions to the original problem. Most readers may be satisfied by the top answer, but others can scroll down to find one that’s a better fit for who they are, what they are ready to understand, and what they’re trying to do.

That’s great, but only as long as the underlying problem stands still: if your question concerns something that is evolving over time, many of the “solutions” you find are out of date. It takes some time to figure that out; I’ve only kept sporadic notes, but I’ll bet that the first few answers I do more than skim over are no longer relevant at least half the time.

This is a time-wasting annoyance for me, but a source of real friction for my junior colleagues. What is merely an out-of-tune chorus for me is a cacaphony for them; I’d be curious to know how often those sites send them down a no-longer-productive path.

In the Karaband

The an-Ruuda spent six hundred years fighting to free themselves from their undead masters. Sunlight was often their only weapon, so it’s not surprising they burn their fallen: they believe the souls of the righteous rise up from the flames to shine down and keep us safe.

The farmers of the Regimental Kingdoms, on the other hand, bury their dead to thank the earth for its bounty. As flesh becomes soil, then barley, then flesh again, so too, they believe, does the world make new spirits out of old.

The pirates of Bantang Ini and Bantang Barra believe we’re reborn as ourselves, living our lives over and over again until we get them right. That’s why they say “Im awa pfa ta” when anything goes wrong: it means, “Oh no, not this again.” And I met a tiger from Thind who said that dead is dead and all the rest is monkey foolishness. He asked me to ring a bell to remember a friend of his, though, so perhaps tigers can be foolish too.

But in the Karaband, where I was born, we believe that when people die they get to tell the tale of their life one last time. It doesn’t matter if the story is a quiet one or filled with great deeds and poetry. All that matters is whether the person telling it enjoys hearing it again.

All of which is to say that I’m truly sorry about poisoning you, but I really do need that amulet. We have a few minutes until the paralysis runs its course, though, and I’ve had a lot of practice listening in situations like these. I know it can be hard to begin, so repeat after me: once upon a time…

Comes Round Again

Another year over; what have I learned?

  • A story I wrote after my brother died was published by On Spec; you can read it here. I received over a dozen rejections and didn’t make any other sales, but decided to self-publish a novel called Beneath Coriandel; I hope you like it.

  • Research Software Engineering with Python was published. It took a lot longer than originally planned, but I’m very grateful to have worked with Damien, Kate, Luke, Joel, and Charlotte.

  • Danielle Smalls and I published “Ten quick tips for staying safe online”.

  • RStudio laid me off. I don’t understand why, or why I wasn’t allowed to try to find another role within the company, but asteroids happen. What angers me is how they fumbled the wind-down of the instructor certification program: a lot of people worked hard to get certified, and I think they deserved better.

  • I went through a bunch of job interviews. Some companies, like Automattic, do a great job of explaining the process and trying to make it fair. Others—well, here’s what happened with Canonical.

  • I spent six months with Metabase then moved to a straight-up programming position at Deep Genomics. I’m really enjoying coding for a living again after so many years, but a little dumbfounded by how much Python now looks like Enterprise Java…

  • We mothballed the TidyBlocks project: we’d reached the limits of what we could do as volunteers and nobody wanted to fund it.

  • I finished Software Tools in JavaScript, but there hasn’t been a lot of interest, so I decided to publish it on Leanpub until I can figure out what’s wrong with my beliefs about what programmers want to learn.

  • I ran a workshop on managing research software projects to help raise money for MetaDocencia. There was much less interest in it than I expected, so once again I’m going to pause any further development until I figure out what’s wrong with my mental model of what my hoped-for audience wants.

  • I revived It Will Never Work in Theory, which publishes short reviews of software engineering research papers that practitioners might be interested in. Posts get a median of just over a hundred views, which is either a lot (compared to the number of people who read academic research papers) or not (compared to the number of people who read fact-free blather on Hacker News). I’ll write more about this in the new year.

  • I had to stop playing guitar because of tenosynovitis in my right hand. I’ve picked up the tenor sax again in its place, which I’m probably enjoying more than our neighbors.

  • I got another tattoo to commemorate my mother and sister. I hope they would have liked it.

  • Favorite book: Driftwood by Marie Brennan.

  • Favorite movie: Free Guy. It’s not great art, but it was just what I needed.

  • Favorite album: Elina Duni’s Partir.

  • Favorite moment: a waltz with my wife.

In the wake of posts about Shopify's support for white nationalists and DataCamp's attempts to cover up sexual harassment
I have had to disable comments on this blog. Please email me if you'd like to get in touch.