Blog

2020 2021 2022 2023 2024 2025
2010 2011 2012 2013 2014 2015 2016 2017 2018 2019
2004 2005 2006 2007 2008 2009

Cognitive Pollution

Update: I have posted a recording of the talk.

I am grateful to Prof. Alberto Bacchelli for inviting me to give a colloquium at the University of Zurich a couple of days ago. My talk was titled Cocaine and Conway’s Law, but what it was really about was how to teach young software developers about cognitive pollution.

When most existing software engineering courses talk about risk, they talk about single-point incidents like the Therac-25 or Ariane 5 flight V88 where there is a direct cause-and-effect relationship between a mistake in software and something bad happening. I think we should instead talk about things like tetraethyl lead and asbestos, or about Purdue Pharma’s role in the opioid epidemic and the fossil fuels lobby’s campaign to promote climate change denial.

In each of these cases, deliberate choices increased the general level of risk to hundreds of millions of people, but the statistical nature of that risk allowed those responsible to avoid personal accountability. I think that case studies like these will help learners understand things like the role that Meta’s algorithms played in the Rohingya genocide and think more clearly about scenarios like the one below:

It is 2035. Most men under 20 have learned what they “know” about what women like from AI chatbots trained on porn and tuned to maximize engagement. As a result, many of them believe that a frightened “no” actually means “yes please”. The people who have earned billions from these systems cannot legally be held accountable for their users’ actions.

How does that make you feel?

If you are currently teaching an undergraduate course that covers cases like these, please get in touch: I’d be grateful for a chance to learn from you.

One Small Command

Please note that I am suffering from jet lag and recovering from a bad cold while writing this, which means my proposal may well be garbage.

I’ve had a lot of conversations over the years about the differences in how software engineers and data scientists work. One example is how they manage software:

The problem is that software engineers build tools for software engineers, which means they don’t automatically support data scientists’ workflows. Continuing the refactor-versus-copy example, Git doesn’t have a way to explicitly say “this file started as a copy of that one”. Git has a way to say “this file was moved or renamed” (git mv), but there isn’t a corresponding git cp command because software engineers believe that you shouldn’t be doing that. You can ask Git to guess which files were copied in each commit:

git log --find-copies --diff-filter=C --stat

but (a) you probably didn’t know this existed, (b) you’re not going to remember it, and (c) Git’s heuristics often produce incorrect answers.

So let’s add git cp so that the log records copying events explicitly. That will allow us to trace the lineage of copied-and-modified notebooks and scripts (and the copied-and-modified configuration files that software engineers create because they don’t think of YAML and TOML as code). Doing this won’t solve all our traceability problems, but I think it will solve some of them, and we’ll learn something useful from its failure if it doesn’t.

Labwork to Leadership

Jen Heemstra’s Labwork to Leadership: A Concise Guide to Thriving in the Science Job You Weren’t Trained For is the book I’ve been hoping to find for years. It includes sections on managing yourself, managing others, and coaching future leaders, all grounded in first-hand experience and empirical research. I wish it included more on organizational change and winding projects down, but those are my preoccupations: as is, it would be a great foundation for a Carpentries-style workshop, and I hope someone creates that.

cover of Heemstra's 'Labwork to Leadership'

Story Dice

My daughter gave me a set of story dice for Christmas a couple of years ago. One of them has gone missing, which makes me wonder if there are now stories I’m no longer able to tell.

three red and two green story dice

Time to make another cup of tea. If you came in peace, be welcome.

Loading the Dishwasher

My daughter was profligate in her use of kitchenware: she would always grab a fresh glass for water instead of re-using the one that she had just emptied, and never used three saucepans to cook when she could use five. Now that she has left home, I only need to load the dishwasher once a day.

But I wish I still had to load it twice.

Time to make another cup of tea. If you came in peace, be welcome.

Time Spent on Hardening

I recently received mail from someone working on a software-based approach to fault tolerance. Their tool makes applications more reliable, but they think it also makes developers more productive by reducing the amount of error detection and handling code they need write.

They have never been able to find research that quantifies how much time developers spend on code for detecting and handling problems relative to the effort for the “happy path”. they know it’s substantial, and is (probably) increasing as applications become more distributed, but the only number they’ve found is from a 1995 book called Software Fault Tolerance, where Dr. Flaviu Cristian says that it often accounts for more than two-thirds of code in production systems.

So I asked a dozen researchers I met through It Will Never Work in Theory if they knew of anything, and the answer was, “No, there isn’t anything that specifically addresses that question.” This strikes me as odd, because it wouldn’t be hard to measure and the answer would be interesting.

People do throw around questionable numbers about the cost of bugs and bug fixing, e.g., claim that companies $2 trillion in 2020. Here are some other related resources my contacts were able to give me:

Again, the fact that we don’t have reliable figures for this strikes me as odd. As one of them pointed out, while everyone is throwing LLMs at often artificial and academic problems and then claiming to have improved some arbitrary metric X% over a random baseline, we still don’t know fairly basic things about software development.

My thanks to everyone who responded to my late-night email about this.

Later: this post made the #8 spot on Hacker News. It must have been a slow day…

Hacker News

Searching for Closure

For every beginning there must be an ending, but we don’t like to talk about that, particularly not in the tech industry. There are thousands of books in print about how to start a business, but only a handful about how to pass one on, and many of those are really about how to sell out at the right time.

I have experienced a lot of endings, and the most important thing I’ve learned is that they can be dignified and fulfilling if done well. I also think that preparing for the end can make it less likely, and make what happens before it more enjoyable. However, a lot of people aren’t being given the chance to wind things down gracefully. Between the Trump administration’s attack on science and the cuts big tech companies are making in the name of AI, thousands of people are being given days (or less) to end years of work.

I am therefore assembling material for a half-day workshop on project closure. If you or someone you know has ended a software project or scientific research project, I’d be very grateful if you could spare half an hour for an online interview: you can reach me by email at gvwilson@third-bit.com.

Note: all discussion will be confidential, and everyone interviewed will be able to review and veto anything that mentions them before it is seen by anyone else.

Learner Personas

There are important differences between deliberate closure (shutting a project down of your own accord and on your own timeline), and abrupt closure (shutting it down on short notice under difficult circumstances). This workshop therefore caters to two learner personas.

Vaida

Liam

Mastodon and Webbly

I was going to title this post “Two Great Tastes That Taste Great Together”, but I expect most of my readers are too young to get the reference, so I’ll just dive right in:

  1. Glitch gave literally millions of people a chance to build something on the web without having to wrestle with NPM or webpack or set up a server or deal with any of the other crap that Sumana Harihareswara has dubbed inessential weirdness. It was beautiful and useful, but it wasn’t profitable enough for Fastly to keep it alive.

  2. But the idea of a low-overhead in-the-browser way for the 99% to build things didn’t start with Glitch and hasn’t died with it either. Projects like Webbly (source here) are still trying to let people use the web to build the web. However, someone has to host these things somewhere: who’s going to do that, and where? More specifically, can we construct a hosting solution that isn’t tied to a particular company and therefore doesn’t have a singular point of failure?

  3. Well, what about Mastodon? Its authors and users are deeply committed to decentralization and federation, and more people are running servers for particular communities every day. What if (wait, hear me out) what if Webbly was bundled with Mastodon so that Mastodon site admins could provide an in-the-browser page-building experience to their users simply by saying “yes” to one configuration option?

  4. Why would they do that? My answer is, “Take a look at Mastodon’s default browser interface.” It lets you add a couple of pictures and a few links to your profile, but that’s less than MySpace offered twenty years ago. I am 100% certain that if Mastodon came with an easy in-browser page builder, people would use it to create all sorts of wonderful things. (Awful ones too, of course, but Mastodon site admins already have to grapple with content admin.)

Greenspun’s Tenth Rule is that every sufficiently complicated program contains a mediocre implementation of Lisp. Equally, I think every useful web-based tool is trying to be what Visual Basic was in the 1990s and WordPress was to the early web: useful, right there, and a gradual ramp for new users rather than a cliff to climb. I think the sort of people who built useful little things with Glitch would do amazing things with Webbly if it was married to their social media. I also think that allowing people to create custom home pages or tweak their feeds would draw a lot of new users away from fragile, centralized systems like X and Bluesky. I know that I’ve been wrong far more often than I’ve been right, but this really does feel promising.