Blog

2020	2021	2022	2023	2024	2025
2010	2011	2012	2013	2014	2015	2016	2017	2018	2019
				2004	2005	2006	2007	2008	2009

Cognitive Pollution

2025-10-25

Update: I have posted a recording of the talk.

I am grateful to Prof. Alberto Bacchelli for inviting me to give a colloquium at the University of Zurich a couple of days ago. My talk was titled Cocaine and Conway’s Law, but what it was really about was how to teach young software developers about cognitive pollution.

When most existing software engineering courses talk about risk, they talk about single-point incidents like the Therac-25 or Ariane 5 flight V88 where there is a direct cause-and-effect relationship between a mistake in software and something bad happening. I think we should instead talk about things like tetraethyl lead and asbestos, or about Purdue Pharma’s role in the opioid epidemic and the fossil fuels lobby’s campaign to promote climate change denial.

In each of these cases, deliberate choices increased the general level of risk to hundreds of millions of people, but the statistical nature of that risk allowed those responsible to avoid personal accountability. I think that case studies like these will help learners understand things like the role that Meta’s algorithms played in the Rohingya genocide and think more clearly about scenarios like the one below:

It is 2035. Most men under 20 have learned what they “know” about what women like from AI chatbots trained on porn and tuned to maximize engagement. As a result, many of them believe that a frightened “no” actually means “yes please”. The people who have earned billions from these systems cannot legally be held accountable for their users’ actions.

How does that make you feel?

If you are currently teaching an undergraduate course that covers cases like these, please get in touch: I’d be grateful for a chance to learn from you.

One Small Command

2025-10-18

Please note that I am suffering from jet lag and recovering from a bad cold while writing this, which means my proposal may well be garbage.

I’ve had a lot of conversations over the years about the differences in how software engineers and data scientists work. One example is how they manage software:

Software engineers regard duplicated code as sinful and refactor to avoid it.
Data scientists routinely copy a notebook or a script and make small changes to do a new data analysis. After many years, I have accepted that they are right to do so: their analyses are often exploratory one-offs, so copy-and-modify is more efficient than generalize-and-parameterize.

The problem is that software engineers build tools for software engineers, which means they don’t automatically support data scientists’ workflows. Continuing the refactor-versus-copy example, Git doesn’t have a way to explicitly say “this file started as a copy of that one”. Git has a way to say “this file was moved or renamed” (git mv), but there isn’t a corresponding git cp command because software engineers believe that you shouldn’t be doing that. You can ask Git to guess which files were copied in each commit:

git log --find-copies --diff-filter=C --stat

but (a) you probably didn’t know this existed, (b) you’re not going to remember it, and (c) Git’s heuristics often produce incorrect answers.

So let’s add git cp so that the log records copying events explicitly. That will allow us to trace the lineage of copied-and-modified notebooks and scripts (and the copied-and-modified configuration files that software engineers create because they don’t think of YAML and TOML as code). Doing this won’t solve all our traceability problems, but I think it will solve some of them, and we’ll learn something useful from its failure if it doesn’t.

Labwork to Leadership

2025-10-08

Jen Heemstra’s Labwork to Leadership: A Concise Guide to Thriving in the Science Job You Weren’t Trained For is the book I’ve been hoping to find for years. It includes sections on managing yourself, managing others, and coaching future leaders, all grounded in first-hand experience and empirical research. I wish it included more on organizational change and winding projects down, but those are my preoccupations: as is, it would be a great foundation for a Carpentries-style workshop, and I hope someone creates that.

cover of Heemstra's 'Labwork to Leadership'

Story Dice

2025-10-07

My daughter gave me a set of story dice for Christmas a couple of years ago. One of them has gone missing, which makes me wonder if there are now stories I’m no longer able to tell.

Time to make another cup of tea. If you came in peace, be welcome.

Loading the Dishwasher

2025-10-05

My daughter was profligate in her use of kitchenware: she would always grab a fresh glass for water instead of re-using the one that she had just emptied, and never used three saucepans to cook when she could use five. Now that she has left home, I only need to load the dishwasher once a day.

But I wish I still had to load it twice.

Time to make another cup of tea. If you came in peace, be welcome.

Time Spent on Hardening

2025-09-18

I recently received mail from someone working on a software-based approach to fault tolerance. Their tool makes applications more reliable, but they think it also makes developers more productive by reducing the amount of error detection and handling code they need write.

They have never been able to find research that quantifies how much time developers spend on code for detecting and handling problems relative to the effort for the “happy path”. they know it’s substantial, and is (probably) increasing as applications become more distributed, but the only number they’ve found is from a 1995 book called Software Fault Tolerance, where Dr. Flaviu Cristian says that it often accounts for more than two-thirds of code in production systems.

So I asked a dozen researchers I met through It Will Never Work in Theory if they knew of anything, and the answer was, “No, there isn’t anything that specifically addresses that question.” This strikes me as odd, because it wouldn’t be hard to measure and the answer would be interesting.

People do throw around questionable numbers about the cost of bugs and bug fixing, e.g., claim that companies $2 trillion in 2020. Here are some other related resources my contacts were able to give me:

A Systematic Review on Software Robustness Assessment
Exceptional Behaviors: How Frequently Are They Tested?
Automating Chaos Experiments in Production
The Exception Handling Riddle: An Empirical Study on the Android API
Unveiling Exception Handling Guidelines Adopted by Java Developers
Today Was a Good Day: The Daily Life of Software Developers: Developers spend about 11% of their time on debugging and bugfixing with some days being dedicated to the task (up to 32%) and some days being dedicated to meetings and collaboration (4-6%). You can also add time spent on testing (up to 16%).
The Work Life of Developers: Activities, Switches and Perceived Productivity
I Know What You Did Last Summer - An Investigation of How Developers Spend Their Time
Unveiling Exception Handling Guidelines Adopted by Java Developers
Studying the Evolution of Exception Handling Anti-Patterns in a Long-Lived Large-Scale Project
A Study on the Effects of Exception Usage in Open-Source C++ Systems
The Corrective Commit Probability Code Quality Metric
Debugging Revisited: Toward Understanding the Debugging Needs of Contemporary Software Developers
Designing robust Java programs with exceptions
Moonstone: Support for Understanding and Writing Exception Handling Code
Understanding Exception Handling: Viewpoints of Novices and Experts
Studying the Relationship Between Exception Handling Practices and Post-Release Defects
Where Do Developers Log?
What Leads to a Confirmatory or Disconfirmatory Behavior of Software Testers?

Again, the fact that we don’t have reliable figures for this strikes me as odd. As one of them pointed out, while everyone is throwing LLMs at often artificial and academic problems and then claiming to have improved some arbitrary metric X% over a random baseline, we still don’t know fairly basic things about software development.

My thanks to everyone who responded to my late-night email about this.

Later: this post made the #8 spot on Hacker News. It must have been a slow day…

Searching for Closure

2025-09-12

For every beginning there must be an ending, but we don’t like to talk about that, particularly not in the tech industry. There are thousands of books in print about how to start a business, but only a handful about how to pass one on, and many of those are really about how to sell out at the right time.

I have experienced a lot of endings, and the most important thing I’ve learned is that they can be dignified and fulfilling if done well. I also think that preparing for the end can make it less likely, and make what happens before it more enjoyable. However, a lot of people aren’t being given the chance to wind things down gracefully. Between the Trump administration’s attack on science and the cuts big tech companies are making in the name of AI, thousands of people are being given days (or less) to end years of work.

I am therefore assembling material for a half-day workshop on project closure. If you or someone you know has ended a software project or scientific research project, I’d be very grateful if you could spare half an hour for an online interview: you can reach me by email at gvwilson@third-bit.com.

Note: all discussion will be confidential, and everyone interviewed will be able to review and veto anything that mentions them before it is seen by anyone else.

Learner Personas

There are important differences between deliberate closure (shutting a project down of your own accord and on your own timeline), and abrupt closure (shutting it down on short notice under difficult circumstances). This workshop therefore caters to two learner personas.

Vaida

Vaida, 33, has a PhD in oceanography and now works as a data analyst for the Ministry of the Environment.
She has been collecting and publishing beach erosion data for the past six years. She also co-founded a volunteer group that teaches environmental science to high school students, and has been its leader for the past five years.
Vaida is relocating to pursue a new career opportunity, so she wants to wind down her data collection project. She also wants the volunteer group to continue its work, but the only documentation of how it operates is one slide deck and a couple of out-of-date blog posts.
Vaida is working hard to prepare for her new job, which means only has two or three hours a week for the next couple of months to put into tidying things up.

Liam

Liam, 41, worked as a civil engineer for almost a decade before becoming a full-time software developer at a company that does contract work modeling slope stability for large construction projects.
While Liam writes lots of tests and uses Git and GitHub to share work with his colleagues, very little of what he knows about OpenStabil (the company’s open source software package) has ever been written down.
Liam’s group was acquired by another engineering firm sixteen months ago. After an abrupt change of leadership, the company has decided to merge parts of OpenStabil into a closed-source tool suite and to stop all further development of the open version. Liam has been told to make these changes immediately; after protest, he has been given until the end of the week.
Liam is deeply invested in the small but tight-knit OpenStabil community, but has a young family at home and doesn’t dare risk being unemployed.

Mastodon and Webbly

2025-09-05

I was going to title this post “Two Great Tastes That Taste Great Together”, but I expect most of my readers are too young to get the reference, so I’ll just dive right in:

Glitch gave literally millions of people a chance to build something on the web without having to wrestle with NPM or webpack or set up a server or deal with any of the other crap that Sumana Harihareswara has dubbed inessential weirdness. It was beautiful and useful, but it wasn’t profitable enough for Fastly to keep it alive.
But the idea of a low-overhead in-the-browser way for the 99% to build things didn’t start with Glitch and hasn’t died with it either. Projects like Webbly (source here) are still trying to let people use the web to build the web. However, someone has to host these things somewhere: who’s going to do that, and where? More specifically, can we construct a hosting solution that isn’t tied to a particular company and therefore doesn’t have a singular point of failure?
Well, what about Mastodon? Its authors and users are deeply committed to decentralization and federation, and more people are running servers for particular communities every day. What if (wait, hear me out) what if Webbly was bundled with Mastodon so that Mastodon site admins could provide an in-the-browser page-building experience to their users simply by saying “yes” to one configuration option?
Why would they do that? My answer is, “Take a look at Mastodon’s default browser interface.” It lets you add a couple of pictures and a few links to your profile, but that’s less than MySpace offered twenty years ago. I am 100% certain that if Mastodon came with an easy in-browser page builder, people would use it to create all sorts of wonderful things. (Awful ones too, of course, but Mastodon site admins already have to grapple with content admin.)

Greenspun’s Tenth Rule is that every sufficiently complicated program contains a mediocre implementation of Lisp. Equally, I think every useful web-based tool is trying to be what Visual Basic was in the 1990s and WordPress was to the early web: useful, right there, and a gradual ramp for new users rather than a cliff to climb. I think the sort of people who built useful little things with Glitch would do amazing things with Webbly if it was married to their social media. I also think that allowing people to create custom home pages or tweak their feeds would draw a lot of new users away from fragile, centralized systems like X and Bluesky. I know that I’ve been wrong far more often than I’ve been right, but this really does feel promising.