Blog

2020 2021 2022 2023 2024 2025 2026
2010 2011 2012 2013 2014 2015 2016 2017 2018 2019
2004 2005 2006 2007 2008 2009

104 Days

It has been 104 days since I was laid off. In that time I have written approximately 64,000 words, of which 75% has been fiction and 25% non-fiction. (These figures don’t include email or social media.) I’ve actually written on all but 30 of those 104 days; at 71%, that puts me a little short of my 75% target but slightly ahead of the 65% of days I’ve managed over the past year.

As for time, I’m averaging about 5 hours a day of trackable activity, which includes exercise, music practice, and pro bono work as well as writing, programming, teaching, and looking for a job. I don’t really know where the rest of the day goes—I don’t believe sleep, chores, Wordle, and an episode or two of Elementary fill nineteen hours out of every twenty-four—but I’m trying not to worry about it.

Ongoing projects include:

One thing I haven’t done much of is read. I used to devour a book or two a week, but these days I find it difficult to get into most fiction, and even harder to read non-fiction. I don’t know if this is because I’m distracted by personal and world events, or whether it’s a stage of life, but I miss losing myself in someone else’s prose for a few hours at a time.

Four Traditions Revisited

Tedre and Sutinen’s paper “Three Traditions of Computing: What Educators Should Know” has shaped my thinking ever since I first read it. And this table (reproduced from the paper) summarizes their analysis:

Mathematical tradition Engineering tradition Scientific tradition
Assumptions Programs (algorithms) are abstract objects, they are correct or incorrect, as well as more or less efficient – knowledge is a priori Programs (processes) affect the world, they are more or less effective and reliable – knowledge is a posteriori Programs can model information processes, models are more or less accurate – knowledge is a posteriori
Aims Coherent theoretical structures and systems Investigating and explaining phenomena, solving problems Constructing useful, efficient, and reliable systems; solving problems
Strengths Rigorous, results are certain, utilized in other traditions Combines deduction and induction, cumulative Able to work under great uncertainty, flexible, progress is tangible
Weaknesses Incommensurability of results, uncertainty about what counts as proper science Limited to axiomatic systems Rarely follows rigid, preordained procedures; poor generalizability
Methods Empirical, inductive, and deductive Analytic, deductive (and inductive) Empirical, constructive

As I wrote three years ago, I’m struck now by what’s not there. I think there should be a fourth column titled “Humanist tradition” that focuses on values, on how computing is used, and on how cognitive and social psychology support, shape, and limit what we can build and how we build it.

I also now think that their distinction between the engineering and scientific traditions isn’t particularly useful. In practice, they are nearly-identical attempts to turn software development into an engineering discipline on par with chemical or electrical engineering. UML, requirements engineering, the use of statistical models to predict bug rates: all are signs of “engineering envy”, and by and large, practitioners have voted with their feet and not adopted them.

Instead, the overwhelming majority of the programmers I’ve worked with fall into what I used to call a “craft” tradition, but which I now think has a lot more in common with industrial design. Using Tedre and Sutinen’s categories:

I think this analysis explains why practitioners and software engineering researchers mostly talk past one another. Most researchers subscribe to what Scott’s book Seeing Like a State labelled “high modernism”: they believe comprehensibility and control will come from uniformity and formalism. Practitioners, on the other hand, are defending the local traditions in which they are personally invested. In my idle moments, I wonder where we’d be if that long-ago NATO conference had adopted industrial design as a metaphor instead of engineering.

Updating Snailz

I have updated the synthetic data generator I built last year to generate datasets I can use in my SQL tutorial. I might also use it as a running example if I ever teach a course on software design in Python to researchers.

If Not Lessons, Then What?

I used to think that when I retired, I would spend my time writing short tutorials on topics I was interested in as a way to learn more about them myself. I’ve now been unemployed for three months, and while I’ve written some odds and ends, it’s not nearly as fulfilling as I expected because I know that most people aren’t going to read a three-thousand word exposition of discrete event simulation: they’re going to ask an LLM, and get something pseudo-personalized in return.

To be clear, I don’t think this is inherently a bad thing: ChatGPT and Claude have helped me build https://github.com/gvwilson/asimpy and fix bugs in https://github.com/gvwilson/sim, and I believe I’ve learned more, and more quickly, from interacting with them than I would on my own. But they do make me feel a bit like a typesetter who suddenly finds the world is full of laser printers and WYSIWYG authoring tools.

I believe I can write a better explanation than an LLM, but (a) I can only write one, not a dozen or a hundred with slight variations to address specific learners’ questions or desires, and (b) it takes me days to do somewhat better what an LLM can do in minutes. I believe I go off the rails less often than an LLM (though some of my former learners may disagree), but is what I produce better enough to outweigh the speed and personalization that LLMs offer? If not, what do I do instead?

First-of in asimpy

Adding a “first of” operation to asimpy required a pretty substantial redesign. The project’s home page describes what I wound up with; I think it works, but it is now so complicated that I’d be surprised if subtle bugs weren’t lurking in its corners. If you (or one of your grad students) want to try using formal verification tools on ~500 lines of Python, please give me a shout.

Trying to Understand asimpy

As a follow-on to yesterday’s post, I’m trying to figure out why the code in the tracing-sleeper branch of https://github.com/gvwilson/asimpy actually works. The files that actually matter for the moment are:

I’ve added lots of print statements to sleep.py and the three files in the package that it relies on. To run the code:

$ git clone git@github.com:gvwilson/asimpy
$ cd asimpy
$ uv venv
$ source .venv/bin/activate
$ uv sync
$ python examples/sleep.py

Inside src/asimpy/actions.py there’s a class called BaseAction that the framework uses as the base of all awaitable objects. When a process does something like sleep, or try to get something from a queue, or anything else that requires synchronization, it creates an instance of a class derived from BaseAction (such as the _Sleep class defined in src/asimpy/environment.py).

Now, if I understand the protocol correctly, when Python encounters ‘await obj’, it does the equivalent of:

iter = obj.__await__()  # get an iterator
try:
    value = next(iter)  # run to the first yield
except StopIteration as e:
    value = e.value     # get the result 

After stripping out docs, typing, and print statements, BaseAction’s implementation of __await__() is just:

def __await__(self):
    yield self
    return None

Looking at the printed output, both lines are always executed, and I don’t understand why. Inside Environment.run(), the awaitable is advanced by calling:

awaited = proc._coro.send(None)

(where proc is the object derived from Process and proc._coro is the iterator created by invoking the process’s async run() method). My mental model is that value should be set to self because that’s what the first line of __await__() yields; I don’t understand why execution ever proceeds after that, but my print statements show that it does.

And I know execution must proceed because (for example) BaseQueue.get() in src/asimpy/queue.py successfully returns an object from the queue. This happens in the second line of that file’s _Get.__await__(), and the more I think about this the more confused I get.

I created this code by imitating what’s in SimPy, reasoning through what I could, and asking ChatGPT how to fix a couple of errors late at night. It did all make sense at one point, but as I try to write the tutorial to explain it to others, I realize I’m on shaky ground. ChatGPT’s explanations aren’t helping; if I find something or someone that does, I’ll update this blog post.

Introducing asimpy

I put the tutorial on discrete event simulation on hold a couple of days ago and spent a few hours building a small discrete event simulation framework of my own using async/await instead of yield. As I hoped, I learned a few things along the way.

First, Python’s await is just a layer on top of its iterator machinery (for an admittedly large value of “just”). When Python encounters await obj it does something like this:

iterator = obj.__await__()  # get an iterator
try:
    value = next(iterator)  # run to the first yield
except StopIteration as e:
    value = e.value         # get the result

Breaking this down:

  1. Call the object’s __await__() method to get an iterator.
  2. It then advances that iterator to its first yield to get a value.
  3. If the iterator doesn’t yield, the result is whatever the iterator returns. (That value arrives as the .value field of the StopIteration exception.)

We can simulate these steps as follows:

# Define a class whose instances can be awaited.
class Thing:
    def __await__(self):
        print("before yield")
        yield "pause"
        print("after yield")
        return "result"

# Get the __await__ iterator directly
awaitable = Thing()
iter = awaitable.__await__()  # this is an iterator

# run to first yield
value = next(iter)
print("iterator yielded:", value)

# Step 2: resume until completion
try:
    next(iter)
except StopIteration as e:
    value = e.value
    print("final result:", value)
before yield
iterator yielded: pause
after yield
final result: result

asimpy builds on this by requiring processes to be derived from a Process class and to have an async method called run:

class Process:
    def __init__(self, env):
        self.env = env

    @abstractmethod
    async def run(self):
        pass


class Sleeper(Process):
    async def run(self):
        for _ in range(3):
            await self.env.sleep(2)

We can then create an instance of Sleeper and pass it to the environment, which calls its run() method to get a coroutine and schedules that coroutine for execution:

env = Environment()
s = Sleeper(env)
env.immediate(s)
env.run()

Environment.run then pulls processes from its run queue until it hits a time limit or runs out of things to execute:

class Environment:
    def run(self, until=None):
        while self._queue:
            pending = heapq.heappop(self._queue)
            if until is not None and pending.time > until:
                break

            self.now = pending.time
            proc = pending.proc

            try:
                awaited = proc._coro.send(None)
                awaited.act(proc)

            except StopIteration:
                continue

The two key lines are in the try block:

  1. Send None to the process’s coroutine to resume its execution.
  2. Get something back the next time it calls await.
  3. Run that something’s act() method.

For example, here’s how sleep works:

class Environment:
    # …as above…
    def sleep(self, delay):
        return _Sleep(self, delay)

class _Sleep:
    def __init__(self, env, delay):
        self._env = env
        self._delay = delay

    def act(self, proc):
        self._env.schedule(self._env.now + self._delay, proc)

    def __await__(self):
        yield self
        return None
  1. Environment.sleep() returns an instance of _Sleep, so await self.env.sleep(t) inside a process gives the environment an object that says, “Wait for t ticks.”
  2. When the environment calls that object’s act() method, it reschedules the process that created the _Sleep object to run again in t ticks.

It is a bit convoluted—the environment asks the process for an object that in turn manipulates the environment—but so far it seems to be able to handle shared resources, job queues, and gates. Interrupts were harder to implement (interrupts are always hard), but they are in the code as well.

Was this worth building? It only has a fraction of SimPy’s features, and while I haven’t benchmarked it yet, I’m certain that asimpy is much slower. On the other hand, I learned a few things along the way, and it gave me an excuse to try out ty and taskipy for the first time. It was also a chance to practice using an LLM as a coding assistant: I wouldn’t call what I did “vibe coding”, but ChatGPT’s explanation of how async/await works was helpful, as was its diagnosis of a couple of bugs. I’ve published asimpy on PyPI, and if a full-time job doesn’t materialize some time soon, I might come back and polish it a bit more.

What I cannot create, I do not understand.

— Richard Feynman

Next Steps for Simulation

I’ve filed a few issues in the GitHub repository for my discrete event simulation of a small software development team in order to map out future work. The goal is to move away from what Cat Hicks calls the “brains in jars” conception of programmers by including things like this:

In order to implement these, though, I’ll need a way to simulate activities that multiple people are working on at the same time. This would be easy to do if all but the last person to join the activity sat idle waiting for it to start, but that’s not realistic. I could instead model it as a multi-person interrupt, but interrupts are hard. If you have enough experience with SimPy to offer examples, I’d be grateful. And if you’re a professor looking for capstone projects for students, please give me a shout: I think that a discrete event simulation framework based on async/await instead of yield would be just about the right size.