Blog

2020 2021 2022 2023 2024 2025 2026
2010 2011 2012 2013 2014 2015 2016 2017 2018 2019
2004 2005 2006 2007 2008 2009

If Not Lessons, Then What?

I used to think that when I retired, I would spend my time writing short tutorials on topics I was interested in as a way to learn more about them myself. I’ve now been unemployed for three months, and while I’ve written some odds and ends, it’s not nearly as fulfilling as I expected because I know that most people aren’t going to read a three-thousand word exposition of discrete event simulation: they’re going to ask an LLM, and get something pseudo-personalized in return.

To be clear, I don’t think this is inherently a bad thing: ChatGPT and Claude have helped me build https://github.com/gvwilson/asimpy and fix bugs in https://github.com/gvwilson/sim, and I believe I’ve learned more, and more quickly, from interacting with them than I would on my own. But they do make me feel a bit like a typesetter who suddenly finds the world is full of laser printers and WYSIWYG authoring tools.

I believe I can write a better explanation than an LLM, but (a) I can only write one, not a dozen or a hundred with slight variations to address specific learners’ questions or desires, and (b) it takes me days to do somewhat better what an LLM can do in minutes. I believe I go off the rails less often than an LLM (though some of my former learners may disagree), but is what I produce better enough to outweigh the speed and personalization that LLMs offer? If not, what do I do instead?

First-of in asimpy

Adding a “first of” operation to asimpy required a pretty substantial redesign. The project’s home page describes what I wound up with; I think it works, but it is now so complicated that I’d be surprised if subtle bugs weren’t lurking in its corners. If you (or one of your grad students) want to try using formal verification tools on ~500 lines of Python, please give me a shout.

Trying to Understand asimpy

As a follow-on to yesterday’s post, I’m trying to figure out why the code in the tracing-sleeper branch of https://github.com/gvwilson/asimpy actually works. The files that actually matter for the moment are:

I’ve added lots of print statements to sleep.py and the three files in the package that it relies on. To run the code:

$ git clone git@github.com:gvwilson/asimpy
$ cd asimpy
$ uv venv
$ source .venv/bin/activate
$ uv sync
$ python examples/sleep.py

Inside src/asimpy/actions.py there’s a class called BaseAction that the framework uses as the base of all awaitable objects. When a process does something like sleep, or try to get something from a queue, or anything else that requires synchronization, it creates an instance of a class derived from BaseAction (such as the _Sleep class defined in src/asimpy/environment.py).

Now, if I understand the protocol correctly, when Python encounters ‘await obj’, it does the equivalent of:

iter = obj.__await__()  # get an iterator
try:
    value = next(iter)  # run to the first yield
except StopIteration as e:
    value = e.value     # get the result 

After stripping out docs, typing, and print statements, BaseAction’s implementation of __await__() is just:

def __await__(self):
    yield self
    return None

Looking at the printed output, both lines are always executed, and I don’t understand why. Inside Environment.run(), the awaitable is advanced by calling:

awaited = proc._coro.send(None)

(where proc is the object derived from Process and proc._coro is the iterator created by invoking the process’s async run() method). My mental model is that value should be set to self because that’s what the first line of __await__() yields; I don’t understand why execution ever proceeds after that, but my print statements show that it does.

And I know execution must proceed because (for example) BaseQueue.get() in src/asimpy/queue.py successfully returns an object from the queue. This happens in the second line of that file’s _Get.__await__(), and the more I think about this the more confused I get.

I created this code by imitating what’s in SimPy, reasoning through what I could, and asking ChatGPT how to fix a couple of errors late at night. It did all make sense at one point, but as I try to write the tutorial to explain it to others, I realize I’m on shaky ground. ChatGPT’s explanations aren’t helping; if I find something or someone that does, I’ll update this blog post.

Introducing asimpy

I put the tutorial on discrete event simulation on hold a couple of days ago and spent a few hours building a small discrete event simulation framework of my own using async/await instead of yield. As I hoped, I learned a few things along the way.

First, Python’s await is just a layer on top of its iterator machinery (for an admittedly large value of “just”). When Python encounters await obj it does something like this:

iterator = obj.__await__()  # get an iterator
try:
    value = next(iterator)  # run to the first yield
except StopIteration as e:
    value = e.value         # get the result

Breaking this down:

  1. Call the object’s __await__() method to get an iterator.
  2. It then advances that iterator to its first yield to get a value.
  3. If the iterator doesn’t yield, the result is whatever the iterator returns. (That value arrives as the .value field of the StopIteration exception.)

We can simulate these steps as follows:

# Define a class whose instances can be awaited.
class Thing:
    def __await__(self):
        print("before yield")
        yield "pause"
        print("after yield")
        return "result"

# Get the __await__ iterator directly
awaitable = Thing()
iter = awaitable.__await__()  # this is an iterator

# run to first yield
value = next(iter)
print("iterator yielded:", value)

# Step 2: resume until completion
try:
    next(iter)
except StopIteration as e:
    value = e.value
    print("final result:", value)
before yield
iterator yielded: pause
after yield
final result: result

asimpy builds on this by requiring processes to be derived from a Process class and to have an async method called run:

class Process:
    def __init__(self, env):
        self.env = env

    @abstractmethod
    async def run(self):
        pass


class Sleeper(Process):
    async def run(self):
        for _ in range(3):
            await self.env.sleep(2)

We can then create an instance of Sleeper and pass it to the environment, which calls its run() method to get a coroutine and schedules that coroutine for execution:

env = Environment()
s = Sleeper(env)
env.immediate(s)
env.run()

Environment.run then pulls processes from its run queue until it hits a time limit or runs out of things to execute:

class Environment:
    def run(self, until=None):
        while self._queue:
            pending = heapq.heappop(self._queue)
            if until is not None and pending.time > until:
                break

            self.now = pending.time
            proc = pending.proc

            try:
                awaited = proc._coro.send(None)
                awaited.act(proc)

            except StopIteration:
                continue

The two key lines are in the try block:

  1. Send None to the process’s coroutine to resume its execution.
  2. Get something back the next time it calls await.
  3. Run that something’s act() method.

For example, here’s how sleep works:

class Environment:
    # …as above…
    def sleep(self, delay):
        return _Sleep(self, delay)

class _Sleep:
    def __init__(self, env, delay):
        self._env = env
        self._delay = delay

    def act(self, proc):
        self._env.schedule(self._env.now + self._delay, proc)

    def __await__(self):
        yield self
        return None
  1. Environment.sleep() returns an instance of _Sleep, so await self.env.sleep(t) inside a process gives the environment an object that says, “Wait for t ticks.”
  2. When the environment calls that object’s act() method, it reschedules the process that created the _Sleep object to run again in t ticks.

It is a bit convoluted—the environment asks the process for an object that in turn manipulates the environment—but so far it seems to be able to handle shared resources, job queues, and gates. Interrupts were harder to implement (interrupts are always hard), but they are in the code as well.

Was this worth building? It only has a fraction of SimPy’s features, and while I haven’t benchmarked it yet, I’m certain that asimpy is much slower. On the other hand, I learned a few things along the way, and it gave me an excuse to try out ty and taskipy for the first time. It was also a chance to practice using an LLM as a coding assistant: I wouldn’t call what I did “vibe coding”, but ChatGPT’s explanation of how async/await works was helpful, as was its diagnosis of a couple of bugs. I’ve published asimpy on PyPI, and if a full-time job doesn’t materialize some time soon, I might come back and polish it a bit more.

What I cannot create, I do not understand.

— Richard Feynman

Next Steps for Simulation

I’ve filed a few issues in the GitHub repository for my discrete event simulation of a small software development team in order to map out future work. The goal is to move away from what Cat Hicks calls the “brains in jars” conception of programmers by including things like this:

In order to implement these, though, I’ll need a way to simulate activities that multiple people are working on at the same time. This would be easy to do if all but the last person to join the activity sat idle waiting for it to start, but that’s not realistic. I could instead model it as a multi-person interrupt, but interrupts are hard. If you have enough experience with SimPy to offer examples, I’d be grateful. And if you’re a professor looking for capstone projects for students, please give me a shout: I think that a discrete event simulation framework based on async/await instead of yield would be just about the right size.

Discrete Events

My work log tells me I’ve spent 54 hours since mid-November building this discrete event simulation, which works out to a little over an hour a day. I’ve learned a few things about SimPy and Polars along the way, and depending on what happens with my job search, I may run an online workshop in 2026 to walk people through it. For now, though, I need to put this aside and concentrate on completing a couple of small contracts and revising some of the fiction I finally “finished”.

The Year That Was

Another year, another “where did the time go?” post…

Time for the day’s last cup of tea. If you came in peace, be welcome.

The Real Hardest Problem

There are only two hard things in computer science: cache invalidation and naming things.

— Phil Karlton

With respect, I think that handling interrupts is harder than either of these. Yesterday’s post explained how SimPy does this. Today, after several near misses, we’ll look at how to add it to our simulation.

A Quick Recap

Our Simulation class now includes a process that waits a random interval, chooses a random developer, and interrupts her by calling .interrupt:

class Simulation:

    def annoy(self):
        while True:
            yield self.timeout(self.t_interrupt_arrival())
            dev = random.choice(self.developers)
            dev.proc.interrupt(self.t_interrupt_duration())

When .interrupt is called, SimPy injects a simpy.Interrupt exception into the target process; the argument to .interrupt is attached to that exception object. If we’re comfortable throwing away the task we’re currently working on, the Developer process can catch the exception like this:

class Developer(Labeled):
    def work(self):
        while True:
            req = None
            try:
                req = self.sim.dev_queue.get()
                task = yield req
                yield self.sim.timeout(task.dev_required)
            except simpy.Interrupt as exc:
                if req is not None:
                    req.cancel()

The trickiest part of this is canceling the outstanding request to get something from the development queue. As we discovered yesterday, if we don’t do this, the developer won’t ever get anything else from the queue.

Nearly Right

What if we don’t want to throw away our current task when we’re interrupted? What if instead we want to handle the interruption and then resume that task? Here’s an implementation that’s almost right:

    # This code is wrong.
    def work(self):
        task = None
        while True:
            req = None
            try:
                if task is None:
                    req = self.sim.dev_queue.get()
                    task = yield req
                t_start = self.sim.now
                yield self.sim.timeout(task.dev_required - task.dev_done)
                task.dev_done += self.sim.now - t_start
                if task.is_done():
                    task = None
            except simpy.Interrupt as exc:
                if req is not None:
                    req.cancel()
                yield self.sim.timeout(exc.cause)

Inside the try block we get a task if we don’t already have one, then try to wait for as much time as the task still requires. When we’re done we add some time to the task and check if it’s now complete. If we’re interrupted, we cancel any outstanding request for a new task and then wait for as long as the interrupt tells us to.

There are (at least) two things wrong with this code. The first and less important is that when we’re interrupted, we throw away any work we’ve done on the task during this iteration of the loop. That’s relatively easy to fix: we just add some code to the except block to increment the .dev_done time in the task.

The bigger problem, though, is that sooner or later this code fails because of an uncaught Interrupt exception. The problem is that we can be interrupted inside our interrupt handler. It isn’t likely, which means it doesn’t happen very often, but if we run the simulation often enough with different random seeds, it eventually falls over.

A Corrected Version

The “fix” (I’ll explain the scare quotes around that word in a moment) is to move interrupt handling into the try block. To do that, we have to add another state variable interrupt_delay that tells the process if it’s currently supposed to be handling an interruption delay:

    def work(self):
        task = None
        interrupt_delay = None
        while True:
            req = None
            try:
                if interrupt_delay is not None:
                    yield self.sim.timeout(interrupt_delay)
                    interrupt_delay = None
                else:
                    if task is None:
                        req = self.sim.dev_queue.get()
                        task = yield req
                    yield self.sim.timeout(task.dev_required - task.dev_done)
                    if task.is_done():
                        task = None
            except simpy.Interrupt as exc:
                if req is not None:
                    req.cancel()
                interrupt_delay = exc.cause

So why did I put scare quotes around the word “fix”? Because I’m still not 100% sure this works. It hasn’t failed yet, despite multiple runs with different seeds and parameters, but this code is now complex enough that I could well believe it contains a one-in-a-million edge case. I think the except block is now a critical region, i.e., that no interrupts can occur within it because none of those three lines hands control back to SimPy, but I’m not completely sure.

And yes, this code still throws away any work the developer has done on a task during a particular loop iteration if an interrupt occurs; the interrupt handler should increment task.dev_done. And yes, it’s possible for an interrupt to be interrupted: a more realistic implementation would stack interrupt delays, but honestly, if my boss interrupts me while I’m being interrupted, I don’t have any qualms about discarding the first interruption.

Yaks and More Yaks

My goal with this series of blog posts was to simulate the development process of a small software team. I’ve spent most of the last week learning more about SimPy; it feels like yak shaving, but without it, I don’t think I’d have confidence in the code shown above (or be able to review its AI-generated equivalent).