Virtual Machine

Why Build a VM?

Types

i
pub type Op {
  Halt
  Load(Int, Int)
  Add(Int, Int, Int)
  Jump(Int)
  JumpIfZero(Int, Int)
}

pub type Machine {
  Machine(ip: Int, regs: List(Int), ram: List(Int))
}

Assembler

i
pub fn assemble(ops: List(Op)) -> List(Int) {
  list.map(ops, encode)
}

fn encode(op: Op) -> Int {
  case op {
    Halt -> 0
    Load(reg, val) -> 1 * field_size * field_size + reg * field_size + val
    Add(dst, a, b) -> 2 * field_size * field_size + dst * field_size + a * 16 + b
    Jump(addr) -> 3 * field_size * field_size + addr
    JumpIfZero(reg, addr) -> 4 * field_size * field_size + reg * field_size + addr
  }
}

Executor

i
pub fn execute(machine: Machine) -> Machine {
  case list_at(machine.ram, machine.ip) {
    Error(_) -> machine
    Ok(encoded) -> {
      let op = decode(encoded)
      case op {
        Halt -> machine
        Load(reg, val) -> {
          let new_regs = set_reg(machine.regs, reg, val)
          execute(Machine(machine.ip + 1, new_regs, machine.ram))
        }
        Add(dst, a, b) -> {
          let va = get_reg(machine.regs, a)
          let vb = get_reg(machine.regs, b)
          let new_regs = set_reg(machine.regs, dst, va + vb)
          execute(Machine(machine.ip + 1, new_regs, machine.ram))
        }
        Jump(addr) ->
          execute(Machine(addr, machine.regs, machine.ram))
        JumpIfZero(reg, addr) -> {
          case get_reg(machine.regs, reg) {
            0 -> execute(Machine(addr, machine.regs, machine.ram))
            _ -> execute(Machine(machine.ip + 1, machine.regs, machine.ram))
          }
        }
      }
    }
  }
}

Running the Example

i
  let program = [
    Load(0, 10),
    Load(1, 20),
    Add(2, 0, 1),
    Halt,
  ]
  let machine = Machine(ip: 0, regs: list.repeat(0, num_regs), ram: assemble(program))
  let result = execute(machine)
  io.println(string.inspect(result.regs))

The disassembler converts the encoded program back to readable text:

load r0 10
load r1 20
add r2 r0 r1
halt

Testing

i
pub fn add_program_test() {
  let program = [Load(0, 5), Load(1, 3), Add(2, 0, 1), Halt]
  let machine = Machine(ip: 0, regs: list.repeat(0, 4), ram: assemble(program))
  let result = execute(machine)
  at(result.regs, 2)
  |> should.equal(Ok(8))
}

pub fn halt_at_start_test() {
  let program = [Halt]
  let machine = Machine(ip: 0, regs: list.repeat(0, 4), ram: assemble(program))
  let result = execute(machine)
  result.ip
  |> should.equal(0)
}

pub fn jump_if_zero_skips_when_nonzero_test() {
  // Load 1 into r0, then JumpIfZero r0 addr=0 should NOT jump
  let program = [Load(0, 1), JumpIfZero(0, 0), Load(1, 42), Halt]
  let machine = Machine(ip: 0, regs: list.repeat(0, 4), ram: assemble(program))
  let result = execute(machine)
  at(result.regs, 1)
  |> should.equal(Ok(42))
}

Check Understanding

Why encode instructions as integers?

A real CPU stores programs as bytes in memory. Using a List(Int) here mimics that: the program and data live in the same address space, and the instruction pointer is just an index. This makes it straightforward to write self-modifying programs (though doing so is a well-known source of bugs). An alternative is to keep the List(Op) as the program representation, which is simpler but means the program and data are separate. This is less realistic but easier to reason about.

The execute function calls itself on every non-Halt instruction. Why does Gleam not run out of stack space for a long program?

Because every recursive call to execute is a tail call: execute is the last thing each branch does, with no further computation pending. The Gleam compiler replaces tail calls with a jump back to the top of the function, reusing the same stack frame. This is equivalent to a while loop and uses O(1) stack space regardless of how many instructions the program runs.

Exercises

Sum 1 to 5 (20 minutes)

Write a program in List(Op) that computes 1 + 2 + 3 + 4 + 5 = 15 using a loop and JumpIfZero. Use register 0 as the counter (start at 5, decrement each iteration) and register 1 as the accumulator. Halt when the counter reaches zero. Run it with execute and assert regs[1] == 15.

Disassembler (15 minutes)

Write disassemble(words: List(Int)) -> List(String) that converts encoded words to strings like "load r0 10" and "add r2 r0 r1". Add one test per instruction type.

Multiply instruction (15 minutes)

Add Mul(dst, a, b) to Op. Choose an unused opcode (e.g. 5). Update encode, decode, execute, and (if you wrote it) the disassembler. Write a test that loads 6 and 7 into registers, multiplies them, and asserts the result is 42.

Step limit (10 minutes)

Add a max_steps: Int field to Machine. Decrement it on each instruction; when it reaches zero, halt. Change the return type to Result(Machine, String). Write a test that confirms a looping program returns an error after the step limit is reached.