class: slide-title
Software Design by Example
A Debugger
chapter
--- ## The Problem - Tracing execution with `print` statements is tedious - And impossible (or nearly so) in some situations - Single-stepping/breakpoint debugger is far more effective - Build one to understand how they work - And to show how to test interactive applications --- ## Preparation - We will want non-interactive input and output for testing - So
refactor
the virtual machine of
Chapter 25
- Pass an output stream (by default `sys.stdout`) ```py def __init__(self, writer=sys.stdout): """Set up memory.""" self.writer = writer self.initialize([]) ``` - Replace every `print` with a call to `self.write` ```py def write(self, *args): msg = "".join(args) + "\n" self.writer.write(msg) ``` --- ## Getting Input - Similarly, don't use `input` function directly ```py def __init__(self, reader=input, writer=sys.stdout): super().__init__(writer) self.reader = reader ``` ```py def read(self, prompt): return self.reader(prompt).strip() ``` --- ## Enumerating State - Old VM was either running or finished - New one has a third state: single-stepping - So define an
enumeration
```py class VMState(Enum): """Virtual machine states.""" FINISHED = 0 STEPPING = 1 RUNNING = 2 ``` - Safer than using strings (which can be mis-spelled) --- ## Running - New `run` method starts in `STEPPING` state - If it started in `RUNNING` we could never tell it to do otherwise ```py def run(self): self.state = VMState.STEPPING while True: if self.state == VMState.STEPPING: self.interact(self.ip) if self.state == VMState.FINISHED: break instruction = self.ram[self.ip] self.ip += 1 op, arg0, arg1 = self.decode(instruction) self.execute(op, arg0, arg1) ``` --- ## Interaction Cases 1. Empty line: go around again 2.
Disassemble
current instruction or show memory: do that and go around again 3. Quit: change state to `FINISHED`. 4. Run normally: change state to `RUNNING` 5. Single-step: exit loop without changing state --- ## Disassembling ```py OPS_LOOKUP = {value["code"]: key for key, value in OPS.items()} ``` - If we type in the number-to-instruction lookup table, it will eventually fall out of step - So build it from architecture description
--- ## Capturing Output - Has to be an object with a `write` method - But can save what it's given for later inspection ```py class Writer: def __init__(self): self.seen = [] def write(self, *args): self.seen.extend(args) ``` --- ## Providing Input - Need a "function" that takes a prompt and returns a string - Create a class with a `__call__` method that "reads" from a list ```py class Reader: def __init__(self, *args): self.commands = args self.index = 0 def __call__(self, prompt): assert self.index < len(self.commands) self.index += 1 return self.commands[self.index - 1] ``` --- ## Testing Disassembly ```py def test_disassemble(): source = """ hlt """ reader = Reader("d", "q") writer = Writer() execute(source, reader, writer) assert writer.seen == ["hlt | 0 | 0\n"] ``` 1. Create program (just a `hlt` instruction) 2. Create a `Reader` with the commands `"d"` and `"q"` 3. Create a `Writer` to capture output 4. Run the program 5. Check that the output is correct --- ## Is It Worth It? - Yes - Test that the debugger can single-step three times and then quit ```py def test_print_two_values(): source = """ ldc R0 55 prr R0 ldc R0 65 prr R0 hlt """ reader = Reader("s", "s", "s", "q") writer = Writer() execute(source, reader, writer) assert writer.seen == [ "000037\n" ] ``` --- class: aside ## Other Tools - [Expect][expect] often used to script command-line applications - Can be used through the [pexpect][pexpect] module - [Selenium][selenium] and [Cypress][cypress] for browser-based applications - Simulate mouse clicks, window resizing, etc. - Harder to set up and use than a simple `assert` - But so are the things they're testing --- ## Extensibility - Move every interactive operation to a method - Return Boolean to signal whether debugger should stay in interactive mode ```py def _do_memory(self, addr): self.show() return True ``` ```py def _do_step(self, addr): self.state = VMState.STEPPING return False ``` --- ## Extensibility - Modify `interact` to choose operations from a lookup table ```py def interact(self, addr): prompt = "".join(sorted({key[0] for key in self.handlers})) interacting = True while interacting: try: command = self.read(f"{addr:06x} [{prompt}]> ") if not command: continue elif command not in self.handlers: self.write(f"Unknown command {command}") else: interacting = self.handlers[command](self.ip) except EOFError: self.state = VMState.FINISHED interacting = False ``` --- ## Extensibility - Build the table in the constructor ```py def __init__(self, reader=input, writer=sys.stdout): super().__init__(reader, writer) self.handlers = { "d": self._do_disassemble, "dis": self._do_disassemble, "i": self._do_ip, "ip": self._do_ip, "m": self._do_memory, "memory": self._do_memory, "q": self._do_quit, "quit": self._do_quit, "r": self._do_run, "run": self._do_run, "s": self._do_step, "step": self._do_step, } ``` --- ## Stop Here - A
breakpoint
tells the computer to stop at a particular instruction - A
conditional breakpoint
stops if a condition is true --- ## Breakpoint Sets - Design #1: store breakpoint addresses in a set for `run` to check
--- ## What Hardware Does - Replace actual instruction with new `brk` instruction - Look up the real instruction when we hit a `brk`
--- ## Add Commands - Rely on parent class to initialize most of the table - Then add more entries ```py def __init__(self): super().__init__() self.breaks = {} self.handlers |= { "b": self._do_add_breakpoint, "break": self._do_add_breakpoint, "c": self._do_clear_breakpoint, "clear": self._do_clear_breakpoint, } ``` --- ## Setting and Clearing ```py def _do_add_breakpoint(self, addr): if self.ram[addr] == OPS["brk"]["code"]: return True self.breaks[addr] = self.ram[addr] self.ram[addr] = OPS["brk"]["code"] return True ``` ```py def _do_clear_breakpoint(self, addr): if self.ram[addr] != OPS["brk"]["code"]: return True self.ram[addr] = self.breaks[addr] del self.breaks[addr] return True ``` --- ## Running ```py def run(self): self.state = VMState.STEPPING while self.state != VMState.FINISHED: instruction = self.ram[self.ip] op, arg0, arg1 = self.decode(instruction) if op == OPS["brk"]["code"]: original = self.breaks[self.ip] op, arg0, arg1 = self.decode(original) self.interact(self.ip) self.ip += 1 self.execute(op, arg0, arg1) else: if self.state == VMState.STEPPING: self.interact(self.ip) self.ip += 1 self.execute(op, arg0, arg1) ``` --- class: summary ## Summary
[cypress]: https://www.cypress.io/ [expect]: https://en.wikipedia.org/wiki/Expect [pexpect]: https://pexpect.readthedocs.io/ [selenium]: https://www.selenium.dev/