Chapter 3: An Interpreter


  • Compilers and interpreters are just programs.
  • Basic arithmetic operations are just functions that have special notation.
  • Programs can be represented as trees, which can be stored as nested lists.
  • Interpreters recursively dispatch operations to functions that implement low-level details.
  • Programs store variables in stacked dictionaries called environments.
  • One way to evaluate a program's design is to ask how extensible it is.

Terms defined: argument, call stack, control flow, dictionary comprehension, dispatch, dynamic scoping, eager evaluation, environment, extensibility, introspection, JavaScript Object Notation, lazy evaluation, lexical scoping, parameter, parser, recursion, runtime, signature, stack frame

Chapter 2 introduced the idea that programs are just another kind of data. Similarly, the compilers and interpreters that make programs run are just programs themselves. Instead of changing the characters in a block of memory like text editors, or calculating sums and averages like spreadsheets, compilers turn text into instructions for interpreters or hardware to run.

Most real programming languages have two parts: a parser that translates the source code into a data structure in memory, and a runtime that executes the instructions in that data structure. This chapter focuses on the runtime; Chapter 17 will explore parsing, while Chapter 19 will look at more efficient ways to execute instructions.

Section 3.1: Expressions

Let’s start by building something that can evaluate simple expressions like 1+2 or abs(-3.5). We represent each expression as a list with the name of the operation as the first item and the values to be operated on as the other items. To add 1 and 2 we use:

["add", 1, 2]

and to calculate the absolute value of -3.5 we use:

["abs", -3.5]

Nothing Special

We have special symbols for addition, subtraction, and so on for historical reasons, but our list representation doesn’t distinguish between things like + and abs because it doesn’t need to. If our program is being compiled into low-level instructions for a particular CPU, it’s the compiler’s job to decide what can be done directly and what needs multiple instructions. For example, early CPUs didn’t have instructions to do division, while modern CPUs may have instructions to do addition or multiplication on multiple values at once.

We can represent more complicated expressions using nested lists. For example, abs(1+2) is:

["abs", ["add", 1, 2]]

The function to add two expressions looks like this:

def do_add(args):
    assert len(args) == 2
    left = do(args[0])
    right = do(args[1])
    return left + right

Its single parameter is a list containing the two sub-expressions to be evaluated and added. After checking that it has the right number of parameters, it calls the function do recursively to evaluate those sub-expressions. (We’ve called the function do instead of eval because Python already has a function called eval.) Once do_add has two actual values, it adds them and returns the result (Figure 3.1)

Recursive evaluation of an expression tree
Figure 3.1: Recursively evaluating the expression abs(1+2).

Arguments versus Parameters

Many programmers use the words argument and parameter interchangeably, but to make our meaning clear, we call the values passed into a function its arguments and the names the function uses to refer to them as its parameters. Put it another way, parameters are part of the definition and arguments are given when the function is called.

do_abs, which calculates absolute value, works the same way. The only differences are that it expects one value instead of two and calculates a different return value:

def do_abs(args):
    assert len(args) == 1
    val = do(args[0])
    return abs(val)

So how does do work? It starts by checking if its input is an integer. If so, it returns that value right away because integers “evaluate” to themselves: no more calculation is needed. Otherwise, do checks that its parameter is a list and then uses the first value in the list to decide what other function to call. This process is often called dispatch.

def do(expr):
    # Integers evaluate to themselves.
    if isinstance(expr, int):
        return expr

    # Lists trigger function calls.
    assert isinstance(expr, list)
    if expr[0] == "abs":
        return do_abs(expr[1:])
    if expr[0] == "add":
        return do_add(expr[1:])
    assert False, f"Unknown operation {expr[0]}"

Finally, the main body of the program reads the file containing the instructions to execute, calls do, and prints the result:

def main():
    assert len(sys.argv) == 2, "Usage: filename"
    with open(sys.argv[1], "r") as reader:
        program = json.load(reader)
    result = do(program)
    print(f"=> {result}")

if __name__ == "__main__":

Our program is a list of lists (of lists…) so we can read it as JSON using json.load. If our program file contains:

["add", ["abs", -3], 2]

then our little interpreter prints:

=> 5

This is a lot of code to do something that Python already does, but it shows what Python (and other languages) do themselves. When we run our little interpreter with:

$ python expr.tll

Python reads, turns it into a data structure with operation identifiers and constants, then uses those operation identifiers to decide what functions to call. Those functions are written in C and have been compiled to machine instructions, but the cycle of lookup, call, and recurse is exactly the same.

Section 3.2: Variables

Adding up constants is a start, but our programs will be easier to read with variables that let us give names to values. We can add them to our interpreter by passing around a dictionary containing all the variables seen so far. Such a dictionary is sometimes called an environment because it is the setting in which expressions are evaluated; the dictionaries returned by the globals and locals functions introduced in Chapter 2 are both environments.

Let’s modify do_add, do_abs, do, and main to take an environment as an extra parameter and pass it on as needed:

def do_abs(env, args):
    assert len(args) == 1
    val = do(env, args[0])
    return abs(val)

Looking up variables when we need their values is straightfoward. We check that we have a variable name and that the name is in the environment, then return the stored value:

def do_get(env, args):
    assert len(args) == 1
    assert isinstance(args[0], str)
    assert args[0] in env, f"Unknown variable {args[0]}"
    return env[args[0]]

To define a new variable or change an existing one, we evaluate an expression and store its value in the environment:

def do_set(env, args):
    assert len(args) == 2
    assert isinstance(args[0], str)
    value = do(env, args[1])
    env[args[0]] = value
    return value

We need to add one more function to make this all work. Our programs no longer consist of a single expression; instead, we may have several expressions that set variables’ values and then use them in calculations. To handle this, we add a function do_seq that runs a sequence of expressions one by one. This function is our first piece of control flow: rather than calculating a value itself, it controls when and how other expressions are evaluated. Its implementation is:

def do_seq(env, args):
    assert len(args) > 0
    for item in args:
        result = do(env, item)
    return result

Let’s try it out. Our test program is:

    ["set", "alpha", 1],
    ["set", "beta", 2],
    ["add", ["get", "alpha"], ["get", "beta"]]
=> 3

Section 3.3: Introspection Again

Before we add more operations, let’s have a look at the current state of do:

def do(env, expr):
    # Integers evaluate to themselves.
    if isinstance(expr, int):
        return expr

    # Lists trigger function calls.
    assert isinstance(expr, list)
    if expr[0] == "abs":
        return do_abs(env, expr[1:])
    if expr[0] == "add":
        return do_add(env, expr[1:])
    if expr[0] == "get":
        return do_get(env, expr[1:])
    if expr[0] == "seq":
        return do_seq(env, expr[1:])
    if expr[0] == "set":
        return do_set(env, expr[1:])
    assert False, f"Unknown operation {expr[0]}"

The sequence of if statements that decide what function to call is going to become unreadably long. Let’s use introspection to create a lookup table instead by finding and storing every function whose name starts with do_:

OPS = {
    name.replace("do_", ""): func
    for (name, func) in globals().items()
    if name.startswith("do_")

Line by line:

  1. We use a dictionary comprehension to create a dictionary in a single statement.

  2. We only add functions whose names start with do_.

  3. Each key-value pair in the dictionary is the name of an operation and the function that implements the operation. The operation’s name is what comes after do_ in the function’s name.

With this lookup table in hand, the code to select and run an operation is:

def do(env, expr):
    # Integers evaluate to themselves.
    if isinstance(expr, int):
        return expr

    # Lists trigger function calls.
    assert isinstance(expr, list)
    assert expr[0] in OPS, f"Unknown operation {expr[0]}"
    func = OPS[expr[0]]
    return func(env, expr[1:])

As with unit test functions in Chapter 2, the do_* functions must have exactly the same signature so that we can call any of them with an environment and a list of arguments without knowing exactly which function we’re calling. And as with finding tests, introspection is more reliable than a hand-written lookup table, but it isn’t necessarily easier to understand. If we write out the lookup table explicitly like this:

OPS = {
    "abs": do_abs,
    "add": do_add,
    "get": do_get,
    "seq": do_seq,
    "set": do_set,

then it’s easy to see exactly what operations are available and what their names are. If we use introspection, we have to search through the source file (or possibly several files) to find all the available operations.

Section 3.4: Statements

Now that we have recursive evaluation, function lookup, and environments, it’s easy to add more features to our little language. Our goal is to execute this program, which starts with the number 1 and doubles it four times:

    ["comment", "Double a number repeatedly"],
    ["set", "a", 1],
    ["print", "initial", ["get", "a"]],
            ["set", "a", ["add", ["get", "a"], ["get", "a"]]],
        ["leq", ["get", "a"], 10],
        ["print", "small", ["get", "a"]],
        ["print", "large", ["get", "a"]]

The simplest of the new operations is comment, which does nothing and returns None:

def do_comment(env, args):
    """Ignore instructions.
    ["comment" "text"] => None
    return None

An if statement is a bit more complex. If its first argument is true it evaluates and returns its second argument (the “if” branch). Otherwise, it evaluates and returns its second argument (the “else” branch):

def do_if(env, args):
    """Make a choice: only one sub-expression is evaluated.
    ["if" C A B] => A if C else B
    assert len(args) == 3
    cond = do(env, args[0])
    choice = args[1] if cond else args[2]
    return do(env, choice)

This is called lazy evaluation to distinguish it from the more usual eager evaluation that evaluates everything up front. do_if only evaluates what it absolutely needs to; most languages do this so that we can safely write things like:

if x != 0:
    return 1/x
    return None

If the language always evaluated both branches then the code shown above would fail whenever x was zero, even though it’s supposed to handle that case. In this case it might seem obvious what the language should do, but most languages use lazy evaluation for and and or as well so that expressions like:

reference and reference.part

will produce None if reference is None and reference.part if it isn’t.

Section 3.5: Functions

One way to evaluate the design of a piece of software is to ask how extensible it is, i.e., how easily we can add or change things [Wilson2022a]. The answer for our interpreter is now, “Pretty easily,” but for our little language is, “Not at all,” because there’s no way for users to create new operations of their own. We need to give users a way to define and call functions.

Doing this takes less than 60 lines:

  1. A function definition looks like:

    ["def", "same", ["num"], ["get", "num"]]

    It has a name, a (possibly empty) list of parameter names, and a single instruction as a body (which will usually be a "seq" instruction).

  2. Functions are stored in the environment like any other value. The value stored for the function defined above would be:

    ["func", ["num"], ["get", "num"]]

    We don’t need to store the name: that’s recorded by the environment, just like it is for any other variable.

  3. A function call looks like:

    ["call", "same", 3]

    The values passed to the functions are normally expressions rather than constants, and are not put in a sub-list. The implementation: 1. Evaluates all of these expressions. 2. Looks up the function. 3. Creates a new environment whose keys are the parameters’ names and whose values are the expressions’ values. 4. Calls do to run the function’s action and captures the result. 5. Discards environment created two steps previously. 6. Returns the function’s result.

  4. Instead of using a single dictionary to store an environment we use a list of dictionaries. The first dictionary is the global environment; the others store the variables belonging to active function calls.

  5. When we get or set a variable, we check the most recent environment first (i.e., the one that’s last in the list); if the variable isn’t there we look in the global environment. We don’t look at the environments in between; the exercises explore why not.

Scoping rules

The set of active environments makes up the program’s call stack. For historical reasons, each environment is sometimes called a stack frame. Searching through all active stack frames for a variable is called is dynamic scoping. In contrast, most programming languages used lexical scoping, which figures out what a variable name refers to based on the structure of the program text.

Here’s the implementation of do_def:

def do_def(env, args):
    """Define a new function.
    ["def" name [...params...] body] => None # and define function
    assert len(args) == 3
    name = args[0]
    params = args[1]
    body = args[2]
    env_set(env, name, ["func", params, body])
    return None

And here’s the implementation of do_call:

def do_call(env, args):
    """Call a function.
    ["call" name ...expr...] => env[name](*expr)
    # Set up the call.
    assert len(args) >= 1
    name = args[0]
    values = [do(env, a) for a in args[1:]]

    # Find the function.
    func = env_get(env, name)
    assert isinstance(func, list) and (func[0] == "func")
    params, body = func[1], func[2]
    assert len(values) == len(params)

    # Run in new environment.
    env.append(dict(zip(params, values)))
    result = do(env, body)

    # Report.
    return result

Our test program and its output are:

        ["add", ["get", "num"], ["get", "num"]]
    ["set", "a", 1],
            ["set", "a", ["call", "double", ["get", "a"]]],
            ["print", ["get", "a"]]
=> None

Once again, Python and other languages work exactly as shown here. The interpreter (or the CPU, if we’re running code compiled to machine instructions) reads an instruction, figures out what operation it corresponds to, and executes that operation.

Section 3.6: Summary

Concept map of interpreter
Figure 3.2: Interpreter concept map.

Section 3.7: Exercises


Implement fixed-size one-dimensional arrays: ["array", 10] creates an array of 10 elements, while other instructions that you design get and set particular array elements by index.

While loops

  1. Add a while loop using a Python while loop.

  2. Add a while loop using recursion.

Loop counters

The "repeat" instruction runs some other instruction(s) several times, but there is no way to access the loop counter inside those instructions. Modify "repeat" so that programs can do this. (Hint: allow people to create a new variable to hold the loop counter’s current value.)

Chained maps

Look at the documentation for the ChainMap class and modify the interpreter to use that to manage environments.

Better error handling

Several of the instruction functions started with assert statements, which means that users get a stack trace of TLL itself when there’s a bug in their program.

  1. Define a new exception class called TLLException.

  2. Write a utility function called check that raises a TLLException with a useful error message when there’s a problem.

  3. Add a catch statement to handle these errors.


Add a --trace command-line flag to the interpreter. When enabled, it makes TLL print a messages showing each function call and its result.

Early return

Add a "return" instruction to TLL that ends a function call immediately and returns a single value.

Variable argument lists

Add variable-length parameter lists to functions.


  1. Our interpreter looks for variables in the current function call’s environment and in the global environment, but not in the environments in between. Why not? How could looking in those intermediate environments make programs harder to debug?

  2. The interpreter allows users to define functions inside functions. What variables can the inner function access when you do this? What variables should it be able to access? What would you have to do to enable this?