Prelude

Prelude: A Reverse Polish Notation Calculator #

In order to get a footing, we’ll start with an example: a calculator that uses postfix operator notation, sometimes called “reverse polish notation.”

In this calculator, all arguments are placed onto a stack. Whenever an operator occurs, it pops values off the stack at that point, performs the operation, and pushes the answer back on.

All operations are binary, so they take two operands from the stack and replace it with one result. To simplify matters, if the stack is has only one operand on it when an operator is called, the other operand is zero.

The calculator takes one line of calculation at a time. If there are leftover elements, it will print the top of the stack, which was the result of the last operation.

Here are some examples, with > marking the user input prompt:

> 1
1
> 1 2
2
> 2 3 +
5
> 2 3 * 4 5 * +
26
> 2 3 4 + *
14
> 2 3 4 5 * * +
62

Our Reference Implementation

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56


# this is a stack of numbers being stored for operation
number_stack = []


def push_num(n):
    number_stack.append(int(n))


def pop_num():
    # the stack has a 0 "pinned" to the bottom of it
    if len(number_stack) == 0:
        return 0
    else:
        return number_stack.pop()


while 1:
    number_stack[:] = []
    print("> ", end="")
    line = input()

    if line.startswith('q'):
        break

    # step one: break the line into pieces
    # keeping track of position
    parts = line.split()
    pos = 1  # not zero due to an offset in the prompt

    for part in parts:
        pos += len(part)

        # decide what the part of the string is. there are several options
        # 1. is it a number? append in on the stack
        if part.isdigit():
            push_num(part)
        # 2. is it an addition? pop two items and add them, then append
        elif part == "+":
            push_num(pop_num() + pop_num())
        # 3. is it a multiplication? same idea
        elif part == "x" or part == "*":
            push_num(pop_num() * pop_num())
        # 4. if it is a blank, then that means we can skip it
        # otherwise, finally if it's anything else, ther is a syntax error
        elif part != '':
            print(" " * pos + "^")
            print("syntax error: unknown token")
            break

        pos += 1  # account for the space

    # in case of a syntax error, it was printed, so continue to the next line
    if pos == len(line) + 1:
        continue
    else:
        print(number_stack[-1])

DOWNLOAD - Note, you’ll need to put spaces between each token and press enter to run a line.

At a high level, it turns out the calculator is doing many of the phases of a compiler! Sort of.

It is doing lexical analysis: splitting up the program into numbers and the operators to execute on them.

It is doing syntax analysis: identifying what operations to do, and keeping track of the order to do them in, which determines their operands.

It is using those syntax understandings to take action. It executes them immediately, which makes it an interpreter rather than a compiler, and isn’t really “translation”, but the idea is still there.

It is so simple mostly because of the simplicty of the “language.” It has only one type (integer), no variables, and not even any control flow. In fact, it is not even Turing Complete!Aside from the lack of control flow, this is because our abstract machine requires data to be accessed in stack order (LIFO).

Even if memory and the number of operations were infinite (as is the case in a Turing machine), this data access pattern makes certain problems uncomputable. In order to make it Turing complete, at least one operation that allows random access (i.e. read from the middle without deleting the data above it) is necessary. This was proven several decades ago.

It would suffice, for example, to add a second parallel stack, and let operations choose between them. It would also suffice to force data accesses into queue order (FIFO) instead.

While there is no way to make this into a language like C or Python without a total re-write, perhaps we can fix these shortcomings and come up with something more powerful in the course of these sections. And perhaps we can learn something along the way.

To make things easier to follow, and facilitate changing one piece at a time, I’m going to create a python package called calc_lang. If you’re not familiar with python packages, here’s an overview. From now on, all “final code” examples will be in the package.