Exercise 3: Simply Typed Lambda Calculus

Hand in: November 2nd (2 weeks). Update: The grammar has been updated to allow for syntactic sugar for numeric literals.

In this exercise, we reuse the combinator parsing library introduced in exercise 1.

The provided framework is self-contained and can be downloaded as zip.

Assignment

The goal of this exercise is to familiarize yourself with the simply typed λ-calculus; your work consists of implementing a type checker and a reducer for simply typed λ-terms. To make it more interesting, we extend lambda calculus with booleans and integers values, let and pairs. This time we'll fix a call by value strategy, so there will be only one reducer you need to write (actually, you can reuse some of the reducer from the previous assignment).

We first introduce the syntax for typed λ-calculus without let and pairs, which we'll add later. It might be a good idea to start implementing this language first, and later add the rest.

t ::= "true" true value
| "false" false value
| "if" t1 "then" t2 "else" t3 if
| numericLit integer
| "pred" t predecessor
| "succ" t successor
| "iszero" t iszero
| x variable
| "\" x ":" T "." t abstraction
| t t application (left associative)
| numericLiteral numeric literal
| "(" t ")"
v ::= values
| "true"
| "false"
| nv numeric value
| "\" x ":" T "." t abstraction value
nv ::= numeric values
| "0"
| "succ" nv

Note: As in the first assignment, we add syntactic sugar for numeric literals. They are desugarized to the corresponding sequence of succ succ.. 0, as described before.

The only new thing in the above rules is the type annotation for lambda abstractions. We see that the variable name is followed by colon and a type. It roughly says "this function expects an argument of type T". In the above rules, T stands for types, and here's the syntax for types:

T ::= "Bool" boolean type
| "Nat" numeric type
| T "->" T function types (right associative)
| "(" T ")"

There are three kinds of types in this language: booleans, natural numbers and function types. The type constructor "->" is right-associative — that is, the expression T1 -> T2 -> T3 stands for T1 -> (T2 -> T3) [TAPL, p.100].

Evaluation rules for this language are straight forward. You may note that they already fix the evaluation strategy to call by value, and that the type of an abstraction is ignored during evaluation. We can say that evaluation of simply typed lambda terms proceeds exactly the same as for untyped lambda terms. The operation of stripping off type annotations is called erasure and it is what enabled the addition of Java generics without modifying the virtual machine.

Computation Congruence
if true then t1 else t2 → t1
if false then t1 else t2 → t2
isZero zero → true
isZero succ NV → false
pred zero → zero
pred succ NV → NV
(λx: T.t1) v2 → [x → v2] t1
t1 → t1'
if t1 then t2 else t3 → if t1' then t2 else t3
t → t'
isZero t → isZero t'
t → t'
pred t → pred t'
t → t'
succ t → succ t'
t1 → t1'
t1 t2 → t1' t2
t2 → t2'
v1 t2 → v1 t2'

Typing rules:

Γ|— true: Bool
Γ|— false: Bool
Γ|— 0: Nat
Γ|— t: Nat
Γ|— pred t: Nat
Γ|— t: Nat
Γ|— succ t: Nat
Γ|— t: Nat
Γ|— iszero t: Bool
Γ|— t1: Bool Γ |— t2: T Γ|— t3: T
Γ|— if t1 then t2 else t3: T
x: T ∈ Γ
Γ|— x: T
Γ, x: T1|— t: T2
Γ|— λx: T1.t2: T1 -> T2
Γ|— t1: T11->T12       Γ |— t2: T11
Γ|— t1 t2: T12

The above typing rules define a typing relation, similar to the evaluation relation. However, while evaluation is a relation between terms, typing is a relation between terms, types and contexts. A typing rule like the first one can be read "under context gamma, term true has type Bool". The role of the context (or environment) is to keep around a mapping between variable names and their types. It will be used to type free variables, when they are encountered. This is illustrated by the variable rule which can be read "under context gamma, variable x has type T provided context gamma has a binding for variable x to type T".

The purpose of this type system is to prevent "bad things" to happen. So far, the only bad thing we know is stuck terms, and this type system prevents stuck terms. In other words, a term that can be assigned a type (it type checks) is guaranteed not to get stuck. The result of its evaluation will be a value.

Adding let and pairs

Let us now proceed to the addition of let:

t ::= ...
    | "let" x ":" T "=" t "in" t

We can define let in terms of the existing concepts, and this has the advantage that once the translation is done in the parser, no addition is necessary to the type checker or to the evaluator. Such an addition is called a derived form. The language that is accepted by our parser is called external language and the language understood by the evaluator (and type checker) is called internal language. And here is the translation of let in terms of abstraction and application:

"let" x ":" T "=" t1 "in" t2 --> (\x: T. t2) t1

To add pairs, we can't do the same trick so we'll need to extends the existing syntax, evaluation and typing rules:

t ::= ...
    | "{" t "," t "}"
    | "fst" t
    | "snd" t

v ::= ..
    | "{" v "," v "}"

T ::= ...
    | T "*" T (right associative)

The first form creates a new pair, and the other two are called projections, and extract the first and the second element of a pair. We add a new kind of values, pair values, and a new kind of type, for the corresponding pair type. We decide that the pair type constructor (denoted by *) takes precedence over the arrow constructor, so Nat * Nat -> Bool is parsed as (Nat * Nat) -> Bool.

The new evaluation rules are:

fst {v1, v2} → v1
snd {v1, v2} → v2
t → t'
fst t → fst t'
t → t'
snd t → snd t'
t1 → t1'
{t1, t2} → {t1', t2}
t2 → t2'
{v1, t2} → {v1, t2'}

And last, the additional typing rules:

Γ|— t1: T1     Γ|— t2: T2
Γ|— {t1, t2} : T1 * T2
Γ|— t: T1*T2
Γ|— fst t: T1
Γ|— t: T1*T2
Γ|— snd t: T2

Implementation

Here is a summary of what you need to implement for this assignment:

Input/Output

Your program should read a string from standard input until end-of-file is encountered, which represents the input program. If the program is syntactically correct, it should print its type and then print each step of the small-step reduction, starting with the input term, until it reaches a normal value (remember that simply typed lambda calculus is strongly normalizing). If there is a type error, it should print the error message and the position where it occured and skip the reduction. The provided framework already implements this behavior. You should use it as-is.

Hints