Exercise 3: Simply Typed Lambda Calculus
Hand in: November 3 (2 weeks).
In this exercise, we reuse the combinator parsing library introduced
in exercise 1.
The provided framework is self-contained and can be downloaded as
src.zip.
Assignment
The goal of this exercise is to familiarize yourself with the simply typed
λ-calculus; your work consists of implementing a type checker
and a reducer for simply typed λ-terms. To make it more
interesting, we extend lambda calculus with booleans and integers
values, let and pairs. This time we'll fix a call by value
strategy, so there will be only one reducer you need to write
(actually, you can reuse some of the reducer from the previous assignment).
We first introduce the syntax for typed λ-calculus without
let and pairs, which we'll add later. It might be
a good idea to start implementing this language first, and later add
the rest.
t |
::= |
"true" |
true value |
|
|
| |
"false" |
false value |
|
|
| |
"if" t1 "then" t2 "else" t3 |
if |
|
|
| |
numericLit |
integer |
|
|
| |
"pred" t |
predecessor |
|
|
| |
"succ" t |
successor |
|
|
| |
"iszero" t |
iszero |
|
|
| |
x |
variable |
|
|
| |
"\" x ":" T "." t |
abstraction |
|
|
| |
t t |
application |
(left associative) |
|
| |
"(" t ")" |
|
|
v |
::= |
|
values |
|
| |
"true" |
|
| |
"false" |
|
| |
nv |
numeric value |
|
| |
"\" x ":" T "." t |
abstraction value |
nv |
::= |
|
numeric values |
|
| |
"0" |
|
| |
"succ" nv |
Note:
As in the first assignment, we add syntactic sugar for numeric
literals. They are desugarized to the corresponding sequence
of succ succ.. 0
, as described before.
The only new thing in the above rules is the type annotation
for lambda abstractions. We see that the variable name is followed by
colon and a type. It roughly says "this function expects an argument
of type T". In the above rules, T stands for types, and here's the syntax
for types:
T |
::= |
"Bool" |
boolean type |
|
|
| |
"Nat" |
numeric type |
|
|
| |
T "->" T |
function types |
(right associative) |
|
|
| |
"(" T ")" |
|
|
There are three kinds of types in this language: booleans, natural
numbers and function types. The type constructor "->" is
right-associative — that is, the expression T1
-> T2 -> T3
stands for
T1 -> (T2 -> T3)
[TAPL, p.100].
Evaluation rules for this language are straight forward. You may note
that they already fix the evaluation strategy to call by value, and
that the type of an abstraction is ignored during evaluation. We can
say that evaluation of simply typed lambda terms proceeds exactly the
same as for untyped lambda terms. The operation of stripping off type
annotations is called erasure and it is what enabled the
addition of Java generics without modifying the virtual machine.
Computation |
Congruence |
if true then t1 else t2 → t1 |
if false then t1 else t2 → t2 |
isZero zero → true |
isZero succ NV → false |
pred zero → zero |
pred succ NV → NV |
(λx: T.t1) v2 → [x
→ v2] t1 |
|
t1 → t1' |
if t1 then t2 else t3 → if t1' then t2 else t3 |
|
t → t' |
isZero t → isZero t' |
|
|
|
|
|
|
Typing rules:
|
|
|
Γ|— t: Nat |
Γ|— pred t: Nat |
|
Γ|— t: Nat |
Γ|— succ t: Nat |
|
Γ|— t: Nat |
Γ|— iszero t: Bool |
|
|
Γ|— t1: Bool Γ
|— t2: T Γ|— t3: T |
Γ|— if t1 then t2 else t3:
T |
|
|
Γ, x: T1|— t: T2 |
Γ|— λx: T1.t: T1
-> T2 |
|
Γ|— t1:
T11->T12 Γ
|— t2: T11 |
Γ|— t1 t2: T12
|
|
|
The above typing rules define a typing relation, similar to the
evaluation relation. However, while evaluation is a relation between
terms, typing is a relation between terms, types and
contexts. A typing rule like the first one can be read "under
context gamma, term true has type Bool". The role of the
context (or environment) is to keep around a mapping between variable
names and their types. It will be used to type free variables, when
they are encountered. This is illustrated by the variable rule which
can be read "under context gamma, variable x has type T
provided context gamma has a binding for variable x to type T".
The purpose of this type system is to prevent "bad things" to
happen. So far, the only bad thing we know is stuck terms, and this
type system prevents stuck terms. In other words, a term that can be
assigned a type (it type checks) is guaranteed not to get
stuck. The result of its evaluation will be a value.
Adding let and pairs
Let us now proceed to the addition of let:
t ::= ...
| "let" x ":" T "=" t "in" t
We can define let in terms of the existing concepts, and
this has the advantage that once the translation is done in the
parser, no addition is necessary to the type checker or to the
evaluator. Such an addition is called a derived form. The
language that is accepted by our parser is called external
language and the language understood by the evaluator (and type
checker) is called internal language. And here is the
translation of let in terms of abstraction and application:
"let" x ":" T "=" t1 "in" t2 --> (\x: T. t2) t1
To add pairs, we can't do the same trick so we'll need to extends
the existing syntax, evaluation and typing rules:
t ::= ...
| "{" t "," t "}"
| "fst" t
| "snd" t
v ::= ..
| "{" v "," v "}"
T ::= ...
| T "*" T (right associative)
The first form creates a new pair, and the other two are called
projections, and extract the first and the second element of a
pair. We add a new kind of values, pair values, and a new kind of
type, for the corresponding pair type. We decide that the pair type
constructor (denoted by *) takes precedence over the arrow
constructor, so Nat * Nat -> Bool is parsed as (Nat *
Nat) -> Bool.
The new evaluation rules are:
|
t1 → t1' |
{t1, t2}
→ {t1', t2} |
|
t2 → t2' |
{v1, t2}
→ {v1, t2'} |
|
|
And last, the additional typing rules:
Γ|— t1: T1
Γ|— t2: T2
|
Γ|— {t1, t2} :
T1 * T2
|
|
Γ|— t: T1*T2
|
Γ|— fst t: T1
|
|
Γ|— t: T1*T2
|
Γ|— snd t: T2
|
|
Implementation
Here is a summary of what you need to implement for this assignment:
-
A parser for the given grammar (including let and pairs).
-
A type checker which given a term finds its type (or it prints
an error message).
-
A call by value reducer (small step) with the corresponding
path method which gives back the stream of intermediate
terms.
Input/Output
Your program should read a string from standard input until
end-of-file is encountered, which represents the input
program. If the program is syntactically correct, it should print
its type and then print each step of the small-step reduction, starting with the
input term, until it reaches a normal value (remember that simply
typed lambda calculus is strongly normalizing). If there is a type
error, it should print the error message and the position
where it occured and skip the reduction.
The provided framework already
implements this behavior. You should use it as-is.
Hints
- As in the previous exercise, the project is supplied with a
build.xml
file for ant, and a starting point for your project.
-
Don't forget to override the method
toString()
in the
subclasses of class Term
in order to get a clean output!
-
You should maintain positions in your abstract syntax
trees. This is done by using
positioned
around
parsers (the skeleton project already has them in place). This
method takes care of updating the position on your trees, during
parsing. Type errors should mention tree positions (have a look
at class TypeError
to see how it's done).