## Exercise 1: Arithmetic expressions

• pack your project files (source files and makefile or `build.xml`) using zip (or `tar/gzip/bzip2`, but please don't use `rar` or any other compressor)
• email your project to Iulian or Stephane, specifying the members of your group.

In this exercise, you are asked to familiarize yourself with Scala and the combinator parsing library.

• The first part will not be graded, and consists of extending an existing arithmetic evaluator with conditionals and comparison operators. (package `arithmetic` in the provided framework)
• The second part is your assignment, consisting of implementing (variations of) the NB language from chapter 3 of the TAPL book (package `tapl1` in the provided framework).

The framework you will be using can be downloaded as zip or tar.gz archive.

The API documentation for this exercise is available online.

### Technicalities

The programming assignments of this course will be done in Scala.

Scala should be installed on all computers in the exercise room, but if you plan to use your own, you can download it here. Scala has an Eclipse plugin, downloadable here and comes with support for several editors such as Emacs or Vi.

The Scala configuration on Linux and Windows is described in a separated documentation page.

The provided framework contains configuration files for Ant and Eclipse.

• Ant: `build.xml`
• Eclipse: `.classpath` and `.project` (use "File" → "Import.." → "General" → "Existing Projects into Workspace" to import the extracted framework)

### 1. Arithmetic expressions

In this first part you will modify an existing parser for arithmetic expressions. The goal is to familiarize yourself with the combinator parsing library and Scala, so your imagination can run wild and implement as many features as you want! The given framework can parse and evaluate simple arithmetic expressions, and we'd like you to (at least) extend it with booleans and comparison expressions.

Here is the grammar which is currently accepted by the parser:

```  Expr ::= Term { ('+' Term) | ('-' Term) }

Term ::= Factor { ('*' Factor) | ('/' Factor) }

Factor ::= <number> | '(' Expr ')'
```

#### A quick introduction to the combinator parsing library

The combinator parsing library comes in its own package, not surprisingly named `combinatorParsing`. It defines several classes, among which is `GenericParser` and `TokenizingParser`. The parsers you will write using this library will have to extend one of these classes. `TokenizingParser` extends `GenericParser` with functionality to recognize common tokens, like identifiers or string and integer literals, so this is what you will use most often.

You're are encouraged to have a look at the following API documentation of the combinator parsing library.

Writing a parser for the above grammar is straight forward using the combinator parsing library: nonterminals become methods, terminals are either keywords or numbers, and to link them all together one calls the methods provided by the library:

• & for sequence
• | for alternative
• rep for repetition
• opt for optional parts

For example, here is how the first production should look:

```  /** Expr ::= Term { ('+' Term) | ('-' Term) }   */
def Expr: Parser = production(Term & rep(kw("+") & Term | kw("-") & Term)
```

You probably noticed that the whole production is wrapped into a call to method `production`. This instructs the library to call a user-defined method when this grammar production has matched, passing the parsed tokens (see below).

Terminals deserve a few more words: the parsing library has to stay general, so it cannot decide beforehand what are the terminals of your grammar. It does, however, provide three stock options: identifiers, and string and numeric literals, and two configurable ones: delimiters and keywords. Delimiters are characters which can break a text into tokens even without a separating space. A common example is parenthesis. Keywords are strings which are not identifiers. Both of them can be specified by listing them in a call to 'incl' before any parsing gets done, as for example:

```  delimiters.incl('(', ')', '+', '-', '*', '/')
keywords.incl("for", "while", "do")
```

(these two sets are inherited from the `GenericParser` class)

The `BuildingParser` mixin defines an abstract type called `Output`, which is the type of values your parser will produce. In this example, it is set to `Int` since evaluation is done during parsing, and therefore it produces ints. In a more elaborate setting, this would probably be an abstract syntax tree.

The parsing library calls method `build` each time a grammar production is matched. This is the hook normally used to build up an Abstract Syntax Tree, but in this simple example it is the place where evaluation takes place. This method gets a list of tokens, and a position. One thing to note is that delimiters will be passed to this method as instances of the keyword (`KW`) token.

Once the parser recognizes the extensions of your grammar, you will need to update this method with the new cases.

The parsing library is well commented and the code is not very large, so we encourage you to read the code and try to understand how it works (be warned though, it makes heavy use of higher order functions and call by name parameters, so you should feel comfortable with Scala before diving in). And of course, experiment with it.

#### Extend our example

Extend the grammar with another expression form, if t1 then t2 else t3, and the three comparison operators <, >, =. Here are some examples that should be accepted by your modified parser:

```10 + (if (2) then 2 else 3)
if (2 < 3) then 2 else 3
10 + (if (2*3 = 3*2) then 2 else 3)
```
and one example which it shouldn't:
```0 > 0 = 0
```

In order to keep things simple, we keep the `Output` type of the parser (Int) and encode booleans to `1` and `0`.

### Assignment 1: The NB language

Hand in: November 8 (in 2 weeks).

The cryptic acronym stands for Numbers and Booleans and comes from the course book. This simple language is defined in Chapter 3 of the the TAPL book.

 ```t ::= "true" terms | "false" | "if" t1 "then" t2 "else" t3 | "0" | "succ" t | "pred" t | "iszero" t v ::= "true" values | "false" | nv nv ::= 0 numeric values | "succ" nv ```

This language has three syntactic forms: terms, which is the most general form, and two kinds of values: numeric values and boolean values. The evaluation rules are as follows:

Computation Congruence
 if true then t1 else t2 → t1 if false then t1 else t2 → t2 isZero zero → true isZero succ NV → false pred zero → zero pred succ NV → NV
 t1 → t1' if t1 then t2 else t3 → if t1' then t2 else t3
 t → t' isZero t → isZero t'
 t → t' pred t → pred t'
 t → t' succ t → succ t'

### Big step semantics

The other style of operation semantics commonly in use is called big step sematics. Instead of defining evaluation in terms of a one step reduction, it formulates the notion of a term that evaluates to a final value, written "t ⇓ v". Here is how the big step evaluation rules would look for this language:

 v ⇓ v
(B-VALUE)
 t1 ⇓ true     t2 ⇓ v2 if t1 then t2 else t3 ⇓ v2
(B-IFTRUE)
 t1 ⇓ false     t3 ⇓ v3 if t1 then t2 else t3 ⇓ v3
(B-IFFALSE)
 t1 ⇓ nv1 succ t1 ⇓ succ nv1
(B-SUCC)
 t1 ⇓ 0 pred t1 ⇓ 0
(B-PREDZERO)
 t1 ⇓ succ nv1 pred t1 ⇓ nv1
(B-PREDSUCC)
 t1 ⇓ 0 iszero t1 ⇓ true
(B-ISZEROZERO)
 t1 ⇓ succ nv1 iszero t1 ⇓ false
(B-ISZEROSUCC)

What we'd like you to do is:

• Write a parser that recognizes this language, using the combinator library
• Write a `reduce` method which performs one step of the evaluation, according to the rules listed above
• Write an `eval` method which implements a big step evaluator (one which evaluates a term down to a value, or it gets stuck when no rule applies)
• Extend the language with numeric literals which are encoded to `succ succ ... 0` during parsing.

### Implementation hints

For this project we encourage you to use an abstract syntax tree. The standard way to do this in Scala is to define case classes for each form a term can get, and construct the tree in the `build` method. The provided framework should give you an idea about how things should look.

The parsing library will parse the digit zero to a numeric literal. If you want to allow only 0, but none of the other numeric literals, you can add the following method to the object `tapl1.Arithmetic`:

```  def zero = token("0", t => t match {
case NumericLit("0") => true
case _ => false
})
def Expr: Parser = production( //...
```

The method `zero` will create a parser which accepts only numeric zero (of course, later on, when implementing the last bullet in the list, you will not need this method anymore, but deal with any numeric literal).

Besides the `reduce` and `eval` methods, your code should provide two testing methods which take a string and evaluate the program.

• The one for reductions should print each intermediary step until it finds a normal form.
• The other one should print either a value (if evaluation ended successfully), or an error message together with the stuck term (a term which is not a value, but for which there are no rules to apply).