Compilation A
Summer Semester 2003

School of Computer and Communication Sciences
Programming Methods Laboratory (LAMP)

Home
Staff
Research
Publications
Events
Teaching
CompilationA
Exam
Jobs

The Exam

The Exam is now corrected, the results can be found on the door to room INR 315. We have also made a PostScript file with examples of correct answers to the questions.

Grades:

Showing up
>= 35 points in total
>= 50 points in total
>= 60 points in total
>= 75 points in total
>= 95 points in total

There are also .5 grades halfway between two grades.

The section "Cheat Sheet" below will show you what a typical cheat sheet might look like. Such a sheet will be provided with the exam.

The section "Example Exam Questions" shows some typical exam questions that might show up on the exam. It is not the same composition or the same questions as on the real exam, the exam will contain fewer but similar questions. (On the exam each question will also show how many points it is worth, since you have 105 minutes for the exam and need 100 points you should spend approximately 1 minute per point on the exam.)

Cheat Sheet

Here are some extra information that can be useful when solving the exam questions.

BNF

Backus-Naur Form was developed by J. Backus and P. Naur for Algol 60.

A production (rule) consists of a left-hand-side (lhs) and a right-hand-side (rhs). The rhs contains terminals and non-terminals. We use | for alternatives. We use juxtaposition for concatenation. Concatenation binds stronger than |.

Names introduced on the left-hand-side can be used on the right-hand-side. A BNF grammar can be recursive, that is, the rhs of a production can refer to the symbol that is being defined in that production (or to another production which in turn refers to the symbol being defined).

EBNF

EBNF allows us to use:

( ) for grouping.
ε for the empty word.
[ E ] for one or zero occurrences of E.
{ E } for zero or more occurrences of E.

Regular expressions

A regular expression can be described by a number of EBNF rules without recursion. There are alternative ways of writing regular expressions. The most common is:

E* for {E}.
E? for [E].
E+ for E{E}.
[ac] for (“a”|”c”).
[a-d] for (“a”|”b”|”c”|”d”).
[^0-9] for anything but a digit.

JLex

In the regexp of JLex | * ? + () [] have their usual reg-exp meaning:

E* for {E}.
E? for [E].
E+ for E{E}.
[ac] for (“a”|”c”).
[a-d] for (“a”|”b”|”c”|”d”).
[^0-9] for anything but a digit.

In Jlex

Name = E

is used to define macros.

{name}

refers to the macro name. And you can use . as a shorthand for [^\n]

Example Exam Questions

The Structure of a Compiler

Glossary

Native code
Runtime
Compiler
Interpreter
Virtual Machine
Scanner
Parser
Syntactic Analysis
Semantic Analysis

Lexical Analysis

The following description of integer constants in C++ appears on page 480 of "The C++ Programming Language".

An integer constant consisting of a sequence of digits is taken to be decimal (base 10) unless it begins with 0 (digit zero). A sequence of digits starting with 0 is taken to be an octal integer (base 8). The digits 8 and 9 are not octal digits. A sequence of digits preceded by 0x or 0X is taken to be a hexadecimal integer (base 16). The hexadecimal digits include a or A through f or F. ... The type of an integer depends on its ... suffix. [an integer may have no suffix.] ... If [the integer] is suffixed by u or U, its type is ... If it is suffixed by l or L , its type is ...

Give a regular expression for C++ integer constants. Use only the operations |, *, and concatenation in your expression (no special JLex notation). You may also use parentheses and ε (which are not operations). For convenience, you may name a regular expression R by writing name = R.

Tokens

    // Read two integers
    readint x.
    readint y.
    max := 0.
    /* Return the max of the two */
    if (x>y) { max := x.} else { max := y.}.
    print "Max: ". print Max. printnl.

Let E be an eff

Write a non-recursive EBNF grammar for the language of international telephone numbers. A telephone number may start with a + and a country code then there might be an area code within parentheses followed by the subscriber number. Spaces are allowed between the country code and the area code and between the area code and the subscriber number, and between any digits in the subscriber number but nowhere else. Examples: +46 (18) 51 13 29, (021)311 87 53, 693 7593, +41(21)6936660.
Updated 08/07/03:
Write a recursive EBNF grammar for the language which consists of a sequence of the character a followed by a sequence of as many ba words. Examples: aba, aababa, aaaababababa.

An Ambigous Grammar

    R ::= RR
    R ::= R '*'
    R ::= a

Rewrite the grammar to an equivalent unambiguous grammar. Motivate why it is unambiguous. Give some example of how regular expressions are parsed by your grammar. Give precedence and associativity for the diffrent contstructs.
Rewrite the grammar so that it can be parsed by a predictive parser.

Drop the E

Convert the following EBNF grammar to a BNF grammar. (That is, get rid of "{}" and "[]".)

            S ::= 'a' { B } [ C ]
            B ::= 'b' | S
            C ::= 'd' | S

Where in a compiler are BNF grammars useful.

Grandma

Note:

₀

    S ::= if C then S
    S ::= if C then S else S
    S ::= ε
    C ::= E == E
    E ::= E = E
    E ::= E + E
    E ::= id
    E ::= ( E )

The grammar G₀ is ambiguous. Explain what this means and give an example that shows that G₀ is ambiguous.
Assume that + should be left associative, = should be right associative, and + should have a higher precedence than =. Rewrite G₀ to a grammar G₁ that expresses these rules.
Rewrite G₁ to a grammar G₂ which is suited for a top-down (predictive, LL(1)) parser.

Lexical or Syntactical

(Un)ambiguous

What does it mean that a grammar is ambiguous?
How can one tell that a grammar is ambiguous?
What imapct does it have on a compiler if the grammar is ambiguous?
Mention two methods to cope with ambiguity in a parser generator like JLex.
Give the meaning of the words "associativity" and "priority".

Take it from the top

Write a top down parser in Java for the following Grammar. You may assume that there are:

A Token nextToken() function, which returns the next token.
A function void error(String str) which signals an error.
And that the integer constants IDENT, LBRACKET, RBRACKET, LPAR, RPAR, and NUMLIT are defined.

      E ::= IDENT { LBRACKET E RBRACKET }
         |  LPAR E RPAR
         |  NUMLIT

      void E() {
...  
      }

Abstract Syntax Tree

      E ::= IDENT { LBRACKET E RBRACKET }
        |   E PLUS E
        |   E MINUS E
        |   LPAR E RPAR
        |   NUMLIT

Design an abstract syntax for the grammar.
Given your abstract syntax draw the abstract syntax tree for the expression "x [y [z + (5-a)]] [2]".

A Concrete Abstract Syntax

    Statement ::= Assignment | LoopStatement
    Statement ::= CaseStatement | ForStatement
    Assignment ::= Designator ':=' Expression
    Designator ::= id { '[' Expression ']' }
    LoopStatement ::= loop StatementSequence end
    StatementSequence ::= Statement { ';' Statement }
    CaseStatement ::= case Expression of Case { '||' Case }
    Case ::= CaseLabelList ':' StatementSequence
    CaseLabelList ::= ConstExpression { ',' ConstExpression }

The Object Formerly Known as Prince

Can you cope with scope?

Assume a language with static scope rules allowing nested scopes. How can you implement an imperative symbol table for the semantic analyzer in the compiler for such a language. The symbol table should be a global structure that is updated through side effects. Describe especially how scope is handled, both in the implementation of the symbol table and in the semantic analysis.

Semantic Analysis

+: The arguments should have the same type, the result type should be the same as the argument type.
xor: The argument should be integers, the result is an integer
/: The arguments should be floating point numbers, the result is a floating point number
:=: The expression should have the same type as the variable.

Sketch on a Java-solution for type cheking of expressions in this language which returns the type of the expression.

Intermediate Representation

Back-end

Explain the meaning of the following words in the context of compiler back-ends and interpretation:

Runtime
Environment
Virtual Machine
Static
Dynamic

Can you interpret this?

Assume that you have a compiler for a small imperative language. This compiler compiles source code to an abstract syntax tree (AST), and performs name and type analysis. The language has only two types integer and boolean. The type analysis has annotated the AST with the type of each node. At runtime the boolean false is represented by 0 (zero) and the boolean true by any other integer.

Now you are writing a interpreter in Java for this language which uses the visitor pattern over the AST. Since there is only one type at runtime (integers) you have decided that the visitor should always return an integer.

You have the following code in the AST:

       /** Tree node representing if E then Ss Else Ss. */
      public static class If extends Stat {
	/** Condition. */
	public final Expr cond;
	/** Then part of the statement. */
	public final Stat[] thenp;
	/** Else part of the statement. */
	public final Stat[] elsep;
	public If(int pos, Expr cond, Stat[] thenp, Stat[] elsep) {
	    super(pos);
	    this.cond = cond;
	    this.thenp = thenp;
	    this.elsep = elsep;
	}
	public void apply(Visitor v) {
	    v.caseIf(this);
	}
       }

And you have already written this code in the interpreter:

       /** Interpret and return the result from a tree node. */
       public int interpret(Tree tree) {
	int tmp = result;
	tree.apply(this);
	int res2 = result;
	result = tmp;
	return res2;
       }

       /** Interpret an array of trees. */
       public void interpret(Tree[] trees) {
	for (int i = 0; i < trees.length; i++)
	    interpret(trees[i]);
       }

      public void caseIf(If tree) {

 	

	

	

	

	

      }

Ad us proprium

Design an exam question and answer it. The question should be on an appropriate level of difficulty.

[an error occurred while processing this directive]

Last modified: Monday, 01-Mar-2004 14:19:13 CET