Compilation A
Summer Semester 2003
School of Computer and Communication Sciences
Programming Methods Laboratory (LAMP)

The Exam

The Exam is now corrected, the results can be found on the door to room INR 315. We have also made a PostScript file with examples of correct answers to the questions.


  1. Showing up
  2. >= 35 points in total
  3. >= 50 points in total
  4. >= 60 points in total
  5. >= 75 points in total
  6. >= 95 points in total
There are also .5 grades halfway between two grades.

The section "Cheat Sheet" below will show you what a typical cheat sheet might look like. Such a sheet will be provided with the exam.

The section "Example Exam Questions" shows some typical exam questions that might show up on the exam. It is not the same composition or the same questions as on the real exam, the exam will contain fewer but similar questions. (On the exam each question will also show how many points it is worth, since you have 105 minutes for the exam and need 100 points you should spend approximately 1 minute per point on the exam.)

Cheat Sheet

Here are some extra information that can be useful when solving the exam questions.


Backus-Naur Form was developed by J. Backus and P. Naur for Algol 60.

A production (rule) consists of a left-hand-side (lhs) and a right-hand-side (rhs). The rhs contains terminals and non-terminals. We use | for alternatives. We use juxtaposition for concatenation. Concatenation binds stronger than |.

Names introduced on the left-hand-side can be used on the right-hand-side. A BNF grammar can be recursive, that is, the rhs of a production can refer to the symbol that is being defined in that production (or to another production which in turn refers to the symbol being defined).


EBNF allows us to use:
  • ( ) for grouping.
  • ε for the empty word.
  • [ E ] for one or zero occurrences of E.
  • { E } for zero or more occurrences of E.

Regular expressions

A regular expression can be described by a number of EBNF rules without recursion. There are alternative ways of writing regular expressions. The most common is:
  • E* for {E}.
  • E? for [E].
  • E+ for E{E}.
  • [ac] for (a|c).
  • [a-d] for (a|b|c|d).
  • [^0-9] for anything but a digit.


In the regexp of JLex | * ? + () [] have their usual reg-exp meaning:
  • E* for {E}.
  • E? for [E].
  • E+ for E{E}.
  • [ac] for (a|c).
  • [a-d] for (a|b|c|d).
  • [^0-9] for anything but a digit.
In Jlex
Name = E 
is used to define macros.
refers to the macro name. And you can use . as a shorthand for [^\n]

Example Exam Questions

  1. The Structure of a Compiler
  2. Why is a compiler divided up in several stages? What stages are a compiler usually divided up into? What does these stages do?

  3. Glossary
  4. Explain the expressions:
    1. Native code
    2. Runtime
    3. Compiler
    4. Interpreter
    5. Virtual Machine
    6. Scanner
    7. Parser
    8. Syntactic Analysis
    9. Semantic Analysis

  5. Lexical Analysis
  6. The following description of integer constants in C++ appears on page 480 of "The C++ Programming Language".

    An integer constant consisting of a sequence of digits is taken to be decimal (base 10) unless it begins with 0 (digit zero). A sequence of digits starting with 0 is taken to be an octal integer (base 8). The digits 8 and 9 are not octal digits. A sequence of digits preceded by 0x or 0X is taken to be a hexadecimal integer (base 16). The hexadecimal digits include a or A through f or F. ... The type of an integer depends on its ... suffix. [an integer may have no suffix.] ... If [the integer] is suffixed by u or U, its type is ... If it is suffixed by l or L , its type is ...

    Give a regular expression for C++ integer constants. Use only the operations |, *, and concatenation in your expression (no special JLex notation). You may also use parentheses and ε (which are not operations). For convenience, you may name a regular expression R by writing name = R.

  7. Tokens
  8. Split the following "Expressive"-code into a sequence of tokens. For each token note if it has any associated value, and in that case what the value is.
        // Read two integers
        readint x.
        readint y.
        max := 0.
        /* Return the max of the two */
        if (x>y) { max := x.} else { max := y.}.
        print "Max: ". print Max. printnl. 

  9. Let E be an eff
    1. Write a non-recursive EBNF grammar for the language of international telephone numbers. A telephone number may start with a + and a country code then there might be an area code within parentheses followed by the subscriber number. Spaces are allowed between the country code and the area code and between the area code and the subscriber number, and between any digits in the subscriber number but nowhere else. Examples: +46 (18) 51 13 29, (021)311 87 53, 693 7593, +41(21)6936660.
    2. Updated 08/07/03:
      Write a recursive EBNF grammar for the language which consists of a sequence of the character a followed by a sequence of as many ba words. Examples: aba, aababa, aaaababababa.

  10. An Ambigous Grammar
  11. Here is a context free grammar for some regular expressions:
        R ::= RR
        R ::= R '*'
        R ::= a
    1. Rewrite the grammar to an equivalent unambiguous grammar. Motivate why it is unambiguous. Give some example of how regular expressions are parsed by your grammar. Give precedence and associativity for the diffrent contstructs.
    2. Rewrite the grammar so that it can be parsed by a predictive parser.

  12. Drop the E
    1. Convert the following EBNF grammar to a BNF grammar. (That is, get rid of "{}" and "[]".)
                  S ::= 'a' { B } [ C ]
                  B ::= 'b' | S
                  C ::= 'd' | S  
    2. Where in a compiler are BNF grammars useful.

  13. Grandma
  14. [Note: This question is quite hard, the questions on the exam will probably be easier than this.]

    Given this grammar G0 with the start symbol S:
        S ::= if C then S
        S ::= if C then S else S
        S ::= ε
        C ::= E == E
        E ::= E = E
        E ::= E + E
        E ::= id
        E ::= ( E )
    Where == is comparison and = is assignment.

    1. The grammar G0 is ambiguous. Explain what this means and give an example that shows that G0 is ambiguous.
    2. Assume that + should be left associative, = should be right associative, and + should have a higher precedence than =. Rewrite G0 to a grammar G1 that expresses these rules.
    3. Rewrite G1 to a grammar G2 which is suited for a top-down (predictive, LL(1)) parser.

  15. Lexical or Syntactical
  16. What is the difference between the lexical and the syntactical analysis of a compiler? What kind of information flows between the phases?

  17. (Un)ambiguous
    1. What does it mean that a grammar is ambiguous?
    2. How can one tell that a grammar is ambiguous?
    3. What imapct does it have on a compiler if the grammar is ambiguous?
    4. Mention two methods to cope with ambiguity in a parser generator like JLex.
    5. Give the meaning of the words "associativity" and "priority".

  18. Take it from the top
  19. Write a top down parser in Java for the following Grammar. You may assume that there are:

    • A Token nextToken() function, which returns the next token.
    • A function void error(String str) which signals an error.
    • And that the integer constants IDENT, LBRACKET, RBRACKET, LPAR, RPAR, and NUMLIT are defined.

             |  LPAR E RPAR
             |  NUMLIT
          void E() {

  20. Abstract Syntax Tree
  21. Consider the following grammar:
            |   E PLUS E
            |   E MINUS E
            |   LPAR E RPAR
            |   NUMLIT
    1. Design an abstract syntax for the grammar.
    2. Given your abstract syntax draw the abstract syntax tree for the expression "x [y [z + (5-a)]] [2]".

  22. A Concrete Abstract Syntax
  23. Here is a grammar for a statements in an imperative programming language.
        Statement ::= Assignment | LoopStatement
        Statement ::= CaseStatement | ForStatement
        Assignment ::= Designator ':=' Expression
        Designator ::= id { '[' Expression ']' }
        LoopStatement ::= loop StatementSequence end
        StatementSequence ::= Statement { ';' Statement }
        CaseStatement ::= case Expression of Case { '||' Case }
        Case ::= CaseLabelList ':' StatementSequence
        CaseLabelList ::= ConstExpression { ',' ConstExpression }
    Define a suitable Java representation for the abstract syntax tree.

  24. The Object Formerly Known as Prince
  25. What is a symbol table used for in a compiler? What type of information is stored in a symbol table?

  26. Can you cope with scope?
  27. Assume a language with static scope rules allowing nested scopes. How can you implement an imperative symbol table for the semantic analyzer in the compiler for such a language. The symbol table should be a global structure that is updated through side effects. Describe especially how scope is handled, both in the implementation of the symbol table and in the semantic analysis.

  28. Semantic Analysis
  29. Assume a procedural language with integers and floating point numbers (floats) as basic types and with the following operations:
    The arguments should have the same type, the result type should be the same as the argument type.
    The argument should be integers, the result is an integer
    The arguments should be floating point numbers, the result is a floating point number
    The expression should have the same type as the variable.

    Sketch on a Java-solution for type cheking of expressions in this language which returns the type of the expression.

  30. Intermediate Representation
  31. What does "Intermediate Representation" mean? What is the adavatage of using an "Intermediate Representation" in a compiler?

  32. Back-end
  33. Explain the meaning of the following words in the context of compiler back-ends and interpretation:

    1. Runtime
    2. Environment
    3. Virtual Machine
    4. Static
    5. Dynamic

  34. Can you interpret this?
  35. Assume that you have a compiler for a small imperative language. This compiler compiles source code to an abstract syntax tree (AST), and performs name and type analysis. The language has only two types integer and boolean. The type analysis has annotated the AST with the type of each node. At runtime the boolean false is represented by 0 (zero) and the boolean true by any other integer.

    Now you are writing a interpreter in Java for this language which uses the visitor pattern over the AST. Since there is only one type at runtime (integers) you have decided that the visitor should always return an integer.

    You have the following code in the AST:

           /** Tree node representing if E then Ss Else Ss. */
          public static class If extends Stat {
    	/** Condition. */
    	public final Expr cond;
    	/** Then part of the statement. */
    	public final Stat[] thenp;
    	/** Else part of the statement. */
    	public final Stat[] elsep;
    	public If(int pos, Expr cond, Stat[] thenp, Stat[] elsep) {
    	    this.cond = cond;
    	    this.thenp = thenp;
    	    this.elsep = elsep;
    	public void apply(Visitor v) {

    And you have already written this code in the interpreter:

           /** Interpret and return the result from a tree node. */
           public int interpret(Tree tree) {
    	int tmp = result;
    	int res2 = result;
    	result = tmp;
    	return res2;
           /** Interpret an array of trees. */
           public void interpret(Tree[] trees) {
    	for (int i = 0; i < trees.length; i++)
    Finish the implementation of the interpreter for the 'if'-node.
          public void caseIf(If tree) {

  36. Ad us proprium
  37. If you feel that you have studied completely different subjects of the course than the ones we have asked you about, now is your chance!

    Design an exam question and answer it. The question should be on an appropriate level of difficulty.

   [an error occurred while processing this directive]  
Last modified: Monday, 01-Mar-2004 14:19:13 CET