|
|
The Exam
The Exam is now corrected, the results can be found on the door
to room INR 315.
We have also made a PostScript file
with examples of correct answers
to the questions.
Grades:
- Showing up
- >= 35 points in total
- >= 50 points in total
- >= 60 points in total
- >= 75 points in total
- >= 95 points in total
There are also .5 grades halfway between two grades.
The section "Cheat Sheet" below will show you what a typical
cheat sheet might look like. Such a sheet will be provided with
the exam.
The section "Example Exam Questions" shows some typical exam
questions that might show up on the exam. It is not the same
composition or the same questions as on the real exam, the exam
will contain fewer but similar questions. (On the exam each
question will also show how many points it is worth, since you
have 105 minutes for the exam and need 100 points you should
spend approximately 1 minute per point on the exam.)
Cheat Sheet
Here are some extra information that can be useful when solving
the exam questions.
BNF
Backus-Naur Form was developed by J. Backus and P. Naur for Algol 60.
A production (rule) consists of a left-hand-side (lhs) and
a right-hand-side (rhs). The rhs contains terminals and non-terminals.
We use | for alternatives. We use juxtaposition for concatenation.
Concatenation binds stronger than |.
Names introduced on the left-hand-side can be used on the
right-hand-side. A BNF grammar can be recursive, that is, the
rhs of a production can refer to the symbol that is being
defined in that production (or to another production which in
turn refers to the symbol being defined).
EBNF
EBNF allows us to use:
- ( ) for grouping.
- ε for the empty word.
- [ E ] for one or zero occurrences of E.
- { E } for zero or more occurrences of E.
Regular expressions
A regular expression can be described by a number of EBNF rules
without recursion.
There are alternative ways of writing regular expressions. The
most common is:
- E* for {E}.
- E? for [E].
- E+ for E{E}.
- [ac] for (“a”|”c”).
- [a-d] for (“a”|”b”|”c”|”d”).
- [^0-9] for anything but a digit.
JLex
In the regexp of JLex | * ? + () [] have their usual reg-exp
meaning:
- E* for {E}.
- E? for [E].
- E+ for E{E}.
- [ac] for (“a”|”c”).
- [a-d] for (“a”|”b”|”c”|”d”).
- [^0-9] for anything but a digit.
In Jlex
Name = E
is used to define macros.
{name}
refers to the macro name.
And you can use . as a shorthand for [^\n]
Example Exam Questions
- The Structure of a Compiler
Why is a compiler divided up in several stages?
What stages are a compiler usually divided up into?
What does these stages do?
- Glossary
Explain the expressions:
- Native code
- Runtime
- Compiler
- Interpreter
- Virtual Machine
- Scanner
- Parser
- Syntactic Analysis
- Semantic Analysis
- Lexical Analysis
The following description of integer constants in C++ appears on
page 480 of "The C++ Programming Language".
An integer constant consisting of a sequence of digits is
taken to be decimal (base 10) unless it begins with 0 (digit
zero). A sequence of digits starting with 0 is taken to be an
octal integer (base 8). The digits 8 and 9 are not octal
digits. A sequence of digits preceded by 0x or 0X is taken to
be a hexadecimal integer (base 16). The hexadecimal digits
include a or A through f
or F. ... The type of an integer depends on its
... suffix. [an integer may have no suffix.] ... If [the
integer] is suffixed by u or U, its type is ... If it is
suffixed by l or L , its type is ...
Give a regular expression for C++ integer constants. Use
only the operations |, *, and concatenation in your
expression (no special JLex notation). You may also use
parentheses and ε (which are not operations). For
convenience, you may name a regular expression R by writing
name = R.
- Tokens
Split the following "Expressive"-code into a sequence of tokens. For each
token note if it has any associated value, and in that case what
the value is.
// Read two integers
readint x.
readint y.
max := 0.
/* Return the max of the two */
if (x>y) { max := x.} else { max := y.}.
print "Max: ". print Max. printnl.
- Let E be an eff
- Write a non-recursive EBNF grammar for the language of
international telephone numbers. A telephone number may start
with a + and a country code then there might be an area code
within parentheses followed by the subscriber number. Spaces are
allowed between the country code and the area code and between
the area code and the subscriber number, and between any
digits in the subscriber number but nowhere else. Examples: +46
(18) 51 13 29, (021)311 87 53, 693 7593, +41(21)6936660.
- Updated 08/07/03:
Write a recursive EBNF grammar for
the language which consists of a sequence of the character a
followed by a sequence of as many ba words. Examples:
aba, aababa, aaaababababa.
- An Ambigous Grammar
Here is a context free grammar for some regular expressions:
R ::= RR
R ::= R '*'
R ::= a
-
Rewrite the grammar to an equivalent unambiguous grammar.
Motivate why it is unambiguous. Give some example of how
regular expressions are parsed by your grammar. Give
precedence and associativity for the diffrent contstructs.
-
Rewrite the grammar so that it can be parsed by a predictive
parser.
- Drop the E
- Convert the following EBNF grammar to a BNF grammar.
(That is, get rid of "{}" and "[]".)
S ::= 'a' { B } [ C ]
B ::= 'b' | S
C ::= 'd' | S
- Where in a compiler are BNF grammars useful.
- Grandma
[Note: This question is quite hard, the questions
on the exam will probably be easier than this.]
Given this grammar G0 with the
start symbol S:
S ::= if C then S
S ::= if C then S else S
S ::= ε
C ::= E == E
E ::= E = E
E ::= E + E
E ::= id
E ::= ( E )
Where == is comparison and = is assignment.
- The grammar G0 is
ambiguous. Explain what this means and give an example that
shows that G0 is ambiguous.
- Assume that + should be left associative, = should be right
associative, and + should have a higher precedence than =.
Rewrite G0 to a grammar
G1 that expresses these rules.
- Rewrite G1 to a grammar
G2 which is suited for a
top-down (predictive, LL(1)) parser.
- Lexical or Syntactical
What is the difference between the lexical and the syntactical
analysis of a compiler? What kind of information flows between the
phases?
- (Un)ambiguous
- What does it mean that a grammar is ambiguous?
- How can one tell that a grammar is ambiguous?
- What imapct does it have on a compiler if the grammar
is ambiguous?
- Mention two methods to cope with ambiguity in a parser
generator like JLex.
- Give the meaning of the words "associativity"
and "priority".
- Take it from the top
Write a top down parser in Java for the following Grammar. You
may assume that there are:
- A
Token nextToken() function, which returns the
next token. - A function
void error(String str)
which signals an error. - And that the integer constants
IDENT , LBRACKET ,
RBRACKET , LPAR , RPAR , and
NUMLIT are defined.
Grammar:
E ::= IDENT { LBRACKET E RBRACKET }
| LPAR E RPAR
| NUMLIT
Code:
void E() {
...
}
- Abstract Syntax Tree
Consider the following grammar:
E ::= IDENT { LBRACKET E RBRACKET }
| E PLUS E
| E MINUS E
| LPAR E RPAR
| NUMLIT
- Design an abstract syntax for the grammar.
- Given your abstract syntax draw the abstract syntax tree for
the expression "
x [y [z + (5-a)]] [2] ".
- A Concrete Abstract Syntax
Here is a grammar for a statements in an imperative programming
language.
Statement ::= Assignment | LoopStatement
Statement ::= CaseStatement | ForStatement
Assignment ::= Designator ':=' Expression
Designator ::= id { '[' Expression ']' }
LoopStatement ::= loop StatementSequence end
StatementSequence ::= Statement { ';' Statement }
CaseStatement ::= case Expression of Case { '||' Case }
Case ::= CaseLabelList ':' StatementSequence
CaseLabelList ::= ConstExpression { ',' ConstExpression }
Define a suitable Java representation for the abstract syntax tree.
- The Object Formerly Known as Prince
What is a symbol table used for in a compiler? What type of
information is stored in a symbol table?
- Can you cope with scope?
Assume a language with static scope rules allowing nested
scopes. How can you implement an imperative symbol table for
the semantic analyzer in the compiler for such a language. The
symbol table should be a global structure that is updated
through side effects. Describe especially how scope is handled,
both in the implementation of the symbol table and in the semantic analysis.
- Semantic Analysis
Assume a procedural language with integers and floating point
numbers (floats) as basic types and with the following operations:
- +
- The arguments should have the same type, the
result type should be the same as the argument type.
- xor
- The argument should be integers, the result is an
integer
- /
- The arguments should be floating point numbers, the result
is a floating point number
- :=
- The expression should have the same type as the
variable.
Sketch on a Java-solution for type cheking of expressions in
this language which returns the type of the expression.
- Intermediate Representation
What does "Intermediate Representation" mean? What is the
adavatage of using an "Intermediate Representation" in a compiler?
- Back-end
Explain the meaning of the following words in the context of
compiler back-ends and interpretation:
- Runtime
- Environment
- Virtual Machine
- Static
- Dynamic
- Can you interpret this?
Assume that you have a compiler for a small imperative
language. This compiler compiles source code to an abstract
syntax tree (AST), and performs name and type analysis. The
language has only two types integer and
boolean. The type analysis has annotated the
AST with the type of each node. At runtime the boolean false is
represented by 0 (zero) and the boolean true by any other
integer.
Now you are writing a interpreter in Java for this language
which uses the visitor pattern over the AST. Since there is only
one type at runtime (integers) you have decided that the visitor
should always return an integer.
You have the following code in the AST:
/** Tree node representing if E then Ss Else Ss. */
public static class If extends Stat {
/** Condition. */
public final Expr cond;
/** Then part of the statement. */
public final Stat[] thenp;
/** Else part of the statement. */
public final Stat[] elsep;
public If(int pos, Expr cond, Stat[] thenp, Stat[] elsep) {
super(pos);
this.cond = cond;
this.thenp = thenp;
this.elsep = elsep;
}
public void apply(Visitor v) {
v.caseIf(this);
}
}
And you have already written this code in the interpreter:
/** Interpret and return the result from a tree node. */
public int interpret(Tree tree) {
int tmp = result;
tree.apply(this);
int res2 = result;
result = tmp;
return res2;
}
/** Interpret an array of trees. */
public void interpret(Tree[] trees) {
for (int i = 0; i < trees.length; i++)
interpret(trees[i]);
}
Finish the implementation of the interpreter for the 'if'-node.
public void caseIf(If tree) {
}
- Ad us proprium
If you feel that you have studied completely different subjects of
the course than the ones we have asked you about, now is your
chance!
Design an exam question and answer it. The question should be
on an appropriate level of difficulty.
|
|