After discussion with some of you and other people, I have decided to
drop the project. I misestimated the amount of work involved in it,
it simply is too big for this class.
This is the programming project for the course. You can do the
project in three different sizes. The base project
implements a minimal subset of the language. The standard
project implements the language as given and is what is
recommended to do. The extended version implements even more
You can sometimes use different sizes for different phases of
the compiler, but for example a standard semantic analysis will
require at least a standard parser.
You have to install JLex and java_cup. The easiest way is to do the
See also the local Java - HOWTO.
- Create a directory
- Copy JLex.jar and java_cup.jar to that directory.
- If you want you can add the jar files to your classpath with
If you want to keep your old
setenv CLASSPATH $HOME/classes/JLex.jar:$HOME/classes/java_cup.jar;
(for csh) or
CLASSPATH as well,
you can append it with
Your first task is to write a scanner for the language
"filter". We will use the tool JLex to generate a file
Scanner.java from the file
(Again, the local Java - HOWTO explains
how to call JLex in more detail).
You can download the file filter.lex,
in which some parts are already done. What is left to you is the addition of the patterns
for the tokens and the corresponding actions. [What is done, is position
management (counting of lines and columns), construction of
the object for the token (which is handed to the parser)
and a string representation of tokens, which is useful for
For the token
data this is already done. You have to
do it similarly for all the other tokens.
The specification, what kind of tokens, white space and comments
should be recognized are in the language
The patterns that you can use are on slide 15ff of lexical analysis.
There are some additional files which you should also download:
Position.java is useful for handling
positions, Report.java is a simple error
module, and Tokens.java is an interface,
which contains the definition of an integer for each token.
Finally, there is a class ScannerTest.java, which
has a main method and allows you to test your Scanner on the standard
You can use byte.filter as a sample input file:
java filter.ScannerTest < byte.filter
JLex reacts weird to comments in the pattern section, so only put
them inside commands. Also do not use spaces in patterns except
" ", in sets
[ ], or
after a backslash
The source file for the parser is filter.cup.
Here, we describe what terminals and non-terminals we have, the
precedence of the operators and the rules (productions) of the
grammar. The given file has already some definitions, you have to
add the remaining ones. The original grammar is given in the
language specification. You will have
to transform it into BNF for using it in filter.cup.
Now you can use java_cup to generate the two files Tokens.java and
Parser.java from filter.cup (See also the local Java - HOWTO
on using java_cup). Parser.java then contains the Java source-code
for the parser and Tokens.java contains the token-ids that must
also be known to the scanner.
There is also a given class ParserTest.java,
which allows you to test your parser.:
java filter.ParserTest < byte.filter
Most likely java_cup will give you warnings or error-messages.
Try to test your parser also with other input files.