Semester projects

Scala benchmark suite

keywords: scala, benchmarking

Goal

The goal of this project would be to develop a benchmark suite that can be used to assess how changes to the Scala compiler and libraries affect the space and time usage of various operations, programming patterns and idioms, etc.

offered by Aleksandar Prokopec.

MultiSets Scala Collection

The Scala collection library has sequences, sets, maps, and several other collection types, but it does not currently have MultiSets. The goal of the project is the write a complete set of APIs and implementations for MultiSets that fit seamlessly in the Scala collection library. See Programming in Scala, chapters 17, 24 and 25 for a discussion of the Scala collections API and its architecture and how to add new collection types to it.

The project can be a semester project for 2 people or a master project for one.

offered by LAMP

Distributed Collections for Nephele/PACTs

The Distributed Collections for Scala [1] currently process data by using Hadoop framework [2] as the processing engine. Goal of this project is to extend Distributed Collections to use Nephele/PACTs [3] framework as the processing back-end.

Achiveing this goal requires:
[1] Distributed Collections for Scala https://github.com/scala-incubator/distributed-collections
[2] Hadoop http://hadoop.apache.org/
[3] Nephele/PACTs http://www.stratosphere.eu

offered by Vojin Jovanovic.

Extending Scaladoc to emit API docs in .NET format

Sandcastle [1] is “an open source documentation generator from Microsoft that reads your assemblies (DLL or EXE files) and their XML Comments and auto- matically generates HTML documentation.” [2]

Sandcastle carries out the very last phase of documentation generation, ensuring that the output meets the visual cues and file formats that developers expect of API docs on .NET. This division of labor bears similarities with the workflow of Scaladoc tools on JVM, where the final generation stage can be customized. That way, two goals can be achieved simultaneously:

For comparison, the functionality of the Scaladoc front-end is usually em- bedded in .NET compilers (for example, as in C# [3] or F# [4]). In our case, we need not modify the Scala.NET compiler to emit documentation attributes in the output assemblies; and the required XML-comment file (with pointers back to source locations) is to be emitted by a new Scaladoc backend (a modular component) to be developed.

A nice to have: Mono support

Regarding Mono, it’s not clear whether the equivalent of Sandcastle (Monodoc) understands all the XML tags of Sandcastle. Therefore, it’s not a requirement for this project to generate Monodoc-compatible XML files (it’s still desirable, for example in case just a few tags can’t be handled by Monodoc we can acco- modate that). If our output works for Monodoc, fine. Quoting from [5]: Mono generally uses Monodoc. As I understand it, this is a some- what different format - but it can extract XML documentation from C# code as normal.

[1] http://sandcastle.codeplex.com/
[2] http://broadcast.oreilly.com/2010/09/build- html- documentation- for- y.html
[3] http://msdn.microsoft.com/en-us/library/b2s063f7(v=vs.71).aspx
[4] http://msdn.microsoft.com/en-us/library/dd233217.aspx
[5] http://stackoverflow.com/questions/4451082/c-mono-api-documentation-tools

offered by Miguel Garcia

Prefuse for Scala

keywords: scala, type systems, reactive programming, visualization, GUI,

Prefuse is a visualization toolkit for Java based on Java2D and Swing. It has a relatively clean separation between model classes and view classes. They communicate with each other through events and listeners. Due to Java's weaker type system compared to Scala and the fact that prefuse is implemented without generics, the framework's numerous model classes (trees, tables, graphs etc) are largely untyped.

Your task for this project is to port (a substantial part of) Prefuse to Scala with the following points in mind:

We can adapt the tasks according to your prior experience and interests. For example, you could just address the first point and leave the second for a later project. You should be experienced with Java generics. Experience with more advanced features of Scala, such as variance, path dependent types or higher-kinded types is a plus. This can be a semester project for one if you are an experienced programmer, a semester project for two or a Master's project.

References:
http://prefuse.org
https://github.com/prefuse/Prefuse
http://lamp.epfl.ch/~imaier/pub/DeprecatingObserversTR2010.pdf

offered by Ingo Maier.

SHotDraw

keywords: scala, reactive programming, design patterns, GUI

JHotDraw is a Java application and framework for structured graphics. It provides a basic version of drawing facilities found in presentation software such as MS PowerPoint or Apple Keynote or diagram editors such as Dia or Omnigraffle. JHotDraw - a port of HotDraw, originally written in Smalltalk - is a showcase for good object oriented (OO) design and implements a variety of design patterns.

Your task for this project is to write a Scala version of JHotDraw in which you This can be a challenging semester project or preferably a Master's project. References:
http://www.jhotdraw.org
http://en.wikipedia.org/wiki/Software_design_pattern
http://lamp.epfl.ch/~imaier/pub/DeprecatingObserversTR2010.pdf

offered by Ingo Maier.

Debugger for Scala Parser Combinators

keywords: Scala, debugger, parser combinators, GUI

Scala has a powerful parser combinators library which enables programmers to design language grammar using familiar Scala methods and fields. Therefore instead of spending time on learning (often powerful) tools from scratch like ANTL, JavaCC or Beaver which is often a steep curve, they can focus on the design. Parser combinators can be powerful but they also often cause troubles, especially for beginners, when the defined grammar doesn't behave as expected and get only cryptic messages. This isn't such a big issue for toy languages but people use Scala parser combinators for full blown projects and finding bugs there is a real issue.

What one would ideally like to have is a specialized debugger for parser combinators that takes the defined grammar and visualizes it in an interactive way. For example we would like to see how literals are consumed, where parser combinators backtrack because it couldn't match the pattern etc. At the moment the only way to debug bugs in your grammar is through primitive println statements which is hardly sufficient for normal Scala programmers.

Student can pick whatever visualization library he wants (for example Scala Swing wrappers). If you find it suitable you can also use Scala.React framework. Successful project would be definitely appreciated by the Scala programmers and could be also used during teaching for our courses. Prior knowledge of Scala parser combinators is not necessary if student has a good knowledge of Scala.

This can be a challenging semester project or maybe even a Master's project (depending on the scope). References:
http://lamp.epfl.ch/teaching/foundations_of_software/docs/combinator_parsing.pdf
http://www.cs.kuleuven.be/publicaties/rapporten/cw/CW491.pdf
http://lamp.epfl.ch/~imaier/pub/DeprecatingObserversTR2010.pdf
http://lamp.epfl.ch/teaching/foundations_of_software/docs/week01.pdf(second part of slides contains a really quick introduction to Scala parser combinators)

offered by Hubert Plociniczak.

Master projects

Distributed Collections DSL

keywords: scala, delite, dsl, collections, parallelism, distribution, large scale data processing, domain specific languages

Contemporary large scale data processing frameworks either provide higher level abstractions that increase expressiveness while sacrificing performance or provide low level APIs that require significant development effort to program and maintain. Goal of this project is to provide a domain specific language (DSL) with a high level API, similar to Scala Collections [1], for distributed data parallel processing. Although having a high level API it would use domain knowledge to generate highly optimized code for existing large scale processing frameworks like (Hadoop [2] or Nephele/PACTs [3]). It would be implemented as a building block for other DSLs inside the Delite framework [4].
DSL implementation would use/extend the interface from existing Delite Collections DSL, use existing optimization techniques and then generate code that targets a specific processing framework. This project would innovate in following aspects:

Requirements this project requires good prior knowledge of Scala.


[1] Scala Collections http://docs.scala-lang.org/overviews/collections/introduction.html
[2] Hadoop http://hadoop.apache.org/
[3] Nephele/PACTs http://www.stratosphere.eu
[4] Delite http://stanford-ppl.github.com/Delite/index.html

offered by Vojin Jovanovic.

Compiler Regression Testing

keywords: object-oriented programming, regression testing, compilers

The Scala compiler has an extensive test suite that we use to detect regressions. Certain files should compile, others must result in specific error messages, and yet another category of tests (indirectly) checks the code generated by the compiler by compiling and running tests and comparing their output to the expected result.

To test the type checker and other crucial, yet under-tested, parts of the compiler, more efficiently and more precisely, we'd like you to design and implement a new kind of test. The following Scala file -- note the comment -- illustrates how these tests could work:

object Test { 
  def f /*@expectedType: Int */ = 1 
}
      

The comment (or, depending on your research on how to best design this, an annotation) specifies the expected type at a certain position in a Scala program. This allows for more direct checking of the type checker, which is currently quite complicated[1]. Moreover, we'd like you to generalise this idea to implicit resolution (which implicit value gets inferred?) or [type|term]-completion in the presentation compiler (which completions are found?), and any other application you can think of!

This is a reasonably challenging project, which requires prior knowledge of Scala programming, and preferably (but not necessarily) some background in compilers. However, these tests will be used several times a day to ensure the quality of a widely used language!

[1]: http://lampsvn.epfl.ch/trac/scala/browser/scala/trunk/test/files/run/constrained-types.scala

offered by Adriaan Moors

Integrating Workstealing Abstractions in the Parallel Collection Framework

keywords: work stealing, schedulers

All languages come with collection support in the form of arrays, hash tables, linked lists etc. Scala parallel collections framework is a parallel extension to the standard collection framework in Scala. It parallelizes bulk operations like collection traversal, transformation or filtering. Currently, it includes parallel arrays, parallel hash maps and hash sets, as well as their persistent (immutable) counterparts. The framework is generic and can be further extended with new data structures.

Goal

The goal of this project is to extend the parallel collection framework with the workstealing abstraction on the level of the framework and develop a new scheduler based on this abstraction. The workstealing abstraction comes in the form of a new method called "steal" for parallel iterators, which simply returns a part of the set being iterated. It should then be shown that this approach leads to better load balancing and a better performance.

offered by Aleksandar Prokopec.

Parallel Collection Extensions

keywords:parallel, collections, data structures

All languages come with collection support in the form of arrays, hash tables, linked lists etc. Scala parallel collections framework is a parallel extension to the standard collection framework in Scala. It parallelizes bulk operations like collection traversal, transformation or filtering. Currently, it includes parallel arrays, parallel hash maps and hash sets, as well as their persistent (immutable) counterparts. The framework is generic and can be further extended with new data structures.

Goal

Your goal is to extend the parallel collection framework with binomial heaps, parallel ropes, conc-lists and parallel ordered maps and sets.

offered by Aleksandar Prokopec.

Java Virtual Machine(JVM) Micro Benchmarks

keywords: jvm, bytecode, optimization, benchmarking

The goal of this project is to provide an overview of how different changes to the bytecode generated by the Scala compiler impact the execution speed in different implementations of the Java Virtual Machine (JVM). The conclusions of this study can be further used to fine-tune the optimization and code generation phases within the Scala compiler.

Some of the questions the study should answer:

offered by Vlad Ureche.