Overview

The goal of this project is to implement a simple, conservative garbage collection technique: Mark & Sweep. This is your first graded assignment, and you have the choice of continuing using a fresh compiler and vm , provided by us, which solves the first two ungraded exercises, or you can continue using the your version.

Setting up

You have three options regarding the compiler and vm that you will be using for the rest of this course:

To get a fresh checkout, you proceed as in the first assignment (the actual output might show more patches being applied):

$ darcs get http://lamp.epfl.ch/~schinz/act-project
Copying patch 1 of 1... done!
Applying patch 1 of 1... done.
Finished getting.

Mark and Sweep

As explained in the course, the purpose of having a garbage collector is to reclaim unused memory. Mark and sweep achieves this goal by two steps: first, it marks all blocks of memory that it thinks are still in use (mark), then by inspecting all the heap it links unmarked blocks to the free list (sweep).

The mark phase starts by marking all blocks from the roots of the program (that means those locations that are live always -- the registers). When it is being marked, each block is inspected and all pointers are followed, and their destination blocks are marked too. The question is how to decide what is a pointer -- one solution is to safely approximate that any integer is one. Another would be to rely on the compiler to tell which fields of a block are pointers. A good alternative which does not require changes in the compiler is to keep a bitmap for the heap, which tells which addresses represent starting points for allocated blocks. We suggest you choose one which does not require compiler support.

The sweep phase starts from the beginning of the heap and each block is checked. If it's not marked, and not free, it will be added to the free list (actually, that depends on the way free blocks are managed, and it might not be a list, but we call it as such for simplicity). We don't require your garbage collector to coalesce adjacent free blocks, although that might improve its performance.

The allocation routine has to be changed, so that it searches for a suitable block of memory. If it does not find it, it should run the garbage collector and retry. For keeping track of free blocks, we suggest you use an internal free list as described in the course.

Requirements

Your garbage collector should not use more memory than heap_size. This means the free list, and any additional bookeeping memory you need (like for mark bits) has to fit inside the allocated heap_mem. However, this does not include the normal C call stack -- you can use a recursive mark function, and in general are allowed to declare local/global variables for pointers inside the heap.

The straight forward way of marking blocks, by changing a bit in the block itself (for instance in the size field) is not going to work unless we have a precise algorithm: we cannot change anything in the 'marked' blocks unless we are positive they are allocated memory blocks. One solution is to keep a bitmap for mark bits inside the heap, so blocks are never modified in the marking process. Another one is to keep a bitmap for allocated blocks: this has the advantage that if an address is flagged as allocated in the bitmap, we are sure it's a memory block and can mark it by setting a bit in the block itself.

Submission

Please submit a single arhived file (zip or tar.gz) containing:

Your VM should be independent on the machine it runs, and it should at least work on the Linux machines in INF1. The tricky part is that the size of int might differ between machines. The INF1 machines are 64 bits machines, while most 'usual' machines are 32 bit. If you encounter problems, check your code for assumptions about int size.