Logo EPFL
LAMP
Ecole Polytechnique Fédérale de Lausanne
Part Two: Mark and Sweep Garbage Collection
English only

The goal of this project is to implement a simple, conservative garbage collection technique: Mark & Sweep. You will continue using your modified VM (although threaded interpretation has no influence on this project).

As explained in the course, the purpose of having a garbage collector is to reclaim unused memory. Mark and sweep achieves that goal by two steps: first, it marks all blocks of memory that it thinks are still in use (mark), then by inspecting all the heap it links unmarked blocks to the free list (sweep).

The mark phase starts by marking all blocks from the roots of the program (that means those locations that are live always -- the registers). When it is being marked, each block is inspected and all pointers are followed, and their destination blocks are marked too. The question is how to decide what is a pointer -- one solution is to safely approximate that any integer is one. Another would be to rely on the compiler to tell which fields of a block are pointers. We'll use the first solution, with a twist: an integer is first checked to see if it points inside the heap area. If it does, it is considered a pointer.

The sweep phase starts from the beginning of the heap and each block is checked. If it's not marked, and not free, it will be added to the free list (actually, that depends on the way free blocks are managed, and it might not be a list, but we call it as such for simplicity). We don't require your garbage collector to coalesce adjacent free blocks, although that might improve its performance.

The allocation routine has to be changed, so that it searches for a suitable block of memory. If it does not find it, it should run the garbage collector and retry. For keeping track of free blocks, we suggest you use an internal free list as described in the course.

Your garbage collector should not use more memory than heap_size. This means the free list, and any additional bookeeping memory you need (like for mark bits) has to fit inside the allocated heap_mem. However, this does not include the normal C call stack -- you can use a recursive mark function, and in general are allowed to declare local/global variables for pointers inside the heap.

In order to hook a garbage collector in the VM, have a look at memory.h. You have to implement two functions, and pack them into the garbage_collector_t structure, then instruct the VM to use them (probably the best place to set it is in main.c).

This should be enough to make your VM use the new garbage collector.