Introduction
The goal of this first graded assignment is to write a
conservative mark&sweep garbage collector for the virtual
machine. This garbage collector should replace the current
memory
module, which does not free memory.
To begin this assignment, you can either use your current version of the project, or start anew with our own minischeme.zip that includes the solution to the two ungraded assignments.
The mark phase
The mark phase identifies the set of reachable memory blocks, and marks them so that the sweep phase can then free unmarked blocks.
Traversing the object graph requires identifying pointers to objects. Since the minischeme compiler does not produce the information necessary to differentiate pointers from integers, a conservative approach has to be taken: everything that looks like a pointer to a heap block is treated as such, which means that the block it refers is considered as reachable.
To identify pointers, the mark phase must repeatedly look at values and determine whether they represent the address of the beginning of an allocated block, or not. This should be done as efficiently as possible. One technique is to manage a bitmap with one bit per heap address, in which a bit is set if and only if the corresponding address designates the beginning of an allocated block.
Once an allocated block has been identified as (conservatively) reachable, it must be marked so that the sweep phase knows it cannot be freed. This mark bit can conveniently be stored in the least-significant bit of the size, in the block header. This is because block size is expressed in bytes, and the size of any block is at least a multiple of two.
As we have seen in the course, when a block is marked, it must be searched for pointers, and the blocks designated by these pointers must be recursively marked. For this, you are allowed to write a recursive mark function, although this solution is far from optimal.
The sweep phase
The sweep phase traverses the whole heap, block after block, and puts those that are not marked in the free list. Adjacent free blocks are coalesced. The mark bit of blocks is also cleared in the process, to guarantee that all blocks are unmarked at the beginning of the next garbage collection.
Notice that the free list is rebuilt from scratch in this phase.
Tips and tricks
Your garbage collector must at least run on the IN machines, which have a 64 bits architecture. Ideally, your code should be completely independent of the architecture size, but this might be difficult to achieve in the limited time at your disposal. Therefore, we strongly recommend that you do all your testing and debugging on a 64 bits architecture.
To test your garbage collector, write long-running programs that consume memory. To help you, we provide a single compiled program, bignums.asm, that computes the factorial of 200.
Solution submission
Your solution must be submitted as a ZIP archive containing exactly the following files - not one more, not one less:
README (short description of your solution) vm/engine.h vm/engine_switch.c vm/fail.c vm/fail.h vm/loader.c vm/loader.h vm/main.c vm/memory.h vm/memory_marksweepgc.c (your GC) vm/opcode.h
The README
file should contain a short
description of your solution and its peculiarities, as well as
the name of all its authors.
Except for vm/memory_marksweepgc.c
, the C files
should be identical to the ones you received initially. If
they are not, please explain why in the README
.
Compiling all those C files together should produce a working,
garbage-collected VM.
To help you check the contents of the archive, we provide you with a Scala script called SubCheck3.scala. When run with your archive as argument, this script should not report any problem.
The archive containing your solution should be attached to an e-mail sent to michel.schinz+acc3@gmail.com, before 23:59, April 3, 2008.
Requirements summary
To get the maximum number of points, your solution must at least satisfy all of the following constraints:
-
When run on the IN machines, your VM must be able to compute
the 30th element of the Fibonacci series using
the naïve, recursive algorithm, and a heap size of 50'000
bytes (i.e. with argument
-m 50000
). - When run on the IN machines, your VM must be able to successfully execute the bignums.asm test file we provide with the default heap size (1'000'000 bytes).
- The technique used to determine whether pointers designate allocated heap blocks must be at least as efficient as the bitmap-based one presented above.
- Your garbage collector must not allocate additional memory to store its internal data structures - e.g. the bitmap.
- The sweep phase must coalesce adjacent blocks.
- Your code must be submitted before the deadline, in the format specified above.
Solution
The file memory_marksweepgc.o
contains a working GC for minivm. This object file is compiled
for the IN machines.