The goal of this second ungraded assignment is to write a threaded version of the virtual machine engine, in order to improve its performance, and then to quantify that improvement. To start this assignment, you can either use your own solution to the first assignment, or our version of

Part 1: writing the threaded engine

Writing the threaded engine amounts to creating a new version of the engine module, called for example engine_threaded.c. Most of the code for that module can be taken directly from the current engine module, engine_switch.c. The two parts that must be adapted are the representation of instructions and the dispatching code.

In the current, switch-based engine, instructions are directly represented by their opcode: all structures that describe the various forms of instructions (instr_t, instr_r_t, instr_rr_t, etc.) start with a field of type opcode_t. In a threaded interpreter, however, instructions are represented by the address of the code that implements them. All structures have to be adapted accordingly.

Once this is done, the function that actually runs the code, engine_run, must be adapted. As explained in the course, a threaded interpreter does not have a dispatching loop. Instead, the dispatching code is duplicated at the end of every piece of code that implements an instruction. With gcc's labels-as-values extension, this dispatching code consists in a simple computed goto.

A minor problem that you will encounter is that labels are local to the function that contains them. Therefore, labels in engine_run are not directly visible from the functions that emit code (engine_emit, engine_emit_r, etc.), which is a problem. One solution is to store these labels in a global array, initialized when the engine is itself initialised (i.e. when engine_setup is called). This initialisation can be done by calling engine_run in a special mode, during which it copies the addesses of its local labels to the global array and returns.

Part 2: measuring performances

Once you have written and tested the code, the next step is to measure the performance gain offered by the threaded interpreter, if any. This is relatively tricky, however, because the only way to loop in minischeme is using a recursive function call, which invariably leads to memory exhaustion, as the VM doesn't free memory yet! We suggest that you either write your test program directly in minivm assembly, or that you first compile a minischeme program and then modify it by hand to make sure it runs long enough without consuming memory.