Structured variables (arrays, records, objects)
may contain internal pointers. These must be
known to the GC so that it can traverse the
graph. Hence, the compiler must communicate to
the GC the type of each dynamic object
and the internal structure of each type.
Finding the beginning of objects:
What happens if the only pointer to an object
points somewhere in the middle of the object?
We must either be able to find the beginning
of the object, or make sure the compiler does
not generate such code.
The internal structure of activation records
& structured variables is described by
run-time templates.
Every run-time object has an extra word
that points to a type descriptor (or Temaplate), a
structure describing which words in the
object are pointers. This map is constructed
at compile-time and stored statically in the
data segment of the executable.
When the GC is invoked, registers may also contain
valid pointers. The compiler must therefore also
generate (for every point where the GC may be called)
a pointer map that describes which registers
hold live pointers at this point. For this reason,
we usually only allow the GC to run at certain points,
often the points where new is called.
We must also provide pointer maps for every function
call point. A function may call which calls
new which invokes the GC. We need to know
which words in 's activation record that at
this point contain live pointers.
How does the GC look up which pointer map belongs to
a particular call to procedure at a particular address ?
The pointer maps are indexed by the return address of !
So, to traverse the stack of activation records, the GC
looks at each frame, extracts the return address, finds the
pointer map for that address, and extracts each pointer
according to the map.
The basic idea behind Mark-and-Sweep
is to traverse and mark all the cells that
can be reached from the root cells.
A root cell is any pointer on the stack
or in global memory which points to objects
on the heap.
Once all the live cells (those which are pointed
to by a global variable or some other live cells)
have been marked, we scan through the heap and
separate the live data from the garbage.
If we are dealing with equal size
objects only (this is the case in
LISP, for example) the we scan
the heap and link all the unmarked
objects onto the free list. At the
same time we can unmark the live cells.
If we have cells of different sizes,
just linking the freed objects together
may result in heap fragmentation.
Instead we need to compact the heap,
by collecting live cells together
in a contiguous memory area on the
heap and doing the same with the garbage
cells in another area.
A straight-forward implementation of mark and sweep
may run into memory problems itself! A depth-first-search
makes use of a stack, and the size of the stack
will be the same as the depth of the object graph.
Remember that
the stack and the heap share the same memory space,
and may even grow towards eachother.
So, if we're out of luck we might run into this
situation:
the heap is full (otherwise
we wouldn't be gc:ing!),
the object graph is deep,
we run out of stack space during the
marking phase.
We're now out of memory alltogether.
Difficult to recover from!
Fortunately, there is a smart algorithm for marking
in constant space, called the Deutsch-Schorr-Waite
algorithm. Actually, it was developed simultaneously
by Peter Deutsch and by Herbert Schorr and W. M. Waite.
The basic idea is to store the DFS stack in the
object graph itself. When a new node (object)
is encountered
we set the ``marked''-bit to 1,
the node (object) is made to point to the
previous node,
two global variables current
and previous are updated.
current points to the current cell, previous
to the previously visited cell.
Use pointer reversal to encode the DFS stack
in the object graph itself.
When the DFS reaches a new cell, change a pointer
in the cell to point back to the DFS parent cell.
When we can go no deeper, return, following
the back links, restoring the links.
Align the templates on a word-boundary, then the
two lowest bit of the template-pointers will always
be zero -- use these bits for markers!
We're somehow going to have to store which pointer to
deal with next during pointer reversal. We could use
the same tricks as above, set the low order bits
of pointers we've dealt with to 1, but then we need to
search through the entire object every time for the
next pointer to deal with -- expensive.
Note that, just like in the malloc implementation,
the pointer to an object points right after the
header. So, the header is at address ptr-8, and
the data in each object is at ptr, ptr+4, etc.