| Doc. No.: | WG21/N2261 J16/07-0121 | |
|---|---|---|
| Date: | 2007-05-04 | |
| Reply to: | Hans-J. Boehm | Mike Spertus | 
| Phone: | +1-650-857-3406 | |
| Email: | Hans.Boehm@hp.com | mike_spertus@symantec.com | 
We describe a possible finalization facility that could be added to C++0x, either as part of the language or by a later TR. We believe that garbage collection in C++0x is useful with or without such a facility, and that such a facility is rarely needed. However, several issues have been raised that are difficult to address without finalization.
Here we outline those issues and the solution space.
Common cases in which explicit destruction is needed, but timing is unimportant, include:
class A {
  vector<B> v;
};
Suppose A is garbage collected and B is not. When an object of type A is garbage collected, unless a finalizer is run, all the (non-garbage collected) memory allocated by the vector A::v will be leaked. However, a finalizer can invoke A::~A() to clean up the vector.
Dynamically, such objects are relatively easy to detect: It is not difficult to add code to report not-yet-deallocated instances at process shutdown. This requires no real language extensions beyond perhaps the addition of a simple library class. But it reports such leaks only at process termination, and it does not easily accommodate static data structures whose resources are only intended to be reclaimed by the operating system at process termination.
It would clearly be preferable to detect such objects when they become unreachable without having been deallocated, for all the same reasons that such tools are commonly preferred for detecting memory leaks in non-garbage-collected programs. Finalization provides precisely the required facility, though one could invent more convenient syntax. Typically an object A requiring explicit deletion would be made to point to a "finalizable" object B. As destructor would explicitly destroy B. If B is not explicitly destroyed, it will be notified that it has become unreachable by calling its finalizer, which could then report the error.
(One might instead directly make A finalizable. This is probably less convenient in practice, since we cannot simply add a field of a library-defined class. It also raises issues related to finalization cycles that are probably best avoided.)
In this case it again seems undesirable to revert to explicit deletion in order to deal with an external resource, namely the remote reference. It may be possible to introduce remote references into arbitrary user data structures. Hence it is not obvious that remote references can be explicitly deleted with much less effort than would be required to explicitly delete all the user's data structures.
However one major issue remains for all of these. Traditional finalization in garbage collected languages is defined in terms of object reachability. But objects can easily become "unreachable" earlier than the programmer expects, due to compiler optimizations. An object may be logically in use long past the last point at which the memory associated with the object is accessed or referenced by any pointers visible to the garbage collector.
In the worst case, the compiler might decide to store all the fields of an object in registers, and then observe that the "this" pointer for the newly allocated memory is dead immediately after the allocation, and hence not save it anywhere. The garbage collector, even if run immediately after the allocation, would then discover that the memory object is no longer reachable, and run the finalizer immediately, in spite of the fact that the object fields are still in use, and in fact may remain in use after the finalizer has run.
A more typical case is the one in which we have a finalizable object F that relies on some external state vector E which is cleaned up by its finalizer. Each object instance includes a data member i that is used to locate the appropriate entry of E. A typical member function m might behave like:
void m()
{
   my_index = i;
  l:
   perform operation on E[my_index];
}
The finalizer for F
cleans up and invalidates E[i].
Consider what happens when m() is the last call on F, and the garbage collector is triggered at the label l. Although the external state E[i] is still accessible, none of the object fields are still needed at this point. As a result, the object pointer (the "this" pointer) may itself be dead, in spite of the fact that one of the object's methods is currently still running.
A possible result is that the collector enqueues F for finalization, and the finalizer runs, all before the call to m(), and the operation on E[my_index] completes. When it finally does complete, E[my_index] has been invalidated by the finalizer, and m() sees invalid data.
This kind of failure is rare, but not unheard of, in practice. It is particularly rare on 32-bit X86 hardware, since the small number of registers tends to force an object's this pointer onto the memory stack, where it is unlikely to be overwritten. Hence it is likely, but not guaranteed, that it will be visible to the garbage collector while an object's methods are executing, whether or not the this pointer is actually still live.
We have seen a lot of finalizer code that is incorrect for this reason, but are only aware of one actual failure. It may be telling that this failure was observed very shortly after a talk on the subject. The issue may just be too obscure for failures to be reported correctly.
We believe this solution is worth considering, particularly since C++ already requires that the lifetimes of variables with nontrivial destructors (e.g. smart pointers) be similarly extended. We are however concerned that this is too large and pervasive a cost for what should be a very rarely used feature. Java chose not to follow this route for that reason. However, we are not aware of empirical evaluations of its cost.
Expert intuitions seem to be that the cost should be low, except possibly on architectures with small architected register sets, such as 32-bit X86.
A more interesting, though untested, alternative is the following. Assume that finalizable objects must inherit from a class finalizable. We say that a pointer is finalizer essential if its static target type inherits form class finalizable. We then insist that no finalizable object be finalized before the end of the last lifetime of a finalizer essential pointer to it.
The crucial observation behind this is that any access to an external resource (e.g. E in the example) requires a pointer to a class that refers to the external resource. A pointer to a less derived class is insufficient. So long as the finalizer is introduced at the same stage in the class derivation process that introduces the external resource, this will also be a finalizer essential pointer. And the finalizer must generally be introduced at the same point as the external resource to ensure proper reclamation of the resource.
The reachability condition here is enforceable by generating code that keeps a pointer p visible to the garbage collector, so long as
Note that this approach works correctly whether or not the external resource is introduced by putting it in a separate finalizable leaf object introduced for the purpose, though not in this form if the external resource index is copied to both objects.
We pursue a similar approach, but further observe that, once these explicit "liveness" declarations are required, we no longer need to define finalization in terms of reachability at all. Such an object x becomes eligible for finalization once there can be no further calls to its delay_finalization() function. More precisely, this happens when there is no longer any way to extend the current execution, such that both
Note that although this criterion doesn't mention "reachability" at all, in fact the only circumstance under which a production runtime will normally be able to determine that further calls to x.delay_finalization() are impossible, is when x is unreachable. A typical implementation of delay_finalization() would simply ensure that the this pointer is visible to the garbage collector at the point of the call, i.e. after all prior memory accesses. An implementation as an opaque no-op member function would be sufficient but not optimal.
The characteristics of this approach are:
There are two common ways to expose a finalization interface:
Arguably, the reverse emulation is also possible, with some added overhead.
We somewhat arbitrarily adhere to our original proposal, which chose the first style. The second style may be a bit easier for programmers to think about, and is also worth considering.
As defined, our facility does not allow the construction of "weak hash maps" which allow keys to be reclaimed when they are otherwise inaccessible,since that would require that we allow finalization on arbitrary objects, which is inconsistent with either of the last two approaches from the preceding section.
Weak hash maps are also a useful facility. But we believe that it can be designed separately in such a way that clean-up of keys is transparent to the application. User code cannot tell whether a key has been reclaimed. Thus the problems described below are not encountered.
In this proposal, finalization-capable objects inherit for a class std::finalizable:
class finalizable {
    public:
	virtual void finalize() = 0;
	delay_finalization();
	virtual ~finalizable(); // Disables the object for finalization
}
This makes it rather intrusive to add finalization to an existing class. However, as we pointed out above, the same thing can generally be accomplished by instead adding a pointer to a small finalizable class to the original class, with only the small newly introduced class inheriting from class finalizable.
Neither solutions 1 nor 3 from above require that all finalizable objects inherit from a fixed class, but solution 2 does.
A finalization-capable object can be enabled for finalization by a library call, as described below.
Once an object is enabled for finalization, we require the programmer to specify whenever a function has finished accessing external object state that might be invalidated by finalization. This is done by invoking its delay_finalization() function. (Again this assumes solution 3 above.)
As with other finalization systems, there is no guarantee that once an object is eligible for finalization, it will actually be finalized. In our case, it is more apparent than usual why this is the case: It is blatantly undecidable whether an object is eligible for finalization.
Finalization-capable objects need to be enabled for finalization, i.e. registered for cleanup actions, using the function register_for_finalization():
class finalization_queue {
public:
        int finalize_all();
...
}
void
register_for_finalization(std::finalizable *obj,
        std::finalization_queue &q = std::system_finalization_queue>);
void
unregister_for_finalization(std::finalizable *obj);
	// implicitly invokes obj->delay_finalization();
When an object passed to register for finalization becomes eligible for
finalization, it may be pushed onto the back of the supplied
std::finalization queue.
The client may later finalize all the elements on the queue with
q.finalize_all(),
which returns the number of elements actually finalized.
The default system finalization queue periodically calls its
finalize_all()
method once immediately after the return from main() and,
if threads are sup ported, periodically from a thread holding
no user-visible locks.
It is safe to make multiple concurrent calls to finalize_all().
If an object that is already registered for finalization is registered a second time, the resulting behavior is undefined except that an object that has already been enqueued may be re-registered for finalization.
If multiple subobjects corresponding to the same allocated section of memory are registered for finalization, the finalizers are invoked in reverse order of registration.
Some subtle properties of this proposal:
For sufficiently deterministic programs, it should be possible to provide an alternate runtime that checkpoints the program state at delay_finalization() calls. Once an object is found to be eligible for finalization, we could then roll back execution to the last delay_finalization() call on that object, effectively running the finalizer at the earliest possible time.
We have no experience with such an implementation, and it is hard to evaluate how practical it would be. But it does have the potential of identifying incorrect finalizer uses during testing, at least if the finalizer is used in a library that can be called from a deterministic test harness. Thus in addition to greatly clarifying the rules for finalizer use, we have a reasonable chance of being able to test for violations.