This article offers an explanation of a useful C++ template idiom, which is then put to work in an implementation of typed object variables.
Although templates have been part of C++ for some time, the last year or so has seen a surge of interest in the potential of this feature of the language. This is partly due to the increased availability of reliable (well, reasonably reliable) implementations in popular compilers, and partly to the influence on the ANSI/ISO C++ draft standard of Stepanov and Lee's Standard Template Library. Because of the extensive use of templates in the proposed standard library for C++, we are all going to become much more familiar with using them. The library's idioms will spawn imitation and enhancement, and I predict that in a couple of years we will be spending more time talking templates than inheritance in the C++ community.
Currently, the most familiar use of template classes in C++ is to provide parameterised, type-safe collections. This idiom has clear advantages over the cumbersome inheritance-based or pointer-to-void collection classes provided by compiler vendors whose tools do not support templates. The purpose of this article, however, is to explore another template idiom, providing a class which allows us to wrap an instance of another class, maintaining access to that class's members, but adding functionality in a way that is transparent to the wrapped class. By overloading the dereferencing operator -> for the template class, we can use instances of specialisations of the template class as pointers to the contained class, while keeping control of access, lifetime and ownership. We'll look first at building a simple reference counting mechanism, before continuing to an implementation of typed object reference variables. These manage references and lifetimes of their objects, automatically deleting them when they are no longer referenced.
The key to this idiom is the overload of the dereferencing operator. By wrapping an instance of a class in a template class, and returning the address of the wrapped instance from the dereferencing operator, we can still access any of the wrapped class's members:
template <class T>
class Wrapper {
public:
T* operator->() { return &myT; }
private:
T myT;
};
int main() {
Wrapper<Thing> wThing;
wThing->Foo(); // calls Thing::Foo()
...
}
C++ defines special treatment for the dereferencing operator. When applied
to an object it must be followed by a member (variable or function) name.
The operand is effectively replaced in the expression by its result. If
the operator returns an object or reference which also supplies an overloaded
dereferencing operator, this too is applied, and so on until a pointer
is returned. The name of the member is then used against this to call the
requested member function or access the required member data. If the final
class which results from this process has no member of the given name,
the compiler displays an error message.
As it stands, this example doesn't add anything to our class: it simply forwards calls to its contained instance. Before going on, consider what assumptions are being made for the template parameter class T. Clearly, any class wrapped by this template must have a default constructor. If it doesn't, the compiler will complain when we try to instantiate the template with the class. This would appear to restrict us somewhat, as it is not always the case that an interesting class has a default constructor. We certainly don't want to provide our template class with a constructor for each possible parameter class.
There are two ways around the problem. Firstly we could define a constructor for the template class which takes as an argument a const reference to an instance of the parameter class. Here is our new template class:
template <class T>
class Wrapper1 {
public:
Wrapper1(const T& rT) : myT(rT) {}
T* operator->() { return &myT; }
private:
T myT;
};
int main() {
Wrapper1<Thing> wThing(Thing(10,20));
...
}
This makes it possible to initialise the contained instance with a temporary
object created just for the purposes of this initialisation. This works,
but is inefficient: we have created two instances of the parameter class,
not just one. Moreover, this idiom in turn relies on the parameter class
having a copy constructor, either generated by the compiler or defined
by the class implementer. A class may have chosen to prevent copying of
its instances by declaring the copy constructor private: such a class cannot
be wrapped by this template. On the plus side, we have the advantage of
being able to initialise the template class from an already-existing instance
of the wrapped class. It doesn't have to be a temporary object created
for the purpose.
A second alternative is to store a pointer to the wrapped class in the template class, rather than a whole instance:
template <class T>
class Wrapper2 {
public:
Wrapper2(T* pT) : my_pT(pT) {}
T* operator->() { return my_pT; }
private:
T* my_pT;
};
int main() {
Thing* pThing = new Thing(10,20);
Wrapper2<Thing> wThing(pThing);
...
delete pThing;
// DANGER: wThing now invalid!
}
Now only one instance of the wrapped class is created. We preserve the
advantage from the previous version of being able to wrap an existing instance,
and can use the template class to wrap a class whose copy constructor is
not accessible. But now we need to manage the pointer which we pass to
the template constructor. If all we want is a wrapper for a single instance
in a defined lifetime, having to delete the instance separately is cumbersome
and potentially error-prone. We might try to create the new instance as
a temporary pointer in the constructor call for the template class, and
have the template delete the pointer itself in its destructor:
template <class T>
class Wrapper3 {
public:
Wrapper3(T* pT) : my_pT(pT) {}
~Wrapper3() { delete my_pT; }
T* operator->() { return my_pT; }
private:
T* my_pT;
};
int main() {
Wrapper3<Thing> wThing(new Thing(10,20));
...
}
This means that the ownership of the pointer is held by the template
class instance, so if we pass a pointer to an already-existing instance,
we have to be sure not to delete it. Moreover, we had better not pass in
the address of a stack or static instance of the wrapped class:
... Thing danger(5,10) Wrapper3<Thing> wThing(&danger); ...When wThing goes out of scope and its destructor is called, an attempt will be made to delete memory held on the stack. This is always bad news, as adjacent data on the stack will be corrupted. To use this solution requires some care and discipline, but we will see that it can be a useful tactic.
Often, we need to manage objects on the basis of the number of references to an instance. We can use this template idiom to provide a wrapper for any class which we want to be able to use in this way. To keep things simple, the counted class provides facilities for manual increment and decrement of references: we will look at a way of automating this later. Here is the class: for now, we assume that a default constructor is available for parameter classes.
template <class T>
class Counted {
public:
Counted() : Count(0) {}
~Counted() { ASSERT(Count == 0); }
unsigned GetRef() { return ++Count; }
unsigned FreeRef() { ASSERT(Count > 0); return --Count; }
T* operator->() { return &myT; }
private:
T myT;
unsigned Count;
};
A couple of ASSERTs check that no references exist for the object on deletion,
and that decrementing the count is only done when the count is greater
than zero. Both members which update the reference count return the new
value of the count.
There is an inconsistency between member functions defined on the template itself, and those we can access on the wrapped instance by virtue of the overloaded dereferencing operator. Given an instance of the template class, we must mix the dot and arrow notations depending on what is being called:
Counted<Thing> t; t.IncRef(); // Call Template<Thing>::IncRef() ... t->Foo(); // Call Thing::Foo() ... t.DecRef();Another drawback arises because the overloaded dereferencing operator applies to an object, not a pointer. If we decided to pass around pointers to Counted<Thing>, calls to the wrapped instance would need to dereference the pointer first. This is inconvenient.
void TryIt(Counted<Thing>* p) {
p->IncRef(); // Call Template<Thing>::IncRef()
(*p)->Foo(); // Call Thing::Foo()
p->Foo(); // NB Won't work!
p->DecRef();
}
However, we can hide much of this behind a second template class to manage
automatic reference counting and object deletion. Instances of this class
act like object variables in Smalltalk and similar languages, in which
a variable is effectively always a reference of some sort (in C++ terms,
a reference or pointer), and never a 'whole' object on the stack or in
static storage. This restriction on the way objects can be created and
used has advantages over C++'s position of complete compatibility with
C. In particular, assignment of one object variable to another simply means
sharing a reference, not copy of an object: the complication of copy constructors
and assignment operators can be ignored. Among more recent languages to
adopt this model rather than a 'declare-anywhere' usage are Borland's Delphi,
and Java, a rational C++ from Hewlett Packard.
We'll start with a variation on the reference-counted template, using a pointer to the embedded instance of the parameter class, initialised from a pointer passed to the constructor. An assertion in the body of the constructor offers some protection against invalid initialisation.
template <class T> class Objvar;
template <class T>
class Counted
{
friend class ObjVar<T>;
private:
Counted(T* pT) : Count(0), my_pT(pT)
{ ASSERT(pT != 0); }
~Counted() { ASSERT(Count == 0); delete my_pT; }
unsigned GetRef() { return ++Count; }
unsigned FreeRef() { ASSERT(Count!=0); return --Count; }
T* const my_pT;
unsigned Count;
};
Now that this class has become, effectively, a helper class, all members
are private, and friendship is granted to the class which will finally
implement the object variable itself. This protection could also be managed
by declaring the template class Counted within the private section of the
declaration of ObjVar. However, current compilers differ in their ability
to cope with nested template classes: the solution using friend also has
the advantage of simplifying the declaration of the ObjVar class itself.
The Counted class no longer overloads the dereferencing operator. The friend declaration to the corresponding ObjVar class means the pointer to the actual object can be retrieved efficiently. Alternatively we could provide an inline accessor member function to return the pointer to the wrapped instance.
We can now define the object variable class to wrap a pointer to an instances of this counted class. We manage sharing of the reference by overloading assignment and providing a copy constructor. Here is the declaration of the class:
template <class T>
class ObjVar
{
public:
ObjVar();
ObjVar(T* pT);
~ObjVar();
ObjVar(const ObjVar<T>& rVar);
ObjVar<T>& operator=(const ObjVar<T>& rVar);
T* operator->();
const T* operator->() const;
friend bool operator==(const ObjVar<T>& lhs,
const ObjVar<T>& rhs);
bool Null() const {return m_pCounted == 0};
void SetNull() { UnBind(); }
private:
void UnBind();
Counted<T>* m_pCounted;
};
In practice, I would expect all the functions to be inline for performance
reasons. The following discussion of each of the members presents them
out-of-line for ease of reference.
The default constructor creates an ObjVar in which the pointer to the counted object is set to zero, making it in effect the nul object. A member function Null() is provided to test for this, and a SetNull() function to lose the binding to the current object.
The one-argument constructor creates (using new) an instance of the counted class with the passed pointer, then increments the reference count on this object. The destructor calls UnBind (described below):
template<class T>
ObjVar<T>::ObjVar()
: m_pCounted(0) {}
template<class T>
ObjVar<T>::ObjVar(T* pT)
{
m_pCounted = new Counted<T>(pT);
m_pCounted->GetRef();
}
template<class T>
ObjVar<T>::~ObjVar()
{
UnBind();
}
The member function UnBind is called whenever the ObjVar instance loses
a reference to the wrapped counted instance. If in the process the reference
count becomes zero, the wrapped instance can safely be deleted. The ObjVar's
pointer to the counted instance is set to zero to indicate that the current
variable now references the Nul object.
template<class T>
void ObjVar<T>::UnBind()
{
if (!Null() && m_pCounted->FreeRef() == 0)
delete m_pCounted;
m_pCounted = 0;
}
The overloaded dereferencing operator simply returns the object held in
the currently referenced counted instance. We throw an exception if an
attempt is made to apply the operator to a null object variable.
template<class T>
T* ObjVar<T>::operator->()
{
if (Null())
throw NulRefException();
return m_pCounted->my_pT;
}
template<class T>
const T* ObjVar<T>::operator->() const
{
if (Null())
throw NulRefException();
return m_pCounted->my_pT;
}
Two versions of this operator are provided, one non-const (returning a
pointer to the wrapped class), the other const (returning a pointer to
const T). This allows us to declare and use constant object variables,
through which it is impossible to change the object being wrapped. Note
that this does not mean that an object referenced by a constant ObjVar
cannot change. Just as with raw C++ pointers and references, another object
may hold a non-const ObjVar to the same underlying object.
Things get more complicated in the copy constructor and assignment operator. Remember that copying and assigning mean sharing the reference, not copying the object. For the copy constructor, we must store the pointer to the counted object from the argument to the constructor, and increment its reference count:
template<class T>
ObjVar<T>::ObjVar(const ObjVar<T>& rVar)
{
m_pCounted = rVar.m_pCounted;
if (!Null())
m_pCounted->GetRef();
}
In the assignment operator, we must detach the current counted object,
decrement its reference count and delete it if it is zero, then attach
the counted object from the argument, and increment its reference count.
As always, we must take care to deal properly with self-assignment. The
issue is dealt with neatly here by incrementing the argument (right-hand
side) reference count before decrementing the count for the current
(left-hand-side) object. In the case of self-assignment, this will leave
the reference count unchanged and consequently prevent deletion of the
counted instance.
template<class T>
ObjVar<T>& operator=(const ObjVar<T>& rVar)
{
if (!rVar.Null())
rVar.m_pCounted->GetRef();
UnBind();
m_pCounted = rVar.m_pCounted;
return *this;
}
Finally, we declare equality and not-equals operators. Two object variables
are equal only if they wrap the same underlying instance. This implements
a test for object identity, by comparing the addresses of the objects
wrapped inside the counted class. This is exactly what we want for object
variables, but might in some cases cause problems in contexts where the
equality operator is understood as an equivalence relation (for
example, when instantiating STL container classes). In this case, you may
prefer to dereference the pointers before testing for equality.
template<class T>
bool operator==(const ObjVar<T>& lhs,
const ObjVar<T>& rhs)
{
return lhs.m_pCounted->my_pT
== rhs.m_pCounted->my_pT;
// or *(lhs.m_pCounted->my_pT) == *(rhs.....)
}
template<class T>
bool operator!=(const ObjVar<T>& lhs,
const ObjVar<T>& rhs)
{
return !(lhs == rhs);
}
Note that the inequality operator does not need to be defined as a friend
of the objvar class, as it simply negates the result of the test for equality.
With a little ingenuity, it is also possible to implement object variables without an intermediate reference-counted class. While this saves a level of indirection in access to the contained object, it introduces the overhead of an additional pointer in the class. This pointer is used to reference an integer holding the reference count, which means it can be shared between all references to the wrapped object. However, because the only data held in one of our object variable instances is a single pointer to a reference-counted object, passing and returning object variable instances is efficient.
These object variables are straightforward to use, and can take the place of ordinary pointers or references to objects in many places. Let's assume we're building a simulation which allows a computer user to explore a virtual world. The user can navigate through a number of virtual locations, interacting with objects and other users in the current location. This could be part of an system for designing buildings, or simply an interactive game. A class Agent represents a user, and is always associated with a current location. An instance of agent is created in an initial location, so the Agent constructor takes an ObjVar for the Location class.
class Agent {
public:
Agent(ObjVar<Location> InitialLoc)
: CurrentLoc(InitialLoc) {}
void Enter(ObjVar<Location> NewLoc)
{ CurrentLoc = NewLoc; }
ObjVar<Location> GetCurrentLoc
{ return CurrentLoc; }
private
ObjVar<Location> CurrentLoc;
};
Don't forget, while it looks as if whole objects are being passed and assigned
in these Agent member functions, the actual instances of Location are held
as pointers and wrapped by the ObjVar template. We get the convenience
of direct use of variables without having to worry about null pointers
being passed or assigned.
The requirements for a Location are somewhat more complex. A Location needs to keep track of all Agents currently present. These will need to be kept on a list of some sort, and the natural solution would usually be to instantiate a templated list class with pointers to the objects in question. Putting pointers to objects in lists can cause problems with ownership: if these objects are contained in more than one list, which is responsible for deleting these objects? The issue with a list of ObjVars is clearer: once an ObjVar is removed from such a list the ObjVar itself is automatically deleted: only if the reference count of the wrapped object consequently becomes zero is the object itself reclaimed. Assuming some standard list accessors on the list template class, here is part of Location's implementation:
class Location {
public:
void AgentEnters(ObjVar<Agent> A)
{ Agents.Add(A); }
void AgentLeaves(ObjVar<Agent> A);
{ Agents.Remove(A); }
private:
List<ObjVar<Agent> > Agents;
};
How do we add and remove agents from this world? We'll make this the responsibility
of an object representing the world as a whole. Here's part of the declaration
of a World class:
class World {
public:
...
ObjVar<Agent> CreateAgent();
void KillAgent(ObjVar<Agent> A);
...
private:
List<ObjVar<Location> > Locations;
List<ObjVar<Agent> > Agents;
ObjVar<Location> InitialLoc;
};
Creating an agent involves declaring an instance of the Agent object variable
type initialised with a new agent and adding her to the world's list of
Agents. The world must remember an initial location in which to create
agents.
ObjVar<Agent> World::CreateAgent()
{
// NB dot syntax!
ASSERT(!InitialLoc.Null());
ObjVar<Agent> A(new Agent(InitialLoc));
InitialLoc->AgentEnters(A);
Agents.Add(A);
return A;
}
Removing an agent is just as simple, but raises issues of visibility. It
is possible that in his travels the Agent has entered into relationships
with other objects. Ideally, we deal with this by managing at some level
each and every association from the agent's point of view. This could be
done using a notification mechanism, or might involve the individual agent
remembering each of these associations. Conversely we can rely on the object
variable itself to defer actual deletion of the Agent object until no more
references exist. This implies that an Agent should enter a 'zombie' state,
which can be checked on access.
void World::RemoveAgent(ObjVar<Agent> A)
{
Agents.Remove(A);
A->GetCurrentLoc()->AgentLeaves(A);
A->Kill();
}
// ... and, in agent.cpp ...
void Agent::Kill()
{
// Set current location to Nul - NB dot syntax!
CurrentLoc.SetNull();
// Set current state to zombie
m_bZombie = true;
}
If this were done using pointers rather than object variables, we would
have no choice but to delete the Agent at this stage. If we had missed
updating some of the objects which might have referenced this Agent, we
would be left in a situation in which future access to the just-deleted
agent would be possible, with all-too-familiar results.
There are a cople of drawbacks to this technique. One problem arises from the interpretation of this. When an agent enters a location, it would be convenient for the agent object itself to notify the old and new locations that it has left, as follows:
void Agent::Enter(ObjVar<Location> NewLoc)
{
CurrentLoc->AgentLeaves(this); // NB!
CurrentLoc = NewLoc;
CurrentLoc->AgentEnters(this); // NB!
}
However, the pointer to the current agent this is a C++ pointer-to-agent,
not an object variable. The compiler will convert this pointer into a temporary
object variable in each of the calls to the Location member functions:
this ObjVar will believe it is the first reference to the wrapped agent,
and when it goes out of scope at the end of the function, the wrapped agent
pointer will be deleted! We can prevent the compiler from making the conversion
automatically through using the new explicit keyword in
the declaration of the objvar template constructor:
template <class T>
class ObjVar
{
public:
ObjVar();
explicit ObjVar(T* pT);
...
This at least means that accidental conversions and consequent deletions
of wrapped objects cannot occur. But it does leave a hole in the idiom,
and forces us to make a third object responsible for such interactions:
in the current case, this third object must of course hold the identities
of both Agent and Location as ObjVars. Surprisingly, there is a way around
this problem, which will have to wait for a subsequent article as space
is already short here.
A second issue also arises from objects which hold mutual references. If these ever become the only references held by these objects, then we end up with objects which can never be reclaimed by the ObjVar mechanism - in effect, a good old-fashioned memory leak. The situation extends to cycles of three or more links. To avoid this, we must take care that at least one object is responsible for breaking the cycle.
In spite of this, these typed object variables demonstrate the power of templates beyond managing collections of objects. Reference-counted smart-pointers offer the flexibility of pointers in implementing associations but, by taking on the burden of reference counting and management of sharing, largely free us from worrying about the deletion of individual object instances. Although the extra indirections make function call marginally slower, the only penalty in code size arises from specialising the template class member functions: in particular, the fact that ObjVars contain only a pointer makes them efficient to pass into and return from functions. By using object variables to wrap specialised instances of abstract base classes, we can benefit both from polymorphism and from a measure of automatic memory management.
Myers[95]: N. Myers, 'A new and useful template technique: "Traits"', C++ Report, May 1995, 33-35.
Stepanov[95]: A. Stepanov/M. Lee, The Standard Template Library ISO Programming Language C++ Project, Doc No: X3J16/94-0095, WG21/N0482, May 1994. This material is now incorporated in ANSI[95].
Stroustrup[91]: B.Stroustrup, The C++ Programming Language, 2nd Edition, Addison Wesley, 1991.
Stroustrup[94]: B. Stroustrup, 'Making a vector fit for a standard', C++ Report, October 1994, 30-34.
Veldhuizen[95a]: T. Veldhuizen, 'Using C++ template metaprograms', C++ Report, May 1995, 36-43.
Veldhuizen[95b]: T. Veldhuizen, 'Expression Templates', C++ Report, June 1995, 27-31.
Vilot[94]: M. J. Vilot, 'An Introduction to the Standard Template Library', C++ Report, October 1994, 22-29.