Monday, August 24, 2009

Object Ownership and Lifetime: Argument against garbage collection and reference counting

One of the biggest issue I came back face to face while trying out C++ on iPhone is object lifetime. To rephrase that, it is the prevention of memory leaks.

It is an issue that has long been place at the back of my mind and forgotten since I did my development on Java and C#. You simply create new object instances, and rely on the framework to properly garbage collect them.

That would probably work on iPhone too, if you stick to Objective C and use the auto-release memory pool. Simply mark all new object instances as auto-release and let the application 'garbage collect' them when the auto release memory pool is popped from the stack.

An approach to simulate garbage collection in C++ has long been reference counting. Every time a part of the application requires a strong reference to an object, it increments the object reference count. Every time an object goes out of scope, it reduces the reference count of all strongly held object references it held. When it reaches zero, the object is deleted/released/freed.

But personally, I disliked this approach. It is how COM works. It seemed like an unnecessary overhead. Why?

Because for a given program, you could easily identify a natural owner of an object. Object usage is shared. Object ownerships are not shared.

For example, assuming that I have a texture, that is used to render two different game objects. To me, I would identify the need of a texture manager that owns the texture. The two different game objects does not own the texture, but only use the texture.

The primary driving reason for reference counting is to prevent a situation when your texture manage release the texture, the game objects still have access to a valid texture.

But this argument is flawed in my opinion. When such a thing happen, it usually indicate that your program has entered an unknown state. Why did you actually unload the texture, when you know that two game objects need to use it? Or rather, on the other hand, are you unaware of what is using or needing what at any particular moment of the application execution?

Of course, there are times when you DO need to unload a texture to free up enough memory for another texture, and you do know at that point of time the two game objects are 'dormant'. They are not drawn and can request the texture to be loaded back in at a later stage (this is more like resource handles concept and I will describe them in a later post). But if you had reference counting, in this case, you would not be able to free the memory up! It's reduced flexibility!

But back to the original argument. Since we can easily identify owners of resources and objects, the argument of reference counting is void.

For this game development, I am sticking to a few 'managed containers', and constraining myself to 'managed pointers' to help own my objects.

To sidetrack slightly, a managed pointer is a template class that manage the lifetime of an object. When such a pointer goes out of scope, the managed object is deleted/released/freed. The most popular implementation of this is probably std::auto_ptr, found in the Standard C++ Library. There are no naked pointers in my program. I use managed pointers as class members, or local variables in a function call.

To extend the idea, I created managed containers. They are simply an extension of the common containers like array, vector, list, map. During their destructor call, they will iterate over all the items and delete/release/free them. Again, they are used as class members or local variables in a function call.

Back to game development, what I end up with is as follows:
1. Each stage of the game is actually a single object
2. The stage object has a few managed containers, which manage top level objects of each stage like game objects, texture resources, script resources.
3. Extra care is taken that every time a new top level object is created, they will be added to the managed containers.

But we still have a big problem. For any given game execution, there should be two type of object instances.
1. Resources and objects that last throughout the execution of a stage. Texture, scripts would belong to this category.
2. Game object instances that are dynamically created and destroyed during the execution of a stage. Enemy spawns, particle effects, etc would belong to this category.

A game object could hold references to objects of type 1, and not have any problems at all. If however a game object hold references to object of type 2, we have a problem.

Objects of type 2 seemed like a big problem on initial look. Upon closer examination, however, they ended up not being a problem at all. Because it goes back to above argument. An object really should not reference another destroyed object. If we reach this stage, we have a bug. We should not go defensive on this and build in reference counting to prevent this. Instead, we should be on the offensive, and encourage such bugs to surface as soon as possible.

Because after all, there is no reasonable way to recover gracefully from such a bug :)

My conclusion? Garbage collection is really a big help as it takes away the burden of releasing memory for you. But when you think hard about it, you could actually properly identify owners and lifetime of each object instance. When you reach that point, garbage collection becomes so much more undeterministic (beyond the promise that at some time in the execution your garbage will be collected).

In fact, garbage colleciton (and reference counting) actually hides bug. You have objects that live longer than they should, because some live objects are holding on references to them. You lost sight of who are the true owners. You have weird and unexpected behavior because an object that you thought was released actually got triggered.

Do some C++. It is good for your mind.

blog comments powered by Disqus