Monday, May 19, 2008

Casting, Subclassing, and the isa field

I was recently updating the xmpp framework, and I ran into a bit of a dilemma. Within the xmpp framework there are 3 types of elements: IQ, Presence and Message. I wanted to extend NSXMLElement to have custom subclasses for each type of element. The subclasses didn't have any variables of their own. Each just added convenient methods for accessing common information. But here's the catch: I was working with apple's xml classes, which only returned NSXMLElement's.

So how do I convert an NSXMLElement object to an instance of an object which extends NSXMLElement?

The experienced objective-c developer would be quick to point out that we could use categories for this. A category allows one to add methods to an existing class. This would be a solution, since I said I didn't need to add any variables to the subclass. However this didn't completely satisfied me. In the end, we're still just returning instances of NSXMLElement. So those using the framework would be able to call methods from ANY of the categories. So they could easily call a method that's supposed to be for an IQ element on an Presence element, and it would work, and the compiler wouldn't even warn them. I wanted something more fool proof. Plus what if I wanted slightly different implementations of the same method for each type.

So again: How do we convert an NSXMLElement object to an instance of an object which extends NSXMLElement?

Can we just cast it???

return (PresenceElement *)someXMLElement;

The compiler says yes, but the runtime says no. But why exactly does the runtime say no?

Everytime you [[alloc] init] an object, you're actually allocating all the storage you need to store an instance of that class. In other words, you're allocating storage for all the variables the class needs. Let's look at a quick example.

@interface Car : NSObject
float milesPerGallon;
- (void)doSomething;

@interface Hybrid : Car
float batteryCharge;
- (void)doSomethingElse;

So if we [[alloc] init] a Hybrid object, we allocate all the storage needed for an NSObject, a Car (1 float) and a Hybrid (1 float). But what about the methods? The compiled implementation of the code for the methods is not part of the object instance. (That would be pretty dumb wouldn't it!) It's part of the class itself. And this is where NSObject comes in. Part of the NSObject class is a field called "isa" which points to the class of the object. So when you send the message "doSomethingElse" to an instance of Hybrid, the runtime uses the isa field to find the Hybrid class, and invoke the proper method.

This is explained in excellent detail, with diagrams, here.

Now say we extend the Hybrid class WITHOUT adding any variables:

@interface MyHybrid : Hybrid
- (void)recharge;

Can we simply cast a Hybrid to MyHybrid? The answer is no. Because the allocated object still has an isa field which is pointing to the Hybrid class. If we simply cast it, and attempt to invoke "doSomethingElse", the runtime will follow the isa field to the Hybrid class, and it won't find a method named "doSomethingElse" in that class.

However, an [[alloc] init] of MyHybrid is almost exactly the same as an [[alloc] init] of Hybrid. They're the exact same size in RAM. In fact, the only difference would be the isa field. Instead of pointing to the Hybrid class, it would obviously be pointing to the MyHybrid class.

What I'm trying to say is this: You can convert a Hybrid object to a MyHybrid object simply by changing the isa field. Like this:

@implementation MyHybrid
+ (MyHybrid *)fromHybrid:(Hybrid *)hybrid
MyHybrid *result = (MyHybrid *) hybrid;
result->isa = [MyHybrid class];
return result;

Remember, this ONLY works if the subclass doesn't add any instance variables of it's own. I'm not sure what would happen if you used this technique absent-mindedly, but I'm guessing it would be BAD.


Kevin Perry said...

This technique is known as "isa-swizzling". KVO uses it itself to work its magic:

Robbie Hanson said...

That's good to know. This just goes to show that a little knowledge can potentially be dangerous. If anyone uses this technique they should recognize that it's rather low level, and should be used with caution and understanding.