Saturday, September 27, 2008

NSXML on the iPhone

We recently ran into a huge roadblock. We had created an XMPP framework in Cocoa, and we were hoping it would easily port to the iPhone. Turns out we were wrong, because Apple decided to make the NSXML classes (NSXMLDocument, NSXMLElement, NSXMLNode, etc) private! They are supposedly in the private Office framework, however private frameworks can't be used for any intensive purposes. So there we were, heavily invested in NSXML, and Apple pulls the floor out from under our feet. Some web searching quickly revealed that we weren't the only ones in this particular situation. It would seem that Apple is more interested in supporting gaming these days...

So with the NSXML class cluster missing from the iPhone, what are our options?

- The NSXMLParser is available. This gives you a SAX parser, and would be helpful if we only needed to read XML. The problem was, we needed to generate a bunch of XML too, so this only gets us halfway there. Plus if we went this route, we'd have to rewrite all the existing code that already uses NSXMLDocument.

- libxml2 is available on the iPhone. This is a C library that's been around for a long time, and it's been available in OS X since like 10.3.9 (or maybe even earlier). This is certainly an option... but it's in C! Surely somebody's made an Objective-C wrapper for it right? Most people think NSXML is a wrapper around libxml. But even if that's true it doesn't help us.

- The Coconut Framework is an Objective-C wrapper around libxml. It's distributed under GPL (but with the possibilty of LGPL or BSD if you talk to the author). It uses a different API than NSXML, likely because it's been around for awhile and NSXML was only added in 10.4. Also, it seems you can only use it to read XML but not generate it.

- TouchXML is another Objective-C wrapper around libxml. It's under a non-restrictive MIT license, and operates as an NSXML replacement. BUT - it gives you read-only support for XML.

- Google's GDataXML classes were the closest I could get. Distributed under the Apache license, they offer an Objective-C wrapper around libxml, and they operate as an NSXML replacement. Plus they offer the ability to read XML, and generate XML with methods such as elementWithName:, addAttribute:, addChild:, etc.

- We present another solution below. Keep reading...

So as a framework developer, with other developers in mind, I had to decide what to do.

If I used Google's GDataXML classes, then I could quickly patch the problem. (I'd only need to define the NSXML classes as GDataXML classes on the iPhone.) But it would offer only a small, small subset of what NSXML offers. Developers using the framework on the iPhone would have to realize this and compensate. Alternatively, I could switch all my code to use the GDataXML classes, but then developers on OS X would reasonably think I'm an idiot.

Also, as I was skimming the source code of all these frameworks I realized something - they are all rather primitive, and don't always operate as NSXML does, and sometimes behave counter-intuitively in an Objective-C world. For example:

- (void)method1
node1 = [[NSXMLElement elementWithName:@"node1"] retain];
node2 = [[NSXMLElement elementWithName:@"node2"] retain];
[node1 addChild:node2];
- (void)method2
NSLog(@"node2: %@", [node2 name]);
[node1 release];
NSLog(@"node2: %@", [node2 name]);

Run this code, and you get the name of node2 printed twice. This makes sense from an Objective-C standpoint. (node2 was never released, why would it's name disappear?) But what happens with other frameworks? Try something similar with TouchXML and it doesn't work the same way. Why? Becuase freeing node1 calls xmlNodeFree, which in turn frees all its children including the xmlNodePtr that node2 was wrapping. It works fine with GDataXML - but only because they "cheat" and copy xmlNode subtrees in their addChild: method, which seems a little wasteful considering how most people would use the API. For example:

NSXMLElement *queryNode, *usernameNode, *digestNode, *resourceNode;

queryNode = [NSXMLElement elementWithName:@"query" URI:@"jabber:iq:auth"];
usernameNode = [NSXMLElement elementWithName:@"username" stringValue:username];
digestNode = [NSXMLElement elementWithName:@"digest" stringValue:digest];
resourceNode = [NSXMLElement elementWithName:@"resource" stringValue:resource];

[queryNode addChild:usernameNode];
[queryNode addChild:digestNode];
[queryNode addChild:resourceNode];

NSXMLElement *iqNode = [NSXMLElement elementWithName:@"iq"];
[iqNode addAttributeWithName:@"type" stringValue:@"set"];
[iqNode addChild:queryElement];

In the example above, the GDataXML classes would create the username node once, then copy it so it can be added. The same would happen to the digest and resource nodes. And then the whole queryNode would get copied as well when it's added to the iqNode. Build up a big XML fragment like this, or build up many XML fragments, and there's a lot of wasteful copying going on. And building XML fragments is what our XMPP framework is all about.

There is an inherent danger in using an XML API that acts differently than Apple's. What happens if Apple makes it's NSXML classes public in a future update? We'd probably want to switch the framework to use them - but what if iPhone developers using the framework had already adapted to the alternative XML library. If that library operates differently than NSXML, then the switch could break a lot of code. Plus there's another problem: alot of iPhone developers are also Mac developers. And they would write a single library that would be used both on the desktop and on the phone. And in fact the XMPP framework itself fits into this category.

An optimal solution would be something like this:

- An Objective-C wrapper around libxml
- Ability to read XML
- Ability to generate XML
- Same API as NSXML
- Behaves the same as NSXML
- Behaves like a true Objective-C class

So I decided to write just this. I'm releasing the source code for iPhone developers everwhere in hopes that they never have to directly use libxml themselves, and so that if Apple ever does make their NSXML classes public, the transition can be seamless. I'm calling the framework KissXML, in honor of TouchXML which inspired me to write it. (Sorry Google, I didn't find out about your code until I was already halfway done.)

KissXML Google Code Project Page



Frank said...

A quick question on the XMPP/KissXML combination. It seems that your XMPP implementation uses initWithXMLString in NSXMLElement, however this method is not implemented in your DDXMLElement class. How do you manage to have these two bits working together in a single project? Perhaps the latest SVN versions are just not in sync?


Robbie Hanson said...

Hi Frank,

The initWithXMLString you're referring to is used in XMPPElement's initWithCoder: method. This initWithCoder method would only be used if one was passing XMPPElements between distributed objects, or archiving it. Since the XMPP framework doesn't directly need any of this stuff, it's not a direct problem for it.

However, it does generate a warning, which is annoying. And I do need DO/archiving abilities for a project I'm working on, so this will be fixed at some point in the future.

Frank said...

Great, thanks! When I get warnings that methods don't exist, I start to think something must be wrong. I'll carry on and ignore it, thanks!

Drunknbass said...

can this be used as a replacement to NSXMLParser? if so how? i have found the NSXMLParser on iphone to leak pretty bad.

Robbie Hanson said...

Hi Drunknbass,

Yes, the framework can be used as a replacement for the NSXMLParser SAX parser. And it's very, very fast too. In some preliminary benchmarks, it often outperforms the actual NSXML classes for raw parsing of XML.

You should be aware of one limitation though. This isn't a limitation of the specific framework, just a limitation of the DOM model (as opposed to SAX parsing). If you were wanting to parse extremely big files (say 10 MB or more) then your RAM usage with this any DOM framework may be a problem on the iPhone platform.

Eric J said...

KissXML is working very well for me, wish I had found it before wasting a weekend on TouchXML (no offense, guys).

The one thing I'm having trouble with is namespaces. The feed I'm reading has a generic namespace (xmlns=""), and I can't figure out how to configure the DDXML to assign a prefix for this or accept an XPATH without it.

Robbie Hanson said...

Hi Eric,

Can you post some sample XML, and XPath queries to the KissXML Mailing List?

Anonymous said...

Thanks for this effort! Will be using KissXML for new iPhone project. I'm new to Objective C and Apple development in general, but why on earth is this functionality (read/write XML) not built in from day one? What about SOAP functionality? Apple seriously dropped the ball in this respect IMO.

Anyway, Cheers and Thanks!

manutencao iphone said...

When I launch the iPhone 3.1.3 simulator from this 3.2.3 setup, I get an error from the simulator app stating it can't find the SDK with a "Switch SDK" button. If I pick the 3.1.3 SDK it still fails.

Hire iphone developer said...

Beautiful blog with nice informational content. This is a really interesting and informative post. Good job!

hire iphone developer

Gaurav said...

Hey..first off...amazing work with the framework..however..could u temme how to generate an XML Doc..i jus need some basic info in it..

for ex: i've tried this out..

DDXMLElement *root = (DDXMLElement *)[DDXMLNode elementWithName:@"UserDetails"];
DDXMLElement *childElement1 = [[DDXMLElement alloc] initWithName:@"Name"];
[childElement1 setStringValue:@"myNameHere"];
[root addChild:childElement1];
[childElement1 release];

DDXMLElement *childElement2 = [[DDXMLElement alloc] initWithName:@"UserName"];
[childElement2 setStringValue:@"coder"];
[root addChild:childElement2];
[childElement2 release];
//NSError *err;
DDXMLDocument *xmlDoc=[[DDXMLDocument alloc] initWithRootElement:root]; //What SHOULD COME HERE??

it would be really nice if u could help me out with this..

ccko said...


i want ask a question.

i add below code to replace node.

but in DDXMLNode.m (dealloc) will throw "pointer being freed was not allocated"

Thanks for your help.

+(void)replaceChild:(DDXMLNode *)A with:(DDXMLNode *)B