That's one of the problems with writing code. It's basically like a long math problem. And in the end if your answer doesn't match what's in the back of the book, the only thing you know is that there's a mistake somewhere between step 1 and step 80. Happy hunting!
So for the benefit of the community, I'm going to break down an example step-by-step.
First an overview of the stream communication, with data coming from the server in blue, data being sent from the client in orange, and my comments in gray. The authentication will be for user "test" and password "secret".
<auth xmlns='urn:ietf:params:xml:ns:xmpp-sasl' mechanism='DIGEST-MD5'/>
There's actually no spaces or formatting anywhere in the
challenge above. I added it for readability. The data is
encoded in base64, and here's what it says:
Again remember that there's really no spaces or formatting in the
response above. I added them simply for readability. The payload
is encoded in base64 and says (without spaces/formatting):
No spaces, base64, etc:
The big thing to figure out is the response (37991b870e0f6cc757ec74c47877472b) value. How do we go about getting this value?
We'll need to hash (md5) a bunch of values together to get this response. Here's what we'll need:
The username and password are what we're using to authenticate.
Where did the realm come from? Sometimes it's supplied by the server inside the challenge. In this particular example it wasn't, so we used the domain identifier of the server. Our server is an ejabberd server running on the local machine, which is why it has a funny name. But in your case it would be something like "deusty.com", "gmail.com", "jabber.org", etc. IE - if your JID is "johnDoe@deusty.com" then your realm (if not otherwise stated in the challenge) is "deusty.com".
The nonce and qop values were supplied by the server in the challenge. The nonce will be different for each challenge the server sends. In this case it looks like a random value.
cnonce is a random string we provide. To make one yourself you could simply use a random number generator. Mine is a UUID (universally unique identifier), and I choose to use UUID's because they're so simple to generate using Carbon's CFUUID class.
digest-uri is easy to make: xmpp/[domainID]
So if you were johnDoe@deusty.com, then your digest-uri would be "xmpp/desuty.com"
And nc is the nonce count. It's the count of how many times you've sent info to the server. But since we're only going to be sending one packet, we really don't have to worry about it. Just make it 00000001 like I did.
Step 1: Combine username:realm:password and md5 hash them.
So we hash "test:osXstream.local:secret" (without the quotes) and store the result in a variable called HA1data.
Here's the trick - normally when you hash stuff you get a result in hex values. But we don't want this result as a string of hex values! We need to keep the result as raw data! If you were to do a hex dump of this data you'd find it to be "3a4f5725a748ca945e506e30acd906f0". But remeber, we need to operate on it's raw data, so don't convert it to a string.
Step 2: We need to combine the result from step 1 with the nonce and cnonce.
So we hash [HA1data]:nonce:cnonce
But wait, the result from step 1 is in raw data format, and the new stuff is a string. So we convert our string ":392616736:05E0A6E7-0B7B-4430-9549-0FE1C244ABAB" into raw utf-8 data, and append this to the end of HA1data.
Step 3: Hash the data from step 2, and store it's hex value in a string HA1.
The value of the string will be "b9709c3cdb60c5fab0a33ebebdd267c4".
Step 4: Hash the string AUTHENTICATE:[digest-uri]. So we'll be hashing "AUTHENTICATE:xmpp/osXstream.local", and we store it's hex value in a string HA2.
The value of the string will be "2b09ce6dd013d861f2cb21cc8797a64d".
Step 5: Hash: HA1:nonce:nc:cnonce:qop:HA2
So we'll be hashing:
Store it's hex value as the result. It should be:
And we're done! That's the hard part. Now you just package up the response value along with all the other stuff, encode it as base64, and send it across the wire.
A few other things I should mention:
Sometimes the server sends back the rspauth in another challenge element. Servers do this because this is how the example is given in the original RFC. The client then has to send an empty response to this prior to receiving the success element. Hopefully, in the future, servers will put an end to this insanity and implement it like my example, since this is how it probably should be. 1, 2
After you've authenticated, you'll still need to bind your resource.
After binding, some servers require you to initiate a session before they'll communicate with you.