Sunday, September 2, 2007

Example please...

I've been working hard recently on a native Cocoa XMPP library. (If you don't know what xmpp is, you can read the wikipedia article. It's the protocol behind jabber and google talk.) One of the difficulties I ran into was implementing SASL digest authentication. The XMPP RFC gives an example, but doesn't tell you how it created the proper response. After some googling, I stumbled upon the RFC for Digest Authentication as a SASL Mechanism. Here I found a rather cryptic (yet detailed) algorithm of how to create the proper responses. Of course, I couldn't apply the algorithm to the XMPP example in the RFC since the author didn't bother to tell us what password he used. Fortunately the SASL document had an example. The only problem was I couldn't get my code to match it. Where was I going wrong?

That's one of the problems with writing code. It's basically like a long math problem. And in the end if your answer doesn't match what's in the back of the book, the only thing you know is that there's a mistake somewhere between step 1 and step 80. Happy hunting!

So for the benefit of the community, I'm going to break down an example step-by-step.

First an overview of the stream communication, with data coming from the server in blue, data being sent from the client in orange, and my comments in gray. The authentication will be for user "test" and password "secret".




<stream:features>
<mechanisms xmlns='urn:ietf:params:xml:ns:xmpp-sasl'>
<mechanism>DIGEST-MD5</mechanism>
</mechanisms>
</stream:features>


<auth xmlns='urn:ietf:params:xml:ns:xmpp-sasl' mechanism='DIGEST-MD5'/>


<challenge xmlns='urn:ietf:params:xml:ns:xmpp-sasl'>
bm9uY2U9IjM5MjYxNjczNiIscW9wPSJhdXRoIixjaGFyc2V0PXV0Zi04LGFsZ29yaXRobT1tZDUtc2Vzcw==
</challenge>


There's actually no spaces or formatting anywhere in the
challenge above. I added it for readability. The data is
encoded in base64, and here's what it says:


nonce="392616736",qop="auth",charset=utf-8,algorithm=md5-sess


<response xmlns="urn:ietf:params:xml:ns:xmpp-sasl">
dXNlcm5hbWU9InRlc3QiLHJlYWxtPSJvc1hzdHJlYW0ubG9jYWwiLG5vbmNlPSIzOTI2MTY3MzYiLGNub25jZT0iMDVFMEE2RTctMEI3Qi00NDMwLTk1NDktMEZFMUMyNDRBQkFCIixuYz0wMDAwMDAwMSxxb3A9YXV0aCxkaWdlc3QtdXJpPSJ4bXBwL29zWHN0cmVhbS5sb2NhbCIscmVzcG9uc2U9Mzc5OTFiODcwZTBmNmNjNzU3ZWM3NGM0Nzg3NzQ3MmIsY2hhcnNldD11dGYtOA==
</response>


Again remember that there's really no spaces or formatting in the
response above. I added them simply for readability. The payload
is encoded in base64 and says (without spaces/formatting):


username="test",realm="osXstream.local",nonce="392616736",
cnonce="05E0A6E7-0B7B-4430-9549-0FE1C244ABAB",nc=00000001,
qop=auth,digest-uri="xmpp/osXstream.local",
response=37991b870e0f6cc757ec74c47877472b,charset=utf-8


<success xmlns='urn:ietf:params:xml:ns:xmpp-sasl'>
cnNwYXV0aD1mMDc5YmIwY2FhYjFhMWYwZjViMDk1MDAxMGRjZWU0Yw==
</success>


No spaces, base64, etc:
rspauth=f079bb0caab1a1f0f5b0950010dcee4c




The big thing to figure out is the response (37991b870e0f6cc757ec74c47877472b) value. How do we go about getting this value?

We'll need to hash (md5) a bunch of values together to get this response. Here's what we'll need:

username=test
password=secret
realm=osXstream.local
nonce=392616736
qop=auth
cnonce=05E0A6E7-0B7B-4430-9549-0FE1C244ABAB
digest-uri=xmpp/osXstream.local
nc=00000001

The username and password are what we're using to authenticate.

Where did the realm come from? Sometimes it's supplied by the server inside the challenge. In this particular example it wasn't, so we used the domain identifier of the server. Our server is an ejabberd server running on the local machine, which is why it has a funny name. But in your case it would be something like "deusty.com", "gmail.com", "jabber.org", etc. IE - if your JID is "johnDoe@deusty.com" then your realm (if not otherwise stated in the challenge) is "deusty.com".

The nonce and qop values were supplied by the server in the challenge. The nonce will be different for each challenge the server sends. In this case it looks like a random value.

cnonce is a random string we provide. To make one yourself you could simply use a random number generator. Mine is a UUID (universally unique identifier), and I choose to use UUID's because they're so simple to generate using Carbon's CFUUID class.

digest-uri is easy to make: xmpp/[domainID]
So if you were johnDoe@deusty.com, then your digest-uri would be "xmpp/desuty.com"

And nc is the nonce count. It's the count of how many times you've sent info to the server. But since we're only going to be sending one packet, we really don't have to worry about it. Just make it 00000001 like I did.

Step 1: Combine username:realm:password and md5 hash them.
So we hash "test:osXstream.local:secret" (without the quotes) and store the result in a variable called HA1data.

Here's the trick - normally when you hash stuff you get a result in hex values. But we don't want this result as a string of hex values! We need to keep the result as raw data! If you were to do a hex dump of this data you'd find it to be "3a4f5725a748ca945e506e30acd906f0". But remeber, we need to operate on it's raw data, so don't convert it to a string.

Step 2: We need to combine the result from step 1 with the nonce and cnonce.
So we hash [HA1data]:nonce:cnonce

But wait, the result from step 1 is in raw data format, and the new stuff is a string. So we convert our string ":392616736:05E0A6E7-0B7B-4430-9549-0FE1C244ABAB" into raw utf-8 data, and append this to the end of HA1data.

Step 3: Hash the data from step 2, and store it's hex value in a string HA1.
The value of the string will be "b9709c3cdb60c5fab0a33ebebdd267c4".

Step 4: Hash the string AUTHENTICATE:[digest-uri]. So we'll be hashing "AUTHENTICATE:xmpp/osXstream.local", and we store it's hex value in a string HA2.
The value of the string will be "2b09ce6dd013d861f2cb21cc8797a64d".

Step 5: Hash: HA1:nonce:nc:cnonce:qop:HA2
So we'll be hashing:
b9709c3cdb60c5fab0a33ebebdd267c4:392616736:00000001:05E0A6E7-0B7B-4430-9549-0FE1C244ABAB:auth:2b09ce6dd013d861f2cb21cc8797a64d

Store it's hex value as the result. It should be:
37991b870e0f6cc757ec74c47877472b

And we're done! That's the hard part. Now you just package up the response value along with all the other stuff, encode it as base64, and send it across the wire.

A few other things I should mention:

Sometimes the server sends back the rspauth in another challenge element. Servers do this because this is how the example is given in the original RFC. The client then has to send an empty response to this prior to receiving the success element. Hopefully, in the future, servers will put an end to this insanity and implement it like my example, since this is how it probably should be. 1, 2

After you've authenticated, you'll still need to bind your resource.

After binding, some servers require you to initiate a session before they'll communicate with you.

18 comments:

Anonymous said...

I have any same trouble regarding making the XMPP clients response. I failed in the final step of the Challenge-response of SASL. XMPP Server sent "not authorized". I found one question. I can get same solution of your step 2 of making response. But I could not get same data in step 3 "b970....". Is same the utf-8 encoding data for pure ASCII code, for example, "3a4f5..."? In other words, "3a4f572..." raw data string is as same as "3a4f572..." utf-8 encoding data, because we use only a ASCII code.
http://en.wikipedia.org/wiki/UTF-8

Could you explain how to encoding the raw data to the utf-8 encoding data?
Why do we use the utf-8 encoding for the ASCII-code? Is contents of data same?
Are the utf-8 encoding data different from pure raw data?
Or could you tell us the nice utf-8 encoding sample?

PS: Sorry, I'm an asian. English is not well.

Anonymous said...

Sorry I foget to describe my name "mam" in above comment.

For example , I calculated step 2 of making response.
(Of cource . this is a bad example....)
Why?

A1=
HA1data:nonce:cnonce=
"3a4f572a5a748ca945e506e30acd90906f0:392616736:05E0A6E7-0B7B-4430-9549-0FE1C244ABAB"
* concatenate as string...

--->
Step-3:
H(A1) =
df7947fbd06f285b831c996f5f1f96af

Why should not we calculate above?

--- mam ---.

Robbie Hanson said...

Hi anonymous,

I'll do my best to help you get the problem solved. It sounds like you're having problems with the really tricky step.

First, here's example source code. This may clear up a few things. I'll try to expand upon it in another comment.

(you may have to copy and paste to see it all)

SSCrypto *crypto = [[[SSCrypto alloc] init] autorelease];

NSString *HA1str = [NSString stringWithFormat:@"%@:%@:%@", username, realm, password];
NSString *HA2str = [NSString stringWithFormat:@"AUTHENTICATE:%@", digestURI];

[crypto setClearTextWithString:HA1str];
NSData *HA1dataA = [crypto digest:@"MD5"];
NSData *HA1dataB = [[NSString stringWithFormat:@":%@:%@", nonce, cnonce] dataUsingEncoding:NSUTF8StringEncoding];

NSMutableData *HA1data = [NSMutableData dataWithCapacity:([HA1dataA length] + [HA1dataB length])];
[HA1data appendData:HA1dataA];
[HA1data appendData:HA1dataB];

[crypto setClearTextWithData:HA1data];
NSString *HA1 = [[crypto digest:@"MD5"] hexval];

[crypto setClearTextWithString:HA2str];
NSString *HA2 = [[crypto digest:@"MD5"] hexval];

NSString *responseStr = [NSString stringWithFormat:@"%@:%@:00000001:%@:auth:%@",
HA1, nonce, cnonce, HA2];

[crypto setClearTextWithString:responseStr];
NSString *response = [[crypto digest:@"MD5"] hexval];

return response;

Robbie Hanson said...

Here is a better explanation of the "raw data" vs "hex string" difference.

First, note that hashing (such as an md5 hash) works on the raw binary data. Think 1's and 0's...

In step 1, the result (in raw data, printed in hex) from the hash is this:
3a4f5725a748ca945e506e30acd906f0

However, if you convert this to a string, then you convert it to an array of characters "3a4f..." But wait! This is a problem. Because what is the raw data of a character? It's very different! The raw data, printed in hex, of the UTF-8 character 3 is 33. And a = 61. So if you convert it to a string, then it's raw data becomes this:
33613466 35373235 61373438 63613934 35653530 36653330 61636439 30366630

(spaces added for readability)

This is the difference between "raw" and string values. Why did the SASL Digest RFC people decide to make it so complicated? I don't know. Digest access authentication in HTTP doesn't do any of this "raw" data stuff...

Anonymous said...

Dear Robbie Hanson,

Oh it is very clear explanation.
string "3" is actually "33" binary data=(ASCII code).

I will try it.
Thank you very much.

For example, I saved "3a4f" strings in text file vv by editor, and convert "nkf -w80 vv", so I get same answer "3a4f".
This means that the contents of file = 0x33 0x61 0x34 0x67.
If we keep this concept, we go to the trouble. I'm in the loop in 2 months.

--mam--

Anonymous said...

great write-up. I couldnt find this info anywhere.

Anonymous said...

Nice article!
Great! Fantastic! Thank you very much :)
(I wrote jabber server ;) )

Anonymous said...

Here a PHP function string2hex, usefull to get a correct "raw format".
(do not forget to do a loop on the entire string!)


function str2hex($string) {
$hex = "";
for ($i = 0; $i < strlen($string); $i++) {
$hex .= (strlen(dechex(ord($string[$i]))) < 2) ?
"0" . dechex(ord($string[$i])) : dechex(ord($string[$i]));
}
return $hex;
}

Brmm said...

Great post, the only one I could find on the internet which explains SASL auth response for XMPP.

I know it's been a long time since you posted this, but..

I dont understand the critical part with the raw data. If you're not allowed to save the MD5 hash output as a String because it has to stay raw data, then how do you put it in a variable anyway? What is the data type then, so you can work with it?

I'm using this for an actionscript project and I'm saving it in a byteArray, but that doesn't seem to work for me. I always get the wrong hash output.

I think I'm not following here.

grz

Robbie Hanson said...

Hi Brmm,

In Cocoa we save it in NSData, which is just a byte buffer.

Anonymous said...

I am not familiar with the language you are using.

For the 'tricky part', could you please post a php version? I am still not coming up with the same values you are using and it breaks down when I combine the $HA1data with the nonce and cnonce strings. I come up with a completely different value. I am beginning to wonder if this can be done in php.

Thanks for your help!!

kai said...

At last I got this right! Thank you very much for this post! It helps a lot. :D

Anonymous said...

2011? Better late than never. Thank you for the nice example. It was still quite some trouble for me to figure out how to get it done with C++...no abraction whatsoever. Anybody attempting this with C++ should just store nonce+cnonce as unsigned char*, same thing goes for the intermediate HA1 result. Concatenate and hash.

Anonymous said...

just wanted to say: thanks! with this example i was able to pinpoint a bug in my code. now it works!

Chamini Perera said...

I'm also trying to implement this with c++.It didn't work for me, gave me the same result Mam got (as mention above ), please any idea to implement this with c++

Andrew Poltavchenko said...

Thank You, people!) This article very helped for me. May be I'm a stupid, but only after reading this text I have progress))).

P.S. I'm sorry - I'm russian and english is my weak side)))

John said...

Stumbled across this while looking for a walkthrough for exactly this. I was able to step by step create and test a method to reproduce your results!

Many thanks for taking the time to post this explanation!

Thank you!

Anonymous said...

Hi everyone,

Great article, very useful to have actual test values when debugging auth.

in PHP, the easiest way I found to do the "tricky part" is to use the pack function
$Y = pack('H32', md5($X));

PHP sample code can be found here : https://github.com/Gugli/jabberd2/blob/master/tools/sendupdatepacket.php

-Gugli-