Saturday, September 11, 2010

WebSocket Draft 76 Algorithm Example

The latest versions of Google Chrome and Safari have switched to draft 76 of the WebSocket protocol. Since this version is so drastically different than draft 75, this means it isn't exactly backwards compatible. In other words, if your server doesn't support d76, then the newest browsers won't work with the cool WebSocket stuff you wrote.

This means server/framework developers will need to upgrade their WebSocket support (if they haven't already). The trouble is, the spec is unnecessarily difficult to read. And I'm not the only one who thinks so:

The specification document is just not readable unless you want to go completely insane. [1]

it's both over engineered and absolutely badly documented via those "specs" [2]


If you want a quick and understandable overview of the changes, I'd recommend this page.

But if you're looking for a good quality example, with step-by-step instructions (or maybe just a unit test as you're implementing the code), you'll find them few and far in-between. So I thought I would provide one.

The example comes from the spec, but I'll break down the steps one-by-one:

Incoming Request:

GET / HTTP/1.1
Upgrade: WebSocket
Connection: Upgrade
Host: example.com
Origin: http://example.com
Sec-WebSocket-Key1: 18x 6]8vM;54 *(5: { U1]8 z [ 8
Sec-WebSocket-Key2: 1_ tx7X d < nw 334J702) 7]o}` 0

Tm[K T2u

Outgoing Response:

HTTP/1.1 101 WebSocket Protocol Handshake
Upgrade: WebSocket
Connection: Upgrade
Sec-WebSocket-Origin: http://example.com
Sec-WebSocket-Location: ws://example.com/

fQJ,fN/4F4!~K~MH

So how do we calculate the response body? (The fQJ,fN... part) Very carefully.

The first thing to note is that the request body is exactly 8 bytes. We are displaying it here as if each byte were interpreted as ASCII text. Similarly, the response body is exactly 16 bytes, again displayed as if each byte were ASCII text.

An overview of the steps we need to perform are:

  1. Extract each digit from Sec-WebSocket-Key1 and concatenate them

  2. Count the number of spaces in Sec-WebSocket-Key1

  3. Divide #1 by #2

  4. Convert #3 to a 32 bit big-endian integer (network byte order)

  5. Repeat steps 1-4 for Sec-WebSocket-Key2

  6. Concatenate #4, #5, and the request body

  7. Perform MD5 digest of #6


Sec-WebSocket-Key1 was given as
18x 6]8vM;54 *(5: { U1]8 z [ 8

So this gives us the number 1868545188 for step 1. And there are 12 spaces (step 2).

1868545188 / 12 = 155712099 (step 3)

To covert this value to big-endian (network byte order) you might use something like this, or a similar language dependent function. The end result, if printed in hexadecimal, is <0947fa63>.

Repeating these steps for Sec-WebSocket-Key2:
1733470270 / 10 = 173347027 (<0a5510d3>)

At this point you have a 4 byte value from steps 1-4 (<0947fa63>), and a 4 byte value from step 5 (<0a5510d3>). We are now supposed to concatenate these with the request body.

Are we concatenating strings? No.
Are we concatenating numbers? No.
We are concatenating raw bytes. Think 0's and 1's.

If we print these raw bytes in hexadecimal we get:
Part 1: 0947fa63
Part 2: 0a5510d3
Part 3: 546d5b4b20543275

Part 3 came from "Tm[K T2u" in the response body. This is the 8 bytes you read from the socket. If you took those 8 bytes, and printed them as a series of 8 ascii characters you would get "Tm[K T2u". If you printed those 8 bytes as hexadecimal you would get "546d5b4b20543275".

So if we concatenate these parts we get <0947fa630a5510d3546d5b4b20543275>. Notice this is 16 bytes total.

We're almost done. The last step is to calculate the MD5 hash of <0947fa630a5510d3546d5b4b20543275>. Remember, you are passing this data to the MD5 routine as raw bytes. Don't convert it to a string, or anything like that.

You should get <66514a2c664e2f344634217e4b7e4d48> as the result. This is the result printed in hexadecimal. If we instead interpreted the 16 bytes as 16 ascii characters, we would get:
fQJ,fN/4F4!~K~MH

You are to send these raw 16 bytes immediately after the HTTP response headers.

Hope this helps someone.

 

11 comments:

Jitka Dařbujanová said...

Thanks for the article. Little mistake: Part3 should be 546d5b4b20543275

Robbie Hanson said...

Good catch! Fixed it.

Anonymous said...

Nice article but it´s not working to me. In the step 4 you say to convert the value to big-endian. When i converted it to big-endian i got "7204148657912807424". In the step 6 i should concatenate the value of the key1 as big-endian (as i wrote) or as hexadecimal (0947fa63)? The third key (request body) i should concatenate as string?

春天 said...

think you.
i found this a long time.
i am from china.

Stefan said...

this really helped a lot ! thanks.

晏德 said...

Thank you for your article help me figure out my problem!!
謝謝您的文章幫我找出我程式上的問題!!

magicmanpepe said...
This comment has been removed by the author.
magicmanpepe said...

Hey Deusty! Thanks for adding websocket support to cocoahttpserver! One request.. would you be kind to create a Websocket client that can connect and communicate with a Websocket server? All the other cocoa websocket libraries do not support the new draft spec. Thanks!

Anonymous said...

Thanks for the info. It help allot.
For those who still have trubbel with this i can recomend wireshark for trubleshoting the comunication between client and server. And for examplet use echo.websocket.org as echo server to investigate server response.

Brent Gulanowski said...

WebSockets is a fast-moving target. The protocol spec has changed it's name and gone through nine more versions of the draft. The important point being that it is a draft.

http://tools.ietf.org/html/draft-ietf-hybi-thewebsocketprotocol-09

Perhaps this is why the various Cocoa samples out there don't support binary frames yet (or even control frames).

Anonymous said...

Thanks for the great blog post and for CocoaHTTPServer in general...!

Any chance that we'll see an update in support of RFC 6455? Seems like iOS6's Mobile Safari now speaks this latest version of WebSocket...

Thanks!