In section 1.3 \"Opening Handshake\" of draft-ietf-hybi-thewebsocketprotocol-17, it describes Sec-WebSocket-Key
as follows:
To prove that the
According to RFC 6455 Websocket standard
first part:
.. the server has to prove to the client that it received the
client's WebSocket handshake, so that the server doesn't accept
connections that are not WebSocket connections. This prevents an
attacker from tricking a WebSocket server by sending it carefully
crafted packets using XMLHttpRequest [XMLHttpRequest] or a form
submission.
...
For this header field, the server has to take the value (as present
in the header field, e.g., the base64-encoded [RFC4648] version minus
any leading and trailing whitespace) and concatenate this with the
Globally Unique Identifier (GUID, [RFC4122]) "258EAFA5-E914-47DA-
95CA-C5AB0DC85B11" in string form, which is unlikely to be used by
network endpoints that do not understand the WebSocket Protocol.
second part:
The |Sec-WebSocket-Key| header field is used in the WebSocket opening
handshake. It is sent from the client to the server to provide part
of the information used by the server to prove that it received a
valid WebSocket opening handshake. This helps ensure that the server
does not accept connections from non-WebSocket clients (e.g., HTTP
clients) that are being abused to send data to unsuspecting WebSocket
servers.
So, as the value of the GUID is specified in the standard, it is unlikely (possible, put with very small probability) that the server which is not aware of Websockets will use it. It does not provide any security (secure websockets - wss:// - does), it just ensures that server understands websockets protocol.
Really, as you've mentioned, if you are aware of websockets (that's what to be checked), you could pretend to be a websocket server by sending correct response. But then, if you will not act correctly (e.g. form frames correctly), it will be considered as a protocol violation. Actually, you can write a websocket server that is incorrect, but there will be not much use in it.
And another purpose is to prevent clients accidentally requesting websockets upgrade not expecting it (say, by adding corresponding headers manually and then expecting smth else). Sec-WebSocket-Key and other related headers are prohibited to be set using setRequestHeader
method in browsers.
The RFC 6455 spec shows the (minimum) 4 lines that the server needs to respond to the client (browser) with. The hardest part is confirming your Websocket server C code is doing the right calculations. Here's a short PHP script (PHP is easy to install on all OS's) that will properly calculate the key to reply with. Hard-code the key you get from the client (browser) into the 2nd line below:
<?php
$client_websocket_key = "IRhw449z7G0Mov9CahJ+Ow==";
$concat = $client_websocket_key . "258EAFA5-E914-47DA-95CA-C5AB0DC85B11";
$ascii_sha1 = sha1( $concat ); // print this one for debugging, not used for real value.
$sha1 = sha1( $concat, true );
echo base64_encode( $sha1 );
?>
I'm inclined to agree.
Nothing of importance would change if the client ignored the value of the Sec-WebSocket-Accept header.
Why? Because the server is not proving anything by doing this calculation (other than that it has the code to do the calculation). Just about the only thing it rules out is a server that simply replies with a canned response.
The exchange of headers (e.g. with fixed 'key' and 'accept' values) is already sufficient to rule out any accidental connection with something that is not at least trying to be a WebSocket server; and if it's trying, the requirement that it do this calculation is hardly an impediment to its succeeding.
The RFC claims:
".. the server has to prove to the client that it received the client's WebSocket handshake, so that the server doesn't accept connections that are not WebSocket connections."
and:
"This helps ensure that the server does not accept connections from non-WebSocket clients .."
Neither of these claims make any sense. The server is never the one rejecting the connection because it is the one computing the hash, not the one checking it.
This sort of exchange would make some sense if the magic GUID were not fixed, but were instead a shared secret between client and server. In that case the exchange would allow the server to prove to the client that it had the shared secret without revealing it.
What the RFC is unclear about is that the "Sec-WebSocket-Key" header from the client should be random on each request. Which means any cached result from a proxy will contain an invalid "Sec-WebSocket-Accept" reply header and thus the websocket connection will fail instead of reading cached data unintentionally.
Mostly for cache busting.
Imagine a transparent reverse-proxy server watching HTTP traffic go by. If it doesn't understand WS, it could mistakenly cache a WS handshake and reply with a useless 101 to the next client.
Using a nonce (the key) and requiring a basic challenge-response rather specific to WS ensures the server actually understood this was a WS handshake and in turn tells the client that the server will indeed be listening on the port. A caching reverse-proxy would never implement that hashing logic "by mistake".