I wrote a WebSocket frame decoder in Java:
private byte[] decodeFrame(byte[] _rawIn) {
int maskIndex = 2;
byte[] maskBytes = new byte[4];
Parsing a raw websocket frame is easy enough. But you have to inspect the header one byte at a time.
Here's a rough example:
I left a few TODO's for you to work out on your own (after reading the RFC-6455 spec of course)
Things you can validate:
Base Framing Protocol: RFC-6455 - Section 5.2
Client-to-Server Masking: RFC 6455 - Section 5.3
Fragmentation: RFC 6455 - Section 5.4
Control Frames: RFC 6455 - Section 5.5
Close Frames: RFC 6455 - Section 5.5.1
Data Frames: RFC 6455 - Section 5.6
While you can validate at the individual frame level, you will find that some of the validations above are validations of state and behavior between multiple frames. You can find more of these kinds of validations in Sending and Receiving Data: RFC 6455 - Section 6.
However, if you have extensions in the mix, then you will also need to process the frames from the point of view of the negotiated extension stack as well. Some tests above would appear to be invalid when an extension is being used.
Example: You have Compression Extension (RFC-7692) (such as permessage-deflate
) in use, then the validation of TEXT payload cannot be done with the raw frame off the network, as you must first pass the frame through the extension. Note that the extension can change the fragmentation to suit its needs, which might mess up your validation as well.
package websocket;
import java.nio.ByteBuffer;
import java.nio.charset.Charset;
public class RawParse
{
public static class Frame
{
byte opcode;
boolean fin;
byte payload[];
}
public static Frame parse(byte raw[])
{
// easier to do this via ByteBuffer
ByteBuffer buf = ByteBuffer.wrap(raw);
// Fin + RSV + OpCode byte
Frame frame = new Frame();
byte b = buf.get();
frame.fin = ((b & 0x80) != 0);
boolean rsv1 = ((b & 0x40) != 0);
boolean rsv2 = ((b & 0x20) != 0);
boolean rsv3 = ((b & 0x10) != 0);
frame.opcode = (byte)(b & 0x0F);
// TODO: add control frame fin validation here
// TODO: add frame RSV validation here
// Masked + Payload Length
b = buf.get();
boolean masked = ((b & 0x80) != 0);
int payloadLength = (byte)(0x7F & b);
int byteCount = 0;
if (payloadLength == 0x7F)
{
// 8 byte extended payload length
byteCount = 8;
}
else if (payloadLength == 0x7E)
{
// 2 bytes extended payload length
byteCount = 2;
}
// Decode Payload Length
while (--byteCount > 0)
{
b = buf.get();
payloadLength |= (b & 0xFF) << (8 * byteCount);
}
// TODO: add control frame payload length validation here
byte maskingKey[] = null;
if (masked)
{
// Masking Key
maskingKey = new byte[4];
buf.get(maskingKey,0,4);
}
// TODO: add masked + maskingkey validation here
// Payload itself
frame.payload = new byte[payloadLength];
buf.get(frame.payload,0,payloadLength);
// Demask (if needed)
if (masked)
{
for (int i = 0; i < frame.payload.length; i++)
{
frame.payload[i] ^= maskingKey[i % 4];
}
}
return frame;
}
public static void main(String[] args)
{
Charset UTF8 = Charset.forName("UTF-8");
Frame closeFrame = parse(hexToByteArray("8800"));
System.out.printf("closeFrame.opcode = %d%n",closeFrame.opcode);
System.out.printf("closeFrame.payload.length = %d%n",closeFrame.payload.length);
// Examples from http://tools.ietf.org/html/rfc6455#section-5.7
Frame unmaskedTextFrame = parse(hexToByteArray("810548656c6c6f"));
System.out.printf("unmaskedTextFrame.opcode = %d%n",unmaskedTextFrame.opcode);
System.out.printf("unmaskedTextFrame.payload.length = %d%n",unmaskedTextFrame.payload.length);
System.out.printf("unmaskedTextFrame.payload = \"%s\"%n",new String(unmaskedTextFrame.payload,UTF8));
Frame maskedTextFrame = parse(hexToByteArray("818537fa213d7f9f4d5158"));
System.out.printf("maskedTextFrame.opcode = %d%n",maskedTextFrame.opcode);
System.out.printf("maskedTextFrame.payload.length = %d%n",maskedTextFrame.payload.length);
System.out.printf("maskedTextFrame.payload = \"%s\"%n",new String(maskedTextFrame.payload,UTF8));
}
public static byte[] hexToByteArray(String hstr)
{
if ((hstr.length() < 0) || ((hstr.length() % 2) != 0))
{
throw new IllegalArgumentException(String.format("Invalid string length of <%d>",hstr.length()));
}
int size = hstr.length() / 2;
byte buf[] = new byte[size];
byte hex;
int len = hstr.length();
int idx = (int)Math.floor(((size * 2) - (double)len) / 2);
for (int i = 0; i < len; i++)
{
hex = 0;
if (i >= 0)
{
hex = (byte)(Character.digit(hstr.charAt(i),16) << 4);
}
i++;
hex += (byte)(Character.digit(hstr.charAt(i),16));
buf[idx] = hex;
idx++;
}
return buf;
}
}
My current interest in websockets allows me to possibly help with this though I'm brand new to websockets.
http://tools.ietf.org/html/rfc6455#section-5.2 gives a high level view of the data frame. You will test the last four of the first byte so raw_in[0]<<<4. This will give you the last four I'm not too good with bit operations yet so I'm not sure how to get the last 4 bits to represent 0000 1111-0000 0000 vs 1111 0000-0000 0000. So as you can see 0001 op code is a text frame, 0010 op code is a binary frame and so on. So if you only want to except text frames simply test that the last four bits of the first byte is 0001.
The websocket protocol does not include checksums of any kind, if that's what you're looking for. If there's an error in a data frame the only way you'll know is because the data comes out wrong or because subsequent frames come out "funny" (unexpected opcode, longer or shorter than expected, etc).
The first safeguard against an application connecting to a websocket server it wasn't designed for is the HTTP websocket handshake. When it doesn't include an Upgrade: websocket
, Sec-WebSocket-Key
or Sec-WebSocket-Version: 13
it isn't even a RFC6455 websocket client and must be rejected.
The second safeguard works against clients which are speaking websocket, but were designed for a different application. This is the Sec-WebSocket-Protocol: something
header. This header is optional, but should be a string which identifies the application the client wants to use. When the value doesn't match the application(s) the server expects, the client should be rejected.
The last saveguard against clients which think that they speak websocket and connect to the right server but actually have a bug in their websocket protocol implementation are the reserved bits.
There are no illegal values for the masking key or length. A wrong length will cause the next frame to begin after interpreting not enough or too much data as payload, but this can be hard to detect. The only sign that this has happened, is when the first byte of an alleged frame doesn't make sense.
The 2nd, 3rd and 4th bit of a frame are reserved and, according to the RFC "MUST be 0 unless an extension is negotiated [...] If a nonzero value is received [...] the receiving endpoint MUST Fail the WebSocket Connection.". There are no extensions yet which use these bits, and when there will ever be one, you will have to do something to switch it on. So when one of these bits is non-zero, something is wrong.
When you want, you can add further safeguards on your protocol level, like a specific magical byte-value every message has to start and/or end with (keep in mind that there are multi-fragment messages and a browser can use this when it feels like doing so). The application I develop at the moment uses JSON payloads, so when a message isn't a valid JSON string starting with {
and ending with }
, I know the client is broken (or my servers frame decoding method is, which is far more likely).