18.1 WebSocket Fundamentals
The WebSocket protocol provides full-duplex, bidirectional communication over a single TCP connection. This section covers the protocol fundamentals, connection lifecycle, frame structure, and the HTTP upgrade mechanism.
Protocol Overview
WebSocket enables persistent, low-latency communication between clients and servers, eliminating the request-response overhead of HTTP. The protocol begins with an HTTP upgrade handshake and then switches to the WebSocket protocol for frame-based messaging.
Key Characteristics:
- Full-duplex: simultaneous bidirectional communication
- Frame-based: structured binary protocol
- Persistent connection: single TCP connection lifetime
- Low overhead: minimal header size after initial upgrade
- Binary-safe: supports both text and binary data
Protocol Versions: WebSocket (RFC 6455) is the standard version implemented in Java's HTTP client. Unlike HTTP versions, WebSocket has a single stable version that supports both HTTP/1.1 and HTTP/2 as underlying transport.
HTTP Upgrade Handshake
WebSocket connections begin with an HTTP upgrade request. The client sends an HTTP/1.1 request with special upgrade headers:
// Client initiates upgrade request
GET /chat HTTP/1.1
Host: server.example.com
Upgrade: websocket
Connection: Upgrade
Sec-WebSocket-Key: dGhlIHNhbXBsZSBub25jZQ==
Sec-WebSocket-Version: 13
Sec-WebSocket-Extensions: permessage-deflate
Key Upgrade Headers:
Upgrade: websocket- indicates WebSocket protocol upgradeConnection: Upgrade- persistent connection requiredSec-WebSocket-Key- base64-encoded random 16-byte nonceSec-WebSocket-Version- protocol version (13 is current)Sec-WebSocket-Extensions- optional compression extensions
The server responds with a 101 status code and derives the response key:
// Server response
HTTP/1.1 101 Switching Protocols
Upgrade: websocket
Connection: Upgrade
Sec-WebSocket-Accept: s3pPLMBiTxaQ9kYGzzhZRbK+xOo=
Sec-WebSocket-Extensions: permessage-deflate
Response Key Derivation:
public class WebSocketKeyUtils {
private static final String MAGIC = "258EAFA5-E914-47DA-95CA-C5AB0DC85B11";
public static String deriveResponseKey(String clientKey) {
try {
String concatenated = clientKey + MAGIC;
MessageDigest sha1 = MessageDigest.getInstance("SHA-1");
byte[] hash = sha1.digest(concatenated.getBytes(StandardCharsets.UTF_8));
return Base64.getEncoder().encodeToString(hash);
} catch (NoSuchAlgorithmException e) {
throw new RuntimeException(e);
}
}
}
// Usage
String clientKey = "dGhlIHNhbXBsZSBub25jZQ==";
String responseKey = WebSocketKeyUtils.deriveResponseKey(clientKey);
// responseKey = "s3pPLMBiTxaQ9kYGzzhZRbK+xOo="
Frame Structure
Once the upgrade completes, communication switches to WebSocket frames. Each frame contains metadata and payload data.
Frame Header Format:
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-------+-+-------------+-------------------------------+
|F|R|R|R| opcode|M| Payload len | Extended payload length |
|I|S|S|S|(4 bits)|A| (7 bits) | (0/16/64 bits) |
|N|V|V|V| |S| | |
| |1|2|3| | K| | |
+-+-+-+-+-------+-+-------------+-------------------------------+
| Extended payload length continued, if payload len = 127 |
+-------------------------------+-------------------------------+
| |Masking-key, if MASK set |
+-------------------------------+-------------------------------+
| | |
| Payload Data | Payload Data cont. |
+-------------------------------+-------------------------------+
Frame Fields:
- FIN (1 bit): Final fragment of message (1 = final, 0 = more frames coming)
- RSV1-RSV3 (3 bits): Reserved for extensions (normally 0)
- Opcode (4 bits): Frame type (see below)
- MASK (1 bit): Payload masked (1 = masked, required for client→server)
- Payload Length (7/16/64 bits): Message size encoding
- Masking Key (32 bits): Client-to-server XOR mask key
- Payload Data: Actual message data
Opcode Values:
public enum FrameOpcode {
CONTINUATION(0x0), // Continuation frame
TEXT(0x1), // Text frame
BINARY(0x2), // Binary frame
CLOSE(0x8), // Close frame
PING(0x9), // Ping frame (keep-alive)
PONG(0xA); // Pong frame (ping response)
private final int code;
FrameOpcode(int code) {
this.code = code;
}
public int getCode() {
return code;
}
}
Control Frames:
- Close frames (0x8): Initiate connection closure with status code and reason
- Ping frames (0x9): Server-initiated keep-alive probes
- Pong frames (0xA): Client responses to ping
Text vs Binary Frames
WebSocket frames can carry text or binary data. The first frame's opcode determines the message type.
Text Frames: Text frames (opcode 0x1) contain UTF-8 encoded text. Multiple frames can form a single message using continuation frames.
public class TextFrameHandler {
private WebSocket webSocket;
private StringBuilder messageBuffer = new StringBuilder();
public void onText(CharSequence data, boolean isLast) {
messageBuffer.append(data);
if (isLast) {
String completeMessage = messageBuffer.toString();
processMessage(completeMessage);
messageBuffer = new StringBuilder();
}
}
private void processMessage(String message) {
// Process complete text message
System.out.println("Received: " + message);
}
}
Binary Frames: Binary frames (opcode 0x2) contain raw binary data without encoding constraints. Useful for efficient data transfer (images, compressed content, custom serialization).
public class BinaryFrameHandler {
private List<ByteBuffer> messageFrames = new ArrayList<>();
public void onBinary(ByteBuffer data, boolean isLast) {
messageFrames.add(data);
if (isLast) {
ByteBuffer completeMessage = assembleMessage();
processBinaryData(completeMessage);
messageFrames.clear();
}
}
private ByteBuffer assembleMessage() {
int totalSize = messageFrames.stream()
.mapToInt(ByteBuffer::remaining)
.sum();
ByteBuffer combined = ByteBuffer.allocate(totalSize);
messageFrames.forEach(combined::put);
combined.flip();
return combined;
}
private void processBinaryData(ByteBuffer data) {
// Process complete binary message
byte[] bytes = new byte[data.remaining()];
data.get(bytes);
System.out.println("Received binary: " + bytes.length + " bytes");
}
}
Fragmentation and Continuation
Large messages are split into multiple frames using continuation frames (opcode 0x0).
public class FragmentationExample {
private WebSocket webSocket;
// Client sends fragmented message
public void sendLargeMessage(String message) {
String[] chunks = message.split("(?<=\\G.{1000})"); // Split into 1KB chunks
for (int i = 0; i < chunks.length; i++) {
boolean isLast = (i == chunks.length - 1);
webSocket.sendText(chunks[i], isLast);
}
}
// Server receives fragmented message
private StringBuilder incomingMessage = new StringBuilder();
public CompletionStage<?> onText(WebSocket webSocket, CharSequence data, boolean isLast) {
incomingMessage.append(data);
if (isLast) {
String complete = incomingMessage.toString();
incomingMessage = new StringBuilder();
// Process complete message
return processMessage(complete);
}
return CompletableFuture.completedStage(null);
}
private CompletionStage<?> processMessage(String message) {
System.out.println("Received complete message: " + message.length() + " chars");
return CompletableFuture.completedStage(null);
}
}
Connection Lifecycle
WebSocket connections follow a state machine:
States:
- CONNECTING: Initial upgrade handshake in progress
- OPEN: Upgrade completed, bidirectional communication active
- CLOSING: Close frame sent/received, cleanup in progress
- CLOSED: Connection terminated
public class ConnectionLifecycleListener implements WebSocket.Listener {
private enum State {
CONNECTING, OPEN, CLOSING, CLOSED
}
private State state = State.CONNECTING;
private long connectedTime;
@Override
public void onOpen(WebSocket webSocket) {
state = State.OPEN;
connectedTime = System.currentTimeMillis();
System.out.println("WebSocket OPEN");
}
@Override
public CompletionStage<?> onText(WebSocket webSocket, CharSequence data, boolean last) {
if (state == State.OPEN) {
System.out.println("Message: " + data);
webSocket.request(1); // Request next message
}
return CompletableFuture.completedStage(null);
}
@Override
public CompletionStage<?> onBinary(WebSocket webSocket, ByteBuffer data, boolean last) {
if (state == State.OPEN) {
System.out.println("Binary data: " + data.remaining() + " bytes");
webSocket.request(1);
}
return CompletableFuture.completedStage(null);
}
@Override
public CompletionStage<?> onClose(WebSocket webSocket, int statusCode, String reason) {
if (state == State.OPEN) {
state = State.CLOSING;
System.out.println("Close initiated: " + statusCode + " " + reason);
} else if (state == State.CLOSING) {
state = State.CLOSED;
System.out.println("Connection closed");
}
return CompletableFuture.completedStage(null);
}
@Override
public void onError(WebSocket webSocket, Throwable error) {
System.err.println("WebSocket error: " + error.getMessage());
state = State.CLOSED;
}
}
Keep-Alive with Ping/Pong
Servers periodically send ping frames; clients respond with pong frames to maintain connection health.
public class PingPongHandler implements WebSocket.Listener {
private ScheduledExecutorService scheduler = Executors.newScheduledThreadPool(1);
private WebSocket webSocket;
private long lastPongTime = System.currentTimeMillis();
@Override
public void onOpen(WebSocket webSocket) {
this.webSocket = webSocket;
// Expect pings every 30 seconds
scheduler.scheduleAtFixedRate(() -> {
long timeSinceLastPong = System.currentTimeMillis() - lastPongTime;
if (timeSinceLastPong > 60_000) { // 60 second timeout
System.out.println("Pong timeout detected, closing connection");
webSocket.sendClose(WebSocket.ABNORMAL_CLOSURE, "Pong timeout");
}
}, 30, 30, TimeUnit.SECONDS);
webSocket.request(1);
}
@Override
public CompletionStage<?> onPing(WebSocket webSocket, ByteBuffer message) {
// Automatically respond to ping with pong
return webSocket.sendPong(message).thenRun(() -> {
System.out.println("Sent pong");
webSocket.request(1);
});
}
@Override
public CompletionStage<?> onPong(WebSocket webSocket, ByteBuffer message) {
// Track pong reception for timeout detection
lastPongTime = System.currentTimeMillis();
System.out.println("Received pong");
webSocket.request(1);
return CompletableFuture.completedStage(null);
}
public void close() {
scheduler.shutdown();
}
}
Close Frames and Status Codes
Close frames contain a 2-byte status code and optional UTF-8 reason text.
Standard Status Codes:
public class WebSocketCloseCodes {
// 1000-1003: Protocol-defined status codes
public static final int NORMAL_CLOSURE = 1000; // Normal closure
public static final int GOING_AWAY = 1001; // Endpoint disappearing
public static final int PROTOCOL_ERROR = 1002; // Protocol error
public static final int UNSUPPORTED_DATA = 1003; // Unsupported data type
// 1007-1009: Connection state errors
public static final int INVALID_FRAME_PAYLOAD = 1007; // Invalid frame payload
public static final int POLICY_VIOLATION = 1008; // Policy violation
public static final int MESSAGE_TOO_BIG = 1009; // Message too large
// 1010-1011: Server errors
public static final int MISSING_EXTENSION = 1010; // Extension missing
public static final int INTERNAL_SERVER_ERROR = 1011; // Internal server error
// 1012-1015: Connection issues (server-only codes)
public static final int SERVICE_RESTART = 1012; // Service restart
public static final int TRY_AGAIN_LATER = 1013; // Try again later
public static final int TLS_HANDSHAKE_ERROR = 1015; // TLS error (unreliable)
}
public class GracefulCloseHandler {
private WebSocket webSocket;
public void closeGracefully() {
// Send close frame with normal closure code
webSocket.sendClose(
WebSocketCloseCodes.NORMAL_CLOSURE,
"Closing connection"
);
}
public CompletionStage<?> handleRemoteClose(
WebSocket webSocket, int statusCode, String reason) {
System.out.println("Remote close: " + statusCode + " - " + reason);
if (statusCode == WebSocketCloseCodes.NORMAL_CLOSURE) {
System.out.println("Normal closure");
} else if (statusCode >= 4000 && statusCode <= 4999) {
System.out.println("Application-defined error");
}
// WebSocket automatically responds to close
return CompletableFuture.completedStage(null);
}
}
Masking for Security
Client-to-server frames are masked to prevent cache poisoning attacks. Each client uses a unique 32-bit masking key per frame.
public class MaskingExample {
/**
* Apply XOR mask to frame payload
*/
public static byte[] applyMask(byte[] payload, byte[] maskKey) {
byte[] masked = new byte[payload.length];
for (int i = 0; i < payload.length; i++) {
masked[i] = (byte) (payload[i] ^ maskKey[i % 4]);
}
return masked;
}
/**
* Generate random 32-bit masking key
*/
public static byte[] generateMaskKey() {
byte[] maskKey = new byte[4];
new SecureRandom().nextBytes(maskKey);
return maskKey;
}
public static void main(String[] args) {
String message = "Hello, WebSocket!";
byte[] payload = message.getBytes(StandardCharsets.UTF_8);
byte[] maskKey = generateMaskKey();
byte[] masked = applyMask(payload, maskKey);
byte[] unmasked = applyMask(masked, maskKey); // Apply mask twice = original
System.out.println("Original: " + message);
System.out.println("Masked length: " + masked.length);
System.out.println("Unmasked: " + new String(unmasked));
}
}
HTTP/2 WebSocket Support
WebSocket over HTTP/2 (RFC 8441) provides connection multiplexing benefits while maintaining WebSocket semantics.
public class Http2WebSocketClient {
private HttpClient httpClient;
public Http2WebSocketClient() {
// Configure HttpClient to prefer HTTP/2
this.httpClient = HttpClient.newBuilder()
.version(HttpClient.Version.HTTP_2)
.build();
}
public CompletableFuture<WebSocket> connectViaHttp2(String uri) {
// WebSocket over HTTP/2 uses different URI scheme
// ws: becomes http2 (implicit)
// wss: becomes https with HTTP/2
return httpClient.newWebSocketBuilder()
.buildAsync(
URI.create(uri),
new WebSocketListener()
);
}
private static class WebSocketListener implements WebSocket.Listener {
@Override
public void onOpen(WebSocket webSocket) {
System.out.println("HTTP/2 WebSocket connected");
webSocket.request(1);
}
@Override
public CompletionStage<?> onText(WebSocket webSocket, CharSequence data, boolean last) {
System.out.println("Message: " + data);
webSocket.request(1);
return CompletableFuture.completedStage(null);
}
@Override
public void onError(WebSocket webSocket, Throwable error) {
error.printStackTrace();
}
}
}
Summary
The WebSocket protocol provides efficient bidirectional communication over a persistent TCP connection initiated via HTTP upgrade. Key concepts include:
- Protocol Versioning: Single stable version with HTTP/1.1 and HTTP/2 support
- Upgrade Handshake: HTTP upgrade with header-based key derivation
- Frame Types: Text, binary, and control frames with continuation support
- Lifecycle Management: CONNECTING → OPEN → CLOSING → CLOSED states
- Keep-Alive: Ping/pong frames for connection health monitoring
- Security: XOR masking for client-to-server frames
- Status Codes: Standardized close codes for graceful termination
Understanding these fundamentals is essential for implementing robust WebSocket clients in Java.