Chat, live notifications, collaborative editing, multiplayer games, trading dashboards — all need the server to push data to the client the instant something happens. Plain HTTP can't: it's request/response, always initiated by the client. WebSockets solve this with a persistent, full-duplex connection over which either side can send at any time. Understanding the options — polling, long-polling, SSE, and WebSockets — and how to scale stateful connections is the core of any real-time design, like our chat app.

⚡ Quick Takeaways
  • HTTP is client-initiated request/response — the server can't push, so real-time needs another approach.
  • WebSockets give a persistent, full-duplex connection — after an HTTP upgrade handshake, both sides send freely over one long-lived TCP connection.
  • The handshake starts as HTTP with an Upgrade header → 101 Switching Protocols → the connection becomes ws:///wss://.
  • Know the alternatives — long-polling (hacky push), and SSE (server→client only, simpler) — and pick by whether you need bidirectional.
  • Scaling is the hard part — connections are stateful and long-lived; you need many open sockets, sticky routing, and a pub/sub backplane to fan out across servers.
  • Heartbeats + reconnection keep connections healthy and recover from drops.
tldr

WebSockets upgrade an HTTP connection into a persistent, full-duplex channel so the server can push to clients in real time. For one-way server→client streams, SSE is simpler; for true bidirectional, use WebSockets; long-polling is the fallback. The real engineering challenge is scale: connections are stateful and long-lived, so you need sticky routing, capacity for huge numbers of idle sockets, and a pub/sub backplane (e.g. Redis) to deliver a message to clients connected to other servers. Add heartbeats and reconnection for resilience.

The Problem: HTTP Is One-Way

HTTP's model is simple and scalable: a client sends a request, the server responds, done. But it means the server has no way to initiate — it can only answer. For anything where the server needs to tell the client "a new message arrived" or "the price changed," that's a fundamental mismatch. The naive workaround is polling (the client asks "anything new?" every few seconds), which is wasteful — mostly empty responses — and laggy. The history of real-time on the web is a series of increasingly better answers to this.

The Evolution: Polling → Long-Polling → SSE → WebSockets

What WebSockets Are

A WebSocket is a single, long-lived TCP connection that supports full-duplex communication — both client and server can send messages at any time, independently, with low overhead per message (no HTTP headers per message). It begins life as an ordinary HTTP request so it works through existing web infrastructure, then upgrades: once established, it's a bare bidirectional message pipe (ws://, or wss:// over TLS).

the upgrade handshake
# client → server (looks like HTTP)
GET /chat HTTP/1.1
Upgrade: websocket
Connection: Upgrade
Sec-WebSocket-Key: dGhlIHNhbXBsZSBub25jZQ==

# server → client
HTTP/1.1 101 Switching Protocols
Upgrade: websocket
Connection: Upgrade

# → connection is now a full-duplex WebSocket; either side sends anytime

WebSockets vs SSE vs Long-Polling

AspectLong-PollingSSEWebSockets
DirectionServer→client (faked)Server→client onlyFull-duplex
ConnectionRepeated requestsOne long-lived HTTPOne persistent socket
OverheadHigh (re-connect churn)LowLowest per message
ComplexityLowLow (built into browsers)Higher
Use forLegacy fallbackFeeds, notifications, tickersChat, collab, games

The decision rule: if the client only needs to receive updates, SSE is simpler and rides on plain HTTP. If the client also sends frequently (typing in a chat, moving in a game), use WebSockets. Long-polling is the universal fallback when neither is available.

Scaling WebSockets: the Real Challenge

Anyone can open a WebSocket; the hard part is operating millions of them. Unlike stateless HTTP requests that any server can handle, a WebSocket is a stateful, long-lived connection pinned to one specific server. That creates several problems:

The Fan-Out Problem and the Backplane

Suppose Alice and Bob are in the same chat room but their WebSockets are connected to different servers. When Alice sends a message, the server holding her connection has no direct way to reach Bob's connection on another server. The solution is a pub/sub backplane: servers publish incoming messages to a shared bus (commonly Redis pub/sub or Kafka), and every server subscribed to that room receives the message and pushes it down to its own connected clients.

fan-out across servers via a pub/sub backplane
Alice ──ws──▶ WS-server-1 ──publish "room42"──▶ ┌───────────────┐
                                                │ Redis pub/sub │
Bob   ──ws──▶ WS-server-2 ◀──subscribe "room42"─┤  (backplane)  │
                  │                             └───────────────┘
                  └──push──▶ Bob   (server-2 delivers to its own client)

  every WS server subscribes to the rooms its clients are in;
  the backplane bridges connections that live on different servers

This backplane is the defining piece of WebSocket scaling — without it, horizontal scaling breaks because messages can't cross server boundaries.

Heartbeats and Reconnection

Long-lived connections die silently — a laptop sleeps, a network blips, a proxy times out an idle connection — often without a clean close. Two mechanisms keep things healthy. Heartbeats (WebSocket ping/pong frames at intervals) detect dead connections so the server can reclaim resources and the client knows to reconnect. Reconnection logic on the client re-establishes the socket after a drop, ideally with exponential backoff (to avoid a thundering herd when a server restarts) and a way to resume — replaying missed messages since the last received ID so nothing is lost across the gap.

Use Cases

Pitfalls

takeaway

WebSockets turn HTTP's one-way request/response into a persistent, full-duplex channel for true real-time. Choose SSE when you only push server→client, WebSockets when the client also sends, long-polling as the fallback. The hard part isn't opening a socket — it's scaling stateful connections: sticky routing, capacity for millions of idle sockets, and above all a pub/sub backplane to fan messages out across servers, plus heartbeats and reconnection for resilience.

🎯 interview hot-takes

Why not just HTTP for real-time? HTTP is client-initiated request/response — the server can't push; polling is wasteful and laggy.
WebSocket vs SSE? SSE is one-way (server→client) over plain HTTP and simpler; WebSockets are full-duplex for when the client also sends frequently.
How does the handshake work? An HTTP request with Upgrade: websocket101 Switching Protocols → the connection becomes a persistent bidirectional socket.
How do you scale WebSockets? Sticky routing to the server holding the connection, capacity for many idle sockets, and a pub/sub backplane (Redis/Kafka) to fan messages out to clients on other servers.
How do you keep connections healthy? Ping/pong heartbeats to detect dead sockets, and client reconnection with backoff and message resume.

← previous
gRPC & Protobuf