Understanding WebRTC peer-to-peer connection setup

3 minute read

WebRTC is an open standard that allow to add real time communication capabilities to applications. A typical example of apps that use WebRTC are multimedia chat apps such as Discord.

WebRTC is designed for peer-to-peer connections, which means data such as video or audio streams can flow directly between two peers instead via a server. Setting up such a peer-to-peer connection is not inherently difficult, but the standard allows for many options to be configured.

WebRTC connections require a fair amount of configuration. When confrontent with the connection setup for the first time, it is easy to get lost in the details, which is why I want to provide a high-level overview about which parts are involved in a WebRTC peer-to-peer connection and how these parts interact.

Setting up a WebRTC peer-to-peer connection

Setting up a WebRTC peer-to-peer connection can roughly be divided to three steps:

  1. Creating and sending an SDP Offer
  2. Creating and sending an SDP Answer
  3. Gathering and sending ICE candidates

The figure here schematically depicts these four steps.

Overview

Before we can talk about creating a peer-to-peer connection, we first need to understand the nature and the role that the Signalling channel has for the connection setup.

The Signalling channel

The Signalling channel is an integral part of the procedure to set up a WebRTC peer-to-peer connection. Via the Signalling channel, peers can send SDP offers, SDP answers and ICE candidates to each other.

It is important to understand, that - while essential for the connection setup - the signalling channel is not part of WebRTC. So any connection between to peers can be used for signalling, e.g. theoretically nothing is stopping you from using email to set up your WebRTC connection. Typically signalling is done via a signalling server.

Step 1 - Creating and sending an SDP Offer

First, the “calling” peer A needs to create an Session Description Protocol (SDP) offer and send it via the Signalling channel to the other peer B. The SDP is an RFC that describes multimedia sessions for the purpose of creating multimedia sessions. SDP offers and answers have the same structure and contain information such as the media types, media formats, or transport protocols to be used.

Step 2 - Creating and sending an SDP Answer

Secondly, once the “receiving” peer B has received an SDP offer, an SDP Answer has to be created and returned to A via the Signalling channel.

Step 3 - Gathering an sending ICE candidates

Then, both peers request Internet Connection Establishement (ICE) candidates from one or many Session Traversal Utilities for NAT (STUN) servers. ICE candidate contain a description of the public facing internet connection endpoint of each peer. Such a description can later be used for creating a peer-to-peer connection. The process of receiving ICE candidates is called “gathering”, and typically it takes a while until it is completed.

There are many public available STUN servers (e.g. here) that can be used for your WebRTC projects. If a peer is not available via NAT, “Traversal Using Relays around NAT” (TURN) servers can be used as fallback. TURN server relay packets between peers and theoretically should always work. However, such servers have high bandwidth requirements as they route all traffic through them, so they can easily become a bottleneck for a WebRTC application with many clients.

Finally, peers send their collected ICE candidates to the other peers via the Signalling channel. Typically, instead of waiting until the gathering is completed, peers send ICE candidates they have received directly to the other peer, until a connection can be established. So Step 3 and 4 are not strictly performed in order, but typically happen in parallel and are repeated.

Conclusion

Setting up an WebRTC peer-to-peer connection consists of three high level steps - creating and sending an SDP offer, creating and sending an SDP answer, and gathering from a STUN/TURN server and sending ICE candidates to the other peer. Offers, answers, and ICE candidates are sent and received via a Signalling channel, which is an arbitrary communication channel between two peers.