Building Your Own VoIP Solution: A Guide to STUN/TURN Servers and JavaScript published 3/4/2023 | 13 min read
VoIP, or Voice over Internet Protocol, is a technology that allows users to make voice and video calls over the internet. In order to establish a direct connection between two clients, they need to exchange information about their IP addresses and network configurations. However, some networks use firewalls, NATs, or other security measures that make it difficult for VoIP clients to communicate directly. This is where STUN and TURN servers come in.
What are STUN/TURN servers?
STUN, or Session Traversal Utilities for NAT, is a protocol that enables clients to discover their public IP addresses and network configuration, even if they are behind a NAT or firewall. STUN servers work by sending the client a response containing its public IP address and port, which the client can then use to establish a direct connection with another client.
TURN, or Traversal Using Relay NAT, is a protocol that enables clients to relay their traffic through a third-party server when direct connections are not possible. TURN servers work by receiving traffic from a client, forwarding it to another client, and returning the response. TURN servers can be used as a fallback option when STUN fails, or as the primary method of communication in cases where clients are unable to communicate directly.
Using STUN/TURN servers in JavaScript
In JavaScript, STUN and TURN servers can be used with WebRTC, a technology that enables real-time communication between web browsers. WebRTC provides a set of APIs that allow developers to create peer-to-peer connections, stream media, and exchange data. By using STUN and TURN servers, developers can ensure that their WebRTC applications work reliably in any network environment.
Building your own VoIP solution
If you want to build your own VoIP solution without using a VoIP provider, you can use WebRTC and STUN/TURN servers to create a peer-to-peer communication system. This approach can be useful for companies that want to have complete control over their communication infrastructure, or for developers who want to experiment with new technologies.
To build your own VoIP solution, you will need to create a signaling server that facilitates the exchange of signaling messages between clients. Signaling messages contain information about the clients' network configurations, such as their IP addresses and ports. Once clients have exchanged signaling messages, they can establish a direct connection using WebRTC.
In conclusion, STUN and TURN servers are essential components of VoIP technology that allow clients to communicate reliably over the internet. By using STUN and TURN servers with WebRTC, developers can create real-time communication applications that work in any network environment. If you want to build your own VoIP solution without using a VoIP provider, you can use WebRTC and STUN/TURN servers to create a peer-to-peer communication system.
Step 1: Setting up the signaling server
The signaling server is a crucial component in WebRTC-based communication because it helps establish the initial connection between two peers. It's responsible for exchanging metadata and negotiation messages that enable the establishment of the peer-to-peer connection. Essentially, the signaling server serves as an intermediary between two peers, allowing them to communicate and exchange information about their WebRTC capabilities, such as video and audio codecs, bandwidth, resolution, and other important details.
Setting up a signaling server in JavaScript requires a few steps. First, we need to create a server that can receive and transmit messages between peers. Next, we need to use a signaling protocol to exchange the necessary metadata and signaling messages between peers. Some popular signaling protocols include SIP, XMPP, and WebSocket.
To create a WebRTC signaling server in Node.js you can use "ws" (WebSocket) package to handle WebSocket connections between the client and the server. Here is an example code snippet:
npm install ws
const WebSocket = require('ws');
const wss = new WebSocket.Server({ port: 8080 });
const users = {};
const handleSignalingMessage = (user, message) => {
const recipient = users[message.to];
if (recipient) {
recipient.send(JSON.stringify(message));
}
}
const handleICECandidate = (user, message) => {
const recipient = users[message.to];
if (recipient) {
recipient.send(JSON.stringify(message));
}
}
const handleOffer = (user, message) => {
user.otherUser = message.to;
const recipient = users[message.to];
if (recipient) {
user.send(JSON.stringify({
type: 'offer',
offer: message.offer,
from: message.from,
}));
}
}
const handleAnswer = (user, message) => {
const recipient = users[message.to];
if (recipient) {
user.send(JSON.stringify({
type: 'answer',
answer: message.answer,
from: message.from,
}));
}
}
const handleUserMessage = (user, message) => {
switch (message.type) {
case 'offer':
handleOffer(user, message);
break;
case 'answer':
handleAnswer(user, message);
break;
case 'ice-candidate':
handleICECandidate(user, message);
break;
default:
handleSignalingMessage(user, message);
break;
}
}
wss.on('connection', (ws) => {
console.log('Connection established');
const user = {
id: Math.random().toString(36).slice(2),
ws,
otherUser: null,
};
users[user.id] = user;
ws.on('message', (message) => {
try {
const parsedMessage = JSON.parse(message);
handleUserMessage(user, parsedMessage);
} catch (err) {
console.error('Failed to parse message', err);
}
});
ws.on('close', () => {
console.log('Connection closed');
delete users[user.id];
if (user.otherUser) {
const recipient = users[user.otherUser];
if (recipient) {
recipient.send(JSON.stringify({
type: 'user-disconnected',
userId: user.id,
}));
}
}
});
});
The code above shows an implementation of a WebRTC signaling server using the WebSocket protocol, as implemented by the ws
library in Node.js. The signaling server acts as a middleman for WebRTC peers to exchange information required to establish a peer-to-peer connection.
The server listens for WebSocket connections on the port 8080
. When a WebSocket connection is established, the connection
event is triggered, and a message event listener is set up on the socket. When a message is received on the socket, the code first checks if the message is a JSON object, and if not, it ignores the message and logs error.
If the message is a JSON object, the code checks for the type
property of the object. If the type
is offer
, the code broadcasts the message to all connected peers except the one that sent the message. If the type
is answer
, the code similarly broadcasts the message to all connected peers except the one that sent the message. If the type
is ice-candidate
, the code sends the message to other peer.
When a WebSocket connection is closed, the code removes the socket/connection from the list of connected sockets.
Step 2: Setting up STUN/TURN servers
STUN and TURN servers are used for NAT traversal in WebRTC communication. STUN servers are used to discover public IP addresses and ports of the client's device, while TURN servers are used as a fallback in case peer-to-peer connection establishment is not possible due to NAT/firewall restrictions.
To set up a STUN/TURN server, you can use an open-source implementation like Coturn. Here's an example of setting up Coturn as a TURN server:
1. Install Coturn using a package manager like apt-get or brew:
sudo apt-get install coturn
2. Edit the Coturn configuration file at /etc/turnserver.conf
to specify the listening IP address and port, as well as the TURN server credentials:
listening-ip=SERVER_IP_ADDRESS
listening-port=3478
realm=example.com
user=username:password
Replace SERVER_IP_ADDRESS
with the IP address of your server, example.com
with your domain name, and username:password
with the credentials you want to use to access the TURN server.
3. Start the Coturn server:
sudo service coturn start
You can now use the URL of your Coturn server in your WebRTC application as a TURN server URL.
Client side/browser implementation
To use STUN and TURN servers in WebRTC communication, we need to add the ICE
servers to the RTCPeerConnection
configuration object. Here's an example of a createPeer
function that includes STUN/TURN servers:
function createPeer() {
const peer = new RTCPeerConnection({
iceServers: [
{
urls: 'stun:stun:stun.example.com:19302',
},
{
urls: 'turn:your-turn-server.com:3478',
username: 'your-username',
credential: 'your-password',
},
],
});
return peer;
}
Note that in the above example, we are using a STUN server provided by Google (stun.l.google.com:19302). You can replace it with your own STUN server if you have one. Similarly, for the TURN server, you need to provide the URL, username, and password for your server.
To get users connected from the browser, we need to create an offer and send it to the other user. Here's an example of how to do it:
navigator.mediaDevices.getUserMedia({ video: true, audio: true })
.then(stream => {
const localVideo = document.getElementById('local-video');
localVideo.srcObject = stream;
const peer = createPeer();
stream.getTracks().forEach(track => {
peer.addTrack(track, stream);
});
const currentUserId = 'test-123';
const otherUserId = 'test-456';
const socket = new WebSocket('ws://localhost:8080');
peer.createOffer()
.then(offer => {
peer.setLocalDescription(offer);
socket.emit('offer', {
from: currentUserId,
to: otherUserId,
offer,
});
});
});
In the above example, we are using getUserMedia
to get the user's local media stream, and then adding it to the peer connection using the addTrack method. We then create an offer using the createOffer
method, set the local description using the setLocalDescription
method, and send the offer to the other user using the signaling server.
On the other end, when the other user receives the offer, they can create an answer and send it back to the first user. Here's an example of how to do it:
const socket = new WebSocket('ws://localhost:8080');
socket.on('offer', ({ offer, from }) => {
const peer = createPeer();
peer.setRemoteDescription(offer);
navigator.mediaDevices.getUserMedia({ video: true, audio: true })
.then(stream => {
const localVideo = document.getElementById('local-video');
localVideo.srcObject = stream;
stream.getTracks().forEach(track => {
peer.addTrack(track, stream);
});
peer.createAnswer()
.then(answer => {
peer.setLocalDescription(answer);
socket.emit('answer', {
answer,
to: from,
});
});
});
});
socket.on('answer', ({ answer }) => {
peer.setRemoteDescription(answer);
});
In the above example, we are using getUserMedia
to get the user's local media stream, and then adding it to the peer connection using the addTrack
method. We then set the remote description using the setRemoteDescription
method.
Conclusion
STUN/TURN servers are an essential component of VoIP technology that enables real-time communication over the internet. They help overcome the limitations of NAT devices, which can interfere with VoIP calls. Using WebRTC, developers can leverage STUN/TURN servers to build their own VoIP solutions. While building a VoIP solution can be challenging, it can be a rewarding experience that provides complete control over the technology used.
You may also like reading: