Building Your Own VoIP Solution: A Guide to STUN/TURN Servers and JavaScript published 3/4/2023 | 13 min read

VoIP, or Voice over Internet Protocol, is a technology that allows users to make voice and video calls over the internet. In order to establish a direct connection between two clients, they need to exchange information about their IP addresses and network configurations. However, some networks use firewalls, NATs, or other security measures that make it difficult for VoIP clients to communicate directly. This is where STUN and TURN servers come in.



What are STUN/TURN servers?

STUN, or Session Traversal Utilities for NAT, is a protocol that enables clients to discover their public IP addresses and network configuration, even if they are behind a NAT or firewall. STUN servers work by sending the client a response containing its public IP address and port, which the client can then use to establish a direct connection with another client.

TURN, or Traversal Using Relay NAT, is a protocol that enables clients to relay their traffic through a third-party server when direct connections are not possible. TURN servers work by receiving traffic from a client, forwarding it to another client, and returning the response. TURN servers can be used as a fallback option when STUN fails, or as the primary method of communication in cases where clients are unable to communicate directly.

Using STUN/TURN servers in JavaScript

In JavaScript, STUN and TURN servers can be used with WebRTC, a technology that enables real-time communication between web browsers. WebRTC provides a set of APIs that allow developers to create peer-to-peer connections, stream media, and exchange data. By using STUN and TURN servers, developers can ensure that their WebRTC applications work reliably in any network environment.



Building your own VoIP solution

If you want to build your own VoIP solution without using a VoIP provider, you can use WebRTC and STUN/TURN servers to create a peer-to-peer communication system. This approach can be useful for companies that want to have complete control over their communication infrastructure, or for developers who want to experiment with new technologies.

To build your own VoIP solution, you will need to create a signaling server that facilitates the exchange of signaling messages between clients. Signaling messages contain information about the clients' network configurations, such as their IP addresses and ports. Once clients have exchanged signaling messages, they can establish a direct connection using WebRTC.

In conclusion, STUN and TURN servers are essential components of VoIP technology that allow clients to communicate reliably over the internet. By using STUN and TURN servers with WebRTC, developers can create real-time communication applications that work in any network environment. If you want to build your own VoIP solution without using a VoIP provider, you can use WebRTC and STUN/TURN servers to create a peer-to-peer communication system.



Step 1: Setting up the signaling server

The signaling server is a crucial component in WebRTC-based communication because it helps establish the initial connection between two peers. It's responsible for exchanging metadata and negotiation messages that enable the establishment of the peer-to-peer connection. Essentially, the signaling server serves as an intermediary between two peers, allowing them to communicate and exchange information about their WebRTC capabilities, such as video and audio codecs, bandwidth, resolution, and other important details.

Setting up a signaling server in JavaScript requires a few steps. First, we need to create a server that can receive and transmit messages between peers. Next, we need to use a signaling protocol to exchange the necessary metadata and signaling messages between peers. Some popular signaling protocols include SIP, XMPP, and WebSocket.

To create a WebRTC signaling server in Node.js you can use "ws" (WebSocket) package to handle WebSocket connections between the client and the server. Here is an example code snippet:

  
# Install ws using npm
npm install ws



  
const WebSocket = require('ws');

const wss = new WebSocket.Server({ port: 8080 });

const users = {};

const handleSignalingMessage = (user, message) => {
  const recipient = users[message.to];
  if (recipient) {
    recipient.send(JSON.stringify(message));
  }
}

const handleICECandidate = (user, message) => {
  const recipient = users[message.to];
  if (recipient) {
    recipient.send(JSON.stringify(message));
  }
}

const handleOffer = (user, message) => {
  user.otherUser = message.to;
  const recipient = users[message.to];
  if (recipient) {
    user.send(JSON.stringify({
      type: 'offer',
      offer: message.offer,
      from: message.from,
    }));
  }
}

const handleAnswer = (user, message) => {
  const recipient = users[message.to];
  if (recipient) {
    user.send(JSON.stringify({
      type: 'answer',
      answer: message.answer,
      from: message.from,
    }));
  }
}

const handleUserMessage = (user, message) => {
  switch (message.type) {
    case 'offer':
      handleOffer(user, message);
      break;
    case 'answer':
      handleAnswer(user, message);
      break;
    case 'ice-candidate':
      handleICECandidate(user, message);
      break;
    default:
      handleSignalingMessage(user, message);
      break;
  }
}

wss.on('connection', (ws) => {
  console.log('Connection established');

  const user = {
    id: Math.random().toString(36).slice(2),
    ws,
    otherUser: null,
  };
  users[user.id] = user;

  ws.on('message', (message) => {
    try {
      const parsedMessage = JSON.parse(message);
      handleUserMessage(user, parsedMessage);
    } catch (err) {
      console.error('Failed to parse message', err);
    }
  });

  ws.on('close', () => {
    console.log('Connection closed');
    delete users[user.id];
    if (user.otherUser) {
      const recipient = users[user.otherUser];
      if (recipient) {
        recipient.send(JSON.stringify({
          type: 'user-disconnected',
          userId: user.id,
        }));
      }
    }
  });
});

The code above shows an implementation of a WebRTC signaling server using the WebSocket protocol, as implemented by the ws library in Node.js. The signaling server acts as a middleman for WebRTC peers to exchange information required to establish a peer-to-peer connection.

The server listens for WebSocket connections on the port 8080. When a WebSocket connection is established, the connection event is triggered, and a message event listener is set up on the socket. When a message is received on the socket, the code first checks if the message is a JSON object, and if not, it ignores the message and logs error.

If the message is a JSON object, the code checks for the type property of the object. If the type is offer, the code broadcasts the message to all connected peers except the one that sent the message. If the type is answer, the code similarly broadcasts the message to all connected peers except the one that sent the message. If the type is ice-candidate, the code sends the message to other peer.

When a WebSocket connection is closed, the code removes the socket/connection from the list of connected sockets.



Step 2: Setting up STUN/TURN servers

STUN and TURN servers are used for NAT traversal in WebRTC communication. STUN servers are used to discover public IP addresses and ports of the client's device, while TURN servers are used as a fallback in case peer-to-peer connection establishment is not possible due to NAT/firewall restrictions.

To set up a STUN/TURN server, you can use an open-source implementation like Coturn. Here's an example of setting up Coturn as a TURN server:

1. Install Coturn using a package manager like apt-get or brew:

  
sudo apt-get install coturn

2. Edit the Coturn configuration file at /etc/turnserver.conf to specify the listening IP address and port, as well as the TURN server credentials:

  
listening-ip=SERVER_IP_ADDRESS
listening-port=3478
realm=example.com
user=username:password

Replace SERVER_IP_ADDRESS with the IP address of your server, example.com with your domain name, and username:password with the credentials you want to use to access the TURN server.

3. Start the Coturn server:

  
sudo service coturn start

You can now use the URL of your Coturn server in your WebRTC application as a TURN server URL.



Client side/browser implementation

To use STUN and TURN servers in WebRTC communication, we need to add the ICE servers to the RTCPeerConnection configuration object. Here's an example of a createPeer function that includes STUN/TURN servers:

  
function createPeer() {
  const peer = new RTCPeerConnection({
    iceServers: [
      {
        urls: 'stun:stun:stun.example.com:19302',
      },
      {
        urls: 'turn:your-turn-server.com:3478',
        username: 'your-username',
        credential: 'your-password',
      },
    ],
  });
  
  // ...
  
  return peer;
}

Note that in the above example, we are using a STUN server provided by Google (stun.l.google.com:19302). You can replace it with your own STUN server if you have one. Similarly, for the TURN server, you need to provide the URL, username, and password for your server.

To get users connected from the browser, we need to create an offer and send it to the other user. Here's an example of how to do it:

  
// Get local media stream
navigator.mediaDevices.getUserMedia({ video: true, audio: true })
  .then(stream => {
    const localVideo = document.getElementById('local-video');
    localVideo.srcObject = stream;
  
    // Create peer connection
    const peer = createPeer();
  
    // Add local stream to peer connection
    stream.getTracks().forEach(track => {
      peer.addTrack(track, stream);
    });

    const currentUserId = 'test-123'; // get currentUserId somehow
    const otherUserId = 'test-456'; // get otherUserId somehow

    // out signalling server
    const socket = new WebSocket('ws://localhost:8080');
  
    // Create offer and set local description
    peer.createOffer()
      .then(offer => {
        peer.setLocalDescription(offer);
      
        // Send offer to other user
        socket.emit('offer', {
          from: currentUserId,
          to: otherUserId,
          offer,
        });
      });
  });



In the above example, we are using getUserMedia to get the user's local media stream, and then adding it to the peer connection using the addTrack method. We then create an offer using the createOffer method, set the local description using the setLocalDescription method, and send the offer to the other user using the signaling server.

On the other end, when the other user receives the offer, they can create an answer and send it back to the first user. Here's an example of how to do it:

  
// out signalling server
const socket = new WebSocket('ws://localhost:8080');

// Receive offer from other user
socket.on('offer', ({ offer, from }) => {
  // Create peer connection
  const peer = createPeer();
  
  // Set remote description
  peer.setRemoteDescription(offer);
  
  // Add local media stream to peer connection
  navigator.mediaDevices.getUserMedia({ video: true, audio: true })
    .then(stream => {
      const localVideo = document.getElementById('local-video');
      localVideo.srcObject = stream;
      
      stream.getTracks().forEach(track => {
        peer.addTrack(track, stream);
      });
  
      // Create answer and set local description
      peer.createAnswer()
        .then(answer => {
          peer.setLocalDescription(answer);
        
          // Send answer to other user
          socket.emit('answer', {
            answer,
            to: from,
          });
        });
    });
});

// Receive answer from other user
socket.on('answer', ({ answer }) => {
  // Set remote description
  peer.setRemoteDescription(answer);
});

In the above example, we are using getUserMedia to get the user's local media stream, and then adding it to the peer connection using the addTrack method. We then set the remote description using the setRemoteDescription method.



Conclusion

STUN/TURN servers are an essential component of VoIP technology that enables real-time communication over the internet. They help overcome the limitations of NAT devices, which can interfere with VoIP calls. Using WebRTC, developers can leverage STUN/TURN servers to build their own VoIP solutions. While building a VoIP solution can be challenging, it can be a rewarding experience that provides complete control over the technology used.



You may also like reading: