If you’re ready to build a WebRTC based live video application, then the most important architectural decision you need to make is what media server to use.

A purely Peer-to-Peer (P2P) WebRTC video call does not require a media server. All video and audio media is transferred between the peers, and all your application has to do is establish the P2P connection using a process known as signaling. 

However, the vast majority of WebRTC applications use media servers in order to provide features such as group chat, recording, broadcast, transcriptions, or just to improve quality across users on different devices.

We’ve talked about media servers many times in our videos at WebRTC.ventures. You may want to check those out as you consider questions like the following:

You may also want to watch the presentation that Alberto and I did at TADSummit on Architecting your WebRTC Application, which covers a lot of these topics.

I will also mention that our fabulous intern Altanai Bisht just wrote a great technical blog post on Configuring Asterisk as a WebRTC SFU Media Server!

Which media server should you choose?

There are so many options, and they range from tightly controlled commercial APIs to open source projects. In this blog post, I will cover the five main decision points on which you should base your decision.

Five factors to determine your WebRTC media server selection:

  1. Upfront budget
  2. Operational budget
  3. Scaling expectations
  4. Feature expectations
  5. Availability of expertise

 

Factor #1 – Upfront Budget

Some CPaaS’s like daily.co stress how easy it is to drop their functionality into an application. Others may offer a starter kit or sample projects that give you a head start on building applications with relatively straightforward UI requirements.  

Choosing a media server option that has sample projects or a drop-in UI component can help reduce the upfront cost of building your application.  Keep in mind that sample projects are by definition very incomplete. So even if you can get the video part running quickly, you will still have to build lots of functionality around.

Factor #2 – Operational Budget

What will the media server cost? This is crucial to understand up front since your business model likely depends on it.

Most CPaaS solutions like Vonage and Twilio offer solutions with usage based pricing. These will be easy to start with, and you won’t have to invest in any cloud servers of your own or the associated DevOps staff to manage them. That can be very nice, although as your app scales your pricing may become burdensome unless you can negotiate a bulk rate directly with the CPaaS provider.

Open source solutions offer a lower operational cost after a certain volume is reached. You’ll have to invest more up front to build out the cloud infrastructure. But once it’s up and running, there’s no big monthly bill based on your usage.

Factor #3 – Scaling Expectations

This is closely related to the previous factor around your operational budget. One advantage of going with a CPaaS media server is that you are outsourcing all the complexity of scaling a cluster of video servers to others. You still need to scale your web app, but the video part should scale automatically without you doing anything.

However, you’ll need to build out your own scaling infrastructure with most open source solutions. While this will give you ultimate control and insight over your solution, you will need to invest more work up front. 

Also consider the specific features of your application. For example, some open source media servers handle recordings differently from each other. One of those architectures may be more suitable to you based on how many recordings you will be doing in your application as it scales.

Factor #4 – Feature Expectations

Speaking of features like recording, what else will your application need to do? You can increasingly find advanced functionality built into the CPaaS’s for things like spatial audio, blurred backgrounds, and much more.

However a more advanced telephony integration may require building your own open source media server configuration. Or, you may want to get into more bleeding edge features like the Offscreen Canvas API or WebCodecs. In these scenarios, you will likely need to choose an open source media server that will let you do more work under the hood or further customize your implementation beyond what a commercial offering can provide.

Factor #5 – Availability of Expertise

The most advanced application architecture in the world is meaningless if you can’t find someone to build it for you! Consider the technical skills of your in-house talent, as well as those you can contract with such as our team at WebRTC.ventures. When looking at open source media servers, consider the language that it’s written in and how active the user community is, or what company leads that project and if they offer consulting services around their work.  

Let’s get started!

Building a WebRTC application is rarely simple, and finding the technical talent to build your application is hard. There is no single answer. You need to consider all of these factors when making the decision best for your specific application.

We are here to help! We can help you determine the best media server to use for your unique situation. Then, our team of experts can build out the solution for you or in collaboration with your internal team. Contact us today!

Recent Blog Posts