the whole is greater than the sum of its parts

WebRTC Connectivity Solution with Focus on Quality and Scale

The whole is greater than the sum of its parts

the whole is greater than the sum of its partsWe released a PR today about the AudioCodes WebRTC Connectivity Solution. In this post, I want to provide a technical insight into this solution, explaining why the whole is greater than the sum of its parts.

A typical approach to connecting WebRTC with an existing enterprise VoIP network would follow one of these two architectural concepts:

  • Browser using Opus with some kind of proprietary signaling-> WebRTC to SIP GW -> SBC -> Transcoding to G.7xx -> IP Phone on a SIP network
  • Browser using G.711 with some kind of proprietary signaling-> WebRTC to SIP GW -> SBC -> [Optional: Transcoding from G.711 to G.7xx]-> IP Phone on a SIP network

Both cases introduce some issues

The external WebRTC to SIP GW carries a price.

Transcoding from Opus to G.7xx is something you would want to avoid unless there is no other option. Opus requires hefty CPU resources (much more than most audio codecs) and introduces quality issues while increasing CAPEX.

The other alternative, using G.711 over the open internet, is not a good option as G.711 wasn’t built for sustaining quality over unmanaged networks.

Traffic traverses between networks and different network types (WiFi, Wireline Ethernet, public Internet and enterprise networks) while devices themselves can vary as well. Handling of network impairments should be done in the entity that connects between these networks as it is familiar with the requirements of each and has the per-session knowledge of source and destination of traffic. While the two links above use VoLTE as an example in the post, similar issues arise when traversing WebRTC to non-WebRTC networks and the algorithms described in these posts serve the need of this architecture as well.

Another missing piece in typical deployments is the ability to monitor the traffic, know what is happening on the network and impact SBC decisions for quality improvements.

Putting quality and scale at the front

The solution announced was designed with these two factors at the forefront and that is what makes it stick-out of the crowd.

Integrated solution

In our solution, WebRTC is supported in the SBC itself. This includes WebSockets, DTLS and other WebRTC “special” behavior. This architecture simplifies deployment and management as well as reduces delay, (or in other words, increases quality).

Minimal transcoding scenarios

By supporting Opus in the AudioCodes IP Phone, we hold the rope at both ends. On the one hand, voice over the open Internet is using the Opus codec which is purposely built for this task. On the other hand, transcoding is not required as the IP Phone also supports Opus. This significantly increases the number of calls supported while reducing CAPEX.


For signaling we use SIP over WebSockets. We found this to be a good solution when the goal is to connect to existing networks.

Another advantage of the decision to use SIP for signaling is the connection to WebRTC API platforms. WebRTC API platforms typically use proprietary signaling. For connecting to existing networks they build an adaptor which is typically SIP. This basic SIP implementation can’t connect to any existing enterprise SIP network but going through the SBC allows it to do so. While this SIP adaptor varies and is not always pure WebRTC over WebSockets, eliminating the need for this adaptor to implement a full WebRTC to SIP GW, simplifies this interconnection making this task easier.

Voice quality enhancement

As mentioned, traffic traversing between networks typically suffer from call quality degradation. There are different implementations of media engines on the client side that are optimized for the network with which the client plans to work. When the client connects with another client on a network that it wasn’t built to connect with, voice call quality issues increase. Having the SBC as a demarcation point between the networks allows for the utilization of the AudioCodes voice enhancement algorithms for improving audio quality.

Detect & correct

The AudioCodes Session Experience Management (SEM) not only monitors and detects voice call quality issues, it also works in harmony with the SBC to allow for smart, quality based, routing and quality improvements.

Why is this important?

A WebRTC GW doesn’t stand by itself; it needs a multitude of capabilities and supporting elements to ensure effective and high quality service. This is why the whole is indeed greater than the sum of its parts when bringing all that is required for an end-to-end WebRTC Enterprise connectivity solution.

Garfield & Odie

The NSA is After My Grandmother Secret Lasagna Recipe

Commenter: Hi, can I ask a WebRTC related question?

WebRTC forum moderator: Of course, you are in the right place.Garfield & Odie

Commenter: My name is Garfield.

WebRTC forum moderator: Ahh OK, are you related to the cat?

Commenter: Seriously? When’s the last time a cat asked questions in the WebRTC forum?

WebRTC forum moderator: Good point.  How can I assist you today, Mr. ahh Garfield?

Commenter: I have a problem with a family secret which has been passed down from mouth to ear for centuries.

WebRTC forum moderator: And how is it related to the WebRTC forum?

Commenter: Well, I am quite old and I realized that this is a good time to pass the secret to the next generation. I have been told that WebRTC has the most secured media channel there is. So I thought this would be a good place to do it.

WebRTC forum moderator: If I may ask, how old are you?

Commenter: 15 years old.

WebRTC forum moderator: Excuse me? Ahh. Why don’t you pass your secret to your children?

Commenter: They’re on the other side of the ocean and I am afraid to do it on an international call.

WebRTC forum moderator: Why?

Commenter: Hey, I may be old but I’m not so naive to let the NSA learn about my grandmother’s secret lasagna recipe.

WebRTC forum moderator: I get your point. You heard right about WebRTC. Let me explain: The security mechanism in WebRTC is quite different than other calling systems. WebRTC mandates all media channels voice, video and data using DTLS protocol for security.

The browsers exchange a Datagram Transport Layer Security (DTLS) handshake on every media (voice, video and data) channel. Once these DTLS handshakes are completed, the media is encrypted and begins flowing between the browsers using Secure Real-time Protocol (SRTP).

Other calling systems are using SDES (used by SIP); in SDES the crypto is forwarded in the SDP via the signaling interface and can be used by the servers in the signaling path to decrypt the media.

SDES, which is used more in the SIP world, would help interface more easily to existing SIP-based infrastructure. Although there is consensus that DTLS will be mandatory for WebRTC to support, until two weeks ago, Google Chrome supported both SDES and DTLS. The most recent Chrome version no longer supports SDES. Mozilla Firefox never supported SDED.

In short, when using WebRTC end-to-end, eavesdropping is not an option; your grandmother’s lasagna secret recipe is safe.

Commenter: Many thanks, by the way, what happens if recording is a necessity such as in Contact Centers or where regulation requires it?

WebRTC forum moderator: In such cases, there are a few options. One option is using WebRTC client API for recording, however it’s a local solution.  An organization straight forward approach is to use a WebRTC GW with forking abilities, or a Session Border Controller which supports SIPREC for recording with WebRTC termination capabilities. The SBC then maintains two DTLS call legs toward each party of the call and in the middle; it forks the call to the recording server using a standard SIPREC protocol.

WebRTC Recording with SIP-Rec

Commenter: It’s strange that you are mentioning SBC’s in this context, since a few weeks ago I read in that SBCs are not required in the WebRTC world.

WebRTC forum moderator: That’s true when making a call between two browsers but there will be a need for a border element to be between the WebRTC world and the SIP world, WebRTC GW (SBC), to terminate WebSocket, DTLS, ICE and OPUS transcoding.

Commenter: That’s very intriguing! So do you think it is safe to send Pomodoro Siciliano over the WebRTC data channel too?

WebRTC forum moderator: I would recommend starting moderately with the basil.

Commenter: Meeow…..

WebRTC forum moderator: Oh my……..

2 Box WebRTC GW

Why not All WebRTC GWs Were Born Equal

[Post is better viewed on the blog Website]

2 Box WebRTC GWWebRTC, although not finalized in standard bodies, is already being deployed in different networks and segments. One can categorize WebRTC deployments as either an Island or as Open.

An Island – All communications of the service run within the closed network of the provider of the service. An Island deployment can use any proprietary technology it wishes and can change behaviour between versions without the need to worry about backwards compatibility as long as it doesn’t change the interfaces with which users/developers interact.

Open – The provider of the service doesn’t control all the components and therefore, must stick to standard protocols. By way of illustration, think of connecting a home user from the browser to an enterprise SIP network or a service provider’s voice network.

Now enough with theory and let’s look at reality.

  • In reality, some Islands need to be connected to the external world.
  • In reality, most of those “Open” deployments require an alignment between vendors and server components to connect between them, AKA SBC.

Why is this important?

When looking at WebRTC in the context of existing communications and networks, there is no point or need to reinvent the wheel. That said, in cases where there is a need to bridge WebRTC into existing telephony networks (say SIP), WebRTC should be added as an interface to the existing connection point.

Looking at some of the ways WebRTC GWs are being designed today, we can see that many vendors have taken the route of adding WebRTC through an external box as illustrated below.

WebRTC GW as an External Box

One might say that the WebRTC GW is all that is required but there are many reasons why eventually such deployments end up requiring an SBC in the architecture. These reasons may include: Security, SIP Trunk and SIP networks interoperability, QoS and regulatory compliance.

What is the down side of a two box solution?

In addition to the obvious need to manage and maintain two boxes or SW components, there are also technological downside factors to this type of architecture.

In this architecture, each of the two components will take care of different functionalities – the GW will provide the WebRTC specific functionalities while the SBC will handle the generic SIP functionalities as well as the per vendor/network functionalities.

The GW functionality

GW functionality includes:

  • Terminate WebSockets and any other proprietary/standard signaling used on the WebRTC leg
  • Terminate DTLS
  • Handle ICE functionality
  • Handle RTP multiplexing as in WebRTC all RTP and RTCP streams are sent on the use port
  • Support for Extended Secure RTP Profile for RTCP-Based Feedback (RTP/SAVPF)

The SBC functionality

SBC functionality includes:

  • Support for enhanced SIP features such as class 5 features, music on hold and forking
  • Support for vendor specific functionality and call signaling.

Putting it all together

Due to the nature of functionality supported by each one of these components, both signaling and media need to flow through them both. This presents a significant challenge to the architecture as it increases complexity and media delay.

WebRTC GW in an SBC

What is the alternative architecture?

Similar to other interfaces and vendor specific functionalities supported by an SBC, WebRTC can be included within the SBC itself. By taking this approach and allowing for dynamically turning on and off this functionality, you avoid the problematic issues addressed above.


Have you deployed or are you looking to deploy a WebRTC GW? Which of the approaches do you find most fits your needs?