Privacy Security

What Do You Know about OTT Voice Usage in Your Enterprise?

Encryption doesn’t always equal privacy

IT departments have all the means necessary to manage voice communication over the enterprise network and know the ins and outs of those communications. Some enterprises have compliance requirements to which they must adhere, some have security considerations and others have reasons to “know what’s happening in their network”.

With traditional VoIP systems, achieving the above is relatively a simple task.

However, OTT VoIP traffic is a different story. And in the case of OTT, most enterprises settle with one of the following options:

  • Block it
  • Live with the reality

The question is, are these the only two options available and what do enterprises really want to do about OTT VoIP?

Border TURN server

Some enterprises are adding a new entity to the border of their network, a border TURN server that forces all VoIP media traffic to go through it. This includes enterprise managed VoIP as well as OTT. VoIP media that doesn’t go through the border TURN server is blocked.

Adding this entity and blocking all VoIP media that doesn’t go through the border TURN server creates a problem for WebRTC communication because only one TURN server address can be provided for the peer connection establishment procedure. Since many services require media to go through an application TURN server, the border TURN server is left out of the flow and media that doesn’t go through, is blocked.

In a post I published last week together with Dan Burnett (co-editor of WebRTC standards) on (where we publish updates about what takes place at IETF and W3C with regards to WebRTC), we talked about RETURN. In a nutshell, RETURN encapsulates two TURN servers into one by adding the border TURN server as a configuration option to browsers. Details and illustrations can be found in the original post.

Since WebRTC media is always encrypted, what is the point in requiring it to pass through the border TURN server?

What can be extracted from encrypted communication?

Privacy SecuritySince all media flows through the border TURN server, there are some basic things it can “know” – such as the source, destination and length of a call.

With this knowledge, the server can block calls from black listed addresses, limit/monitor call duration and collect this information.

These capabilities are pretty basic and I wanted to know if there was more a border TURN server can detect in an encrypted media stream. To better understand, I turned to Yossi Zadah (who is already well known on this blog) and to Ilan Shallom. Ilan is a Professor at Ben-Gurion University and founder of a speech recognition company that today is part of AudioCodes. His technology is the brain behind our VocaNOM solution.

Some might be surprised to learn that there is a significant amount of information that can be extracted from an encrypted media stream. There are studies that show it is possible to identify the language of the conversation. Other studies show it is possible to unveil the identity of the speakers on such a call and even create approximate transcripts of encrypted VoIP calls by identifying words in the stream.

There is also a thesis specifically relating to Skype using Silk (from back in 2011) that details information that can be learned from such conversations.

Key Takeaways

  • Border TURN servers are being deployed at enterprises. Though they impose problems on WebRTC communication, RETURN is planned by the IETF as a solution.
  • Given the limitations border TURN servers impose on OTT traffic, my personal view is that they would be counterproductive in most cases as they limit Bring Your Own OTT (BYOO) in the enterprise
  • If you thought that your WebRTC call is private…think again.
the whole is greater than the sum of its parts

WebRTC Connectivity Solution with Focus on Quality and Scale

The whole is greater than the sum of its parts

the whole is greater than the sum of its partsWe released a PR today about the AudioCodes WebRTC Connectivity Solution. In this post, I want to provide a technical insight into this solution, explaining why the whole is greater than the sum of its parts.

A typical approach to connecting WebRTC with an existing enterprise VoIP network would follow one of these two architectural concepts:

  • Browser using Opus with some kind of proprietary signaling-> WebRTC to SIP GW -> SBC -> Transcoding to G.7xx -> IP Phone on a SIP network
  • Browser using G.711 with some kind of proprietary signaling-> WebRTC to SIP GW -> SBC -> [Optional: Transcoding from G.711 to G.7xx]-> IP Phone on a SIP network

Both cases introduce some issues

The external WebRTC to SIP GW carries a price.

Transcoding from Opus to G.7xx is something you would want to avoid unless there is no other option. Opus requires hefty CPU resources (much more than most audio codecs) and introduces quality issues while increasing CAPEX.

The other alternative, using G.711 over the open internet, is not a good option as G.711 wasn’t built for sustaining quality over unmanaged networks.

Traffic traverses between networks and different network types (WiFi, Wireline Ethernet, public Internet and enterprise networks) while devices themselves can vary as well. Handling of network impairments should be done in the entity that connects between these networks as it is familiar with the requirements of each and has the per-session knowledge of source and destination of traffic. While the two links above use VoLTE as an example in the post, similar issues arise when traversing WebRTC to non-WebRTC networks and the algorithms described in these posts serve the need of this architecture as well.

Another missing piece in typical deployments is the ability to monitor the traffic, know what is happening on the network and impact SBC decisions for quality improvements.

Putting quality and scale at the front

The solution announced was designed with these two factors at the forefront and that is what makes it stick-out of the crowd.

Integrated solution

In our solution, WebRTC is supported in the SBC itself. This includes WebSockets, DTLS and other WebRTC “special” behavior. This architecture simplifies deployment and management as well as reduces delay, (or in other words, increases quality).

Minimal transcoding scenarios

By supporting Opus in the AudioCodes IP Phone, we hold the rope at both ends. On the one hand, voice over the open Internet is using the Opus codec which is purposely built for this task. On the other hand, transcoding is not required as the IP Phone also supports Opus. This significantly increases the number of calls supported while reducing CAPEX.


For signaling we use SIP over WebSockets. We found this to be a good solution when the goal is to connect to existing networks.

Another advantage of the decision to use SIP for signaling is the connection to WebRTC API platforms. WebRTC API platforms typically use proprietary signaling. For connecting to existing networks they build an adaptor which is typically SIP. This basic SIP implementation can’t connect to any existing enterprise SIP network but going through the SBC allows it to do so. While this SIP adaptor varies and is not always pure WebRTC over WebSockets, eliminating the need for this adaptor to implement a full WebRTC to SIP GW, simplifies this interconnection making this task easier.

Voice quality enhancement

As mentioned, traffic traversing between networks typically suffer from call quality degradation. There are different implementations of media engines on the client side that are optimized for the network with which the client plans to work. When the client connects with another client on a network that it wasn’t built to connect with, voice call quality issues increase. Having the SBC as a demarcation point between the networks allows for the utilization of the AudioCodes voice enhancement algorithms for improving audio quality.

Detect & correct

The AudioCodes Session Experience Management (SEM) not only monitors and detects voice call quality issues, it also works in harmony with the SBC to allow for smart, quality based, routing and quality improvements.

Why is this important?

A WebRTC GW doesn’t stand by itself; it needs a multitude of capabilities and supporting elements to ensure effective and high quality service. This is why the whole is indeed greater than the sum of its parts when bringing all that is required for an end-to-end WebRTC Enterprise connectivity solution.

Garfield & Odie

The NSA is After My Grandmother Secret Lasagna Recipe

Commenter: Hi, can I ask a WebRTC related question?

WebRTC forum moderator: Of course, you are in the right place.Garfield & Odie

Commenter: My name is Garfield.

WebRTC forum moderator: Ahh OK, are you related to the cat?

Commenter: Seriously? When’s the last time a cat asked questions in the WebRTC forum?

WebRTC forum moderator: Good point.  How can I assist you today, Mr. ahh Garfield?

Commenter: I have a problem with a family secret which has been passed down from mouth to ear for centuries.

WebRTC forum moderator: And how is it related to the WebRTC forum?

Commenter: Well, I am quite old and I realized that this is a good time to pass the secret to the next generation. I have been told that WebRTC has the most secured media channel there is. So I thought this would be a good place to do it.

WebRTC forum moderator: If I may ask, how old are you?

Commenter: 15 years old.

WebRTC forum moderator: Excuse me? Ahh. Why don’t you pass your secret to your children?

Commenter: They’re on the other side of the ocean and I am afraid to do it on an international call.

WebRTC forum moderator: Why?

Commenter: Hey, I may be old but I’m not so naive to let the NSA learn about my grandmother’s secret lasagna recipe.

WebRTC forum moderator: I get your point. You heard right about WebRTC. Let me explain: The security mechanism in WebRTC is quite different than other calling systems. WebRTC mandates all media channels voice, video and data using DTLS protocol for security.

The browsers exchange a Datagram Transport Layer Security (DTLS) handshake on every media (voice, video and data) channel. Once these DTLS handshakes are completed, the media is encrypted and begins flowing between the browsers using Secure Real-time Protocol (SRTP).

Other calling systems are using SDES (used by SIP); in SDES the crypto is forwarded in the SDP via the signaling interface and can be used by the servers in the signaling path to decrypt the media.

SDES, which is used more in the SIP world, would help interface more easily to existing SIP-based infrastructure. Although there is consensus that DTLS will be mandatory for WebRTC to support, until two weeks ago, Google Chrome supported both SDES and DTLS. The most recent Chrome version no longer supports SDES. Mozilla Firefox never supported SDED.

In short, when using WebRTC end-to-end, eavesdropping is not an option; your grandmother’s lasagna secret recipe is safe.

Commenter: Many thanks, by the way, what happens if recording is a necessity such as in Contact Centers or where regulation requires it?

WebRTC forum moderator: In such cases, there are a few options. One option is using WebRTC client API for recording, however it’s a local solution.  An organization straight forward approach is to use a WebRTC GW with forking abilities, or a Session Border Controller which supports SIPREC for recording with WebRTC termination capabilities. The SBC then maintains two DTLS call legs toward each party of the call and in the middle; it forks the call to the recording server using a standard SIPREC protocol.

WebRTC Recording with SIP-Rec

Commenter: It’s strange that you are mentioning SBC’s in this context, since a few weeks ago I read in that SBCs are not required in the WebRTC world.

WebRTC forum moderator: That’s true when making a call between two browsers but there will be a need for a border element to be between the WebRTC world and the SIP world, WebRTC GW (SBC), to terminate WebSocket, DTLS, ICE and OPUS transcoding.

Commenter: That’s very intriguing! So do you think it is safe to send Pomodoro Siciliano over the WebRTC data channel too?

WebRTC forum moderator: I would recommend starting moderately with the basil.

Commenter: Meeow…..

WebRTC forum moderator: Oh my……..

Waves - Communications Webification

Things I Picked Up at WebRTC World East 2014

[Post is better viewed on the blog Website]

Last week was a busy one at WebRTC World in Atlanta with several parallel tracks, demos, keynote sessions and many discussions with long-time friends from the industry and some new friends I made at this event.

And here lies my (expected) problem. I’m mainly still seeing the same people I used to meet at the SIP events 10+ years ago, and not enough new faces. We were shown some customer examples by TokBox and a live demo of an easyRTC customer but this wasn’t enough. As an industry, we haven’t yet managed to open WebRTC to the general public of the web developers.  Could it be that WebRTC World is an event more tailored for VoIP experts? Tsahi and Chris are taking a stab at this challenge next week together with Google at the KrankyGeek show that will follow Google I/O.

With all the hype around WebRTC these past couple of years many many are now disappointed as they don’t see the growth happening at the “expected” pace. In the conference’s welcome notes, Phil Edholm addressed this issue nicely presenting WebRTC in the context of Geoffrey Moore’s theory for the Technology Adoption Lifecycle. 

The Chasm - Technology Adoption Lifecycle

Clearly, WebRTC will proliferate only once it is commonly used by web developers. Usage by the existing VoIP communication industry alone will not bring the communications capabilities enabled by WebRTC to the applications and scenarios we envision today. WebRTC, as Phil presented, is driving the communication webification wave. By the end of this decade, the way in which we communicate will change. We just need patience for this technology to take off.

Waves - Communications Webification



Google, Serge Lachapelle

The presentation by Serge covered 2 main topics. He started with the story of how WebRTC came to life. From a gap identified by the Chrome team, “Human Communication is not possible in browsers”, to the decision to acquire GIPS and the official announcement of WebRTC. There were 2 big challenges any company looking to launch communications services would run into – building and binding together the voice and video technology and overcoming all legal and royalty issues related to them. Since Serge had already 10 years of experience in breaking down those walls, he knew that they must be solved for the general public of developers and innovators in order to make this successful. Hence, the decision to acquire a leading media engine vendor and invest significant money and efforts into solving the patents tangle.

The technology challenges led nicely to the second part of his presentation that talked about upcoming (expected very soon) releases where a lot of focus has been put, among other things, on quality,  WebRTC over wireless networks, faster connection time and better adaptation to network changes. This is, of course, a short list. We will need to see the release notes of the upcoming Chrome 36 & 37.

In an earlier event in London, Serge spoke about the priority being given to solving WebRTC for mobile. I would have liked to hear in-depth details on that part of the picture. More might come to light next week when Serge covers this topic at the KrankyGeek Show where he will talk about Mobile WebRTC. I will not be there but will surely keep an eye on that half day event.


Microsoft, Bernard Aboba

This was an interesting technical presentation. It started with the directive of Microsoft CEO Satya Nadella for Mobile First, Cloud First continuing with the requirements for achieving high quality real-time communications on mobile. The presentation also covered the area of ORCA and how it will be adopted in WebRTC.

Bernard mentioned that the various Microsoft products such as IE, Lync, Skype, Yammer and others are all independent regarding their decision relating to WebRTC adoption. With Microsoft increasing their transparency about their IE plans through their IE platform status website (a place to see what the Dev team is working on) and the IE Developer Channel (where pre-beta versions of IE can be found), I hope we will soon learn more about Microsoft’s plans for WebRTC.


Avaya, Gary Barnett

There were several keynotes by vendors but I’m mentioning this one as it looks like Avaya is right on the money with their current plans for WebRTC in the scope of their Collaboration Environment engine given the company’s market position, customers and DNA. Different from other communications vendors who are trying to copy what others have already done and follow the footsteps of the API platform service providers, Avaya is using WebRTC to enhance their current offering.

In his presentation, Gary talked about how WebRTC plugs-in to Collaboration Environment and makes use of existing capabilities such as speech analytics with the addition of web content brought to the service of the Contact Center agent and manager.

The presentation was given only in the context of Collaboration Environment and the Contact Center segment. There are of course, other products and services such a communications vendor can launch by utilizing WebRTC but no information was given in this regard.


TalkBox, Ian Small

As usual, Ian gave a great presentation seasoned with good live demos. On this occasion and at his previous keynote at WebRTC West in Santa Clara, Ian put a lot of focus on media processing capabilities. These are complex things to do but bandwidth adaptation, dominant speaker detection and dynamic layout changing were done by video companies many years ago. The nice thing about what TokBox does is that it makes all this accessible to the web world in a complete and comprehensive solution. They could have just bought all these media processing capabilities from a 3rd party and used them.


AudioCodes at WebRTC World Atlanta

In a previous post I talked about the Open vs. Island types of WebRTC deployments; AudioCodes, falls into the “Open” category as an enabler of communication across VoIP networks and vendors in high quality. As such we presented the SBC with WebRTC interface including support for DTLS and other goodies as well as the Opus enabled IP Phone. From this perspective AudioCodes is well differentiated from other comparable vendors who demonstrated products that are more in the GW category. The key differences AudioCodes presented were as follows:

  • Adding WebRTC to the SBC rather than as an external box
  • Opting for Opus all the way instead of G.711 or mandatory transcoding on the GW
  • Taken together, this means a more efficient, lower cost, higher quality solution

In addition to our booth, both Alan Percy and I took part in a total of 8 sessions:

Were you in Atlanta last week? I’m looking forward to hearing your take on this event in the comments section below.

You weren’t there and want more insight as to what took place in the above sessions or others? Feel free to drop us a note in the comments section below or to contact us directly.

2 Box WebRTC GW

Why not All WebRTC GWs Were Born Equal

[Post is better viewed on the blog Website]

2 Box WebRTC GWWebRTC, although not finalized in standard bodies, is already being deployed in different networks and segments. One can categorize WebRTC deployments as either an Island or as Open.

An Island – All communications of the service run within the closed network of the provider of the service. An Island deployment can use any proprietary technology it wishes and can change behaviour between versions without the need to worry about backwards compatibility as long as it doesn’t change the interfaces with which users/developers interact.

Open – The provider of the service doesn’t control all the components and therefore, must stick to standard protocols. By way of illustration, think of connecting a home user from the browser to an enterprise SIP network or a service provider’s voice network.

Now enough with theory and let’s look at reality.

  • In reality, some Islands need to be connected to the external world.
  • In reality, most of those “Open” deployments require an alignment between vendors and server components to connect between them, AKA SBC.

Why is this important?

When looking at WebRTC in the context of existing communications and networks, there is no point or need to reinvent the wheel. That said, in cases where there is a need to bridge WebRTC into existing telephony networks (say SIP), WebRTC should be added as an interface to the existing connection point.

Looking at some of the ways WebRTC GWs are being designed today, we can see that many vendors have taken the route of adding WebRTC through an external box as illustrated below.

WebRTC GW as an External Box

One might say that the WebRTC GW is all that is required but there are many reasons why eventually such deployments end up requiring an SBC in the architecture. These reasons may include: Security, SIP Trunk and SIP networks interoperability, QoS and regulatory compliance.

What is the down side of a two box solution?

In addition to the obvious need to manage and maintain two boxes or SW components, there are also technological downside factors to this type of architecture.

In this architecture, each of the two components will take care of different functionalities – the GW will provide the WebRTC specific functionalities while the SBC will handle the generic SIP functionalities as well as the per vendor/network functionalities.

The GW functionality

GW functionality includes:

  • Terminate WebSockets and any other proprietary/standard signaling used on the WebRTC leg
  • Terminate DTLS
  • Handle ICE functionality
  • Handle RTP multiplexing as in WebRTC all RTP and RTCP streams are sent on the use port
  • Support for Extended Secure RTP Profile for RTCP-Based Feedback (RTP/SAVPF)

The SBC functionality

SBC functionality includes:

  • Support for enhanced SIP features such as class 5 features, music on hold and forking
  • Support for vendor specific functionality and call signaling.

Putting it all together

Due to the nature of functionality supported by each one of these components, both signaling and media need to flow through them both. This presents a significant challenge to the architecture as it increases complexity and media delay.

WebRTC GW in an SBC

What is the alternative architecture?

Similar to other interfaces and vendor specific functionalities supported by an SBC, WebRTC can be included within the SBC itself. By taking this approach and allowing for dynamically turning on and off this functionality, you avoid the problematic issues addressed above.


Have you deployed or are you looking to deploy a WebRTC GW? Which of the approaches do you find most fits your needs?

The Value of Voice

Make it Short, it is Expensive

Determining the value of communications

[Post is better viewed on the blog Website]

Mothers-in-law are a sensitive subject. Anyone who has a mother-in law knows this very well.

Now, don’t get me wrong. Most people would probably envy me for mine. She got into this post because she is a natural reserve for technology of the 60-70’s. She made a conscious decision to remain planted back in those days in many areas, technology is just one of them. It’s a long story and this isn’t the best place to go into details about it J. Last week I was in Spain with my wife trekking in the Pyrenees and our kids were left with our extended family. Two days during the week were with my mother-in-law at our house. When I called and my 6 years old son picked up the phone he started waffling his usual nonsense. In the background I heard my mother-in-law saying to him – “make it short, this is a very expensive call”.

The value of voice goes down to zero

This is something I heard many years ago and one I use myself. But this statement needs to be placed in the right context. When I call from my vacation, I have many means of making a free call or a call that is so cheap that practically puts it into the “free” category. My son can waffle for as long as he wants. I can call through a mobile application (called Bphone) provided by my local service provider that takes my home number with me and allows me to make calls from anywhere over WiFi at the cost of a local call, as if I was calling from home. This happens to be an application AudioCodes has provided Bezeq (the local service provider) for this service. I have plenty of other options for calling PSTN using Viber, Skype… or the AudioCodes enterprise mobility application. All these allow for calls from anywhere for a free/flat rate to a very low cost.

But this doesn’t mean that the value of voice goes down to zero. Voice is still the #1 revenue source for service providers. It has value for consumers and surely has value for enterprises, value that is far more than the call itself. There are services attached to calls in the enterprise environment.

A good post written by Yossi Zadah is scheduled for release next Monday that takes a look at the value of voice calls through the prism of call toll fraud, so stay tuned.

The value of communication is determined by the service in which it is embedded

The Value of VoiceThe value of voice as well as video, presence and messaging, is not in simply connecting such sessions but needs to be viewed in the context of the service in which it is embedded. If voice/video communication is embedded in an insurance company’s self-service website where a user can speak with an agent when running into issues purchasing his insurance, the value of the call is not the amount of cents it costs but rather the fact that the deal was closed instead of the customer going to the competitor. There is a multitude of examples of similar and other cases such as remote learning and group collaboration. In all these cases, the value of the call is the cost saved or the revenue earned. The value perceived by the provider of this specific service is higher than the value the Communications Service Provider (CSP) receives for the call. Therefore, packaging the calling service in a way that is easy to embed into other services will allow the communications service provider to extract more value from it. This naturally leads us to the web and to WebRTC.

WebRTC as a catalyzer

WebRTC makes communications ubiquitous across web services. It renders the world of VoIP communications accessible to web developers and not only to VoIP experts. WebRTC is a catalyzer for communication revenue as it allows combining communications with web services. Moreover, it allows connecting these services with the existing enterprise communications platforms through SBCs that reside within the enterprise domain or in the cloud. With WebRTC, communications become a web feature that allows for the increase of conversion rates and revenue from web-based services and therefore, its true value becomes the value of the service and not of the call itself.

Returning to the phrase “The value of voice goes down to zero” I would coin a new phrase “The value of voice is equal to the value of the service in which it resides.

What is your view on the value of voice? Feel free to express your views and comment to this post.

1 Answer to WebRTC Signaling

1 Answer to WebRTC Signaling

[Post is better viewed on the blog Website]

A lot of opinionated information has been written about the debated topic of WebRTC signaling. An example of a good and well-balanced technical post is WebRTC Hacks, written by Victor Pascual.

I am excited to be participating in a panel on this topic next week at WebRTC Global Summit in London and I thought it would be a good idea to provide some points about this topic beforehand. If you happen to be around please come and pay us a visit, we are at booth #6.

What is the debate about?

There are 2 fundamental items the industry is debating about:

  • Should WebRTC define signaling
  • When building a product/service should signaling be based on existing standard or proprietary protocols

The answer to the first question is easy. Since WebRTC was born to serve Web developers, not Telecom VoIP geeks, one would never be able to imagine what WebRTC could be used for. This fact requires taking the “less is more” approach and define only the minimal must, thus, leave signaling out of the WebRTC definition scope and put it in the hands of those building each specific solution.

The second question seems more complex as is testimony to the many opinions out there. Some think proprietary JSON-based signaling is the only answer. Others think standard signaling is the answer pitching to go for SIP or XMPP. Another opinion I enjoyed debating about at the WebRTC 2013 conference in Paris was that WebRTC was “born” for IMS (needless to say I didn’t support that point of view).

1 Answer to WebRTC Signaling

So what is the 1 answer for WebRTC signaling?

If it wasn’t clear to this point, the answer is simply – it depends.

The decision of what signaling to use when building a product or a service depends on its nature and the solution for which it was designed for.

The primary distinction required for deciding if a standard or proprietary approach fits best is whether the solution goes into a service island or if it needs to connect with an existing VoIP network.

In the case of a service island, proprietary signaling will typically be chosen because it is the easy approach, however, if advanced telephony functions, already well defined in standard protocols such as SIP are required, there is no point in reinventing the wheel. It is perfectly OK to pick and choose the functionalities of SIP needed in the implementation and ignore the rest.

On the other hand, if the solution is about allowing WebRTC service to connect with existing standard VoIP networks such as SIP the natural signaling choice would be SIP.

Last but not least, if you are providing an end-to-end solution that includes the WebRTC clients as well as the server, whatever signaling is hidden under the hood doesn’t really interest the developer building the application. What does interest the developer is how simple it is to use your APIs for integrating your client into his product.

It would be interesting to get your comments to this post detailing your view on this subject and how you decided to deal with this matter in your implementation.

Moving WebRTC Into Your Network Through the Front Door

Moving WebRTC Into Your Network Through the Front Door

As part of my work with WebRTC, I get a chance to speak to different types of companies about their WebRTC plans. When speaking with companies that have existing VoIP products and services, the conversation usually moves to how WebRTC should be added to their offering, what the additional service benefits are and how to architect the solution. The typical requirement is to leave the existing deployment untouched and bridge WebRTC into the existing network through some sort of a GW. Where should the logical function of the GW be located and what should the network architecture look like are usually the questions debated. To answer them, I decided to write this post.Moving WebRTC Into Your Network Through the Front Door

Image credit: Muhammad Mahdi Karim

To demonstrate the consideration points and options, I will use an example of a contact center. For the sake of this example, let’s take a contact center that has both PSTN lines as well as SIP Trunks from a service provider. All traffic inside the contact center is via SIP where some of the agents are working on premise and others are home agents who are “called in” for handling contact center peak traffic.


Architecture Options when Adding WebRTC

In my discussions with customers and partners who are looking to add WebRTC into their existing networks, the architecture alternatives considered were:

  • A dedicated GW
  • Adding a WebRTC interface to their current core server
  • Adding WebRTC through an SBC

Before and After WebRTC

BeforePre WebRTC Contact Center Connectivity

  • Traffic comes from a service provider over SIP trunks or PSTN
  • All traffic in the contact center is SIP
  • Home agents are connected over IP-SIP but this is done in a secured and controlled manner
  • The contact center core server is placed inside the contact center network. Security is handled by other elements so it is protected from denial of service attacks, call fraud and other security vulnerabilities
  • Calls are using G.729 or G.711. Transcoding, if required, is handled in the contact center network



Adding WebRTC into the game creates new requirements and a new type of traffic source. With WebRTC, users browsing the website of the company serviced by the contact center, can call in directly from the browser. Their traffic runs over the Internet directly to the contact center.

WebRTC includes 2 voice codecs: G.711 and Opus. These are the codecs that come within the browser. If the intention is to eliminate the need for download, calls must be initiated using one of these 2 codecs.

Since G.711 is not built for running over the open Internet as it doesn’t include resiliency, it is beneficial to initiate the calls with Opus. The optimal approach would be to run Opus end-to-end from the browser to the agent but in cases where this is not possible, it is best to keep Opus on the open Internet leg and transcode when getting into the contact center network.Alternatives for adding WebRTC to the Contact Center

  • The contact center is required to have a “leg” in the public Internet domain
  • Quality of service is not managed. Even though in many cases the quality is good, supporting SLA requires the capacity to manage and monitor quality of experience (QoE)
  • Supporting Opus requires either adding a new intensive computing transcoding function or adding Opus to the agent’s client


Alternatives Comparison for adding WebRTC to Contact CenterThe comparison above doesn’t relate to any vendor specific product but rather looks at common functionalities of such products. Given this, the following conclusion should be viewed in context of the actual functionality supported by the specific products considered.

The comparison shows that in the case of the contact center example in this post, adding WebRTC to the existing internal contact center server will yield high risk and therefore, it is not a recommended alternative.

The selection between a pure GW and an SBC would depend on priority focus. If security and QoS are of high priority, the comparison leans towards the SBC alternative.

Going native with WebRTC

Going Native with WebRTC

Going native with WebRTCIf you are reading this blog there is a good chances you heard about WebRTC and are well aware of the various products and services around it. Centering on enterprise communication and how WebRTC is being realized in this segment, the market is pretty much focused on one solution – a GW.

Now don’t get me wrong, WebRTC GWs are very important, you can’t really do without them if you want to overcome the slow deployment cycles and stay current with technology advancements. The point is that a GW is not enough. Let’s delve more into that.

The Typically Proposed “WebRTC for Enterprise” Architecture

If you look at some of the architectures used today for bringing WebRTC into enterprise networks they typically comprise a WebRTC GW and a media server that transcodes Opus to some other common codec, say G.729. Some options will also include RESTful APIs for configuration and creation of services on top. On the Audio side, there are cases where G.711 is used end-to-end but this option is not a preferred one from quality perspective when going over the open internet even though there are ways to add resiliency and improve quality even if G.711 is used.

A typical architecture of WebRTC GW Deployment

A typical architecture of WebRTC GW Deployment


The architecture described above is great, it will do the job. Question is, at what price.

Basically this architecture is kind of an “easy” way to bridge WebRTC into an enterprise network. You put a big box that will brute-force everything to what you have running on your network today. If that big box doesn’t provide the required capacity, just add another one.

There is another option

The most “expensive” component in GWing WebRTC into an enterprise network is the transcoding part. The way to work around this is by adding native support for Opus to the end devices. Doing so will yield quality improvement, cost reduction and preserve privacy. You can find a detailed technical overview why going native with WebRTC media on the end devices is important in an earlier blog post I published.

Reality is that Opus transcoding is extremely computing intensive so architecting this task on the server side will take a significant capacity toll on your system.

The flip side of this is that putting Opus on the IP Phone is complex. Assuming you are going for a SW upgrade and not a HW change, it requires flexibility in the phone architecture and hard work to get Opus running on it.

The subject of why adding Opus to existing IP Phones is complex interested me for a long time so I had a chat with our experts. Eli Shoval who is running our DSP Group and Oren Klimker, a team leader in this group.

The challenges in running Opus on an IP Phone can be summarized to be:

  • Processing power – Since not all IP Phones were born equal there needs to be optimization work and actual rewriting of some codec parts to make it run best on the IP Phone processor
  • Memory – This includes both footprint and run-time memory requirements. Opus is a feature rich codes that serves a wide variety of implementation scenarios; additionally, since sampling rate of Opus is higher than traditional VoIP codes memory required for an audio channel is increased

This in turn yields 2 main tasks required to overcome the MIPS and memory challenges:

  • Optimization – This work includes optimized implementation of some components of the Opus codec for both performance on the specific phone’s SoC (System on Chip) and memory consumption
  • Selective implementation – Part of the rewrite work needs to include removal of certain functions not required on an IP Phone

But there is Opus 1.1, doesn’t that solve the issue?

The short answer to this is NO.

In details, there are 2 reasons why Opus 1.1 doesn’t remove the need for native support for Opus on the end devices but rather make it even more essential.

The first reason is simple. Since transcoding will always add delay, reduce quality and impose cost on the server implementation; whenever possible, better to avoid getting into transcoding.

The second reason lies in the details of Opus 1.1 improvements. There is a pretty long list of changes, some such as surround encoding improvements don’t really touch the IP Phone requirements that much. What I want to take a closer look at is the speed improvements. As it looks, the team that built Opus 1.1 focused on improving the codec speed when running on ARM processors with NEON (do I hear mobile?), they reached up to 40% improvement. On the other hand, for x86 architectures there is no real improvement and in some cases things even got a bit worse. This can be seen in the diagrams below.



Opus 1.1 performance on ARM Cortex-A9 and i7-3520M

Opus 1.1 performance on ARM Cortex-A9 and i7-3520M



This means that Opus 1.1 doesn’t bring good news to transcoding servers but it does improve the performance on some of the end devices.


As explained in this post and on earlier ones the preferred architecture for deploying WebRTC on any network and specifically on enterprise networks is one with end-to-end media without transcoding. The target should be to minimize the cases of transcoding to those where it is a must. Such cases as where traffic is going through a GW to PSTN or over SIP Trunks. In other cases, going native with WebRTC on the end devices is a preferred architecture.