“Try to make that call again and if things go bad yet again please let me know”
Enterprise telephony networks of the past were rather simple from a quality assurance perspective. The move to IP was gradual, typically starting within the enterprise via the move to IP PBXs in the branches, adding inter-branch connectivity over IP and later switching to full IP also on the access side. This was typically done using MPLS lines that provided quality assurance but at a high price.
Quality of Experience (QoE) is critical for enterprise communication networks. I spoke about this topic from a bit of a different perspective at the last Upperside WebRTC Conference and shared the presentation as well. The move to an all IP architecture introduced quality challenges as enterprise networks are complex with multiple sites and a need to connect efficiently between them. Challenges are magnified as enterprises break loose from MPLS chains and move to unmanaged IP networks. Issues that arise from such a change span all VoIP disciplines from call control through media routing to codecs. This makes effective troubleshooting of telephony problems and keeping a consistent good quality of experience, a hard nut to crack.
Driving blind through the network
If you would ask an IT manager at any given point in time the status of his VoIP network, he would be able to give basic statistical information about call activity, packet loss and bandwidth utilization. But now comes the tricky part – understanding what the user experience is like or why this experience was bad on a specific call. The typical answer to such a complaint about call quality is similar to the quote at the beginning of this post but the thing is that when making the call again, the problem usually wouldn’t happen.
By the time the problem does happen again, too much time has passed, context is lost and IT is back to square one trying to figure out what happened and what needs to be done to fix it. VoIP is an application that runs on top of the IP network. For monitoring and resolving VoIP problems, IT needs a proper “pair of glasses” built specifically for the nature of VoIP applications and networks.
Fundamental requirements for effective network monitoring
In order to improve user experience, it is required to understand the root cause of the problem and act upon such findings. A few fundamental capabilities are required in order for a monitoring system to provide this capability. Basically, it boils down to a holistic view of the network and aggregation of calls statistics of different interfaces. Monitoring a single call and acting upon that is not going to solve the root cause of the issues you have in your interface between office A and office B. But looking at the overall flow of calls between locations will enable you to identify network issues.
Take as an example QoS marking mis configuration – such an issue may result in sporadic quality issues and identifying the root cause of the issue will be possible only once the monitoring system has identified that this happens only on one specific media route in which a router is breaking the QoS marking.
It’s clear that regular network monitoring tools will not be helpful in diagnosing the problem and that a proper pair of “glasses” that can find network misconfigurations which result in bad quality of experience is required.
Based on this, the fundamental requirements from a monitoring system would include:
- IT time saving – Help in finding the root cause of the problem in a few simple clicks
- Real Time analysis – Showing what really happens in the VoIP network when the problem occurs. It is great to be able to just look at the screen and see what exactly is happening throughout the entire VoIP network.
- Notifications – Getting alerts on real problems in the VoIP network before they are reported by a user, or worse, being escalated to higher management.
- Trend statistics – This relates to that holistic network view that allows for identifying issues over time and keeps quality issues in context. Good examples of this are quality issues that happen repeatedly while another activity is taking place on the network. Such an “activity” can be a backup system running high network traffic or a user downloading large files.
- Monitoring the call control – analysis of the media is not enough as many cases of bad user experience are related to call control issues such as dropped calls and incorrect routing of calls that result in call establishment failure.
- Network probes – In order to monitor the network of a distributed enterprise you need to have a probe in each location that will collect and report network statistics and quality information.
- Demarcation point in each branch – The demarcation point located at the edge of the local network will allow for the isolation of problems and assist in understanding if the root cause of an issue is on the enterprise network or related to the provider of the access or telephony service.
- Remote access & visibility to all sites – Once problems are identified, you need to be able to act upon the findings and make changes in remote locations.
In conclusion, enterprise real-time communications solutions comprise multiple parties, from the access providers through telephony service providers to server and end user equipment. Monitoring the network, understanding the root cause of issues and changing what is required to be changed in order to avoid further quality issues, is a difficult task. The road to achieving this goal runs through both minimizing the number of vendors involved and using a monitoring solution that has a holistic view of the entire network and of each branch.
At AudioCodes, we have been working with such cases for many years. As a company that started from the voice codec level and grew to become a complete one-stop-shop for VoIP communication, we understand the concerns and issues enterprises are experiencing when moving to an all IP environment. For that reason, we have developed a Session Experience Manager solution (SEM) that is part of our One Voice offering.
Feel free to relate your stories about quality issues and conclusions from these cases in comments to this post.