© 2022 Black Swan Telecom Journal | • | protecting and growing a robust communications business | • a service of |
Email a colleague |
September 2011
Are network operators prepared for the Pandora’s box of security and performance risks created by an all-IP network? If not, they could be getting all the ills of the Internet with few of its benefits.
But why choose IP if it has so many weaknesses? Initially data networks were based on ATM or Frame Relay, but the popularity of the Internet drove operators to IP. IP works well for its intended purpose, but about five years ago the introduction of time-sensitive services, like IPTV and VoIP, showed the limitations of IP technology. Of course there are all sorts of features in ATM and TDM networks to handle real-time communications that are missing in IP.
Happily, the weakness in IP traffic management is being addressed by deep packet inspection (DPI). By peering deep inside the packet, you can understand what type of traffic it is and how it ought to be handled. Similarly, the weakness in IP service assurance was addressed by Network Behavioral Analysis (NBA), which is the gathering and analyzing of traffic metrics at various points in the network to better detect security and performance issues.
Steven Shalita, the VP of Marketing at NetScout, is with us today to talk about network behavioral analysis (NBA), and why NBA is becoming necessary to provide security and service assurance in IP networks. Steve has been with NetScout for three years with a two-year interlude at Alcatel-Lucent’s IP Division and time at Redback and Cisco. Steve delves into how IP complicates service assurance and service delivery, and shows an approach to safeguarding the end-user experience and the service provider’s profitability.
James Heath: Steve, NetScout has been providing service assurance and traffic management solutions for a long time to service providers and enterprises. How do these two different markets approach network security in your experience? |
Steve Shalita: We often have enterprises use our products in cyber defense, but it’s less common among our service provider customers. Their interest in us is centered on the user experience, through service quality and service management. And this is because in the past the service providers felt there was little chance that their network was threatened and so their security efforts concentrated on selling enterprises ways to protect data-in-motion, like a VPN.
With service providers moving away from the traditional TDM technology to the much more dynamic environment that IP offers, traffic patterns will be more unpredictable and the number of different terminal devices and applications will be greater. So, just as the threat risk increases, traffic flow visibility decreases.
But telecom has been doing IP networking for at least a decade, so what’s different? What are the issues that are causing fits for the service providers? |
In a way, service providers have had it easy in the past because they were never forced to do a full cut-over to IP. They could upgrade their systems region-by-region.
With LTE it’s different. Wireless companies always had the advantage of a strong standards group, 3GPP, which provided a stock network architecture, so every LTE network will be all-IP. This rollout is having a revolutionary impact on the perceived need for service monitoring.
You see, IP networks need to be instrumented quite differently from TDM networks. With TDM, all types of links — SONET/SDH, ATM, etc. — were provisioned between the connection points. So to monitor TDM, monitoring the signaling channel was sufficient to service-assure voice in 2G, 3G and circuit-switched networks. However, in the IP environment, monitoring the signaling is no longer enough. You must also turn your attention to the data plane.
Now the data plane is really hard to analyze, not only from a capacity and volume point of view but also in terms of extracting information and understanding the dynamics of an IP session. The stateless data sessions of Web surfing are a great example. Every time you click a new Web page, a new connection is made, and each one requires authentication, so you’re constantly hitting the DNS servers, the AAA servers, the wireless HLR and similar devices.
The other complication for wireless telecoms is the growing popularity of Wi-Fi hotspots and femtocells. Both use third-party networks to carry wireless traffic back to the mobile core. This lack of visibility all the way out to the handset is a big management concern. At least with NetScout you can see the traffic coming in and can classify it.
You are not a traditional security vendor as you used NBA initially to provide service assurance and then moved into security. What advantages do you provide over others who took the route from cyber defense space? |
I think the simple answer to that is investment leverage. Much of the technology required to do security-specific anomaly detection is already resident in our service-assurance solution. And because we can leverage our skill of finding deviations and abnormalities in service traffic, it saves a service provider the expense of buying a purpose-built security solution.
Most of our telecom customers deploy our nGenius Infinistream appliance because it not only gives them our Adaptive Session Intelligence (ASI) capability — a granular transaction session-oriented metadata — but also an ability to store packets for historical analysis. Having the nGenius Infinistream store those packets gives them the ability to reconstruct a data conversation as if it were happening in real-time and to forensically analyze it for both service delivery and security anomalies.
So the investment made to assure service delivery can be leveraged to get the incremental benefit of security visibility.
The traffic volumes of service provider networks are usually huge so you obviously can’t monitor everything. Are there some rules of thumb or standard operating procedures that service providers follow to prioritize what needs monitoring? |
For a mobile network the No. 1 location of problem generation is DNS, whether it be DNS flooding or other types of performance issues. So operators typically start in that area, in the authentication or federation layer, which includes DNS, the AAA server and the HLR.
The next place they tend to look at is the mobile core, the common gateways between the radio access network, (RAN), and the voice or data transport networks. Because all traffic is flowing through it, by monitoring these gateways in the mobile core you’re seeing all elements of a conversation. You’re seeing the protocols, the setup, data and voice traffic and the individual subscribers.
The next priority is to expand to the data center where applications are hosted, then to the apps store where services or applications are delivered to the user. From there they tend to push out to the Radio Access Network (RAN) and its links. So each point they cover will give them another increment of visibility.
As for monitoring the cell-site backhauls or out at the RNCs, BSCs, and NodeBs, that isn’t as prevalent today because of cost, and because it’s unnecessary. You get a great amount of visibility just by monitoring the mobile core. You have the granularity of seeing down to the cell site, down to the subscriber. So a typical deployment footprint covers these various connection points and the core.
Obviously, monitoring of the network traffic can turn up two kinds of anomalies: a security-related anomaly such as a bot or infection, and a performance anomaly — both of which could threaten an outage. Can you give us some examples of such outages that you’ve detected? |
Carriers are pretty guarded about sharing information like this and unfortunately all my anecdotes would be too specific and identify a carrier. Yet if you look at the most spectacular telecom outages as example — and I am not saying we detected any of them — these outages at NTT, AT&T and Verizon all started out as little things that could have been detected and averted.
These problems manifested themselves over time, which could have been as simple as a route flap or a DNS problem and eventually escalated to a complete outage. Whether that’s security-related or performance-related, whether it’s intentional or unintentional, our system would detect those deviations very early on before users or large numbers of users are impacted.
Do you have any advice for service providers or enterprises on what they might do to look at their security problems and how NetScout is really able to help? |
The security risks are real and they threaten service providers’ abilities to provide service assurance. Service-assurance risk is compounded by IP traffic patterns and the dramatic increase in traffic volume brought by broadband access. Also, simultaneously, users are becoming less tolerant of bad service yet they want to use more devices running more applications.
It’s becoming vitally important to make the proper investment to address those challenges so they can improve their situational awareness. NBA, the analysis of the information gained by peering deep inside packets at certain points in the network, is an essential way to address internal network security and network performance.
But IP packet-flow visibility provided by NBA is not the end in itself. It is the means to the end. IP networks packet analysis is the only way to extract the information needed to manage your traffic. That’s the strength of NBA. We’re good at it and that’s been the key to our success.
This article first appeared in Billing and OSS World.
Copyright 2011 Black Swan Telecom Journal