© 2022 Black Swan Telecom Journal | • | protecting and growing a robust communications business | • a service of |
Longstanding thought patterns are hard to shake. When you hear the word “assurance,” what’s the first thing that comes to mind?
For me, it’s the image of a 1980s AT&T NOC (Network Operations Center) — a massive control room with high ceilings. There’s a huge network map on the wall, and two dozen techies are monitoring three CRT screens apiece as they respond to fault and performance alarms.
OK, how realistic is this NOC paradigm as a service assurance model for today’s IP-centric carnival?
In the circuit-based telephony world, watching the status of a few thousand network nodes was a satisfactory answer to assuring network health. But today, the number of alarm points to monitor has scaled to the millions — and to the point of being ridiculous. No operator has the money to constantly monitor the performance issues of individual smartphone users.
Folks, it’s time for operators to get real and invest more in proactive analytics than in reactive alarms. What’s more, the gap between network planning and assurance has to be closed. Why? Because the cause of today’s customer experience problem is yesterday’s lack of good data for capacity planning.
Kelvin Hall, a self-professed data geek, big data architect, and one of SAS‘ lead consultants in network analytics, has firsthand experience of the severe problems network operations faces — and the big opportunities that await them through a greater mastery of analytics techniques.
Dan Baker: Kelvin, you are inside the network ops of some very large carriers. What are you seeing? |
Kelvin Hall: When I work with big network teams at telcos, it’s readily apparent that they are not used to working with big computer data.
Information is not the problem. The network teams have big silos of data. But it’s a matter of knowing what to do with it. The network guys know how to keep the network running, but in many cases they are too busy rolling out LTE to really get excellent on the data side.
And by combining analytics with all that rich network data, well, there’s a gold mine of information there waiting to be discovered and exploited.
Why haven‘t the carriers taken action and implemented big data solutions already? |
Well, in many ways, large carriers are their own worst enemy when comes to innovating within. I’ve worked at more than one large telco, and I’ll tell you it’s much easier to get things done as an outside consultant
As an employee, you’re rather constrained in what you can do. But as an outsider, you have the power to come in and tell them what they really need to hear — and not what they want to hear.
Other consultants I’ve worked with have noticed the same thing. When they began to work as an employee of a large carrier, they were shocked at the lack of flexibility and got frustrated.
It also sounds like there’s a big mismatch between the knowledge resident at the carriers and the expertise they need. |
The fast expansion of the data in the network brings its own set of problems. In some places it’s doubling every nine months, but the equipment is not handling the growth. And because it’s so new, the network teams don‘t recognize the problems they are having with these network elements. Plus there are generally no alarms to notify them.
I see big mistakes being made. For instance, one carrier I know was growing its network based on the customer’s home address. And that seems silly considering you can use real data to figure out where you need to add capacity.
They were making big rollouts of LTE deployment, but eight of the 10 regions where they were adding capacity were prepaid areas that had no LTE handsets. So they were growing the network where it wasn‘t needed!
So capacity planning is critical, and it requires that you triangulate three data streams: handset features, cell tower capability, and the service you give the customer.
Can you give us an example of how analytics can help in service assurance? |
How about this one? In one state within a mobile operator’s territory, they were experiencing a 5 percent drop call rate — very high. But using analytics, we were able to trace that problem to a configuration change that affected the whole region. The operator was able to turn that into a 0.5 percent drop call rate.
It was an issue they had in the mobile switching center (MSC) affecting many cells.
The problem is they couldn‘t take action on that using traditional service assurance tools. Alarm triggers are too little, too late, and too broad to find what you’re looking for. Instead you need analytics to sift through the data, look at the statistics and see how things are trending.
Another example: One particular software release of the iPhone happened to have a bunch of issues — but only with certain switches. Only certain MSCs and SGSNs had trouble with it.
So imagine somebody in the NOC trying to address that issue. The customer calls to complain, but where do I start? I’ve got 60 devices out there. You know something’s wrong, but where do you go?
But when you apply real analytics to it, you can turn that around and say: This cell is performing badly with a particular type of handset or whatever the case may be, and you can pinpoint the issues more precisely.
Where do you start when you come in and consult with network teams? |
The first thing we usually look for is stranded assets. And to do that you overlay usage on top of the network inventory. And it’s not unusual to find 4 percent of their spending is sitting on network elements that were never turned up or the customer’s gone and they continue to be paying bills to a partner for leased facilities.
Also comparing their best-performing cells versus their worst-performing cells, you’re able to see where the problems are — not from alarm triggers or waiting on the traditional threshold alerts that they look at.
Instead, by looking at data from a customer’s perspective, you can readily see what’s performing badly and what’s performing well.
At another operator, we found 6 percent of mobile data was being dropped.
Wow, 6 percent is an incredible amount to lose. Of course, depending on where that data was lost, it won‘t translate to a 6 percent revenue loss. |
Yes, let me qualify that a bit. Say the user has a 2 gig limit on data usage, well that’s included with their plan. But with 6 percent of the data usage not being recorded, a lot of customers were going over their limit and not being charged for it.
What sort of technology do you bring in to help? |
Well, being a data geek, I do lots of coding and am pretty adept at SQL. But actually SAS‘ newest product, SAS Visual Analytics, has a complete drag-and-drop capability that network guys really like, so the user no longer needs to do coding.
Another technology we bring in is Hadoop. You know, at telcos, it’s not unusual to get 10 billion rows a day off of one network element. Trouble is, it’s expensive to put those 10 billion rows in a database.
But with Hadoop, you can store huge amounts of data at low cost — and it can be either structured or unstructured data. You’re talking about a cost savings of 10 percent to 20 percent of traditional database costs.
Now Hadoop is not a database, and you can‘t access it as such. So at SAS we help the operator use Hadoop properly, and when you do that, the access speed can be just as fast as using a database.
Kelvin, thanks for the great real-world examples.
SAS Institute is one of 40 key analytics suppliers profiled in TRI’s new market research report, The Telecom Analytics & Big Data Solutions Market.
Copyright 2014 Black Swan Telecom Journal