Email a colleague    

March 2014

From Alarms to Analytics: The Paradigm Shift in Service Assurance

From Alarms to Analytics: The Paradigm Shift in Service Assurance

Longstanding thought patterns are hard to shake.  When you hear the word “assurance,” what’s the first thing that comes to mind?

For me, it’s the image of a 1980s AT&T NOC (Network Operations Center) — a massive control room with high ceilings.  There’s a huge network map on the wall, and two dozen techies are monitoring three CRT screens apiece as they respond to fault and performance alarms.

OK, how realistic is this NOC paradigm as a service assurance model for today’s IP-centric carnival?

In the circuit-based telephony world, watching the status of a few thousand network nodes was a satisfactory answer to assuring network health.  But today, the number of alarm points to monitor has scaled to the millions — and to the point of being ridiculous.  No operator has the money to constantly monitor the performance issues of individual smartphone users.

Folks, it’s time for operators to get real and invest more in proactive analytics than in reactive alarms.  What’s more, the gap between network planning and assurance has to be closed.  Why?  Because the cause of today’s customer experience problem is yesterday’s lack of good data for capacity planning.

Kelvin Hall, a self-professed data geek, big data architect, and one of SAS‘ lead consultants in network analytics, has firsthand experience of the severe problems network operations faces — and the big opportunities that await them through a greater mastery of analytics techniques.

Dan Baker: Kelvin, you are inside the network ops of some very large carriers.  What are you seeing?

Kelvin Hall: When I work with big network teams at telcos, it’s readily apparent that they are not used to working with big computer data.

Information is not the problem.  The network teams have big silos of data.  But it’s a matter of knowing what to do with it.  The network guys know how to keep the network running, but in many cases they are too busy rolling out LTE to really get excellent on the data side.

And by combining analytics with all that rich network data, well, there’s a gold mine of information there waiting to be discovered and exploited.

Why haven‘t the carriers taken action and implemented big data solutions already?

Well, in many ways, large carriers are their own worst enemy when comes to innovating within.  I’ve worked at more than one large telco, and I’ll tell you it’s much easier to get things done as an outside consultant

As an employee, you’re rather constrained in what you can do.  But as an outsider, you have the power to come in and tell them what they really need to hear — and not what they want to hear.

Other consultants I’ve worked with have noticed the same thing.  When they began to work as an employee of a large carrier, they were shocked at the lack of flexibility and got frustrated.

It also sounds like there’s a big mismatch between the knowledge resident at the carriers and the expertise they need.

The fast expansion of the data in the network brings its own set of problems.  In some places it’s doubling every nine months, but the equipment is not handling the growth.  And because it’s so new, the network teams don‘t recognize the problems they are having with these network elements.  Plus there are generally no alarms to notify them.

I see big mistakes being made.  For instance, one carrier I know was growing its network based on the customer’s home address.  And that seems silly considering you can use real data to figure out where you need to add capacity.

They were making big rollouts of LTE deployment, but eight of the 10 regions where they were adding capacity were prepaid areas that had no LTE handsets.  So they were growing the network where it wasn‘t needed!

So capacity planning is critical, and it requires that you triangulate three data streams: handset features, cell tower capability, and the service you give the customer.

Network Analytics Functions

Can you give us an example of how analytics can help in service assurance?

How about this one?  In one state within a mobile operator’s territory, they were experiencing a 5 percent drop call rate — very high.  But using analytics, we were able to trace that problem to a configuration change that affected the whole region.  The operator was able to turn that into a 0.5 percent drop call rate.

It was an issue they had in the mobile switching center (MSC) affecting many cells.

The problem is they couldn‘t take action on that using traditional service assurance tools.  Alarm triggers are too little, too late, and too broad to find what you’re looking for.  Instead you need analytics to sift through the data, look at the statistics and see how things are trending.

Another example: One particular software release of the iPhone happened to have a bunch of issues — but only with certain switches.  Only certain MSCs and SGSNs had trouble with it.

So imagine somebody in the NOC trying to address that issue.  The customer calls to complain, but where do I start?  I’ve got 60 devices out there.  You know something’s wrong, but where do you go?

But when you apply real analytics to it, you can turn that around and say: This cell is performing badly with a particular type of handset or whatever the case may be, and you can pinpoint the issues more precisely.

Where do you start when you come in and consult with network teams?

The first thing we usually look for is stranded assets.  And to do that you overlay usage on top of the network inventory.  And it’s not unusual to find 4 percent of their spending is sitting on network elements that were never turned up or the customer’s gone and they continue to be paying bills to a partner for leased facilities.

Also comparing their best-performing cells versus their worst-performing cells, you’re able to see where the problems are — not from alarm triggers or waiting on the traditional threshold alerts that they look at.

Instead, by looking at data from a customer’s perspective, you can readily see what’s performing badly and what’s performing well.

At another operator, we found 6 percent of mobile data was being dropped.

Wow, 6 percent is an incredible amount to lose.  Of course, depending on where that data was lost, it won‘t translate to a 6 percent revenue loss.

Yes, let me qualify that a bit.  Say the user has a 2 gig limit on data usage, well that’s included with their plan.  But with 6 percent of the data usage not being recorded, a lot of customers were going over their limit and not being charged for it.

What sort of technology do you bring in to help?

Well, being a data geek, I do lots of coding and am pretty adept at SQL.  But actually SAS‘ newest product, SAS Visual Analytics, has a complete drag-and-drop capability that network guys really like, so the user no longer needs to do coding.

Another technology we bring in is Hadoop.  You know, at telcos, it’s not unusual to get 10 billion rows a day off of one network element.  Trouble is, it’s expensive to put those 10 billion rows in a database.

But with Hadoop, you can store huge amounts of data at low cost — and it can be either structured or unstructured data.  You’re talking about a cost savings of 10 percent to 20 percent of traditional database costs.

Now Hadoop is not a database, and you can‘t access it as such.  So at SAS we help the operator use Hadoop properly, and when you do that, the access speed can be just as fast as using a database.

Kelvin, thanks for the great real-world examples.

SAS Institute is one of 40 key analytics suppliers profiled in TRI’s new market research report, The Telecom Analytics & Big Data Solutions Market.

Copyright 2014 Black Swan Telecom Journal

Kelvin Hall

Kelvin Hall

Kelvin D.  Hall, Principal Telecommunications Consultant at SAS, has served the telecom industry for over 30 years.  Most recently Kelvin served as the director of analytics servicing AT&T network engineering.  A pioneer in network data algorithms, his approach pinpoints detailed network problems through a new analytical approach exposing problems that were not visible previously.   Contact Kelvin via

Black Swan Solution Guides & Papers

cSwans of a Feather

  • A Big Data Starter Kit in the Cloud: How Smaller Operators Can Get Rolling with Advanced Analytics interview with Ryan Guthrie — Medium to small operators know “big data” is out there alright, but technical staffing and cost issues have held them back from implementing it.  This interview discusses the advantages of moving advanced analytics to the cloud where operators can get up and running faster and at lower cost.
  • Telecoms Failing Badly in CAPEX: The Desperate Need for Asset Management & Financial Visibility interview with Ashwin Chalapathy — A 2012 PwC report put the telecom industry on the operating table, opened the patient up, and discovered a malignant cancer: poor network CAPEX management, a problem that puts telecoms in grave financial risk.  In this interview, a supplier of network analytics solutions provides greater detail on the problem and lays out its prescription for deeper asset management, capacity planning and data integrity checks.
  • History Repeats: The Big Data Revolution in Telecom Service Assurance interview with Olav Tjelflaat — The lessons of telecom software history teach that new networks and unforeseen industry developments have an uncanny knack for disrupting business plans.  A service assurance incumbent reveals its strategy for becoming a leader in the emerging network analytics and assurance market.
  • From Alarms to Analytics: The Paradigm Shift in Service Assurance interview with Kelvin Hall — In a telecom world with millions of smart devices, the service assurance solutions of yesteryear are not getting the job done.  So alarm-heavy assurance is now shifting to big data solutions that deliver visual, multi-layered, and fine-grained views of network issues.  A data architect who works at large carriers provides an inside view of the key service provider problems driving this analytics shift.
  • The Shrink-Wrapped Search Engine: Why It’s Becoming a Vital Tool in Telecom Analytics interview with Tapan Bhatt — Google invented low cost, big data computing with its distributed search engine that lives in mammoth data centers populated with thousands of commodity CPUs and disks.  Now search engine technology is available as “shrink wrapped” enterprise software.  This article explains how this new technology is solving telecom analytics problems large and small.
  • Harvesting Big Data Riches in Retailer Partnering, Actionable CEM & Network Optimization interview with Oded Ringer — In the analytics market there’s plenty of room for small solution firms to add value through a turnkey service or cloud/licensed solution.  But what about large services firms: where do they play?  In this article you’ll learn how a global services giant leverages data of different types to help telcos: monetize retail partnerships, optimize networks, and make CEM actionable.
  • Raising a Telco’s Value in the Digital Ecosystem: One Use Case at a Time interview with Jonathon Gordon — The speed of telecom innovation is forcing software vendors to radically adapt and transform their business models.  This article shows how a deep packet inspection company has  expanded into revenue generation, particularly  for mobile operators.  It offers a broad palette of value-adding use cases from video caching and parental controls to application-charging and DDoS security protection.
  • Radio Access Network Data: Why It’s Become An Immensely Useful Analytics Source interview with Neil Coleman — It’s hard to overstate the importance of Radio Access Network (RAN) analytics to a mobile operator’s business these days.  This article explains why the RAN data, which lives in the air interface between the base station and the handset --  can be used for a business benefit in network optimization and customer experience.
  • Analytics Biology: The Power of Evolving to New Data Sources and Intelligence Gathering Methods interview with Paul Morrissey — Data warehouses create great value, yet it’s now time to let loose non-traditional big data platforms that create value in countless pockets of operational efficiency that have yet to be fully explored.  This article explains why telecoms must expand their analytics horizons and bring on all sorts of new data sources and novel intelligence gathering techniques.
  • Connecting B/OSS Silos and Linking Revenue Analytics with the Customer Experience by Anssi Tauriainen — Customer experience analytics is a complex task that flexes B/OSS data to link the customer’s network experience and actions to improve it and drive greater revenue.  In this article, you’ll gain an understanding of how anayltics data needs to be managed across various customer life cycle stages and why it’s tailored for six specific user groups at the operator.
  • Meeting the OTT Video Challenge: Real-Time, Fine-Grain Bandwidth Monitoring for Cable Operators interview with Mark Trudeau — Cable operators in North America are being overwhelmed by the surge in video and audio traffic.  In this article you’ll learn how Multi Service Operators (MSOs) are now monitoring their traffic to make critical decisions to protect QoS service and monetize bandwidth.  Also featured is expert perspective on trends in: network policy; bandwidth caps; and  customer care issues.
  • LTE Analytics:  Learning New Rules in Real-Time Network Intelligence, Roaming and Customer Assurance interview with Martin Guilfoyle — LTE is telecom’s latest technology darling, and this article goes beyond the network jargon, to explain the momentous changes LTE brings.  The interview delves into the marriage of IMS, high QoS service delivery via IPX, real-time intelligence and roaming services, plus the new customer assurance hooks that LTE enables.

Related Articles