© 2022 Black Swan Telecom Journal | • | protecting and growing a robust communications business | • a service of |
The success of a Major League baseball franchise today is built around one or two multi-million dollar sluggers --a David Ortiz or Albert Pujols.
That’s a theory of mine. But as a casual fan of baseball, what do I know? The truth is that baseball has become yet another analytic science, and the data proves that “coach potato experts” like me are often way off the mark.
Fortunately I ran into Peter Mueller, a student of the “grand old game” and the CTO of ATS, a boutique analytics firm in New Jersey serving a few Tier 1 telcos in North America.
In my interview with Peter, he walks us through his own big data/cloud “epiphany” then lays out a couple of great examples of how the analytics technology benefits both the telco customer and the big data solution vendor.
Dan Baker: Peter, what’s your take on my baseball theory? |
Peter Mueller: Dan, I learned a lot reading Moneyball by Michael Lewis, a book that has since become a movie. That book shed light on new baseball strategies made possible by analytics.
People used to measure a ball player’s worth by simple measures like batting average, and RBIs (runs batted in). But new statistics like on-base percentage and slugging percentage have proven a much greater predictor of a player or team’s success.
Actually, the money the players make is correlated more with the income their team makes from selling ballpark seats, television rights, and team merchandise. And if you look back to 2003, the Florida Marlins beat the New York Yankees 4 games to 2 in the World Series. And yet Marlins‘ player salaries were only one third of the Yankee payroll.
Very interesting. So what about analytics in telecom? As CTO at ATS you’ve been in an ideal position to see this big data and analytics evolve. How do you see this trend breaking with the past? |
Dan, I think it’s realizing you’re no longer limited to the Oracle database way of doing things. The Oracle database was never that good at handling very large volumes of data in its native format. To be successful in this area, you had to spend a lot of time converting and inserting your data into ever-costlier hardware and software configurations.
Once you got there, you were limited to a Structured Query Language (SQL) view of the world. That’s fine when you need to manage transactions such as bank records where there are a lot of reads, writes, and locking of records. When you take money out of an ATM machine, the new account balance needs to appear instantly on the ATM display. But, the new era of analytics calls for a richer set of analytic tools.
There is a world of difference between banking transactions and telecom CDRs. CDRs are a one way ticket: nobody is going back to the switch to rewrite something. Not only that: CDRs come at you in huge volumes, like getting blasted with a firehose.
Yes, the rise of Teradata in the 1990s was in direct response to this dilemma. Teradata was a database designed for analytics and reporting: a new concept at the time. |
Yes, Teradata with its parallel processing was one of the first to crack the problem and eat away at the Oracle behemoth. Today, we’re seeing Open Source and the public cloud pick up on that trend.
Another issue is that the data coming off the switches is usually not plain ASCII text. The Bellcore AMA format, for instance, is very hard to read because it’s in binary format. In GSM too, you also get binary encoded records. So all these need to be translated and reformatted.
So it’s here where new tools and hardware have revolutionized everything. With Hadoop and MapReduce, you can set up your own streaming jobs to run through these things. Moore’s Law is still working, too: commodity processing boxes have allowed IT to take on mass processing jobs they could never afford to do before.
The IT groups at our operator customers have also became very clever at parsing their own records, and switch vendors have started outputting their records in a more useful way. Together, that sets the table for more powerful analytics engines to come in and start making sense of all this “big data”.
How did your shop actually get started using big data? |
Well, our first use of Hadoop was a duct tape and glue sort of thing. We hijacked some old servers and were amazed to see Hadoop working in our own data center, running fairly large analyses on commodity hardware. From there, we made the jump to public cloud providers and greatly lowered our costs. Your readers should definitely explore that.
At ATS, we’ve kind of settled on Amazon and Google Cloud Compute. We like the more durable storage you get on Amazon’s S3 platform, plus it’s nice to rent rather than own your computing resources. If a medium-sized IT shop were to set up a 20-node Hadoop cluster in-house, they’d find it expensive and very tough to configure. But with the cloud, you can expand that 20 node cluster to 200 nodes, say, once a month when you need it for 3 hours.
So there’s some truth to those rumors about the “death of the server”. Computing power has become a utility: it no longer needs to be in an air conditioned room down the hall.
Weren‘t you reluctant to port your code to Hadoop, given you’ve developed in other formats for decades? |
Actually, no. Our code ports pretty well into Hadoop and that speaks highly for Hadoop rather than our code frankly. The trick is to embrace the MapReduce framework, which isn‘t much more than a specific way of counting things.
For example, if you had twenty decks of shuffled cards and a room full of 20 people, what would be the best way to find and count all the aces? Your programmer might come up with one way, but MapReduce forces you to split the job into manageable chunks and have each node report back what they found.
So, the task for any software engineer looking to retrofit their software for Hadoop is to split up the smaller tasks (“I found 3 aces in this pile, 6 in this one”) , from the final tabulation (“It turns out there are 80 aces in total.”). Then, you let Hadoop manage the intricacies of setting up and babysitting all the jobs and tasks. And you’re done.
Peter, how do you use big data in your projects with telecom clients? |
Let me give you an example. A wireless carrier came to us and said, “We’ve got a retention problem and we need you help identify who is likely to churn. We’ll supply you with any data set you need to look at: CDRs, billing data, plus the anonymized list of customers who disconnected with us in the last 6 months.”
OK fine, with that clear mission we put out thinking hats on and came up with a series of questions to explore:
That last issue about “dropped calls” is a biggie. Wireless operators wonder whether they should sign-off on multi-billion dollar CAPEX spends. The engineers say that erecting more wireless towers will reduce dropped calls. But in the big scheme of things, does it really matter?
So, Dan, let me put you on the spot and ask: what’s your gut feel? In your opinion, is the number of dropped calls a reliable predictor of churn?
I would definitely say yes: Dropped calls should be a reliable predictor of churn. |
You said, “yes”, and in fact the executives at the wireless carrier agreed with you. But surprisingly, the answer is “no”. By running the big data, we found dropped calls were not a predictor, at least for the carrier customer we worked with.
We’re still not sure why. Maybe it says a lot about what we’ve come to expect from our wireless phones. People don‘t expect the grass to be greener on the other side of the fence.
Now, conversely your intuition about things that cause churn may turn out to be correct. For instance, “right-sizing” your customers turns out to be very important.
By right-sizing, I mean reaching out to customers who are on the wrong plan. So if I have you on a 4 Gigabyte plan for $120 a month, that’s a problem if you are only using 1 Gigabyte a month, because you are far more likely to be persuaded by an ad from another carrier to switch. The gap between what you are paying and what you are using is just too large.
But you need big data to prove that case, because CFOs don‘t like the idea of proactively reaching out to high profitable customers and offering them a cheaper plan when it’s too seductive to just view these customers as being ’pure profit‘.
You have to show them (again, with cold, hard data) that the lifetime value of a customer drops the more they are over-paying. On the flipside, the same types of analysis can show you which customers are ripe for an upsell.
Peter, thanks for these great stories and perspective. So to sum it all up, what’s the key virtue of becoming a big data shop? |
To me, Dan, the biggest benefit of big data technology is that it reduces the cost of failure.
The truth is that most new ideas -- in revenue assurance or network performance or churn prediction -- start with a hunch. And to go from a hunch to a proof of concept takes work because you need to test that concept and find an efficient way to test it.
In revenue assurance, you can float all kinds of ideas of how people are losing revenue, but most of those ideas don‘t pan out. So it’s very much a game of numbers. Yet, in baseball, even if you bat .300, you’re considered very good.
And customers love us for that virtue. They bring us in on nothing more than a fuzzy idea and we can run with it at fairly low cost. And we can come back and say, “You know what? We looked at a billion of your records, and this idea you had about predicting churn seems to make sense in this area”, but not in 10 other areas. Once we separate the wheat from the chaff, we can move quickly to implement.
NOTE: ATS is one of 41 vendors profiled in TRI's 574-page research report: The Telecom Analytics & Big Data Solutions Market. |
Copyright 2014 Black Swan Telecom Journal