Email a colleague    

August 2013

When Big Data is Too Big: The Value of Real-Time Filtering and Formatting

When Big Data is Too Big: The Value of Real-Time Filtering and Formatting

James Heath, a former colleague now working at Sprint, recently gave me an intriguing definition of “Big Data.” He said, “You know your data is ‘big’ when the cheapest and best way to transport it is to carry it on an airplane.”

It’s a good one, James.  Maybe the only thing I would add is the dimension of time.  If you need your data analyzed five minutes after it’s left a router, an airplane trip doesn‘t really help.  And that’s quite common in telecom: the volume of network traffic is often so huge it outstrips the ability of the analytics engine to process it fast enough.

Rick Aguirre, CEO of Dallas-based Cirries Technologies, is a guy who knows something about managing these extra-extra large data sets.  His company has made a business out of helping carriers filter their big data down to a size that can be analyzed in a matter of seconds, minutes, and hours.

Dan Baker: Rick, tell us a little about your firm.  When did you get rolling and what’s your mission in the “big data” space?

Rick Aguirre: Sure, Dan.  We are a small, but rapidly expanding company and although “big data” is our business, we actually don‘t offer an analytics application.  Instead we built a real-time mediation and filtering engine that’s very fast and can quickly get data in any format the client wants.  In short, we feed a lot of analytic applications with the right data at the right time.

Thankfully we survived the deepest trough of the recession in 2008/2009, then from 2010 onward we’ve seen fast and profitable growth.  Companies like Nokia-Siemens and RedHat have come to us for help and we also serve -- through resellers -- client in places like the Philippines, Mexico and Canada.  Some of the large U.S. carriers, of course, are also important clients and we sell to them direct.

How does your system work and how do you add value?

It starts when an operator’s system or analytics platform has a limited throughput or needs to have the data presented in a different manner.  Say you’ve captured data off the routers at the rate of 600,000 events a second, but the analytics engine can‘t run more than 15,000 events.  How do you get the data down to a volume you can manage?

The answer is our Maestro Data Controller and in this case, filtering the data, one of many data manipulation functions we provide to our carrier clients.

So at the entry point of our real-time data parsing engine, we have “resource adapters” that can collect and transform any data structure -- a protocol, a syslog, a flat file.  We then process that data in real-time where we do filtering, mediation, enrichment correlation, or even apply policies to it.  In the end, we transform the network data into enhanced information or what we call smart data.

I like to say, “Give us all your data and we’ll filter and transform it so you’re only getting the relevant data.”

And the secret to doing that is the engine we’ve developed that can process a million events a second on a single CPU with a quad-core processor.  Combined with Maestro’s distributed data collection capabilities, the results are very high at low cost.  For one Carrier Grade Network Address Translation (CGNAT) application, Cirries demonstrated 1 million events a second, while the competitive alternatives were at 15,000 to 250,000 events per second.

Rick, can you give us a feel for some of the use cases for your platform?

Dan, there are a variety of use cases.  Let me walk you through some of the more interesting and important ones.

Data Monetization through a Third Party

The attitude of carriers today is, “Hey, I’ve sunk a ton of money in my network, so give me a way to generate more revenue from that investment -- even if it’s through non-traditional channels.”

This is the goal of what some are calling “data monetization,” in which a carrier sells real-time data to retailers and marketing agencies because the information is relevant and timely for a certain set of mobile subs.

Here in Dallas, there’s a Dick’s Sporting Goods store off the major interstate and tens of thousands of commuters pass that store every day on the way home from work.

Now a wireless carrier serving Dallas knows when subscribers are on that highway because the cell sites are gathering that intelligence.  The trouble: you can‘t afford to look at all those cell site records.  You need a way to drastically filter down the information and put it in a format the wireless carrier can use.

So our system collects, performs the relevant functions, and then publishes the information (targeted mobile subscribers) in a real-time queue saying a prospect attractive to a retailer like Dick’s Sporting Goods is coming into the cell.  So if Dick’s has a promotion going on, it can extract the information and send a discount coupon to the subscriber’s phone.

Here’s the thing: the analytics system supporting that use case could be simple or very sophisticated.  Either way, you can‘t enable it unless you can quickly extract and filter the big traffic down to a size you can get your arms around.

Network Optimization & Migration

Another very common use case is network optimization and migration.

Now, the deployment of new mobile networks like LTE is causing the big operators to migrate traffic to handle the huge growth in mobile data.  But mobile backhaul is only one example.  There’s plenty of optimizing going on in backbone and MPLS networks too.

The data is often readily available.  And the problem is the same: the data needs to be pared down to a workable volume and formatted to enable good decision and predict trends.

Often the data from their core router tells them they are running at 95 percent capacity, but figuring out where capacity is needed at the edge routers is a challenge.  So we help them collect that data and report on where the traffic is heavy in real-time.

In some cases, the routers are exhausted and they don‘t even have ports available to generate records to monitor capacity.  In cases like that we generate records off the raw packets.  In that way, the carrier doesn’t need to disturb the network or buy new equipment.

Luckily our development system is flexible enough to alter things rapidly and recompile.  We don‘t have to hard code all this.

Someone could come into us and say, “We want to look at A, B, and C and here are our syslogs.” Well, in a week we can come back and show them a prototype system and the performance they could get out of that system.

Achieving Balance in Network Peering

Rather than doing a lot of buying and selling of interconnect traffic, the large operators have peering relationships where they agree to trade traffic in a balanced way.

So there’s a need to understand where calls and data are coming from and determine how far they are carrying bits of information.  If I carried another operator’s data from Los Angeles to New York, the other operator will transport data from other points to maintain parity in the relationship.

We don‘t run analytics ourselves, but we can aggregate traffic based on a policy, generate a report, and throw up a dashboard or graph showing the current status of peering.

Saving Large Volumes to an Unstructured Data Store

One other important use case is the need to store collect and store massive amounts of data for a somewhat indefinite period.

For instance, regulators require large carriers to store tons of data for law enforcement to comply with CALEA, the Communications Assistance for Law Enforcement Act.

Well, we help the operator store that data in an unstructured data store, say a RedHat or Hadoop format.  Then the carrier uses unstructured search to pull that data out whenever it needs to.

Rick, it sounds like your filtering is a much needed service.  Tell me, do you partner with any of the analytics software vendors today?

We’ve explored OEM relationships with a few analytics firms, but the only software alliance we have today is with RedHat.

But I think OEM business makes a lot of sense.  We’d definitely like to find a way to team with analytics companies and be their front end.  Each party contributes what it’s good at.  The analytics firm has its algorithms and a knowledge of how to implement marketing programs.  Then, firms like Cirries have the pre-processing knowhow that would allow the analytics vendor to expand the number of big data use cases it serves.

I also think our approach can be a cost saver for carriers, especially those who are creating or buying analytics solutions in silos.  We can enable a more horizontal way of acquiring and mediating the data.  The analytics applications on top would no longer need to be concerned about real-time filtering and formatting because one system down below manages all that.

Copyright 2013 Black Swan Telecom Journal

Rick Aguirre

Rick Aguirre

Rick Aguirre, a telecommunications industry veteran who specializes in bringing industry-changing technologies to market, is president of Cirries Technologies, Inc. He may be contacted through the company’s website,

Black Swan Solution Guides & Papers

cSwans of a Feather

  • Pumping Crude Intelligence: How to Manage Telco Big Data before you Monetize It interview with Thomas Vasen — Mobile telecoms are eager to sell their time- and location-sensitive big data to marketers in other industries.  But a key bottleneck exists: finding a way to efficiently access and manage the huge data voluimes involved.  In this article, a supplier of mediation software explains his firm’s approach to tackling the problem.
  • When Big Data is Too Big: The Value of Real-Time Filtering and Formatting interview with Rick Aguirre — The volume of telecom network traffic is often so huge it outstrips the ability of even “big data” engines to analyze it fast enough.  In this article, you’ll learn about a business that filters and formats very large data sets and delivers the relevant data for applications like: data monetization, network optimization, network peering monitoring, and unstructured data storage.
  • Crusaders Clash: The Battle for Control of Telco 2.0 Service Delivery, Billing & Policy interview with Stephen Rickaby — Mobile is still reeling from the shock of being taken out of the driver’s seat in terms of services offered on the handset.  But  will telecoms make a services come-back?  This interview with an expert in the thick of Telco 2.0 transformation action discusses the strategic issues involved and also analyzes Oracle’s recent moves to acquire Acme Packet and Tekelec.
  • Telecom Mediation: Time to Move Back into the Limelight? by Dan Baker — While mediation technology remains crucial to assurance applications, solution vendors have been relatively quiet in recent years.  This article points to reasons why the mediation market may soon get more active.  Among the factors discussed are: consolidation, big data, group merger activity and the offload of mobile transactions to cheaper platforms.
  • Putting a Database at the ‘Nexus’ of Service and Revenue Assurance interview with Michael Olbrich — Closing the B/OSS gap — getting network-facing OSS systems to communicate with customer-facing business systems is one of telecom’s greatest challenges.  This article shows the virtue of unifying B/OSS data and  processes under a single database.  Also discussed is the issue of vendor management and choosing trusted supplier to grow with.

Related Articles