Email a colleague    

August 2013

The Shrink-Wrapped Search Engine: Why It’s Becoming a Vital Tool in Telecom Analytics

The Shrink-Wrapped Search Engine: Why It’s Becoming a Vital Tool in Telecom Analytics

In its early days, Google faced a computing challenge of epic proportions: how to catalog the internet so it could profitably grow its web search and GoogleAds biz.

The mission required massive disk space and lightning fast response, and building a solution around standard commercial hardware and databases would have been way too costly.  So what did Google do?  It invented a globally distributed search engine that lives in many mammoth data centers, each populated with thousands of commodity CPUs and disks.

Fifteen years later, the same technology that drives Google and other search engines has been productized into software, basically shrink-wrapped search engines.  They are quickly becoming a standard tool in business analytics.

Splunk is a fast-growing success story in this new software category.  The firm IPOed in 2012 and now has 5,600 customers and $198 million in annual revenue (2012).

Now here to explain the Splunk business and explain how its product plays in the telecom industry is Tapan Bhatt, the firm’s senior director of solutions marketing.

Dan Baker: Tapan, it would be great if you could explain what your software solution is about.

Tapan Bhatt: Dan, when people say “big data”, they often think of Twitter or Facebook data, but at Splunk we consider social media as just one example of the broader category of machine data.  Machine data is everything coming off mobile devices, emails, web pages, security logs, and much more.  IDC estimates that 90% of the data an organization owns is machine data that lives outside relational databases.

So Splunk allows companies to have full access to that information for many purposes: analytics, compliance, troubleshooting, and security.  Splunk software is especially useful where the data is either too costly to integrate in relational databases or the data needs to be accessed in real time.

Splunk is completely agnostic to data source type and data that varies widely so even with various data sources and types you can still make sense of the data.  Let’s take an example of a customer escalation issue at a telecom company:

You start off with multiple data sources, say, ordering processing, middleware, care/IVR data, and Twitter feeds.  At first, the data may not make sense — some of the data is simply blocks of text.  But by picking up a customer ID here and a product ID there, you can correlate and trace the entire process of the telecom service.

Let’s say the customer tried to place an order, there was an error in the middleware that made the order fail.  The customer then calls customer service and is put on hold for 10 minutes, so the customer gets frustrated and complains on Twitter.  By analyzing the machine data, you can correlate the customer ID from the call all the way back to the original order and identify the middleware error that ultimately resulted in the tweet so you can address the issue.

This is an example of what’s possible with machine data.

What’s the technology behind Splunk?

Basically, the technology comes in three parts: the data collection, data indexing, and an analytics query language called the Splunk Search Processing Language.  The collection function pulls in data in any format, then the data indexing organizes the data for fast analysis using the Search Processing Language.  What’s really unique about doing analytics in Splunk is it is on-demand, a schema on-the-fly, versus the traditional slow and costly data management practice of normalizing, loading data into a warehouse for analytics.

The core technology is the ability to search and correlate data across many discrete datasets.  You can do reporting or alerting; you can create custom dashboards in the interface.  Then there are software development kits and APIs that allow third party developers to build things on top of Splunk.

One of the hardest things to wrap my mind around is the fact that the Splunk machine data is not stored in a database: the fields and rows of an Excel spreadsheet is a paradigm that it’s hard to shake.

Yes, underneath Splunk, the data is not stored in a relational database.  Think of Splunk as a highly efficient file system that lives across distributed computers and allows you to do real-time search and analytics.

Machine data is relatively unstructured.  Each item has a time stamp, but most everything else is unstructured.  Data warehouses and relational databases rely on a schema which means you need to understand the metadata structure.  With Splunk you extract the schema at search time after the data is indexed.  In Splunk you can have a data source available immediately because you are extracting the schema only after the data is indexed.

In a data warehouse you have connectors within an ETL (Extract Transform Load) process that periodically captures new records and adds them to the system.  Splunk has no such connectors.  Instead it uses Forwarders, which listen to data and appends the data to the file system automatically.

Now the other beauty of this approach is it saves money: you avoid the cost of database licenses and you can scale to terabyte size deployments linearly using commodity hardware.  Multiple machines are processing the data in parallel as the data volume increases.  This significantly reduces your hardware costs.

In a great many use cases, the Splunk approach compares very favorably with relational databases.  And when it comes to search and ad hoc correlations, Spunk delivers greater power than a database, especially since you can better ask questions of the data and you’re not constrained to a certain type of search.

And yet I understand you now allow users to pull data directly from relational databases.

Yes, four months ago we launched DBConnect and it’s been one of the hottest downloads off our website.

DBConnect enables Splunk users to enrich machine data with structured data in multiple databases.  So if you pull in data such as customer IDs, product IDs, or regions the customer lives in, those data are inserted in the machine data.  These lookups allow users to gain access to richer profile information.

What about customers and users in the telecom market?

Dan, we serve a broad range of industries in finance, government, media, telecom and others.  One of our largest installations is one of North America’s largest mercantile exchanges where Splunk monitors 25 million trades a day for security.

Our telecom customers include firms like Verizon, China Mobile, Telstra, CenturyLink, Taiwan Mobile, KDT, and NTT.  And those firms are using Splunk in a wide number of use cases — in security, quality of service monitoring, network capacity planning, and fraud analysis to name a few.

A good way to explain the possibilities in telecom is run through a quick use case in the mobile world, so why don‘t we do that.

Mobile Service Profitability & Optimization Case

We’re using a hypothetical mobile operator that is offering an unlimited song download service, and they are eager to analyze the usage of that service on their mobile devices to optimize the service and its profitability.  They connected three machine data sources to Splunk:

  • Radius authentication data is used to track log-in data to verify customers are accessing the right resources and services.
  • Web data is brought in to find out exactly what songs are being accessed.
  • Business Process Management (BPM) logs are a collection of logs and other transactional data pulled from the middleware stack generated from order management and billing applications.

Now correlating the data from these three very different machine data types is not a trivial exercise, yet doing so is a powerful feature of the Splunk engine.

For example, Radius authenticates and tracks the identity of a particular user in a particular IP session.  But what happens when the user logs in an hour later and starts a new session?  How do you group the webpages an individual user visited on a particular day?

Well, what you can do is ask Splunk to merge the Radius logs and the web logs to create a “transaction” and use that transaction to track user activity sequentially no matter how many sessions were started or websites were visited.

And once the transactions are created, you can search for specific events.  You can geographically map the BCM data for a particular time period to, say, track purchases of iPhones from a particular zip code and show the results on a Google map.

You can create a CEO-level dashboard in Splunk that tracks average revenue per user or the number of iPhone orders perhaps.  And you can also do external look ups, say to a reverse DNS lookup file allowing you to dramatically enrich the data in Splunk.  With DBConnect, you can bring in data directly from a relational database.

Finally you can distribute reports.  If you want, you can schedule a PDF file of a particular report to be distributed to the marketing group at 9:00 am every Monday.  Likewise, you can have the results dumped to CSV files for viewing in Excel.  Plus APIs in Splunk enable report access in third party analytics solutions.

Sounds like the solution is very flexible.  Tell me, how do you price Splunk?

We charge based on volume of data indexed per day.  Our pricing is tiered and starts at 500 Megabyte to 1 Gig, then 1 to 5 Gig, 5 Gig to 10 Gig, and so on.

At the low end, the cost of one use case is in the $5,000 to $10,000 range.  Of course, the biggest companies are paying more than $1 million a year.

Another advantage is your time-to-value is fairly quick with only a small investment.  Comparing to a traditional database provider, you could easily spend $1 million on software and another $4 million on services — and that might take you a year to deploy.  At Splunk, however, our revenue from services is basically negligible.

By the way, the Splunk engine scales all the way from desktop to enterprise.  The same product used to process only 500 Megabytes of data is the same at a customer processing 100 Terabytes a day with analytics capabilities on approximately 10 Petabytes of historical data.

How quickly can users get up to speed on the product?

What gets people started with Splunk is downloading the product freely off our website and trying it out in their area of interest.

We have a one-hour tutorial and in that time you can learn how to index the files.

There’s a straightforward way through APIs to use SQL commands, but you don‘t need to know SQL.  The vast majority of users employ our Splunk Processing Language (SPL) that features over 100 statistical commands.

Using SPL, the user can extract particular fields and even utilize some predictive features in the language.

The learning curve is quite fast.  We have a guy on staff that was an Oracle DBA in a previous life.  He says that in a day or two he was able to get 90% of the functionality he had on a sophisticated database.

Thank you, Tapan.

Copyright 2013 Black Swan Telecom Journal

Tapan Bhatt

Tapan Bhatt

Tapan Bhatt is Senior Director of Product Marketing at Splunk.  He has broad experience in corporate marketing, product management, and business development.  Prior to joining Splunk in 2011, he held key marketing roles at Vendavo and Siebel Systems.

He has a BS degree in Chemical Engineering from the Birla Institute of Technology and Science and an MBA from the Graduate School of Business, University of Chicago.   Contact Tapan via

Black Swan Solution Guides & Papers

cSwans of a Feather

  • A Big Data Starter Kit in the Cloud: How Smaller Operators Can Get Rolling with Advanced Analytics interview with Ryan Guthrie — Medium to small operators know “big data” is out there alright, but technical staffing and cost issues have held them back from implementing it.  This interview discusses the advantages of moving advanced analytics to the cloud where operators can get up and running faster and at lower cost.
  • Telecoms Failing Badly in CAPEX: The Desperate Need for Asset Management & Financial Visibility interview with Ashwin Chalapathy — A 2012 PwC report put the telecom industry on the operating table, opened the patient up, and discovered a malignant cancer: poor network CAPEX management, a problem that puts telecoms in grave financial risk.  In this interview, a supplier of network analytics solutions provides greater detail on the problem and lays out its prescription for deeper asset management, capacity planning and data integrity checks.
  • History Repeats: The Big Data Revolution in Telecom Service Assurance interview with Olav Tjelflaat — The lessons of telecom software history teach that new networks and unforeseen industry developments have an uncanny knack for disrupting business plans.  A service assurance incumbent reveals its strategy for becoming a leader in the emerging network analytics and assurance market.
  • From Alarms to Analytics: The Paradigm Shift in Service Assurance interview with Kelvin Hall — In a telecom world with millions of smart devices, the service assurance solutions of yesteryear are not getting the job done.  So alarm-heavy assurance is now shifting to big data solutions that deliver visual, multi-layered, and fine-grained views of network issues.  A data architect who works at large carriers provides an inside view of the key service provider problems driving this analytics shift.
  • The Shrink-Wrapped Search Engine: Why It’s Becoming a Vital Tool in Telecom Analytics interview with Tapan Bhatt — Google invented low cost, big data computing with its distributed search engine that lives in mammoth data centers populated with thousands of commodity CPUs and disks.  Now search engine technology is available as “shrink wrapped” enterprise software.  This article explains how this new technology is solving telecom analytics problems large and small.
  • Harvesting Big Data Riches in Retailer Partnering, Actionable CEM & Network Optimization interview with Oded Ringer — In the analytics market there’s plenty of room for small solution firms to add value through a turnkey service or cloud/licensed solution.  But what about large services firms: where do they play?  In this article you’ll learn how a global services giant leverages data of different types to help telcos: monetize retail partnerships, optimize networks, and make CEM actionable.
  • Raising a Telco’s Value in the Digital Ecosystem: One Use Case at a Time interview with Jonathon Gordon — The speed of telecom innovation is forcing software vendors to radically adapt and transform their business models.  This article shows how a deep packet inspection company has  expanded into revenue generation, particularly  for mobile operators.  It offers a broad palette of value-adding use cases from video caching and parental controls to application-charging and DDoS security protection.
  • Radio Access Network Data: Why It’s Become An Immensely Useful Analytics Source interview with Neil Coleman — It’s hard to overstate the importance of Radio Access Network (RAN) analytics to a mobile operator’s business these days.  This article explains why the RAN data, which lives in the air interface between the base station and the handset --  can be used for a business benefit in network optimization and customer experience.
  • Analytics Biology: The Power of Evolving to New Data Sources and Intelligence Gathering Methods interview with Paul Morrissey — Data warehouses create great value, yet it’s now time to let loose non-traditional big data platforms that create value in countless pockets of operational efficiency that have yet to be fully explored.  This article explains why telecoms must expand their analytics horizons and bring on all sorts of new data sources and novel intelligence gathering techniques.
  • Connecting B/OSS Silos and Linking Revenue Analytics with the Customer Experience by Anssi Tauriainen — Customer experience analytics is a complex task that flexes B/OSS data to link the customer’s network experience and actions to improve it and drive greater revenue.  In this article, you’ll gain an understanding of how anayltics data needs to be managed across various customer life cycle stages and why it’s tailored for six specific user groups at the operator.
  • Meeting the OTT Video Challenge: Real-Time, Fine-Grain Bandwidth Monitoring for Cable Operators interview with Mark Trudeau — Cable operators in North America are being overwhelmed by the surge in video and audio traffic.  In this article you’ll learn how Multi Service Operators (MSOs) are now monitoring their traffic to make critical decisions to protect QoS service and monetize bandwidth.  Also featured is expert perspective on trends in: network policy; bandwidth caps; and  customer care issues.
  • LTE Analytics:  Learning New Rules in Real-Time Network Intelligence, Roaming and Customer Assurance interview with Martin Guilfoyle — LTE is telecom’s latest technology darling, and this article goes beyond the network jargon, to explain the momentous changes LTE brings.  The interview delves into the marriage of IMS, high QoS service delivery via IPX, real-time intelligence and roaming services, plus the new customer assurance hooks that LTE enables.

Related Articles