'Big Data' Is the New 'Cloud'
Copyright 2014 by Virgo Publishing.
Posted on: 07/16/2012

By Hyoun Park

"Big Data" is here, and it is even more confusing than the “cloud." Incomplete and obsolete definitions are being used to define Big Data, which confuses customers and vendors alike. How can we get past the confusion in the market and identify opportunities to successfully help companies implement Big Data?

First, let's start with the definition of Big Data. More than a decade ago, META Group analyst Doug Laney introduced the challenges of data growth with the three Vs: volume, velocity and variety. This is the description that is still used today as the starting point of describing Big Data. However, as data has evolved over the past decade, we now start to ask ourselves what each of these Vs really mean:

  • Does "volume" refer to the size of a database? Or size of a data object? Or the cumulative size of all data within an organization?
  • Does "velocity" refer to the speed of data coming in? The speed of data acquisition? The speed of data processing? The speed of visualizing and charting data?
  • And does "variety" refer to a variety of data types? Or a variety of sources? Or a variety of applications supported?

Although the answer could be "All of the Above," this isn't a helpful way to think about Big Data or to start designing potential product suites to help companies with their business needs. To be less clever and more straightforward, let's just say that Big Data is any sort of data that can't be stored or analyzed by a business's existing database or analytics solution. And let's start breaking out the concept of Big Data into different types of use cases based on the type of Big Data being used and the tools involved in supporting Big Data.

The increasing size of data volumes we can call Expanding Big Data. This refers to data that is outgrowing its original database or data warehouse. As this data grows from gigabytes to terabytes, companies typically seek Big Data appliances that combine hardware with either a Hadoop distribution (open-source software for distributed computing)or a data warehouse. Examples include the Dell Apache Hadoop solution, EMC Greenplum Data Computing Appliance, IBM Netezza, Netapp Hadoopler, Oracle Big Data Appliance and Teradata Extreme Data Appliance.

Networked and correlated data on a massive scale is Social Big Data. The most obvious examples of Social Big Data come from social media monitoring, where marketing and service professionals seek to monitor Facebook, Twitter and other social networks to identify brand sentiment and business opportunities. Although social media monitoring solutions such as Salesforce's Radian6, Marketwire's Sysomos and Crimson Hexagon can provide social monitoring and analytics, companies seeking to integrate social information with existing product and customer information need to bring this information in-house. In this case, data integration technologies become important in combining social data with CRM and other traditional enterprise data transactions. This data integration consists of three steps often abbreviated as ETL:

  1. Extracting data from external sources
  2. Transforming data to the proper format and cleaning up bad data
  3. Loading the data to a database for analysis

Data integration companies include Birst, IBM, Informatica, MicroStrategy, NetApp, Oracle, Pervasive, Pentaho, SAP and SAS.

The challenges of complex unstructured data are Messy Big Data. Although databases provide a valuable structure for analyzing data, most data doesn't neatly fit into a database. As companies seek to analyze video, voice, pictures and other unstructured data, they require tools that fall outside traditional data analysis. Each of these specialized forms of data typically requires its own form of analysis. For video, tools such as HStreaming and SenSen Networks are used to analyze the actual video content. For voice, there are a number of speech analytics companies that have traditionally focused on the contact center space, such as Avaya's Aurix, CallMiner, HP Autonomy, Nexidia, NICE Systems and UTOPY.

The ease of use for big data and integration of Big Data with standard business analytics make up Easy Big Data. One of the most challenging aspects of Big Data that it is complex and difficult to analyze. Hadoop analysis is not intuitive, but a number of visualization vendors have started to focus on the importance of being able to access and analyze Big Data. Companies focused on making Big Data easier to use and visualize include Microstrategy, Pervasive, Pentaho, Splunk, Qlikview, Tableau Software and TIBCO Spotfire.

Machine and sensor data for large operational deployments constitute Automated Big Data. This use case is most common in manufacturing organizations that use time-series analysis of transactional processes. Process historians created by companies such as GE and Rockwell have traditionally support these needs. Interestingly, as traditional enterprises increasingly use more sensors and machines to support IT, marketing, retail and other efforts, they will increasingly need to conduct similar analysis regarding the quality and deviation of their machine environments and may need this type of solution.

As you can see, there are many types of Big Data that exist in the general enterprise world and many more that could potentially be categorized as industry verticals such as financial services, government, health care and others are considered. All of these use cases, technologies and considerations are part of concept of Big Data.

As your clients start to think about Big Data, you should ask the following questions to better understand what they are truly looking for:

  • Is the company focusing on data storage, strategic data analysis, real-time data analysis, or line-of-business visualization? As companies move up this scale of Big Data needs, they progress from pure cloud or server-based storage to a Hadoop or data warehouse implementation and eventually to a visualization solution that can directly show these results to line-of-business employees. Storage is relatively straightforward, but the need for real-time performance and line-of-business insight allows the value of Big Data to scale.
  • What level of context does the company really need? If companies are simply looking for sentiment analysis associated with unstructured voice or social media traffic, they actually may not be looking for the traditional “Big Data" solutions. Instead, they actually may be looking for voice analytics or social media monitoring solutions. It is only when companies are ready to analyze these data sources in context of CRM, ERP and other enterprise data sources that the world of Big Data products starts to come into play. But companies need a basic understanding of their data before they are ready to fully dig into the insight of Big Data.
  • Does the company need to integrate Big Data and traditional data analytics approaches? Big Data potentially can become a new IT silo that creates its own headaches and problems if the infrastructure and management are completely separated from the traditional analytics team. Companies need a data integration road map to make sure that these new Big Data tools can be connected to all relevant enterprise applications and data repositories.

By asking a couple of specific questions about the types of Big Data that a company is trying to achieve, channel partners can discover whether there are Big Data solutions that align to their current business and become a trusted partner to companies seeking to better understand their own needs for Big Data.

Hyoun Park is a principal analyst at Nucleus Research where he conducts and oversees primary investigative research on analytics, big data, business analytics, social software and enterprise mobility. He combines his telco, sabermetrics, social networking and expense management backgrounds to describe key value propositions associated with analytics and emerging business technologies. Park holds a bachelor's degree in women's and gender studies from Amherst College and a master's degree in business administration from Boston University.


Hear more from Nucleus Research's Hyoun Park in the two-hour session, "Big Data — What's the Big Deal and How Can Its Power Be Harnessed in the Cloud?," at the Channel Partners Conference & Expo, Sept. 12-14, in Orlando.