The insideBIGDATA Guide to Streaming Analytics is a useful new resource directed toward enterprise thought leaders who wish to gain strategic insights into this exciting new area of technology. Many enterprises find themselves at a key inflection point in the big data timeline with respect to streaming analytics technology. There is a huge opportunity for direct financial and market growth for enterprises by leveraging streaming analytics. Streaming analytics deployments are being engaged by companies in a broad variety of different use cases. The vendor and technology landscape is complex and numerous open source options are mushrooming. It’s important to choose a platform that will supply a proven and pre-integrated, performance-tuned stack, ease of use, enterprise-class reliability and flexibility to protect the enterprise from rapid technology changes. Maybe the most important reason to evaluate this technology now is that a company’s competitors are very likely implementing enterprise-wide real-time streaming analytics right now and may soon gain significant advantages in customer perception & market-share. The complete insideBIGDATA Guide to Streaming Analytics is available for download from the insideBIGDATA White Paper Library.
Streaming analytics platforms provide businesses a method for extracting strategic value from data-in-motion in a manner similar to how traditional analytics tools operate on data-at rest. Instead of historical analysis, the goal with streaming analytics is to enable near real-time decision making by letting companies inspect, correlate and analyze data even as it flows into applications and databases from numerous different sources. Streaming analytics allows companies to do event processing against massive volumes of data streaming into the enterprise at high velocity.
Streaming analytics technologies enable action based on an analysis of a series of events that have just happened. Further, modern streaming analytics tools provide support for large data volumes and sophisticated query processing. Instead of thousands or tens of thousands of events per second, a streaming analytics platform can process millions and even tens of millions of events per second. Because data in a streaming analytics environment is processed before it lands in a database, the technology supports much faster decision making than possible with traditional data analytics technologies. With traditional analytics you gather information, store it and do analytics on it later. This is called “at-rest analytics.” With streaming technologies the analysis is done as the data arrive.
Common use cases for streaming analytics abound. For example, dashboards and visualization software integrated on top of streaming analytics platforms can help enterprises visualize and monitor their business in real-time. Such tools can be used to monitor changing customer attitudes through use of social media sentiment analysis. Similarly, streaming analytics capabilities can be used to enable real-time alerts or leverage new business opportunities—like making promotional offers to customers based on their geographical location at a specific time. Streaming analytics capabilities are also vital in the security-monitoring context because they give organizations a way to quickly correlate seemingly disparate events to detect threat patterns and evaluate risks. Government agencies have used these capabilities to do security monitoring of both network and physical assets.
Genesis of Streaming Analytics
Streaming analytics technology grew out of demand by enterprises that experienced a strong upward trajectory of data volume, velocity and variety as well as a need to ingest and evaluate this data to quickly make strategic business decisions.
Streaming Analytics Tools
There are many technology options for streaming analytics today and the ecosystem is evolving fast. The big enterprise players in this space include SAP, IBM, Informatica, Software AG, Oracle and TIBCO. Open source streaming analytics projects such as Apache Storm and Spark Streaming also have generated a lot of attention recently. Early adopters of these technologies have included major Internet companies like Twitter, Groupon, Spotify, Yelp, Uber, Pinterest and the Weather Channel, as well as other large enterprises including Cisco, Bosch, Rockwell Automation, Schneider Electric, Emory University Hospital, UCLA Department of Neurosurgery and others. In addition there are the pure-play technology vendors like DataTorrent with Apache Apex and also Cask (formerly Continuuity) which has teamed with AT&T Labs on an open source project called Tigon, a real-time stream processing framework built on top of Hadoop and Hbase. In addition, StreamAnalytix, another prominent member of this ecosystem, is a unique streaming analytics platform based on a best-of-breed open source technology stack.
Many of these tools are configured for the use and support of query processing capabilities out-of-the box and offer relatively easy to use and intuitive visual interfaces for running queries.
In addition, many vendors offer hosted streaming analytics services that are good for companies with cloud applications. Amazon’s Kinesis, Google’s Data Flow and Microsoft’s Azure, for instance, all support real-time processing and data analytics. Others include: Apache Samza from LinkedIn and Twitter’s Heron which might become open source soon.
Further, the leading Hadoop distributions have all embraced streaming analytics for their architectures including: MapR, Hortonworks, and Cloudera.
Over the next few weeks we will explore these streaming analytics topics:
- Streaming Analytics – An Overview
- The Business Value of Streaming to the Enterprise
- Selecting a Streaming Architecture
- Case Studies – How are Enterprises Using Streaming Analytics
- StreamAnalytix by Impetus