I recently caught up with Dr. William Bain, Founder and CEO at ScaleOut Software, to discuss his perspectives on real-time analytics as well as the benefits and challenges for businesses adopting this technology. ScaleOut Software was founded in 2003 by Dr. William L. Bain. Dr. Bain has a Ph.D. (1978) in electrical engineering/parallel computing from Rice University, and he has worked at Bell Labs research, Intel, and Microsoft. He founded and ran three start-up companies prior to joining Microsoft. In the most recent company (Valence Research), he developed a distributed Web load-balancing software solution that was acquired by Microsoft and is now called Network Load Balancing within the Windows Server operating system. Dr. Bain holds several patents in computer architecture and distributed computing. As a member of the screening committee for the Seattle-based Alliance of Angels, Dr. Bain is actively involved in entrepreneurship and the angel community.
Daniel D. Gutierrez – Managing Editor, insideBIGDATA
insideBIGDATA: Everyone is talking about real-time analytics. How does real-time insight into live systems and data streams differ from today’s business intelligence? Are they one and the same?
Dr. William Bain: For more than a decade, big data technologies have powered business intelligence by analyzing petabyte datasets in minutes or hours to help guide strategic decision making. However, in today’s fast-paced, data-driven economy, traditional business intelligence does not fully meet the needs of many enterprises. Spurred by trends such as the Internet of Things and the demand for customer personalization in e-commerce, businesses need to make sense of the flood of pertinent data in real time, take immediate action, and capture perishable opportunities before the moment is lost. Called “operational” intelligence, this important capability opens the door to the next generation of business intelligence.
insideBIGDATA: What competitive advantages can businesses gain from real-time insights into their live data?
Dr. William Bain: Quick responses to changing market conditions are crucial in maintaining a competitive edge. While early real-time analytics applications focused on supplying timely information to human operators, such as helping dispatchers quickly direct utility trucks to downed power lines, this technology has evolved to enable automated decision making. This enables real-time analytics to identify fast-changing trends and provide immediate feedback to live, mission-critical systems without a human in the loop. Businesses which can capture perishable opportunities using real-time analytics enjoy a powerful competitive advantage that can impact the bottom line.
The good news is that real-time analytics can be implemented cost-effectively using today’s commodity servers; there’s no need for special purpose hardware. This technology can track and respond to live events within tens to hundreds of milliseconds (faster than a human can react), making it suitable for a wide range of applications and opening up exciting possibilities. There are countless examples: helping shoppers select products on web sites, tracking telemetry from patient-monitors to identify impending medical events, detecting intrusions within physical or information systems, catching fraudulent financial transactions, and many more.
insideBIGDATA: What determines when real-time insights are beneficial? Can you provide some examples of use cases?
Dr. William Bain: Capturing the value of real-time analytics requires companies to rethink how analytics technology is integrated into their mission-critical systems. While business intelligence is typically implemented using offline systems in the data warehouse, real time analytics creates operational intelligence within online systems, integrating directly into manufacturing lines, IoT telemetry paths, and e-commerce web sites. This enables companies to immediately reap the benefits of live, actionable data insights.
Running real-time analytics within live systems opens up new scenarios that were previously beyond the reach of business analytics. For example, this technology can bring operational intelligence to the Internet of Things (IoT), helping to manage devices ranging from consumer wearables to manufacturing systems. It can also dramatically enhance the shopping experience for online customers, enabling “in the moment” recommendations and omni-channel personalization; this helps e-commerce companies use fast-changing shopping data and social media to build much deeper, one-to-one relationships.
insideBIGDATA: What challenges do enterprises face moving to real-time analytics models?
Dr. William Bain: Both IT and business managers within enterprises face new challenges when implementing real-time analytics for operational intelligence. For this technology to be effective, it must be able to extract and analyze data from existing, mission-critical systems. This means that real-time analytics must be integrated into these systems and provide useful feedback fast and reliably.
To deliver high value to the business, real-time analytics must embody domain-specific algorithms that incorporate knowledge of specific patterns of interest to the business process. For example, a manufacturing system needs to detect variations in telemetry that indicate a high likelihood of impending machine failure. Likewise, a real-time analytics system tracking a trucking or rental car fleet must understand what driving patterns are unusual enough to signal dispatchers. Implementing these domain-specific algorithms requires intimate knowledge of the business process being optimized.
insideBIGDATA: How can organizations incorporate real-time data analytics into their business? What are the enabling technologies?
Dr. William Bain: Recent innovations in a technology called “in-memory computing” have made it the engine of real-time analytics. This exciting, new technology bypasses the limitations of traditional analytics systems used in the data warehouse and makes it possible to track live systems. In particular, this technology can meet real-time requirements while delivering the scalability needed to handle large applications with thousands of data sources generating live data streams. It also incorporates techniques for ensuring its continuous availability within mission-critical systems.
Although some in-memory computing techniques (such as Apache Spark) have been employed in the data warehouse, they have focused primarily on accelerating offline analysis of very large data sets or integrating data streams into batch processing. Because their approach to application design and system availability is tailored to the data warehouse, they are not well suited for live systems. In contrast, a specific in-memory technology called “in-memory data grids” directly targets real-time analytics on live data and enables the creation of powerful applications for generating operational intelligence.
Unlike traditional stream processing platforms, which focus on extracting information from incoming data streams, in-memory data grids enable applications to easily combine data streams with a constantly evolving model of their data sources. This allows these applications to draw on historic data that puts real-time patterns into perspective for deeper introspection, and it results in more effective feedback to live systems. For example, a real-time patient monitoring system can make use of current drug dosages as well as evolving patient history over prior minutes and hours to create a richer context for telemetry and empower better decision-making. In-memory data grids move stream processing and real-time analytics to where the data lives, delivering high value with fast response times and continuous availability.
Sign up for the free insideBIGDATA newsletter.