Achieving the best outcomes from data analysis no matter its provenance–databases, systems, storage, etc.–is huge in a world swimming in data. Denodo uses the technology known as data virtualization to get speedy and insightful results from data no matter the origin. We sat down with Suresh Chandrasekaran, Senior Vice President at Denodo, to get a much better understanding of why his company focuses 100% on this solution.
insideBIGDATA: Why does the world need data virtualization and what does the Denodo Platform offer?
Suresh Chandrasekaran: I could argue that everything that the world needs, except for love and kindness, can be made better with data. Every cool innovation, in technology or in business – be it in the area of internet of things, hyper-connectedness, real-time and predictive analytics or customer intimacy – is driven by better use of increasingly complex and disparate data. Simply put, data virtualization is the shortest path between the need for data and its fulfillment. Unlike other integration technologies, it lets the data live wherever it already is but provides integrated access to the users, speeding up time to solution. It lets business leaders, data scientists, innovators, operational decision makers, you and me, “do our thing” to achieve better outcomes; all without having to worry about how to access, store, convert, integrate, copy, move, secure, and distribute data upfront and on an ongoing basis. So everyone needs data virtualization, and Denodo is both the leader in data virtualization and also the only company in the world that doesn’t just dabble in it, but makes it 100 percent of its focus.
The benefits of data virtualization in access, time, cost and flexibility are well documented. A video on our website describes how our customer, The Climate Corporation, integrates massive big data analytics which consists of weather sensor data and crop yield models with insurance sales and marketing systems in the cloud using data virtualization. It also describes how they were able to do 3x more using only a third of the staff while providing a flexible architecture for meeting changing future needs of the business using data virtualization. And that’s tremendous. Big and small organizations – from AAA to Vodafone, Biogen Idec to Nationwide, US Dept of Energy to National Institutes of Health – have gone public with their stories of how data virtualization helps their mission.
insideBIGDATA: Can you please describe Denodo’s solutions?
Suresh Chandrasekaran: The Denodo Platform delivers the capability to access any kind of data from anywhere it lives without necessarily moving it to a central location like a data warehouse. Once moved it exposes that data to various users and analytical/business applications as virtual data services in a way that is meaningful to the users, in real-time, with high performance, using caching and minimal data movement only as needed. That is data virtualization in a nutshell. Imagine an hour-glass architecture where at the bottom you have many sources and at the top many users, but a thin, light-weight layer in the middle to efficiently bridge data access, integration and delivery, while minimizing replication and cost. Data virtualization has been compared to earlier forms of data federation. It does that as well as real-time distributed query optimization but it also does much more. It provides abstraction and decoupling between physical systems/location and logical data needs. It includes tools for semantic integration of structured to highly unstructured data. It enables intelligent caching and selective persistence to balance source and application performance. It layers in security, governance and data services delivery capabilities that may not be available or a good match between original sources and intended new applications.
While data virtualization brings a lot of capability, it is very important to recognize that it is not a direct replacement for either traditional replication-based integration tools (ETL, ELT, Data Replication, etc.) or message-driven integration hubs (ESB, Cloud and Web Services hubs). Often, data virtualization is used either between sources and these integration tools or between them and applications to provide faster time-to-solution and more flexibility. However, for rapid prototyping and for some use cases it can also eliminate the need for heavier integration altogether.
insideBIGDATA: What sets Denodo apart from other providers?
Suresh Chandrasekaran: Three things set Denodo apart: Performance, Broad Spectrum and Ease-of-Use. These points are inter-related because they emanate from the same fact that Denodo is focused on data virtualization and its broadest application in hundreds if not thousands of use cases from world-class customers.
- Performance – just like a vehicle’s performance is not measured purely on straight line runs on the Bonneville Salt Flats, data virtualization performance is dependent on both platform prowess and knowledge of best practice scenarios. Nationwide Insurance posted their DV results in a presentation and said “Performance was met or exceeded”. IBM’s co-sell partnership with Denodo came after extensively testing our data virtualization solution with IBM Puredata Systems for Analytics and IBM BigInsights. To meet high expectations of these world leading companies, Denodo provides not only advanced features and functionality in the product but also a library of solutions and best practices delivered through our training and services offerings to help the customer reach superb results for their specific scenario.
- Broad Spectrum – You can get a taste of data virtualization or more correctly data federation from other tools in the market; some embedded in BI tools for data blending, as an add-on to an ETL platform for prototyping use, or from network or cloud services vendors. Denodo customers on the other hand chose us because even though they plan to start using data virtualization for Agile BI or Logical Data Warehousing or Big Data Analytics infrastructure, they anticipated the future use for web and unstructured data, or provisioning mobile and cloud applications with agile REST-ful data services. Put simply, they want a flexible information architecture for the long-term. Denodo offers this breadth in features and experience that few can match.
- Ease-of-Use – Integration is not easy, but the out-of-box experience and success in initial projects is key to an organization making wider use of data virtualization. So we deliver a single integrated data virtualization platform that includes all the necessary components. This approach delivers the right mix of simplicity without sacrificing deeper, sometimes hidden functionality for the different roles that may use it such as a business-savvy analyst, modeller, developer, administrator, and so on. To supplement that we offer an extensive array of self-driven online tutorials, training courses and professional services programs.
insideBIGDATA: Data Virtualization has gained a ton of momentum over the years. What does this mean for Denodo?
Suresh Chandrasekaran: I agree. Whether you call it the tipping point or mainstream adoption or ‘being in the zone’ – it is a great time to be doing data virtualization and/or helping others to do it. What it means for us is not only scaling our own organization and efforts as the leading vendor of data virtualization, but enabling our customers, partners and the community at large to do so as well. We have developed a lot of expertise on “Data Virtualization Patterns” which we are using to enable them to drive successful DV adoption in different areas. These patterns are evolving in waves. For example, Agile BI, Information as a Service (Self-Service BI), and Logical Data Warehousing originally set off the trend and are becoming one of the mainstream use cases of data virtualization. Gartner, Forrester, IDC, TDWI and several leading consultants are actively encouraging the use of data virtualization today. Big data and real-time analytics are now fanning the flames. Although the initial focus of Big Data has been on specific Storage/Processing platforms like Hadoop, a true Big Data Analytics Platform Architecture has to have data virtualization to efficiently Intake all the disparate real-time, unstructured and streams data and disseminate actionable analytics to the right applications or users in real-time in a hybrid analytical environment. The other two major patterns that are boosting data virtualization are the provisioning of data services for agile application development (mobile and cloud) and better use of unstructured Web, public data.gov and social data in the enterprise.
insideBIGDATA: What is your relationship like with the Big Data/Hadoop community and what does this mean technologically for you?
Suresh Chandrasekaran: Denodo is active in the evolution of the Big Data/Hadoop and the broader NoSQL ecosystem and is providing the tools to enable wider adoption of these data management platforms. Hadoop is great for many new use cases and yet it is just another data management platform, and we will see many more as either data types or business needs emerge to expose gaps in current platforms. This proves both the need for virtualization because new platforms come along for good reasons and you must unify the data for all platforms and make those easier to access for the broader masses.
Working with the Hadoop community, Denodo has developed both native and SQL-ized access to Hadoop distributions including IBM BigInsights, Cloudera, Hortonworks, MapR and Amazon S3/EC2. As our recent IBM partnership, use cases from companies such as The Climate Corporation and Wolters-Kluwer, and our latest webinar on Big Data virtualization patterns demonstrate, data virtualization can accelerate the value and insights of Big Data/Hadoop analytics in many ways. So when we work with these companies we are focused not just on better and higher performing ways to connect to Hadoop (Hive, Hbase, Impala, HDFS, etc.) but also the overall solution architecture of Big Data.
While Hadoop is gaining very fast traction, some other SQL and NoSQL technologies are also going to grow in use including integrated and in-memory RDBMS systems, graph, columnar, and document data-bases, high performing data warehouse appliances, streaming analytics and web services. In this respect, it is noteworthy that of all the data virtualization vendors, Denodo alone uses an “Extended Relational Model” at its core which is more friendly to hierarchical, semantic and unstructured data formats even while providing a Relational or SQL friendly modeling paradigm. This enables Denodo customers to be more flexible in their choice of right tool for the job and evolve a hybrid mix of data and analytics technologies that seamlessly appears as a unified data and analytics framework. While SQL-ification of Big Data/Hadoop will continue, that is not always the best way to use Hadoop. So the flexibility you get with Denodo is key.
insideBIGDATA: In some ways these technologies (Big Data/Hadoop & DV) are in the nascent stages. What does the future hold for your company and the industry as a whole?
Suresh Chandrasekaran: Hadoop is a nascent technology growing very fast. Data virtualization could be described as a resurgent technology that is growing fast … its original principles are time-tested while the implementation is new and smarter. In fact, the two complement each other very well. What I mean by that is data federation has been around for a while but had its share of problems and jaded users, because it was a good idea with bad implementation. Data virtualization has risen from those ashes to be functionally more than data federation, but also better and smarter in its implementation, taking advantage of advances in database, semantic, networking, memory and computing technologies. Once people recognize this difference in a modern data virtualization platform like Denodo, they want to use it more often and are amazed at what it can do for them.
Considering we’re still early in the possibilities for Big Data and data virtualization, the future is very exciting. What this means for us is more partnerships and solution/pattern development and education to guide customers on the best uses of data virtualization in both analytics and other contexts.
But I also want to provide a word of caution. Big Data/Hadoop projects today have an intense focus on the storage/processing technologies and specific ones at that. What is important is to consider upfront is the need to integrate both source data feeding these systems and the resulting analytics from them with the rest of enterprise business intelligence and applications ecosystem and allow for the maximum flexibility. The best way to do that is to ensure that data virtualization is a key component of your overall Big Data strategy and architecture.