In this special guest feature, Dave Wang of Databricks enumerates the main reasons that data analytics in the cloud are becoming a top priority for enterprises in 2015. Dave Wang is Senior Product Marketing Manager at Databricks, a company founded by the creators of Apache Spark, that aims to help clients with cloud-based big data processing using Spark. Prior to Databricks, he worked with high-tech, retail, and banking clients at McKinsey & Company. He also held senior software engineering positions at Northrop Grumman and Arch Rock Corporation (Acquired by Cisco in 2010). He received his MBA from MIT Sloan and MS / BS in Electrical and Computer Engineering from Carnegie Mellon University.
Countless organizations have the desire to build a distinctive competitive advantage through data analytics. Nevertheless they have struggled in this pursuit because many of the technologies currently available still involve antiquated software, cumbersome systems, disparate solutions, and complex integrations – not to mention the staggering costs associated with infrastructure hardware and corresponding support personnel.
The cloud is rapidly transforming the ways in which organizations deploy and consume data analytics, just as it has transformed many other categories of technology. According to analyst firm IDC, “over the next five years spending on cloud-based big data and analytics solutions will grow three times faster than spending for on-premise solutions.” Here are main reasons that data analytics in the cloud are becoming a top priority for enterprises in 2015:
Cloud provides a more flexible deployment model for powerful open source software
Open source platforms such as Apache Spark are precipitating the transition by providing simpler and faster data processing capabilities. While these open source platforms are extremely powerful, deploying them remains extremely difficult on-premise: high cost of hardware, long lead-times, and large investment in supporting personnel.
The cloud will change this paradigm, allowing organizations to immediately harness the power of open source software without upfront investment, and pay only for resources consumed. This will lower the barrier to initiate data analytics projects, allow organizations to run more experiments, and ultimately yield more insights from data.
Cloud makes analytical tools simpler to learn and easier to use
One major challenge with traditional on-premise software is the difficulty to learn and use: unintuitive UIs, fragmented sources of documentation, non-existent help outside of heavy-weight professional services (army of people wearing blue shirts); to top it all off, feedback from customers take a long time to show up in product because the release cycles are so long.
Cloud-based software sets a new standard for usability. In stark contrast to their on-premise counterparts, Cloud based analytical tools can be drastically simpler to learn and easier to use. Browser-based UIs are graphical and intuitive, documentation can be dynamically linked to the most current online resources, and innovative new channels of customer support – such as allowing support engineers to troubleshoot a customer problem remotely. Most of all, Cloud-based tools have short release cycles in the order of weeks instead of month, which can further support changing customer needs.
Cloud enables experts to tackle hard analytics problems through collaboration
The hardest analytical problems require technical, mathematical, and domain expertise to solve. Case in point, detecting fraud in an online marketplace requires domain experts to define patterns of abnormal behavior, mathematicians to build algorithms that detect the behavior, and developers to implement the algorithms. Rarely would one find all these capabilities in a single individual. Therefore, effective analytical problem solving requires a platform that facilitates collaboration and enforces teamwork best practices, such as documenting work and sharing insights.
Cloud-based analytics platforms are inherently multi-user and foster close collaboration throughout an organization. When combined with additional capabilities such as an interactive notebook environment and data visualization, a Cloud-based analytics platform is a natural place to bring together experts with different specialties to tackle hard business problems.
Growing ecosystem of cloud-native business applications need a centralized platform for analysis
New applications being developed today are, more often than not, being built in the cloud. As the ecosystem of these cloud-based applications continues to grow, the amount of data being generated in the cloud will also grow in size, and being able to extract data from cloud-native data will become increasingly critical to every business.
Cloud-based analytics platforms are also a natural fit to process the data being generated in the cloud. Many cloud-based applications are built with connectors to each other already (such as CRM and marketing automation tools), a cloud-based analytics platform can easily become a centralized location to analyze the data from different cloud-based data sources as well.
Cloud is the best place to effectively deploy an entire data pipeline
Organizations need to build an entire data pipeline to extract maximum value out of the data – that is, they must ingest, transform, and explore the data initially; after which they need to use the insights gleaned from the previous steps to build complex models, develop new products, and deploy these products to customers. Doing so is exceedingly difficult on-premise because of the number of disparate capabilities one must integrate.
The cloud, on the other hand, is a good place to integrate these capabilities in a single unified environment. A cloud-based platform allows an organization to connect to a wide variety of data sources, gain better productivity with user-friendly tools, collaborate more effectively, and serve data products to a broad audience through the cloud. Being able to easily deploy an entire data pipeline, especially the last crucial step of serving data products, is the best reason cloud-based data analytics will take center stage in 2015.
Sign up for the free insideBIGDATA newsletter.