In this special technology white paper, 3 Reasons In-Cluster Analytics is a Big Deal, you’ll learn about how recent technology advances within the Apache Hadoop ecosystem have provided a big boost to Hadoop’s viability as an analytics environment—above and beyond just being a good place to store data. Specifically, you’ll see how the “Looker on Hadoop” solution leverages these advances to now fully support SQL on Hadoop as the best way to access big data, making in-cluster analytics of data in Hadoop a reality.
Say you have a lot of data sitting in a Hadoop cluster and you need to analyze it. How do you go about that? This white paper identifies two approaches: export into a relational analytical engine, and analyze the data directly within the Hadoop cluster using one of the several available SQL-on-Hadoop technologies
You’ll also see a brief overview of a series of major improvements in the leading SQL-on-Hadoop technologies – Hive, Spark, Impala and Presto. Thanks to the improvements the SQL-in-Hadoop technologies have achieved in recent months, Looker is delivering the future of enterprise analytics. Looker enables users to analyze and model all their Hadoop data in-cluster, putting the “big” into big data analytics without sacrificing speed or ease of use.
The white paper includes the following sections:
- Two approaches to analyzing data in a Hadoop cluster
- Four major solutions: Hive, Impala, Spark and Presto
- Delivering on the promise of big data