Three Reasons In-Cluster Analytics is a BIG DEAL

White Papers > Analytics > Three Reasons In-Cluster Analytics is a BIG DEAL

Recent technology advances within the Apache Hadoop ecosystem have provided a big boost to Hadoop’s viability as an in-cluster analytics environment—above and beyond just being a good place to store data. Leveraging these advances, Looker can now fully support SQL on Hadoop as the best way to access big data, making in-cluster analytics of data in Hadoop a reality. This is a big deal - it meets a huge demand, it shows how rapidly the technologies have evolved, and it delivers on one of the most significant unmet promises of big data analytics.

The First Approach has been kind of a default for many organizations. You can use the Hadoop environment for collecting and transforming data, and then export that data into a relational analytical database—for example Redshift or Vertica—for the actual analysis. So you get the advantages of high-speed, powerful analysis...but only after you’ve moved the data out of Hadoop and into a more familiar environment. This approach is more complicated than it needs to be. Plus it’s pricey, and getting more so as data volumes grow.

The Second Approach is more elegant. You can analyze the data directly within the Hadoop cluster using one of the several available SQL-on-Hadoop technologies, eliminating the need to move the data into a separate database. Unfortunately, it hasn’t been easy to get that to work. Until recently, these tools have not provided the speed or the depth of capability that most organizations need. Say you have a lot of data that you need to analyze, sitting in a Hadoop cluster.

    Contact Info

    Work Email*
    First Name*
    Last Name*
    Zip/Postal Code*

    Company Info

    Company Size*
    Job Role*

    All information that you supply is protected by our privacy policy. By submitting your information you agree to our Terms of Use.
    * All fields required.