Data Science 101: Real-time Analytics using Cassandra, Spark and Shark

Print Friendly, PDF & Email

In the video below, Evan Chan (Software Engineer at Ooyala), describes his experience using the Spark and Shark frameworks for running real-time queries on top of Cassandra data. He starts by surveying the Cassandra analytics landscape, including Hadoop and HIVE, and touches on the use of custom input formats to extract data from Cassandra. Then, he dives into Spark and Shark (two memory-based cluster computing frameworks) and explains how they enable often dramatic improvements in query speed and productivity. This talk was given at Cassandra Day Silicon Valley 2014. For a parallel slideshare presentation, click HERE.

 

Sign up for the free insideBIGDATA newsletter.

 

Speak Your Mind

*