Continuum Analytics Brings Serious Analytics to Hadoop

Print Friendly, PDF & Email

Continuum_logoContinuum Analytics, the creator and driving force behind Anaconda, the leading modern open source analytics platform powered by Python, announced advancements in Anaconda bringing high performance advanced analytics to Hadoop. In addition to providing Python and R packages alongside Hadoop clusters, Anaconda will include a distributed processing framework for Hadoop that interacts directly with HDFS and YARN. With Anaconda’s new capabilities inside Hadoop, data scientists can finally achieve lightning fast processing of computationally intensive machine learning analytics to realize the full value of their Big Data.

Anaconda, already known for high performance Python, is now delivering high performance for Hadoop. Anaconda is leading the Open Data Science movement opening up Hadoop through a Python gateway that interacts directly with YARN and HDFS. This allows all Anaconda functionality, including other high performance analytics based on R and MPI, to work with the Hadoop ecosystem. It also bridges the gap between High Performance Computing (HPC) and Big Data to help enterprises unlock the value of the data tied up in their Hadoop cluster.

There’s so much potential in Hadoop, yet enterprise customers still struggle to unlock all of the computing power in their clusters even with the latest execution engines like Spark. Enterprise customers demand flexibility, high performance and efficient use of memory to scale up their Big Data workloads, especially for heavy duty machine learning. Continuum Analytics is helping enterprises extract value from Big Data,” said Peter Wang, co-founder and CTO of Continuum Analytics. “Anaconda empowers enterprises to get high performance and interactive analytics not only by leveraging the Open Data Science ecosystem–including Python and R–but also by leveraging investments in HPC using MPI. This is breakthrough technology for organizations who want to get high value and high impact data science solutions from Hadoop.”

Anaconda provides high performance computing professionals with a bridge into the Big Data and Hadoop ecosystem using the integrating power of Python. Many HPC professionals have resisted the move to Hadoop because of the lack of proven high performance analytics available on Hadoop. Now these seasoned professionals can leverage their powerful legacy MPI advanced analytics against Hadoop data stores. Similarly, by leveraging the same HDFS and YARN bridge, the vast R community will be able to realize performance gains using parallel R analytics in Hadoop. For enterprises concerned with authentication to their Hadoop data, Anaconda will also include single-sign-on (SSO) via Kerberos.

The new capabilities in Anaconda will be generally available by April of 2016.


Sign up for the free insideBIGDATA newsletter.

Speak Your Mind