Altiscale Increases Performance and Reliability of Hadoop and Spark Platform with Updated Altiscale Data Cloud

Print Friendly, PDF & Email

Altiscale, Inc., a leading provider of Big Data-as-a-Service, introduced Altiscale Data Cloud 4.0, featuring major upgrades to core Hadoop components, such as HDFS and YARN, and an expanded Spark-as-a-Service offering that supports all major versions of Apache Spark. With this new release, customers get increased performance, scalability and stability, ensuring that customers have the best possible Big Data experience on the Altiscale Data Cloud platform. In addition, Altiscale Data Cloud 4.0 meets the Hadoop ecosystem standards being established by the ODPi, so that applications adhering to the ODPi standard can be easily run on Altiscale or any Hadoop distribution that meets ODPi specifications.

For the typical enterprise, running Hadoop and Spark is complex and resource intensive, with the challenges only increasing as data volumes expand. By offering a Hadoop and Spark platform in the cloud, with full operations support and elastic scalability, Altiscale ensures that customers can focus on the value they get from Big Data, while liberating them from the hassle of data management.

Spark attracts wide interest for its ability to rapidly process large volumes of data. However, organizations often struggle to get started with Spark. Altiscale solves this problem by ensuring that customers have the support and operations expertise they need to be successful with Spark quickly. Altiscale Data Cloud 4.0 features expert advisory services to help customers establish analytical jobs, full operational support to ensure that workloads complete successfully and a high performance cloud infrastructure that has been specifically built and tuned for fast Big Data processing. The Spark-as-a-Service offering comes with Hadoop YARN for resource management, as well as a full workbench of data science tools.

We have been using Spark on the Altiscale Data Cloud for the past year and have been really excited about its performance and scalability. We see great promise in Spark, and we also want to ensure that we are fully covered for all of our analytical needs, including MapReduce. At Altiscale, experts are looking out for us to ensure that we get the latest, greatest, production-ready features,” said Satya Ramachandaran, SVP of engineering and managing director, MarketShare. “Altiscale provides a full Big Data solution, including Hadoop, MapReduce and Spark, so we know that we can get all of our jobs done, not just Spark jobs. And since it’s a fully managed service, I never have to worry about scaling, resource contention or underlying software upgrades.”

Altiscale Data Cloud 4.0 also addresses the challenges that organizations face with the rapid advancements in Spark development, which can hinder applications that were built for earlier versions. To help customers absorb the change at their own pace, Altiscale offers full support of all major recent Apache Spark versions (1.5.0, 1.4.1, 1.3.1). Customers will not only have access to the latest features and performance improvements for Spark, but will also be able to run prior versions that may be necessary for already-built analytical applications or data analysis.

Altiscale is dedicated to providing its customers with a full breadth of production-ready big data analytical options. That’s why we’ve been active in the Spark community from the very beginning,” said Raymie Stata, CEO and founder, Altiscale. “It’s also why Altiscale is ensuring that we support all major recent versions of Spark. Spark is evolving so rapidly that we want to ensure anything our customers rely on for Big Data analytics continues to be there for them.”

Altiscale CTO, David Chaiken, recently presented at the Hadoop Summit San Jose, “Running Spark and MapReduce in Production,” which is available here as a video:

The latest version of the Altiscale Data Cloud also features the following updated capabilities:

  • Apache Spark 1.5.0, which provides improved performance and stability. It offers enhanced support for Data Science APIs, especially with advances in DataFrames features and improved support for the R Language, making it a compelling release for data scientists.
  • Apache Hadoop 2.7.1, so that customers can utilize YARN’s resource manager to deploy and manage long-running data access applications in Hadoop.
  • Apache Hive 1.2.0, for access to enhanced SLQ Semantics and major performance improvements.
  • Apache Pig 0.15.0, which now provides the capability to run Hive UDFs inside Pig as well as improved stability for Pig on Tez.

There are also improvements to the workflow manager, Apache Oozie (4.2.0) and to Apache Tez (0.7.0). The latest version of the Altiscale Data Cloud meets the standards being established by the ODPi, which means that any application built to meet ODPi standard specifications can easily run on the Altiscale Data Cloud.


Download insideBIGDATA: An Insider’s Guide to Apache Spark

Speak Your Mind