Sign up for our newsletter and get the latest big data news and analysis.

Hadoop, Spark or Both?

In this contributed article, tech writer Blake Davies asks the question: Spark or Hadoop? This question has recently sparked various discussions throughout the online communities. Even though these two work on different principles, they can be applied in a same way for various uses. While Hadoop is a household name in the world of big data processing, Spark is still building a name for itself and it’s doing so with “style”.

Pepperdata® Code Analyzer for Apache Spark Highlights Performance Bottlenecks for Developers

Pepperdata, the DevOps for Big Data company, announced Pepperdata Code Analyzer for Apache Spark, which provides Spark application developers the ability to identify performance issues and connect them to particular blocks of code within an application. Code Analyzer is a new product that follows on the heels of Pepperdata Application Profiler, which provides Hadoop and Spark developers with actionable recommendations for improving job performance.

MapR Releases New Ecosystem Pack with Optimized Security and Performance for Apache Spark

MapR Technologies, Inc., the provider of the Converged Data Platform that converges the essential data management and application processing technologies on a single, horizontally scalable platform, announced its next major release of the MapR Ecosystem Pack (MEP) program. MEP is a broad set of open source ecosystem projects that enable big data applications running on the MapR Converged Data Platform with inter-project compatibility.

Databricks Launches New Edition of Its Spark-Based Cloud Platform for Data Engineers

Databricks, the company founded by the creators of the popular Apache Spark project and providers of the leading Spark-based cloud platform for data science, announced an edition of its cloud platform optimized specifically for data engineering workloads called Databricks for Data Engineering.

Impetus Technologies Announces StreamAnalytix 3.0 Featuring Support for Apache Spark-Based Batch Processing

Impetus Technologies, a big data thought leader and software solutions company, announced StreamAnalytix™ 3.0 featuring support for Apache Spark-based batch processing and enriched online and offline machine learning features, helping enterprises maximize the performance of their analytical models and achieve the most favorable business outcomes. The newest version adds to the stream processing capabilities driven […]

Data as a Critical Element in the Discovery and Delivery of Smart Energy

In this contributed article, Jules S. Damji, an Apache Spark Community Evangelist with Databricks, shows how as the value of data continues to grow, the next-generation smart grid should become a reality, benefiting utility companies and consumers alike.

Pepperdata Integrates Performance into DevOps for Big Data

Pepperdata, the Big Data performance company, announced it is expanding its product portfolio with Pepperdata Application Profiler, providing Hadoop and Spark developers with easy to understand recommendations for improving job performance. Application Profiler is currently available in early access and will be generally available in the second quarter of 2017.

EnterpriseDB Announces New Apache Spark Connecter to Speed Postgres Big Data Processing

EnterpriseDB® (EDB™), the database platform company for digital business, announced the general availability of a new version of the EDB Postgres Data Adapter for Hadoop with compatibility for the Apache Spark cluster computing framework. The new version gives organizations the ability to combine analytic workloads based on the Hadoop Distributed File System (HDFS) with operational data in Postgres, using an Apache Spark interface.

Percipient Launches SparkPLUS to Solve Apache Spark’s Out-of-memory Problems

Percipient, a Singapore-based startup, is launching a revolutionary solution to address the memory issues incurred by users of open source platform, Apache Spark. By delivering unified data a priori to the Spark platform, Percipient’s SparkPLUS solution is able to multiply the platform’s computing space, thereby greatly enhancing its utility for real time and analytical applications.

Monte Carlo Simulations in Ad-Lift Measurement Using Spark

In this talk from Spark Summit East 2016, Prasad Chalasani explores some of the challenges that arise in setting up scalable simulations in a specific application, and share some solutions and lessons learned along the way, in the realms of mathematics and programming.