Sign up for our newsletter and get the latest big data news and analysis.

The inside Spark channel is a resource for professionals looking to learn about the benefits of Apache Spark

Data as a Critical Element in the Discovery and Delivery of Smart Energy

In this contributed article, Jules S. Damji, an Apache Spark Community Evangelist with Databricks, shows how as the value of data continues to grow, the next-generation smart grid should become a reality, benefiting utility companies and consumers alike.

Pepperdata Integrates Performance into DevOps for Big Data

Pepperdata, the Big Data performance company, announced it is expanding its product portfolio with Pepperdata Application Profiler, providing Hadoop and Spark developers with easy to understand recommendations for improving job performance. Application Profiler is currently available in early access and will be generally available in the second quarter of 2017.

EnterpriseDB Announces New Apache Spark Connecter to Speed Postgres Big Data Processing

EnterpriseDB® (EDB™), the database platform company for digital business, announced the general availability of a new version of the EDB Postgres Data Adapter for Hadoop with compatibility for the Apache Spark cluster computing framework. The new version gives organizations the ability to combine analytic workloads based on the Hadoop Distributed File System (HDFS) with operational data in Postgres, using an Apache Spark interface.

The Leaky Pipeline Problem -
 Making your Mark as a Woman in Big Data

insideBIGDATA was on hand for the recent Spark Summit East 2017 conference in Boston, and one of the more compelling presentations was by Kavitha Mariappan, VP Marketing at Databricks. The talk focused on the premise that despite the tremendous growth and opportunities in big data today, women still play a small role in this arena.

Percipient Launches SparkPLUS to Solve Apache Spark’s Out-of-memory Problems

Percipient, a Singapore-based startup, is launching a revolutionary solution to address the memory issues incurred by users of open source platform, Apache Spark. By delivering unified data a priori to the Spark platform, Percipient’s SparkPLUS solution is able to multiply the platform’s computing space, thereby greatly enhancing its utility for real time and analytical applications.

Monte Carlo Simulations in Ad-Lift Measurement Using Spark

In this talk from Spark Summit East 2016, Prasad Chalasani explores some of the challenges that arise in setting up scalable simulations in a specific application, and share some solutions and lessons learned along the way, in the realms of mathematics and programming.

ODPi Publishes Operations Specification Providing Developers Consistency Across Application Management Tools

ODPi, a nonprofit organization accelerating the open ecosystem of big data solutions, announced the availability of ODPi 2.0, which includes the first release of the ODPi Operations Specification and the Runtime Specification 2.0, to standardize the development model for big data solution and application providers and help enterprises improve installation and management of Hadoop-based applications.

The Touchy-Feely Side of Spark

In this special guest feature, Alex Bordei, head of product management at Bigstep, offers 5 examples of how Apache Spark has maximized its user experience – its feel.

Structuring Apache Spark 2.0: SQL, DataFrames, Datasets And Streaming

In the talk below, Michael Armbrust, gives an overview of some of the exciting new API’s available in Spark 2.0, namely Datasets and Structured Streaming. Together, these APIs are bringing the power of Catalyst, Spark SQL’s query optimizer, to all users of Spark.

IBM Unleashes the Power of Machine Learning with Watson-enabled Data Platform

IBM (NYSE:IBM) announced IBM Watson Data Platform to help companies gain more valuable insights from data. The platform delivers the world’s fastest data ingestion engine and cognitive-powered decision-making to data professionals, allowing them to collaborate in the IBM Cloud, with the services they prefer. IBM is also making IBM Watson Machine Learning Service available – making machine learning simple with an intuitive, self-service interface.