Sign up for our newsletter and get the latest big data news and analysis.

The Touchy-Feely Side of Spark

In this special guest feature, Alex Bordei, head of product management at Bigstep, offers 5 examples of how Apache Spark has maximized its user experience – its feel.

Structuring Apache Spark 2.0: SQL, DataFrames, Datasets And Streaming

In the talk below, Michael Armbrust, gives an overview of some of the exciting new API’s available in Spark 2.0, namely Datasets and Structured Streaming. Together, these APIs are bringing the power of Catalyst, Spark SQL’s query optimizer, to all users of Spark.

Big Data Events by MapR Coming to So Cal

MapR, the company behind the Converged Data Platform, is hosting two very timely upcoming big data events in Southern California. If you find yourself in Los Angeles, specifically Newport Beach in Orange County, and/or Santa Monica, please consider registering now.

IBM Unleashes the Power of Machine Learning with Watson-enabled Data Platform

IBM (NYSE:IBM) announced IBM Watson Data Platform to help companies gain more valuable insights from data. The platform delivers the world’s fastest data ingestion engine and cognitive-powered decision-making to data professionals, allowing them to collaborate in the IBM Cloud, with the services they prefer. IBM is also making IBM Watson Machine Learning Service available – making machine learning simple with an intuitive, self-service interface.

Databricks Sets New World Record for CloudSort Benchmark Using Apache Spark at $1.44 Per Terabyte

Databricks®, the company founded by the the team that created the popular Apache® Spark™ project, announced that in collaboration with industry partners, it has broken the world record in the CloudSort Benchmark, a third-party industry benchmarking competition for processing large datasets.

New Business Intelligence Performance Benchmark Reveals Strong Innovation Amongst Open-Source projects

AtScale, the company providing business users with speed, security and simplicity for BI on Hadoop, released the results of its reference performance study: The Business Intelligence Benchmark for SQL-on-Hadoop engines.

Interview: Mike Perez, Vice President of Services at Kinetica

I recently caught up with Mike Perez, Vice President of Services at Kinetica, to talk about GPU-accelerated databases and discuss how the Kinetica new Install Accelerator and Application Accelerator programs are are helping customers quickly integrate Kinetica into their environments.

Singapore Startup Develops Ultra-Fast Big Data Unification Technology by Avoiding MapReduce

Singapore-based startup, Percipient, has developed a way to sidestep a common Hadoop big data function, thereby shortening query processing time by more than 15 times.

3 Reasons In-Cluster Analytics is a Big Deal

In this special technology white paper, 3 Reasons In-Cluster Analytics is a Big Deal, you’ll learn about how recent technology advances within the Apache Hadoop ecosystem have provided a big boost to Hadoop’s viability as an analytics environment—above and beyond just being a good place to store data.

Apache Spark Survey Reveals Increased Growth in Users and New Workloads Including Exploratory Data Science and Machine Learning

In order to better understand Apache Spark’s growing role in big data, Taneja Group conducted a major market research project, surveying approximately 7,000 people. The sample was made up of technical and managerial job roles from around the world directly involved in big data.