Sign up for our newsletter and get the latest big data news and analysis.

The Touchy-Feely Side of Spark

In this special guest feature, Alex Bordei, head of product management at Bigstep, offers 5 examples of how Apache Spark has maximized its user experience – its feel.

Structuring Apache Spark 2.0: SQL, DataFrames, Datasets And Streaming

In the talk below, Michael Armbrust, gives an overview of some of the exciting new API’s available in Spark 2.0, namely Datasets and Structured Streaming. Together, these APIs are bringing the power of Catalyst, Spark SQL’s query optimizer, to all users of Spark.

Building a Business Case for Data Quality

The infographic below was developed by Experian Data Quality as a by product of their recent survey of 402 management-level professionals. The infographic covers how managers feel about data. Data is your organization’s most valuable asset, and having good data quality is necessary for sustained success.

The GridGain In-Memory Data Grid

In this special technology white paper, The GridGain In-Memory Data Grid, you’ll learn that with the cost of system memory dropping 30% every 12 months, in-memory computing has become the first choice for a variety of workloads across all industries. In-memory computing can provide a lower TCO for data processing systems while providing an unparalleled performance advantage.

Databricks Sets New World Record for CloudSort Benchmark Using Apache Spark at $1.44 Per Terabyte

Databricks®, the company founded by the the team that created the popular Apache® Spark™ project, announced that in collaboration with industry partners, it has broken the world record in the CloudSort Benchmark, a third-party industry benchmarking competition for processing large datasets.

New Business Intelligence Performance Benchmark Reveals Strong Innovation Amongst Open-Source projects

AtScale, the company providing business users with speed, security and simplicity for BI on Hadoop, released the results of its reference performance study: The Business Intelligence Benchmark for SQL-on-Hadoop engines.

Performance Optimization of Deep Learning Frameworks on Modern Intel Architectures

In this video from the Intel HPC Developer Conference, Elmoustapha Ould-ahmed-vall from Intel describes how the company is doubling down to optimize Machine Learning frameworks for Intel Platforms. Using open source frameworks as a starting point, surprising speedups are possible using Intel technologies.

Apache Spark Survey Reveals Increased Growth in Users and New Workloads Including Exploratory Data Science and Machine Learning

In order to better understand Apache Spark’s growing role in big data, Taneja Group conducted a major market research project, surveying approximately 7,000 people. The sample was made up of technical and managerial job roles from around the world directly involved in big data.

Datawatch Brings “Data Socialization” to Self-Service Analytics

Datawatch Corporation (NASDAQ-CM: DWCH) announced its strategic vision and product road map for making “data socialization” a reality for all business users of self-service data preparation.

How Big Data Can Change Your Business and How to Let It

In this contributed article, tech writer Linda Gimmeson takes a look at how big data technology can change your business in fundamental ways as well as steps to jump-start the effort.