Sign up for our newsletter and get the latest big data news and analysis.

The inside Spark channel is a resource for professionals looking to learn about the benefits of Apache Spark

Big Data Analytics Receive a “Spark” In the Arm

In this special guest feature, Anand Venugopal, head of StreamAnalytix at Impetus Technologies, discusses real-time streaming analytics applications and how companies can use Apache Spark for data processing and analytics functionality. Real-time data and analytics processes are the central nervous system of today’s enterprise, which makes it no surprise that the global revenue in the business intelligence (BI) and analytics software market is forecast to reach $22.8 billion by the end of 2020.

Big Data, Hadoop & Cloud: Tackling a Chain of Emerging Challenges

In this special guest feature, Chandra Ambadipudi, CEO of Clairvoyant, provides a compelling tour de force through the recent history of the big data industry and how Hadoop and the cloud have made steady acceleration possible. Also offered are recommendations for how to address several challenges faced by enterprises with respect to big data cloud implementations.

Top 5 Mistakes When Writing Spark Applications

In the presentation below from Spark Summit 2016, Mark Grover goes over the top 5 things that he’s seen in the field that prevent people from getting the most out of their Spark clusters. When some of these issues are addressed, it is not uncommon to see the same job running 10x or 100x faster with the same clusters, the same data, just a different approach.

The Data Scientist’s Guide to Apache Spark

Looking to dive deeper into the more cutting edge machine learning use cases in Apache Spark? To successfully use Spark’s advanced analytics capabilities including large scale machine learning and graph analysis, check out The Data Scientist’s Guide to Apache Spark, from our friends over at Databricks.

Databricks Launches Delta To Combine the Best of Data Lakes, Data Warehouses and Streaming Systems

Databricks, provider of the leading Unified Analytics Platform and founded by the team who created Apache Spark™, announced Databricks Delta, the first unified data management system that provides the scale and cost-efficiency of a data lake, the query performance of a data warehouse, and the low latency of a streaming ingest system. Databricks Delta, a […]

Apache Spark Expands With Cypher, Neo4j’s ‘SQL For Graphs,’ Adds Declarative Graph Querying

Neo4j, a leader in connected data, announced that it has released the preview version of Cypher for Apache Spark (CAPS) language toolkit. This combination allows big data analysts to incorporate graphs and graph algorithms in their work, which will dramatically broaden how they reveal connections in their data.

Impetus Technologies Delivers Visual Spark Studio – A New, Free Development Tool to Accelerate Spark Adoption in Enterprises

Impetus Technologies, a big data software products and services company, announced the immediate availability of Visual Spark StudioTM, a new standalone tool aimed at addressing the increasing demand for Spark-based analytic and data processing solutions in enterprises.

Databricks Secures $140 Million to Accelerate Analytics and Artificial Intelligence in the Enterprise

Databricks, provider of the leading Unified Analytics Platform and founded by the team who created Apache Spark™, announced it has secured $140 million in a Series D funding round led by Andreessen Horowitz. New Enterprise Associates and Battery Ventures also participated.

Interview: Ash Munshi, CEO at Pepperdata

I recently caught up with Ash Munshi, CEO at Pepperdata, to get a rundown on his company, a sense for how big data and DevOps are related, some highlights on new product offerings, and his sense for where Pepperdata is headed in the future.

IBM Combines All-Flash and Storage Software Optimized for Hortonworks

IBM (NYSE: IBM) announced a new all-flash, high-performance data and file management solution for enterprise clients running exabyte-scale big data analytics, cognitive and AI applications. The combined flash and storage software solution has been certified with the Hortonworks Data Platform (HDP) to provide clients with more choice in selecting the right platform for their big data analytics on data processing engines like Hadoop and Spark.