Sign up for our newsletter and get the latest big data news and analysis.

Spark MLlib: Making Practical Machine Learning Easy and Scalable


In this talk, Xiangrui Meng of Databricks shares his experience in developing MLlib. The talk covers both higher-level APIs, ML pipelines, that make MLlib easy to use, as well as lower-level optimizations that make MLlib scale to massive data sets.

Why Data Quality without Data Integrity is No Match for Today’s Business Demands

Bobby Koritala

In this special guest feature, Bobby Koritala, Chief Product Officer of Infogix, discusses data management best practices and why data quality without data integrity is no match for today’s business demands.

Book Review: Doing Math with Python


When one of my favorite independent tech book publishers, No Starch Press, notified me about their new title “Doing Math with Python,” I was energized to review what potentially could be a good new resource for budding data scientists.

OpsDataStore Unveils Solution to Ensure Online Service Quality for Dynamic IT Environments


OpsDataStore, the company delivering a solution to improve online service quality across heterogeneous and rapidly changing applications and IT infrastructures, announced the general availability of OpsDataStore 1.0 — the industry’s first big data back end for all IT management data.

Redis: Setting Big Data on Fire


For those unfamiliar with Redis, it is an open source, in-memory data structure server. Originally conceived to solve a problem that required speed and simplicity, it soon became clear that Redis had applications far beyond its original intent. Redis has since grown to include many data structures that resolve very complex programming problems with simple commands executed within the data store.

Discovering Alpha Through Automation


In this special guest feature, Dr. Venkat Srinivasan, Chairman and CEO of Rage Frameworks, Inc., outlines how big data can help active investment managers see success in financial markets in pursuit of Alpha.

Apixo Iris Platform Powers Accurate Patient Risk Adjustment for Improved Healthcare


Apixio Inc., the data science company for healthcare, announced that its HCC Profiler solution is improving care delivery and chronic disease management with rich and accurate patient profiles.

Updated WANdisco Fusion Platform Offers Hybrid Cloud, Active Back-up for Enterprises


WANdisco, (LSE: WAND) a leading provider of continuous-availability software for global enterprises to meet the challenges of Big Data, announced major updates to its flagship WANdisco Fusion Platform.

Splice Machine Announces Version 2.0 of its RDBMS: A Hybrid In-Memory Architecture Powered by Hadoop and Spark


Splice Machine announced the 2.0 version of its RDBMS, a hybrid in-memory RDBMS powered by Hadoop and Spark. Splice Machine’s version 2.0 delivers a database solution that incorporates the proven scalability of Hadoop, ANSI SQL, ACID transactions, and the in-memory performance of Spark.

Datameer Selected by to Optimize Customer Website Experience with Big Data Analytics


In an effort to improve its overall customer experience,, a leader in online and mobile travel, has selected Datameer’s self-service big data analytics solution to better understand customer behavior, analyze offer effectiveness, and improve operational processes.