Sign up for our newsletter and get the latest big data news and analysis.

The State of Data Management – Why Data Warehouse Projects Fail

Based on new research commissioned by SnapLogic and conducted by Vanson Bourne, who  surveyed 500 IT Decision Makers (ITDMs) at medium and large enterprises across the US and  UK, this whitepaper explores the data management challenges organizations are facing, the  vital role data warehouses play, and the road to success.

Interview: Global Technology Leader PNY

The following whitepaper download is a reprint of the recent interview with our friends over at PNY to discuss a variety of topics affecting data scientists conducting work on big data problem domains including how “Big Data” is becoming increasingly accessible with big clusters with disk-based databases, small clusters with in-memory data, single systems with in-CPU-memory data, and single systems with in-GPU-memory data. Answering our inquiries were: Bojan Tunguz, Senior System Software Engineer, NVIDIA and Carl Flygare, NVIDIA Quadro Product Marketing Manager, PNY.

A Comprehensive Guide to Evaluating Customer Data Platforms (CDPs)

This white paper by our friends over at HGS Digital aims to help you evaluate Customer Data Platform (CDP) vendors on various key areas. Investing in a CDP should be done with a long-term aim – various systems from which the data is imported and systems to which data is  exported may change over time, but the CDP becomes the master repository of data and should be leveraged by the marketing organization for a very long time.

New Study Details Importance of TCO for HPC Storage Buyers

Total cost of ownership (TCO) is often assumed to be an important consideration for buyers of HPC storage systems. Because TCO is defined differently by HPC users, it’s difficult to make comparisons based on a predefined set of attributes. With this fact in mind, our friends over at Panasas commissioned Hyperion Research to conduct a worldwide study that asked HPC storage buyers about the importance of TCO in general, and about specific TCO components that have been mentioned frequently in the past two years by HPC storage buyers.

Real-Time Analytics from Your Data Lake Teaching the Elephant to Dance

This whitepaper from Imply Data Inc. introduces Apache Druid and explains why delivering real-time analytics on a data lake is so hard, approaches companies have taken to accelerate their data lakes, and how they leveraged the same technology to create end-to-end real-time analytics architectures.

Introducing Apache Druid

This whitepaper provides an introduction to Apache Druid, including its evolution,
core architecture and features, and common use cases. Founded by the authors of the Apache Druid database, Imply provides a cloud-native solution that delivers real-time ingestion, interactive ad-hoc queries, and intuitive visualizations for many types of event-driven and streaming data flows.

insideBIGDATA Guide to Optimized Storage for AI and Deep Learning Workloads

This new technology guide from DDN shows how optimized storage has a unique opportunity to become much more than a siloed repository for the deluge of data constantly generated in today’s hyper-connected world, but rather a platform that shares and delivers data to create competitive business value. The intended audience for this important new technology guide includes enterprise thought leaders (CIOs, director level IT, etc.), along with data scientists and data engineers who are a seeking guidance in terms of infrastructure for AI and DL in terms of specialized hardware. The emphasis of the guide is “real world” applications, workloads, and present day challenges.

How to Plan and Launch Your Modern Data Catalog

Implementing a data catalog helps every member of your data community discover and use the best data and analytics resources for their projects, achieve faster results, and make better decisions. They illuminate tribal knowledge and spur collaboration, both of which are key elements of collective data empowerment. Are you ready to plan and launch your modern data catalog? Data.world says, let’s get started.

insideBIGDATA Guide to Data Platforms for Artificial Intelligence and Deep Learning

This insideBIGDATA technology guide explores how current implementations for AI and DL applications can be deployed using new storage architectures and protocols specifically designed to deliver data with high-throughput, low-latency and maximum concurrency.

Five Things to Consider When Choosing a Data Catalog

The self-service data analytic journey often begins with data catalog. Download the new white paper from Unifi Software that offers insight on what considerations to take into account when choosing a data catalog in today’s market.