Sign up for our newsletter and get the latest big data news and analysis.

Six Platform Investments from Intel to Facilitate Running AI and HPC Workloads Together on Existing Infrastructure

Because HPC technologies today offer substantially more power and speed than their legacy predecessors, enterprises and research institutions benefit from combining AI and HPC workloads on a single system. Six platform investments from Intel will help reduce obstacles and make HPC and AI deployment even more accessible and practical.

The insideBIGDATA IMPACT 50 List for Q4 2019

The team here at insideBIGDATA is deeply entrenched in following the big data ecosystem of companies from around the globe. We’re in close contact with most of the firms making waves in the technology areas of big data, data science, machine learning, AI and deep learning. Our in-box is filled each day with new announcements, commentaries, and insights about what’s driving the success of our industry so we’re in a unique position to publish our quarterly IMPACT 50 List of the most important movers and shakers in our industry. These companies have proven their relevance by the way they’re impacting the enterprise through leading edge products and services. We’re happy to publish this evolving list of the industry’s most impactful companies!

Book Excerpt: Defining Data Models

The following article is an excerpt (Chapter 3) from the book Hands-On Big Data Modeling by James Lee, Tao Wei, and Suresh Kumar Mukhiya published by our friends over at Packt. The article addresses the concept of big data, its sources, and its types. In addition to this, the article focuses on giving a theoretical foundation about data modeling and data management. Readers will be getting their hands dirty with setting up a platform where we can utilize big data.

Big Models are Carbon Efficient if you Share Them

Dr. Samuel Leeman-Munk, research statistician developer at SAS, and Dr. Xiaolong Li, staff scientist at SAS, explore the carbon emissions generated through training a high-performing NLP model, and offers examples and best practices for how these models can benefit hundreds of other researchers and millions of customers to paint a less dire picture of AI’s effect on the environment.

Interview: Haoyuan Li, Founder and CTO of Alluxio

I recently caught up with Haoyuan Li, Founder and CTO of Alluxio to examine the next critical piece in the data evolution: Data Orchestration – the missing piece in building hybrid and multi-cloud analytical architectures. Haoyuan Li received his Ph.D. from UC Berkeley AMPLab, in Computer Science. At the AMPLab, he created Alluxio (formerly Tachyon) Open Source Data Orchestration System, co-created Apache Spark Streaming, and became an Apache Spark founding committer.

“Above the Trend Line” – Your Industry Rumor Central for 10/7/2019

Above the Trend Line: your industry rumor central is a recurring feature of insideBIGDATA. In this column, we present a variety of short time-critical news items grouped by category such as M&A activity, people movements, funding news, financial results, industry alignments, customer wins, rumors and general scuttlebutt floating around the big data, data science and machine learning industries including behind-the-scenes anecdotes and curious buzz.

Is Your Legacy Industry About to be Disrupted?

In this contributed article, Greg Robinson, Managing Director for 4490 Ventures, discusses what’s next for entrepreneurs, now that the commoditization of consumer-facing IoT has made such incredible progress and hardware is readily and cheaply available. Here’s a hint: it’s not simply about building the next layer of the tech stack.

insideBIGDATA Guide to Optimized Storage for AI and Deep Learning Workloads – Part 3

Artificial Intelligence (AI) and Deep Learning (DL) represent some of the most demanding workloads in modern computing history as they present unique challenges to compute, storage and network resources. In this technology guide, insideBIGDATA Guide to Optimized Storage for AI and Deep Learning Workloads, we’ll see how traditional file storage technologies and protocols like NFS restrict AI workloads of data, thus reducing the performance of applications and impeding business innovation. A state-of-the-art AI-enabled data center should work to concurrently and efficiently service the entire spectrum of activities involved in DL workflows, including data ingest, data transformation, training, inference, and model evaluation.

insideBIGDATA Guide to Optimized Storage for AI and Deep Learning Workloads – Part 2

Artificial Intelligence (AI) and Deep Learning (DL) represent some of the most demanding workloads in modern computing history as they present unique challenges to compute, storage and network resources. In this technology guide, insideBIGDATA Guide to Optimized Storage for AI and Deep Learning Workloads, we’ll see how traditional file storage technologies and protocols like NFS restrict AI workloads of data, thus reducing the performance of applications and impeding business innovation. A state-of-the-art AI-enabled data center should work to concurrently and efficiently service the entire spectrum of activities involved in DL workflows, including data ingest, data transformation, training, inference, and model evaluation.

DAOS Delivers Exascale Performance Using HPC Storage So Fast It Requires New Units of Measurement

Forget what you previously knew about high-performance storage and file systems. New I/O models for HPC such as Distributed Asynchronous Object Storage (DAOS) have been architected from the ground up to make use of new NVM technologies such as Intel® Optane™ DC Persistent Memory Modules (Intel Optane DCPMMs). With latencies measured in nanoseconds and bandwidth measured in tens of GB/s, new storage devices such as Intel DCPMMs redefine the measures used to describe high-performance nonvolatile storage.