insideBIGDATA Guide to Data Platforms for Artificial Intelligence and Deep Learning

Print Friendly, PDF & Email

With AI and DL, storage is cornerstone to handling the deluge of data constantly generated in today’s  hyperconnected world. It is a vehicle that captures and shares data to create business value. In this  technology guide, insideBIGDATA Guide to Data Platforms for Artificial Intelligence and Deep Learning, we’ll see how current implementations for AI and DL applications can be deployed  using new storage architectures and protocols specifically designed to deliver data with high-throughput, low-latency and maximum concurrency.

The target audience for the guide is enterprise  thought leaders and decision makers who understand that enterprise information is being amassed like never before and that a data platform is both an enabler and accelerator for business innovation.


The stage is set for enterprise competitive success with respect to how fast valuable data assets can be  consumed and analyzed to yield important business insights. Technologies such as artificial  intelligence (AI) and deep learning (DL) are facilitating this strategy and the increased efficiency of  these learning systems can define the extent of an organization’s competitive advantage.

Many companies are strongly embracing AI. A March 2018 IDC spending guide on worldwide  investments on cognitive and AI systems indicates the level will reach $19.1 billion for 2018, an  increase of 54.2% over the amount spent in 2017. Further, spending will continue to grow to $52.2 billion by 2021. By all indications, this is an industry on an upward trajectory, but limiting factors such  as data storage and networking bottlenecks must be addressed to assure the maximum benefit from AI and DL applications.

Enterprise machine learning algorithms have historically been implemented using traditional compute  architectures, where system throughput and data access latencies are measured by paring  compute and storage resources through the same network interconnections that serve other business  applications. With AI and DL, the increasing volume and velocity of arriving data are stressing these legacy architectures. Although compute has made great strides with GPUs, legacy file storage solutions  commonly found in enterprise data centers haven’t kept pace.

Data is the New Source Code

Data’s role in the future of business cannot be overstated. DL is about growing autonomous capability  by learning from very large amounts of data. In many ways, data is the new source code. An AI data  platform must enable and streamline the entire workflow. AI and DL workflows are non-linear, i.e. not  a process that starts and then ends, and then goes onto the next iteration. Instead, non-linear means  the operations in the workflow happen concurrently and continuously (as depicted in the wheel  graphic below). It’s all about iterating, completing each step as fast as possible through the acceleration  afforded by a parallel storage architecture. It’s about getting the wheel going and allowing  customers to grow their infrastructure seamlessly as the data sets grow, as the workflows  evolve. Data is ingested then gets indexed and curated before being used for training, validation, and  inference; all these different steps happen concurrently and continuously. Data continues to be  collected as training occurs, as models are moving to production. The wheel gets bigger and more engaged as workflows evolve.

Over the next few weeks we will explore these topics surrounding data platforms for AI & deep learning:

If you prefer, the complete insideBIGDATA Guide to Data Platforms for Artificial Intelligence and Deep Learning is available for download from the insideBIGDATA White Paper Library, courtesy of DDN.



Speak Your Mind