insideBIGDATA Guide to Data Platforms for Artificial Intelligence and Deep Learning – Part 4

data platforms for artificial intelligence
Print Friendly, PDF & Email

data platforms for artificial intelligenceWith AI and DL, storage is cornerstone to handling the deluge of data constantly generated in today’s  hyperconnected world. It is a vehicle that captures and shares data to create business value. In this  technology guide, insideBIGDATA Guide to Data Platforms for Artificial Intelligence and Deep Learning, we’ll see how current implementations for AI and DL applications can be deployed  using new storage architectures and protocols specifically designed to deliver data with high-throughput, low-latency and maximum concurrency.

The target audience for the guide is enterprise  thought leaders and decision makers who understand that enterprise information is being amassed like never before and that a data platform is both an enabler and accelerator for business innovation.

Accelerated, Any-scale AI Solutions

DataDirect Networks (DDN) is a global technology leader, with two decades of experience in designing,  implementing, and optimizing storage solutions that enable enterprises to generate value  and accelerate time-to-insight from their data—both on premise and in the cloud. Engineered as an AI  data platform, DDN’s A³I (Accelerated, Any-Scale AI) can truly maximize business value and optimize  the AI environment—applications, compute, containers, and networks. It enables and accelerates AI  applications and streamlines DL workflows using a scalable shared parallel architecture and protocol.

One of the foundational elements for success with AI and DL is establishing the centrality of data in the  AI workflow. Many existing projects in the enterprise might depend on a collection of smaller data  sets at this point in time, but as AI methodologies mature, and as more enterprises engage AI, the  amount of data that can be applied to a problem, as well as the performance that can used to process it, is becoming more and more important. The DDN shared parallel storage architecture embraces this  notion.

DDN’s large data ingest capability can run concurrently with training and inference processes. It easily  handles large and diverse data streams from multiple sources. The DDN shared parallel storage  architecture incorporates unique features that accelerate, streamline and secure end-to-end AI and DL workflows.

The following is a short list of benefits of A³I for AI and DL deployments:

  • DDN A3I is a turnkey AI data platform, fully integrated and optimized for AI and DL applications. It has been thoroughly tested with widely used CPU and GPU computing platforms, AI and DL applications, and can be easily integrated into any IT environment.
  • The parallel architecture and protocol extends the performance of NVME from the disk all the  way to the application for maximum acceleration.
  • Data is delivered with high-throughput, low latency and massive concurrency to achieve full GPU saturation. This ensures that all compute cycles are put towards productive use.
  • The shared architecture allows multiple systems to access data simultaneously, enabling multiple  phases of the workflow to happen at concurrently and continuously.
  • The platform provides flexible and seamless scaling of performance, capacity and capability to match evolving workflow needs.
  • The platform can ingest, process and deliver heterogeneous data from a wide variety of sources, and supports mixed workloads.
  • DDN’s shared parallel file systems use metadata to enable file system level tagging of assets, and  then making it easy for applications to find the data they’re looking for based on these metadata  tags.
  • The platform includes robust data protection and integrity capabilities, and can be architected for  maximum availability.
  • These solutions are designed, deployed and supported by DDN’s global R&D and field  engineering organizations.

DDN storage solutions are unique in that they are continuously tested and optimized with commonly used AI and DL applications and various networking and computing platforms. DDN technology is  tightly integrated with GPUs, providing optimal data fulfillment from storage-to-GPU and GPU-to-GPU, for fastest and most efficient use of computing resources. DDN has equipped its laboratories with leading GPU compute platforms to provide unique benchmarking and testing capabilities for AI and  DL workloads.

Engineered from the ground up for the AI data platform, DDN A³I solutions are fully optimized to  accelerate AI applications and streamline DL workflows for greatest productivity. DDN A³I solutions  make AI-powered innovation easy, with faster performance, effortless scale, and simplified  operations—all backed by the data at scale experts. The DDN shared parallel architecture fully  saturates GPUs and ensures all efforts go towards productive AI use.

  • The DDN AI200 is an all-NVME flash appliance that is an efficient, reliable and easy to use data  storage system AI and DL applications. The AI200 reference architectures are designed in  collaboration with NVIDIA® to provide highest performance, optimal efficiency, and flexible growth  for NVIDIA® DGX-1TM servers. With AI200, Caffe applications running on a DGX-1 server  demonstrate 2.4x increased image throughput and 2x shorter completion times. TensorFlow training  applications demonstrate double image throughput and complete twice as fast on a DGX-1 server using a AI200 solution.
  • The DDN AI7990 is a hybrid storage appliance for ultimate flexibility that allows intermix of performance flash and large capacity disk in a high-density system. The AI7990 keeps DGX-1 servers saturated with data ensuring absolute maximum utilization whilst also managing tough data  operations from bursty ingest to large scale data transformations. Used for image, voice and sound  recognition, TensorFlow DL applications depend on large and diverse data sets with rich media  content. DDN storage solutions provide the capacity needed to store and deliver massive  heterogeneous data sets. They sustain the performance required to ensure data saturation of multiple  GPUs engaged in distributed, accelerated training of node-based, multi-layered deep neural networks.

This is the fourth in a series of articles appearing over the next few weeks where we will explore these topics surrounding data platforms for AI & deep learning:

If you prefer, the complete insideBIGDATA Guide to Data Platforms for Artificial Intelligence and Deep Learning is available for download from the insideBIGDATA White Paper Library, courtesy of DDN.


Speak Your Mind