NVIDIA Launches Large Language Model Cloud Services

Print Friendly, PDF & Email

NVIDIA today announced two new large language model cloud AI services — the NVIDIA NeMo Large Language Model Service and the NVIDIA BioNeMo LLM Service — that enable developers to easily adapt LLMs and deploy customized AI applications for content generation, text summarization, chatbots, code development, as well as protein structure and biomolecular property predictions, and more.

The NeMo LLM Service allows developers to rapidly tailor a number of pretrained foundation models using a training method called prompt learning on NVIDIA-managed infrastructure. The NVIDIA BioNeMo Service is a cloud application programming interface (API) that expands LLM use cases beyond language and into scientific applications to accelerate drug discovery for pharma and biotech companies.

“Large language models hold the potential to transform every industry,” said Jensen Huang, founder and CEO of NVIDIA. “The ability to tune foundation models puts the power of LLMs within reach of millions of developers who can now create language services and power scientific discoveries without needing to build a massive model from scratch.”

NeMo LLM Service Boosts Accuracy With Prompt Learning, Accelerates Deployments

With the NeMo LLM Service, developers can use their own training data to customize foundation models ranging from 3 billion parameters up to Megatron 530B, one of the world’s largest LLMs. The process takes just minutes to hours compared with the weeks or months required to train a model from scratch.

Models are customized with prompt learning, which uses a technique called p-tuning. This allows developers to use just a few hundred examples to rapidly tailor foundation models that were originally trained with billions of data points. The customization process generates task-specific prompt tokens, which are then combined with the foundation models to deliver higher accuracy and more relevant responses for specific use cases.

Developers can customize for multiple use cases using the same model and generate many different prompt tokens. A playground feature provides a no-code option to easily experiment and interact with models, further boosting the effectiveness and accessibility of LLMs for industry-specific use cases.

Once ready to deploy, the tuned models can run on cloud instances, on-premises systems or through an API.

BioNeMo LLM Service Enables Researchers to Tap Power of Massive Models

The BioNeMo LLM Service includes two new BioNeMo language models for chemistry and biology applications. It provides support for protein, DNA and biochemical data to help researchers discover patterns and insights in biological sequences.

BioNeMo enables researchers to expand the scope of their work by leveraging models that contain billions of parameters. These larger models can store more information about the structure of proteins, evolutionary relationships between genes, and even generate novel biomolecules for therapeutic applications.

Cloud API Provides Access to Megatron 530B, Other Ready-Made Models

In addition to tuning foundation models, the LLM services include the option to use ready-made and custom models through a cloud API.

This gives developers access to a broad range of pretrained LLMs, including Megatron 530B. It also provides access to T5 and GPT-3 models created with the NVIDIA NeMo Megatron framework — now available in open beta — to support a broad range of applications and multilingual service requirements.

Leaders in automotive, computing, education, healthcare, telecommunications and other industries are using NeMo Megatron to pioneer services for customers in Chinese, English, Korean, Swedish and other languages.


The NeMo LLM and BioNeMo services and cloud APIs are expected to be available in early access starting next month. Developers can apply now for more details.

The beta release of the NeMo Megatron framework is available from NVIDIA NGC™ and is optimized to run on NVIDIA DGX™ Foundry and NVIDIA DGX SuperPOD™, as well as accelerated cloud instances from Amazon Web Services, Microsoft Azure and Oracle Cloud Infrastructure.

To experience the NeMo Megatron framework, developers can try NVIDIA LaunchPad labs at no charge.

Sign up for the free insideBIGDATA newsletter.

Join us on Twitter: https://twitter.com/InsideBigData1

Join us on LinkedIn: https://www.linkedin.com/company/insidebigdata/

Join us on Facebook: https://www.facebook.com/insideBIGDATANOW

Speak Your Mind