NVIDIA Launches Inference Platforms for Large Language Models and Generative AI Workloads

NVIDIA launched four inference platforms optimized for a diverse set of rapidly emerging generative AI applications — helping developers quickly build specialized, AI-powered applications that can deliver new services and insights. The platforms combine NVIDIA’s full stack of inference software with the latest NVIDIA Ada, NVIDIA Hopper™ and NVIDIA Grace Hopper™ processors — including the NVIDIA L4 Tensor Core GPU and the NVIDIA H100 NVL GPU, both launched at GTC.

Intel’s Habana Labs Launches Second-Generation AI Processors for Training and Inferencing

Intel announced that Habana Labs, its data center team focused on AI deep learning processor technologies, launched its second-generation deep learning processors for training and inference: Habana® Gaudi®2 and Habana® Greco™. These new processors address an industry gap by providing customers with high-performance, high-efficiency deep learning compute choices for both training workloads and inference deployments in the data center while lowering the AI barrier to entry for companies of all sizes.

TensorRT 8 Provides Leading Enterprises Fast AI Inference Performance

NVIDIA today launched TensorRT™ 8, the eighth generation of the company’s AI software, which slashes inference time in half for language queries — enabling developers to build the world’s best-performing search engines, ad recommendations and chatbots and offer them from the cloud to the edge.