By Max A. Cherney
(Reuters) – Nvidia announced a new configuration on Tuesday for its advanced artificial intelligence chips that is designed to speed generative AI applications.
The new version of the Grace Hopper Superchip boosts the amount of high-bandwidth memory, which will give the design the capacity to power larger AI models, according to Nvidia’s vice president of hyperscale and HPC, Ian Buck. The configuration is optimized to perform AI inference functions that effectively power generative AI applications such as ChatGPT.
Nvidia’s Grace Hopper Superchip design stitches together one of the company’s H100 graphics processing units (GPU) with an Nvidia-designed central processor.
“Having larger memory allows, allows the model to remain resident on a single GPU and have to require multiple systems or multiple GPUs to in order to run,” Buck said in a conference call with reporters.
The underlying AI models that power the generative AI apps that are capable of producing human-like text and images continue to grow in size. As model sizes increase, they require a larger amount of memory in order to run without connecting separate chips and systems, which degrades performance.
“The additional memory, it just simply increases the performance of the GPU,” Buck said.
The new configuration called GH200 will be available in the second quarter next year, Buck said.
Nvidia plans to sell two flavors: a version that includes two chips that customers can integrate into systems, and a complete server system that combines two Grace Hopper designs.
(Reporting by Max A. Cherney in San Francisco; Editing by Marguerita Choy)