Nvidia has unveiled the H200, its newest high-end chip designed for training AI models.
Announced on Monday, November 13, 2023, this new GPU is set to supercharge the capabilities of AI models by incorporating 141GB of next-generation “HBM3” memory.
The H200 represents an evolution from its predecessor, the H100, which has been instrumental in the AI ecosystem.
This excitement around Nvidia’s AI GPUs has boosted its stock by a phenomenal 230% in 2023, and the company is forecasting around $16 billion in revenue for its fiscal third quarter, marking a 170% increase from the previous year.
A key feature of the H200 is its enhanced performance in inference, which refers to the process of using a trained AI model to make predictions or decisions based on new, unseen data.
This is distinct from the training phase of a model, where the AI learns patterns from a large dataset.
Current data suggest the H200’s performance is nearly double that of the H100, as evidenced in Meta’s Llama 2 large language model (LLM) benchmarks.
Expected to be shipped in the second quarter of 2024, the H200 will likely start racking up immense orders from AI companies around the world, except in China, Iran, and Russia, to which US AI hardware exports are banned.
The H200 will be compatible with existing systems using the H100, allowing AI companies to upgrade without needing to change their server systems or software.
It will be available in four-GPU or eight-GPU server configurations on Nvidia’s HGX complete systems and also as a separate chip called GH200, pairing the H200 GPU with an Arm-based processor.
However, the H200’s position as the fastest Nvidia AI chip might be short-lived. Due to the high demand for its GPUs, Nvidia plans to shift to yearly release patterns in a bid to keep the AI industry thoroughly in its pocket.
There is another chip in the works, the B100 chip, based on an entirely new Blackwell architecture, which might be announced and released in 2024.