NVIDIA’s new H100 GPUs have set new benchmarks for AI training tasks, breaking several records in the process.
MLPerf was founded by a consortium of researchers, academics, and other specialists who built benchmarks to test how quickly systems can deploy and run AI models. In essence, MLPerf is a series of tests designed to measure the speed and efficiency of machine learning (ML) hardware, software, and services.
Nvidia, the world leader in AI hardware, tested a cluster of 3,584 H100 GPUs to flex their formidable speed.
The cluster, co-developed by AI startup Inflection AI and managed by CoreWeave, a cloud service provider specializing in GPU-based workloads, completed a training benchmark based on the GPT-3 model in less than 11 minutes.
In other words, the cluster trained a GPT-3 equivalent model with some 175bn parameters in about the same time it takes to brew a coffee or walk the dog. While we don’t know how long it took OpenAI to train GPT-3, it certainly wasn’t 11 minutes.
The H100 GPUs set records across 8 other MLPerf tests, showcasing their raw power and versatility. Here are some of the results:
- Large language model (GPT-3): 10.9 minutes
- Natural language processing (BERT): 0.13 minutes (8 seconds)
- Recommendation (DLRMv2): 1.61 minutes
- Object detection, heavyweight (Mask R-CNN): 1.47 minutes
- Object detection, lightweight (RetinaNet): 1.51 minutes
- Image classification (ResNet-50 v1.5): 0.18 minutes (11 seconds)
- Image segmentation (3D U-Net): 0.82 minutes (49 seconds)
- Speech recognition (RNN-T): 1.65 minutes
In their latest round of benchmarking, dubbed v3.0, MLPerf also updated its test for recommendation systems, which are algorithms that suggest products or services to users based on their past behavior.
The new test uses a larger dataset and a more current AI model to better emulate the challenges faced by service providers. Nvidia is the only company to submit results on this benchmark.
MLCommons, an AI and technology consortium, recently announced the newest findings from their AI benchmarking tests.
The primary benchmark round was called v3.0, which assesses the efficiency of machine learning model training. Another round, called Tiny v1.1, examines ML applications for ultra-compact, low-power devices.
The MLPerf round v3.0 saw involvement from companies like ASUSTek, Azure, Dell, Fujitsu, GIGABYTE, H3C, IEI, Intel & Habana Labs, Krai, Lenovo, NVIDIA, NVIDIA + CoreWeave, Quanta Cloud Technology, Supermicro, and xFusion.
Overall, models showed performance improvements of up to 1.54x in the last 6 months or 33 to 49x since the first round, v0.5, in 2019, illustrating the pace of progress in machine learning systems.
Nvidia claimed the scalp of round v3.0 thanks to their ultra-high-end H100 chips, which they will likely maintain for the foreseeable future.