The world’s tech companies are hungry for high-end GPU chips, and manufacturers can’t seem to chug out enough to meet demand.
Behind every high-profile AI model is a legion of GPUs working tirelessly – and even the industry’s A-list players can’t get enough of them.
Training AI models requires immense computing resources, but manufacturers are relatively thin on the grounds, and high-end GPUs aren’t something you can spin up overnight. The vast majority of demand has fallen on industry veteran Nvidia’s shoulders, pushing its market cap to $1tn this year.
Right now, few seem safe from the GPU shortage – and the further away from Nvidia you are, the lower your chances are of getting hold of them.
In May, OpenAI CEO Sam Altman told the US Senate, “We’re so short on GPUs, the less people that use the tool, the better.”
A recent decline in GPT-4’s performance led many to speculate whether OpenAI was unable to meet demand, forcing them to alter and tune aspects of their models.
Meanwhile, in China, the GPU shortage has created a rather bizarre blackmarket where business buyers have to engage in shady deals for Nvidia’s A100 and H100 chips on the floors of the SEG skyscraper in Shenzhen – a cyberpunk-esque scenario ripped straight out of a Deus Ex video game.
Microsoft’s annual report recently highlighted the extended shortage of AI chips as a potential risk factor for investors.
The report says, “We continue to identify and evaluate opportunities to expand our datacenter locations and increase our server capacity to meet the evolving needs of our customers, particularly given the growing demand for AI services.”
It goes on, “Our datacenters depend on the availability of permitted and buildable land, predictable energy, networking supplies, and servers, including graphics processing units (‘GPUs’) and other components.”
The insatiable appetite for GPUs
Computing power is a significant bottleneck for AI development, but few forecasted demand of this magnitude.
If this level of demand was predictable, there would be more AI chip manufacturers around than Nvidia and a handful of startups, with Nvidia controlling at least 84% of the market by some estimates. AMD and Intel are only just getting into the game.
Raj Joshi, a senior vice president at Moody’s Investors Service, said, “Nobody could’ve modeled how fast or how much this demand is going to increase,” “I don’t think the industry was ready for this kind of surge in demand.”
In its May earnings call, Nvidia announced it had “procured substantially higher supply for the second half of the year” to meet the rising demand for AI chips.
AMD, meanwhile, stated that it’s set to unveil its answer to Nvidia’s AI GPUs closer to the end of the year. “There’s very strong customer interest across the board in our AI solutions,” said AMD CEO Lisa Su.
Some industry experts suggest that the chip shortage may ease in two to three years as Nvidia’s competitors expand their offerings. Several startups are now working day and night to plug this explosive demand.
Any and all businesses capable of making high-end chips suitable for AI workloads will do well, but it’s a rare category, as GPUs are exceptionally long-winded to research and build.
AI has to become leaner
Relatively fresh-faced AI developers like Inflection are rushing to build colossal training stacks.
After raising a mighty $1.3bn, Inflection plans to assemble a GPU cluster of 22,000 high-end H100 chips.
For perspective, Nvidia, in collaboration with CoreWeave, recently smashed AI training benchmarks with a cluster of 3,584 chips – including training a large language model (LLM) such as GPT-3.5 in about 10 minutes.
While the quest for power among AI’s leading players revolves around stacking GPUs in what’s starting to look like a feudal land grab, others are focusing on leaning out AI models to get more mileage out of current technology.
For instance, developers in the open-source community recently found ways of running LLMs on compact devices such as MacBooks.
“Necessity is the mother of invention, right?” Sid Sheth, founder and CEO of AI startup d-Matrix told CNN. “So now that people don’t have access to unlimited amounts of computing power, they are finding resourceful ways of using whatever they have in a much smarter way.”
Moreover, the shortage of GPUs is welcome news to those wishing for AI development to slow down – does the technology really need to move faster than it is already?
Probably not. As Sheth puts it, “Net-net, this is going to be a blessing in disguise.”