Open source AI champion Hugging Face is making $10 million in GPU compute available to the public in a bid to ease the financial burden of model development faced by smaller dev teams.
Founded in 2016, Hugging Face has become a go-to source of open source AI models which have been optimized to run on a wide variety of hardware – thanks in part to closeDelangue regards open source as the way forward for AI innovation and adoption, so his biz is making a bounty of complete resources available to whoever needs it. ZeroGPU will be made available via its application hosting service and run atop Nvidia's older A100 accelerators – $10 million worth of them — on a shared basis.
In terms of how Hugging Face has gotten around having to dedicate entire GPUs to individual users, there's no shortage of ways to achieve this, depending on the level of isolation required. However, it's worth noting there are practical limits to all of these approaches – the big one being memory. Based on the support docs, Hugging Face appears to be using the 40GB variant of the A100. Even running 4-bit quantized models, that's only enough grunt to support a single 80 billion parameter model. Due to key-value cache overheads, the practical limit will be less.