
Alibaba cutting NVIDIA GPU use for AI models by 80%
18 Oct 2025
Alibaba Group Holding has unveiled a revolutionary computing pooling solution, Aegaeon, which reduces the need for NVIDIA graphics processing units (GPUs) in its artificial intelligence (AI) models by 82%.
The innovative system was beta tested in Alibaba Cloud's model marketplace for over three months. During this period, it cut down the number of required NVIDIA GPUs from 1,192 to just 213.
Aegaeon can serve dozens of models simultaneously
Innovation details
The research paper on Aegaeon was presented at the 31st Symposium on Operating Systems Principles (SOSP) in Seoul, South Korea.
The study highlights that Alibaba Cloud's system can serve dozens of models with up to 72 billion parameters each.
This is a major leap forward in managing and optimizing resources for AI workloads, especially given the high demand for NVIDIA GPUs in this field.
System improves efficiency by pooling GPU power
Cost efficiency
The researchers from Peking University and Alibaba Cloud, including Alibaba's Chief Technology Officer Zhou Jingren, have highlighted Aegaeon's role in tackling the high costs of serving concurrent large language model (LLM) workloads.
The system is a major step toward improving efficiency by pooling GPU power.
It allows one GPU to serve multiple models at once, thus reducing resource allocation inefficiencies that often plague cloud service providers like Alibaba Cloud and ByteDance's Volcano Engine.
Aegaeon addresses resource inefficiency issue
Model demand
The researchers also noted that a few models, such as Alibaba's Qwen and DeepSeek, are more popular for inference than others.
This leads to resource inefficiency, with 17.7% of GPUs allocated to serve only 1.35% of requests in Alibaba Cloud's marketplace.
The Aegaeon system could be the answer to this problem by optimizing GPU usage across different models and their respective workloads.
-
HIV AIDS: How the risk of HIV increases, understand stage by stage how dangerous its symptoms become.
-
Over 3,000 cases, arrests in Ernakulam Rural District police crackdown on drug activities
-
Poundland closing 11 high street stores with 2 shutting next week - full list
-
BREAKING Alejandro Garnacho hooked at half-time after furious Chelsea star screamed at him
-
4 painting hacks everyone should know before redecorating