
Back in December 2023 — an age in the tech world — Google unveiled what they called an AI Hypercomputer, described as an innovative supercomputer architecture using performance-optimized hardware, open software frameworks, leading machine learning frameworks, and flexible consumption models.
Google Cloud customers of its enterprise cloud service compete against Microsoft and Amazon for enterprise customer dollars, so their goal should be to “increase efficiency and productivity across AI training, tuning, and serving”.
Google Cloud customers who pay will gain virtual access to Google’s AI Hypercomputer software and hardware in order to train their own AI models and applications.
At that time, Google noted how customers like Salesforce and Lightricks were already training and serving large AI models using Google Cloud’s TPU v5p AI Hypercomputer — and already seeing results.
Today at Google Cloud Next 2024 – their annual conference taking place this week in Las Vegas – Google announced massive upgrades to its AI Hypercomputer platform as well as how even more high-profile customers are taking advantage of it.
Google Cloud AI Hypercomputer Upgraded with More Powerful Nvidia Chips First among these upgrades is Google’s Tensor Processing Unit v5p, previously touted as its “most powerful, scalable, and flexible AI accelerator thus far”, entering general availability for its customers.
Google Cloud will upgrade their A3 virtual machine (VM) family powered by NVIDIA H100 Tensor Core GPUs to become A3 Mega, featuring 80 billion transistor GPUs on each chip for enhanced performance and accessibility. A3 Mega will become available to customers starting May.
Google Cloud plans to integrate Nvidia’s Blackwell GPUs into its offerings, expanding support for high-performance computing (HPC) and AI workloads with virtual machines powered by Nvidia HGX B200 and GB200 NVL72 GPUs that are “designed specifically to address even the most demanding AI, data analytics, and HPC workloads”.
Meanwhile, liquid-cooled GB200 NVL72 GPU[s] will provide real-time LLM inference and massive-scale training performance for trillion-parameter scale models.