At GTC 2025, NVIDIA officially announced the Blackwell Ultra B300, a new GPU for data centers and artificial intelligence.
Compared to the B200, the new product offers 50% more memory and FP4 computing performance. NVIDIA says the «chip is designed for the» era of reasoning, more complex language and generative AI models.
The B300 base unit will be accompanied by the new B300 NVL16 server rack, GB300 DGX station and a full GB300 NV72L rack. Eight NV72L racks form the Blackwell Ultra DGX SuperPOD: 288 Grace processors, 576 Blackwell Utlra GPUs, 300 TB of HBM3e memory, and 11.5 ExaFLOPS of FP4. These can be combined to create supercomputing solutions that NVIDIA calls «AI factories».
Using the FP4 instruction on the B300 along with the new Dynamo library optimizes the performance of models such as DeepSeek. NVIDIA says the NV72L rack can deliver 30 times the performance of a similar Hopper configuration. This figure describes the full range of new technologies, including faster NVLink, increased memory, and other improvements.
In the example provided by NVIDIA, Blackwell Ultra issues up to 1000 tokens per second with the DeepSeek R1-671B model. Hopper, on the other hand, offers only up to 100 tokens per second. Consequently, the throughput is increased by 10 times, reducing the time to service a larger request from 1.5 minutes to 10 seconds.
B300 devices are expected to start shipping by the end of the year. Presumably, this time there will be no production delay, as with the previous generation, and it will not be delayed. NVIDIA notes that it received $11 billion in revenue from the Blackwell B200/B100 last fiscal year and expects to increase revenues from the new generation.
Source: Tom`s Hardware