NVIDIA Dynamo Increases Inference Performance While Lowering Costs for Scaling Test-Time Compute; Inference Optimizations on NVIDIA Blackwell Boosts Throughput by 30x on DeepSeek-R1SAN JOSE, …
Month: March 2025
AI is now mainstream and driving unprecedented demand for AI factories — purpose-built infrastructure dedicated to AI training and inference — and the production of intelligence. Many of these AI factories will be gigawatt-scale. Bringing up a single gigawatt AI factory is an extraordinary act of engineering and logistics — requiring tens of thousands of
Read Article
NVIDIA today announced the next evolution of the NVIDIA Blackwell AI factory platform, NVIDIA Blackwell Ultra — paving the way for the age of AI reasoning.
NVIDIA today unveiled NVIDIA Spectrum-X™ and NVIDIA Quantum-X silicon photonics networking switches, which enable AI factories to connect millions of GPUs across sites while drastically reducing energy consumption and operational costs.
NVIDIA announced the release of NVIDIA Dynamo today at GTC 2025. NVIDIA Dynamo is a high-throughput, low-latency open-source inference serving framework for…
NVIDIA announced the release of NVIDIA Dynamo today at GTC 2025. NVIDIA Dynamo is a high-throughput, low-latency open-source inference serving framework for deploying generative AI and reasoning models in large-scale distributed environments. The framework boosts the number of requests served by up to 30x, when running the open-source DeepSeek-R1 models on NVIDIA Blackwell.
NVIDIA announced world-record DeepSeek-R1 inference performance at NVIDIA GTC 2025. A single NVIDIA DGX system with eight NVIDIA Blackwell GPUs can achieve over…
NVIDIA announced world-record DeepSeek-R1 inference performance at NVIDIA GTC 2025. A single NVIDIA DGX system with eight NVIDIA Blackwell GPUs can achieve over 250 tokens per second per user or a maximum throughput of over 30,000 tokens per second on the massive, state-of-the-art 671 billion parameter DeepSeek-R1 model. These rapid advancements in performance at both ends of the performance…
Every second, businesses worldwide are making critical decisions. A logistics company decides which trucks to send where. A retailer figures out how to stock its shelves. An airline scrambles to reroute flights after a storm. These aren’t just routing choices — they’re high-stakes puzzles with millions of variables, and getting them wrong costs money and,
Read Article
Scientists and engineers of all kinds are equipped to solve tough problems a lot faster with NVIDIA CUDA-X libraries powered by NVIDIA GB200 and GH200 superchips. Announced today at the NVIDIA GTC global AI conference, developers can now take advantage of tighter automatic integration and coordination between CPU and GPU resources — enabled by CUDA-X
Read Article
NVIDIA today unveiled partnerships with industry leaders T-Mobile, MITRE, Cisco, ODC, a portfolio company of Cerberus Capital Management, and Booz Allen Hamilton on the research and development of AI-native wireless network hardware, software and architecture for 6G.
General Motors and NVIDIA today announced they are collaborating on next-generation vehicles, factories and robots using AI, simulation and accelerated computing.