Matilda Bailey is a distinguished networking specialist whose career has been defined by a deep focus on the cutting-edge evolution of cellular and next-generation wireless technologies. With a reputation for deconstructing complex infrastructure trends, she has become a go-to expert for understanding how hardware integration fuels the next wave of supercomputing. Her insights are particularly valuable now, as the industry stands on the brink of a massive convergence between traditional scientific research and the explosive growth of artificial intelligence. In this conversation, we explore the intricate details of high-performance architecture and what these massive leaps in processing power mean for the future of global innovation.
The discussion centers on the shift toward rack-scale supercomputing, where the physical boundaries between central and graphics processing units are increasingly blurred to maximize efficiency. We explore the specific performance benchmarks that set new standards for fluid dynamics and climate modeling, as well as the transition from standard AI tools to autonomous, agentic systems. Furthermore, we examine how major global research institutions are preparing to house these liquid-cooled giants to tackle the most pressing scientific questions of our time.
With the Vera Rubin platform integrating Vera CPUs and Rubin GPUs via NVLink-C2C, how does this level of architectural synergy change the game for data center operators compared to previous generations?
This level of integration represents a fundamental shift in how we think about the anatomy of a supercomputer, moving away from fragmented components toward a truly unified, rack-scale system. By weaving together Vera CPUs and Rubin GPUs with NVLink-C2C interconnects, ConnectX-9 SuperNICs, and BlueField-4 DPUs, the platform creates a seamless highway for data that eliminates traditional bottlenecks. We are looking at a setup that supports up to 144 GPUs in a single rack, all managed through advanced direct liquid cooling to handle the immense thermal output. This architecture allows for a staggering seven exaflops of AI performance, providing a dense, powerhouse solution that simplifies the physical footprint while massively scaling the computational output for large-scale operators.
Scientific research often requires extreme precision, so how significant is the inclusion of native FP64 support in this new architecture for fields like climate modeling and fluid dynamics?
For the scientific community, native FP64 precision is not just a luxury; it is the bedrock of accuracy for complex simulations like geoscience and computational fluid dynamics. The Vera Rubin platform delivers five petaflops of native double-precision computing performance, ensuring that the fundamental mathematical workloads remain reliable and precise. When you combine that with the AI-driven capabilities of the system, researchers no longer have to choose between the speed of modern AI and the rigorous precision of traditional HPC. It allows for a hybrid approach where climate models can be both incredibly fast and scientifically exact, which is a vital balance for predicting long-term environmental shifts.
We have seen a massive leap in memory bandwidth with this new generation. What kind of real-world performance boosts should researchers expect when running memory-bound applications?
The jump in memory bandwidth is one of the most impressive technical feats here, offering a 2.8 times increase compared to the previous Blackwell generation. This is a game-changer for memory-bound fluid dynamic applications, where we are projecting performance boosts of up to four times. When you are dealing with the movement of air over a wing or the flow of water in a cooling system, that extra bandwidth ensures the processors aren’t sitting idle waiting for data to arrive. It essentially clears the traffic jams that have historically slowed down discovery, making the entire scientific process feel much more fluid and responsive.
Major institutions like the Leibniz Supercomputing Centre and Los Alamos are already planning deployments. How are these specific organizations tailoring this technology to their unique research goals?
It is fascinating to see how differently these powerhouses are utilizing the same underlying technology to push the boundaries of science. The Leibniz Supercomputing Centre is planning to deploy the Blue Lion supercomputer by 2027, which is expected to offer roughly 30 times the computing power of their current setup for astrophysics and life sciences. Meanwhile, Los Alamos is taking a more diversified approach with a trio of systems—Mission, Vision, and Veritas—to cover everything from national security to open scientific research. Veritas, in particular, is an interesting build because it pairs Rubin GPUs with standalone Vera CPU partitions specifically to explore the frontiers of agentic AI.
There is a lot of talk about the shift toward agentic AI. How does the Vera Rubin architecture specifically address the ten-fold increase in simulation demand that these autonomous systems are expected to generate?
Agentic AI represents a move from a tool that merely responds to queries to a system that can autonomously execute complex, multi-step tasks, which naturally puts a massive strain on infrastructure. Early data indicates that these autonomous agents can increase simulation demand by up to ten times, requiring a platform that can handle training and real-time analysis simultaneously. The Vera Rubin stack is designed to be that single, versatile infrastructure where researchers can train foundation models and deploy surrogate models without switching environments. It provides the raw horsepower and high-speed networking needed to keep up with an AI that is constantly iterating and running its own simulations in the background.
What is your forecast for the future of agentic AI in scientific research?
I believe we are entering an era where AI will act as a primary investigator, capable of running thousands of simultaneous experiments in a virtual environment to find the most viable path for physical testing. As systems like the ones at NERSC and Los Alamos come online, the sheer volume of data-intensive research will explode, likely leading to a decade of “accelerated discovery” in energy exploration and quantum chemistry. The infrastructure is finally moving past being a simple calculator and becoming a true partner in the scientific method, which will fundamentally change the speed at which we solve global challenges. This transition will require us to rethink not just our hardware, but how we structure the very process of inquiry itself.
