Can DriveNets and Accton’s AI White Boxes Transform Networking?

October 15, 2024
Can DriveNets and Accton’s AI White Boxes Transform Networking?

In a significant advance for AI networking, DriveNets and Accton Technology have unveiled new AI networking white boxes that leverage the latest Broadcom ASICs, Jericho-3-AI, and Ramon-3. This new development marks the first commercial availability of such white boxes, explicitly designed to cater to the growing demand for AI networking capabilities. By integrating DriveNets’ scalable software with Accton’s high-performance hardware, the solution aims to create a robust system capable of supporting AI and machine learning clusters with up to 32,000 GPUs connected via 800Gbps interfaces.

Groundbreaking Features and Architecture

Distributed Disaggregated Chassis (DDC) Scheduled Fabric Architecture

At the heart of this announcement lies the Distributed Disaggregated Chassis (DDC) scheduled fabric architecture embedded within the new white boxes. This architecture is engineered to deliver a scalable and easy-to-deploy networking solution, meeting the rising needs of hyperscalers and large enterprises dedicated to building expansive AI clusters. According to DriveNets and Accton, the DDC’s capabilities have been successfully validated through rigorous proofs of concept with leading AI clients, demonstrating its efficacy in real-world scenarios.

The DDC’s scheduled fabric architecture offers a flexible solution by decoupling the hardware and software layers, which allows for independent scaling. This separation grants organizations the ability to expand their AI infrastructure horizontally without the complexities tied to traditional systems. More importantly, it simplifies network management and deployment, enabling faster and more efficient responses to the increasing demands of AI workloads. Additionally, by supporting up to 32,000 GPUs with 800Gbps interfaces, the architecture provides a significant performance enhancement over existing solutions.

Addressing Hyperscalers and Large Enterprises

The targeted audience for these solutions includes hyperscalers and large enterprises, sectors that have seen an unprecedented surge in AI and ML applications. These entities require robust and scalable infrastructure to handle the computational demands of extensive AI operations. The new white boxes offer a comprehensive solution designed to address these needs effectively. According to DriveNets, the integration of Broadcom’s innovative ASICs into these white boxes ensures an unmatched performance level, reducing job completion times and enhancing overall system reliability.

Moreover, the flexibility offered by the DDC architecture translates into lowered operational costs, as organizations can scale their infrastructure in response to increasing AI demands without a corresponding rise in complexity. The architecture’s performance optimizations also extend to ensure resilience and fault tolerance, crucial aspects for maintaining continuous operations in high-stakes environments. By leveraging these advancements, hyperscalers and large enterprises can feasibly extend their AI capabilities, unlocking new avenues for innovation and competitive advantage.

Detailed Specifications of the New Accton White Boxes

NCP-5 and Broadcom’s Jericho-3-AI ASIC

The new product line includes two noteworthy additions: the NCP-5 and the NCF-2, each designed to bring cutting-edge connectivity and performance to AI networking. The NCP-5 is equipped with Broadcom’s Jericho-3-AI ASIC and supports 18 network ports and 20 fabric ports, each offering 800Gbps. This significant bandwidth is aimed at providing the necessary high-speed connections that modern AI applications require, ensuring seamless data exchange and processing across extensive AI clusters.

The Jericho-3-AI ASIC embedded in the NCP-5 functions as the critical component that facilitates this high level of performance. By managing multiple high-speed connections simultaneously, it directly influences the efficiency and speed of data transfer within AI networks. This technology allows the NCP-5 to support the vast data throughput needed for advanced AI and ML applications, reducing latency and improving the overall job completion times. Enhancements like these position the NCP-5 as a pivotal element for any enterprise looking to scale its AI operations effectively.

NCF-2 and Broadcom’s Ramon-3 ASIC

Complementing the NCP-5 is the NCF-2, which leverages Broadcom’s Ramon-3 ASIC. This model supports an impressive 128 fabric ports, each offering 800Gbps, echoing the high performance seen in its counterpart while emphasizing scalability. This design allows the NCF-2 to facilitate extensive data transfer capabilities, making it an ideal fit for large-scale AI networks that require a robust backend to manage extensive computational tasks.

The Ramon-3 ASIC’s design is tailored to handle the intricacies of AI-driven workloads, enhancing the performance of underlying network architectures. This focus on scalability permits organizations to incorporate the NCF-2 efficiently into existing infrastructures without significant hurdles, promoting seamless expansion as AI and ML demands grow. Furthermore, the integration of Broadcom’s advanced ASICs ensures that both the NCF-2 and NCP-5 can deliver the high performance and reliability demanded by large enterprises and hyperscalers.

Testing and Performance Validation

Extensive Tests and Results

To validate the capability and performance of these innovative white boxes, extensive tests were conducted at Accton’s lab in Taiwan. The hardware was assessed using Spirent’s AI workload emulation solution alongside Intel Gaudi servers fitted with 32 GPUs. This rigorous testing environment aimed to mirror real-world AI workloads closely, providing a comprehensive analysis of the architecture’s capabilities. According to the results, the new architecture showcased more than a 30% improvement in Job Completion Time (JCT) compared to traditional Ethernet Clos architecture, a significant milestone in AI networking performance.

These findings highlight the operational efficiencies that organizations can achieve by adopting the new white boxes. The enhanced JCT performance means reduced time spent on data processing tasks, which directly translates to more efficient use of computational resources and faster realization of AI-driven insights and innovations. The use of Spirent’s AI workload emulation solution further underscores the robustness of the testing procedures, presenting strong evidence that these new solutions are ready for real-world deployment and scalable application.

Parity with InfiniBand Solutions

One of the most compelling aspects of the test results is the demonstrated performance parity with InfiniBand solutions, long considered the gold standard in high-performance AI and ML networking. The introduction of these Broadcom-powered white boxes positions DriveNets and Accton’s solution as a formidable competitor in the market. This achievement is particularly notable because it offers a viable alternative without compromising performance, thereby providing organizations a choice that aligns with open-standard networking solutions.

This newfound parity allows enterprises to consider transitioning from proprietary systems to more open-standard configurations, which can result in cost savings and operational efficiencies. It also signifies a broader trend toward decentralization and open innovation within the AI and ML domains, encouraging a more diverse range of technology solutions. The ability to offer comparable performance with existing top-tier solutions like InfiniBand underscores the technological advancements realized through this collaboration between DriveNets and Accton.

Executive Insights and Future Prospects

Ryan Donnelly’s Perspective

Ryan Donnelly, Chief Operating Officer of DriveNets, illuminated the growing demand for new Broadcom ASICs and their potential in supporting high-scale AI clusters. He emphasized that these ASICs excel in performance without compromising on hardware diversity, an essential factor in scaling AI infrastructure. According to Donnelly, this balance between performance and flexibility positions their new white boxes as versatile tools that can adapt to evolving AI requirements, appealing to both current and future needs of enterprises.

Donnelly’s comments highlight the strategic importance of integrating Broadcom’s ASICs into DriveNets’ product lineup. By leveraging these innovations, DriveNets can offer solutions that not only meet but exceed current market expectations. This adaptability is particularly crucial as AI technology continues to evolve rapidly, demanding new approaches to network design and management. DriveNets’ commitment to delivering high-performance, scalable solutions underscores their leadership role in the AI networking landscape, fostering further advancements in this dynamic field.

Mike Wong’s Contributions

Mike Wong, Head of Product Management at Accton, also provided valuable insights, particularly about Accton’s strengths in engineering and design. He noted that Accton’s OCP-compliant open networking white box switches are designed to deliver the necessary performance and reliability for today’s AI infrastructures. Wong highlighted that their engineering prowess ensures these white boxes meet the highest standards required for AI backends, reinforcing the notion that the company is at the forefront of AI networking innovations.

Wong’s statements emphasize the critical role of design and engineering excellence in developing cutting-edge AI solutions. Accton’s dedication to creating reliable, high-performance hardware assures clients that their networking needs will be met with uncompromising quality. The partnership with DriveNets further enhances these attributes, resulting in solutions that are both robust and scalable, perfectly suited for the demands of modern AI applications. By focusing on engineering excellence, Accton continues to solidify its position as a leading provider of AI networking hardware.

Conclusion

In a groundbreaking development for AI networking, DriveNets and Accton Technology have introduced new AI networking white boxes, using the latest Broadcom ASICs, Jericho-3-AI, and Ramon-3. This unveiling marks the first time such high-performance white boxes have been made commercially available, specifically designed to meet the escalating demand for advanced AI networking solutions. The collaboration combines DriveNets’ scalable software with Accton’s cutting-edge hardware, resulting in a powerful system capable of supporting artificial intelligence and machine learning clusters with up to 32,000 GPUs interlinked through 800Gbps interfaces.

The integration of these sophisticated technologies aims to handle the substantial data loads that AI requires, enabling more efficient processing and faster data transfer times. This partnership signifies a transformative step in the AI networking landscape, promising to enhance the performance and scalability required for large-scale AI projects. As AI continues to evolve and penetrate various industries, the need for such robust and high-capacity networking solutions will only grow, making this development particularly timely and impactful.

Subscribe to our weekly news digest!

Join now and become a part of our fast-growing community.

Invalid Email Address
Thanks for subscribing.
We'll be sending you our best soon.
Something went wrong, please try again later