- Joined
- Dec 19, 2005
- Messages
- 17,421
"Tomahawk Ultra is optimized for the tightly coupled, low-latency communication patterns found in both high-performance computing systems and AI clusters. With ultra-low latency switching and adaptable optimized Ethernet headers, it provides predictable, high-efficiency performance for large-scale simulations, scientific computing, and synchronized AI model training and inference.
When deployed with Scale-Up Ethernet (SUE specification available to the public here), Tomahawk Ultra enables sub-400ns XPU-to-XPU communication latency, including the switch transit time—setting a new benchmark for tightly synchronized AI compute at scale.
By reducing Ethernet header overhead from 46 bytes to just 10 bytes, while maintaining full Ethernet compliance, Tomahawk Ultra dramatically improves network efficiency. This optimized header is adaptable per application, offering both flexibility and performance gains across diverse HPC and AI workloads.
Tomahawk Ultra incorporates lossless fabric technology that eliminates packet drops during high-volume data transfer. Incorporating LLR, the switch detects link errors using Forward Error Correction and automatically retransmits packets, avoiding drops at the wire level. Simultaneously, CBFC prevents buffer overflows that traditionally caused packet loss. Together, these mechanisms create a truly lossless Ethernet fabric, delivering the level of reliability demanded by today's most data-intensive workloads.
Tomahawk Ultra also accelerates performance through In-Network Collectives solving one of the most persistent bottlenecks in AI and machine learning workloads. Rather than burdening XPUs with collective operations like AllReduce, Broadcast, or AllGather, Tomahawk Ultra executes these directly within the switch chip. This can reduce job completion time and improve utilization of expensive compute resources. Importantly, this capability is endpoint-agnostic, enabling immediate adoption across a wide range of system architectures and vendor ecosystems.
Designed with innovations in topology-aware routing to support advanced HPC topologies including Dragonfly, Mesh and Torus, Tomahawk Ultra is also compliant with the UEC standard and embraces the openness and rich ecosystem of Ethernet networking.
Introducing SUE-Lite
As part of Broadcom's Ethernet-forward strategy for AI scale-up, the company has introduced SUE-Lite—an optimized version of the SUE specification tailored for power and area-sensitive accelerator applications. SUE-Lite retains the key low-latency and lossless characteristics of full SUE, while further reducing the silicon footprint and power consumption of Ethernet interfaces on AI XPUs and CPUs.
This lightweight variant enables easier integration of standards-compliant Ethernet fabrics in AI platforms, promoting broader adoption of Ethernet as the interconnect of choice in scale-up architectures.
Platform for AI Scale-Up and HPC Scale-Out
Together with the 102.4 Tbps Tomahawk 6, Tomahawk Ultra forms the foundation of a unified Ethernet architecture: enabling scale-up Ethernet for AI, and scale-out Ethernet for HPC and distributed workloads.
Now Shipping
Tomahawk Ultra is 100% pin-compatible with Tomahawk 5, ensuring a very fast time-to-market. It is shipping now for deployment in rack-scale AI training clusters and supercomputing environments. To learn more about the Broadcom Tomahawk Ultra family click here."
Source: https://www.techpowerup.com/338944/...ultra-ethernet-switch-for-hpc-and-ai-scale-up
When deployed with Scale-Up Ethernet (SUE specification available to the public here), Tomahawk Ultra enables sub-400ns XPU-to-XPU communication latency, including the switch transit time—setting a new benchmark for tightly synchronized AI compute at scale.
By reducing Ethernet header overhead from 46 bytes to just 10 bytes, while maintaining full Ethernet compliance, Tomahawk Ultra dramatically improves network efficiency. This optimized header is adaptable per application, offering both flexibility and performance gains across diverse HPC and AI workloads.
Tomahawk Ultra incorporates lossless fabric technology that eliminates packet drops during high-volume data transfer. Incorporating LLR, the switch detects link errors using Forward Error Correction and automatically retransmits packets, avoiding drops at the wire level. Simultaneously, CBFC prevents buffer overflows that traditionally caused packet loss. Together, these mechanisms create a truly lossless Ethernet fabric, delivering the level of reliability demanded by today's most data-intensive workloads.
Tomahawk Ultra also accelerates performance through In-Network Collectives solving one of the most persistent bottlenecks in AI and machine learning workloads. Rather than burdening XPUs with collective operations like AllReduce, Broadcast, or AllGather, Tomahawk Ultra executes these directly within the switch chip. This can reduce job completion time and improve utilization of expensive compute resources. Importantly, this capability is endpoint-agnostic, enabling immediate adoption across a wide range of system architectures and vendor ecosystems.
Designed with innovations in topology-aware routing to support advanced HPC topologies including Dragonfly, Mesh and Torus, Tomahawk Ultra is also compliant with the UEC standard and embraces the openness and rich ecosystem of Ethernet networking.
Introducing SUE-Lite
As part of Broadcom's Ethernet-forward strategy for AI scale-up, the company has introduced SUE-Lite—an optimized version of the SUE specification tailored for power and area-sensitive accelerator applications. SUE-Lite retains the key low-latency and lossless characteristics of full SUE, while further reducing the silicon footprint and power consumption of Ethernet interfaces on AI XPUs and CPUs.
This lightweight variant enables easier integration of standards-compliant Ethernet fabrics in AI platforms, promoting broader adoption of Ethernet as the interconnect of choice in scale-up architectures.
Platform for AI Scale-Up and HPC Scale-Out
Together with the 102.4 Tbps Tomahawk 6, Tomahawk Ultra forms the foundation of a unified Ethernet architecture: enabling scale-up Ethernet for AI, and scale-out Ethernet for HPC and distributed workloads.
Now Shipping
Tomahawk Ultra is 100% pin-compatible with Tomahawk 5, ensuring a very fast time-to-market. It is shipping now for deployment in rack-scale AI training clusters and supercomputing environments. To learn more about the Broadcom Tomahawk Ultra family click here."
Source: https://www.techpowerup.com/338944/...ultra-ethernet-switch-for-hpc-and-ai-scale-up