All You Can Eat Core Buffet! Cerebras Wafer Scale Engine 2

clockdogg

[H]ard|Gawd
Joined
Dec 12, 2007
Messages
1,176
Hungry for more cores? There's 850,000 on the plate.
SunG9zmpeSn5utkS[1].jpg


https://www.techpowerup.com/281313/...ors-40-gb-onboard-sram-850-000-cores-12-wafer

Using just 15kW, thanks to TSMC's 7 nm process, expect this wafer to work as a smart waffle maker. Maybe the smartest waffle maker ever.

Also... good news... if TSMC can put this monster on the table, must mean we'll have our fill of measly little gpu snacks in no time.
 
I wonder what could a processor like this do for AI in games like ARK, Serious Sam, ATLAS, Supreme Commander, PSO NGS, Ultimate Epic Battle Simulator 2, Cooperative gamming, PvE in general, and so on?

Ya know VR Chat is always complaining that their game is very cpu intensive. I would if this could help with any of that?
 
Last edited:
I want to know what the defect rate is for something the size of a silicon wafer? So does 1 defect ruin the whole thing or do they just disable what's essentially a GPU's worth of silicon?
 
I want to know what the defect rate is for something the size of a silicon wafer? So does 1 defect ruin the whole thing or do they just disable what's essentially a GPU's worth of silicon?
Considering they're using almost the same node size (7-8nm), the GPU is 2.53x as dense. The reasons are definitely due to needed spacing between dies and redundant circuitry that can be shut off to defective dies to salvage the rest.
 
I'm curious what kind of cooling solution exists for this thing at 15kW. Hot!
 
I'm curious what kind of cooling solution exists for this thing at 15kW. Hot!
A bit late but have spare time watching some videos today. This big chip is integrated and sold in a standalone system with liquid cooling ;)

Good for distributed computing work.

 

STH - A Cerebras CS-2 Engine Block Bare on the SC22 Show Floor

While we talk about servers having to move to liquid cooling because of density, we are talking about 2kW/U servers or perhaps accelerator trays with 8x 800W or 8x 1kW parts. For the WSE/WSE-2, all of that power and cooling needs to be delivered to a single large wafer, meaning that even things like the thermal expansion rates of different materials matter. The other implication is that virtually everything on this assembly appears liquid-cooled.
Patrick-with-Cerebras-WSE-2-Hamburg-ISC-2022.jpg

Cerebras-CS-2-WSE-2-Engine-Block-696x498.jpg
 
Considering they're using almost the same node size (7-8nm), the GPU is 2.53x as dense. The reasons are definitely due to needed spacing between dies and redundant circuitry that can be shut off to defective dies to salvage the rest.
 

STH - 100M USD Cerebras AI Cluster Makes it the Post-Legacy Silicon AI Winner

The current Phase 1 has 32 CS-2’s and over 550 AMD EPYC 7003 “Milan” CPUs (note: Andrew Feldman, CEO of Cerebras told me they were using Milan) just to feed the Cerebras CS-2’s with data. While 32 GPUs are four NVIDIA DGX H100 systems these days, 32 Cerebras CS-2’s are like 32 clusters of NVIDIA DGX H100’s each on single chips with interconnects on the large chip. This is more like hundreds (if not more) of DGX H100 systems, and that is just Phase 1.
Cerebras-Condor-Galaxy-1-AI-Supercomputer-1068x601.jpg
 
So from what I'm understanding is that they built a giant fucking chip? Congrats but that hardly seems practical. Imagine having 5 people pick up a chip just to install it.

View attachment 350478
I don't have to imagine too hard when I've seen how big old SGI and Sun systems with the really beefy graphics options that replaced entire CPU modules in minifridge-sized, 200+ lbs. "personal graphics supercomputers" were!

This is basically the 21st-century equivalent, and unlike that older big iron that could get away with old-fashioned heatsinks and fans loud enough to be mistaken for jet turbines, it seems like liquid cooling is mandatory. You've probably got enough waste heat on this massive chip that it could double as an electric skillet once you get a stress test going. Anyone wanna port Prime95 to this thing?

Actually, that does raise another question: what software even runs on this thing? What architecture is it? Having lots of cores for embarrassingly parallel workloads is one thing, but Amdahl's Law will kick you hard in the teeth if single-threaded performance isn't up to par with current CPU tech.
 
Actually, that does raise another question: what software even runs on this thing? What architecture is it? Having lots of cores for embarrassingly parallel workloads is one thing, but Amdahl's Law will kick you hard in the teeth if single-threaded performance isn't up to par with current CPU tech.
It is powered by AMD Epyc CPUs, so x86-64,
The Cerebras AI processors most likely have their own programming language that is coded specifically for their FP16 AI processes, probably in a similar fashion as CUDA is with NVIDIA GPUs.
 
Back
Top