• Some users have recently had their accounts hijacked. It seems that the now defunct EVGA forums might have compromised your password there and seems many are using the same PW here. We would suggest you UPDATE YOUR PASSWORD and TURN ON 2FA for your account here to further secure it. None of the compromised accounts had 2FA turned on.
    Once you have enabled 2FA, your account will be updated soon to show a badge, letting other members know that you use 2FA to protect your account. This should be beneficial for everyone that uses FSFT.

Nvidia acqui-hires groq

Marees

Supreme [H]ardness
Joined
Sep 28, 2018
Messages
4,587
Comment by a twitter user

https://x.com/andrawesbahou/status/2003989932524896326?s=20

I think Nvidia acquired Groq *now*, to hedge against DRAM prices and capacity. If you didn't know, DRAM prices exploded recently. While nvidia GPUs are HBM (i.e DRAM heavy), Groq chips rely on in-chip SRAM. There are architectural and compiler-dependent ways to do this, so likely why they got Groq's world-class compiler team.
 
My thought was they wanted some lower power stuff for inference where ASICs can shine.
I saw something about maintaining the timing/coherence across multiple servers / GPUs

Maybe some network synchronization kind of stuff ??
 
grog are about using old nodes, in memory compute..

there was talk of a risk, that being a generalist they could get behind inference per watt/dollar by people doing only inference (like grog), they counter that risk.


I saw something about maintaining the timing/coherence across multiple servers / GPUs

Maybe some network synchronization kind of stuff ??
deterministic timing inference tech could help making synchronization really easy, low latency system would not hurt...
 
My thought was they wanted some lower power stuff for inference where ASICs can shine.
I figured they wanted to improve their PTX to Assembly translation times. NVidia has been hitting a wall with their hardware virtual machines that do the translation as the complexity of the tasks increases.
The RiscV processors that do that have come a long way over the last decade but they aren’t keeping up and Nvidia needed a breakthrough.

Probably cheaper to buy the tech then risk building the next decade of hardware around a licensing agreement.
 
combining on chip memory of grog with advanced cowos/nvlink ability to make giant multi-die stiched chip could make them quite something.

They can also take full control of the most expensive part, memory.. quite the vertical integration move, it become them and tsmc (or samsung/intel as a pure fab partner), Nvidia money and leading expertise in the chiplet world could open the door for 50-60 grog like chips on a SOW (they would not need that many of those to make a 70-140B model fit), on will become quite very cheap/mature old large mostly made for nvidia TSMC 4Nx or even the 7 still going.

They could also put some cpu-gpu tile with it and regular memory to have a large capacity but inference on a specific agent on the in-die sram
 
https://x.com/GavinSBaker/status/2004562536918598000

According to Baker, Nvidia is building a could stay the best at everything publically available AI for a while.

Rubin CPX (regular gddr) could have giant memory capacity to do prefill
Rubin HBM, giant bandwith/good capacity, training like already and high-density (giant model in a one place) inference
grog, lower quantity-ultra high bandwith/low density for super fast inference, the decoding stage using rubin cpx prefill output

giant intake (book, codebase, video, etc...)->Rubin CPX create a prefill cache, you want Nvidia general high tops and giant amount of memory (that do not need bandwith)

transfer that cache to the Rubin SRAM (grog tech) via NVLINK, model weight all in the chips not in external ram, the speed per dollar jump could be again extraordinary, something like original GPT-4 could go from 50 tps on H100 to 2,000+....

(https://newsletter.semianalysis.com/p/another-giant-leap-the-rubin-cpx-specialized-accelerator-rack), Because the prefill stage during inference tends to heavily utilize compute (FLOPS) and only lightly use memory bandwidth, running prefill on a chip with lots of expensive HBM featuring very high memory bandwidth is a waste.

That kind of move, with that kind of money (which was just spending the last quarter profit on something) show how incredible hard AMD/Intel would have to work to be anything more than strategic plan B in that space
 
Last edited:
Will have to see if too soon for 6000 series desktop gpu, but possibly with how small the DLSS model is, it could run in sram entierely on chips, on a part of the gpu like grog do.

That would make it near instant, (i.e. 1080p upscaled running almost as fast as 1080p and using a deterministic amount of time and what it could mean for nice frame gen use) and let the gpu quite alone doing its thing
 
In an email to employees, Nvidia CEO Jensen Huang reportedly referred directly to Groq’s chips.

“We plan to integrate Groq’s low-latency processors into the Nvidia AI factory architecture, extending the platform to serve an even broader range of AI inference and real-time workloads,” he reportedly wrote.

However, Huang confirmed in a media Q&A at CES yesterday that Groq’s technology would not become a part of Nvidia’s main data center roadmap.

“[Groq is] very, very different, and I’m not expecting anything there to replace what we do with Vera Rubin and our next generation,” Huang said. “There’s no reasonable [or] good way to do something better than Vera Rubin that we know of, and this doesn’t change that. However, we might be able to add their technology in a way that allows us to do something incremental that the world hasn’t been able to do yet.”


https://www.eetimes.com/what-is-groq-nvidia-deal-really-about/


one of Groq’s biggest selling points so far has been that it is not Nvidia—that it is a viable and inexpensive second-source for sovereign AI infrastructure. That is likely now defunct and future sovereign buyers will be subject to Nvidia’s negotiation tactics and supply chain constraints anyway.
 
Back
Top