Nvidia's $3,000 mini AI supercomputer draws scorn from Raja Koduri and Tiny Corp — AI server startup

erek

[H]F Junkie
Joined
Dec 19, 2005
Messages
12,431
1736472487158.png


"Koduri later elaborated that - in contrast to the big FP4 claims - by his calculations, the FP16 performance of the Project Digits AI supercomputer wasn’t that impressive. Koduri estimated that the FP16 performance of the upcoming GeForce RTX 5070 and even the $250 Intel Arc B580 “seems close” to what a Project Digits machine could muster."

Source: https://www.tomshardware.com/tech-i...r-startup-suggests-users-just-buy-a-gaming-pc
 
In some ways its not that impressive. 20 arm cores really isn't that much and the gpu die does not equal clusters of hpc gpus. Not really worthy of a supercomputer title. Perhaps super single board computer. Claims against it may be overblown though, this appears to be one of the most powerful single board computers for ai and perhaps other workloads. The 128gb of high bandwith memory is extremely useful, when even top tier consumer gpus are restraining memory capacity. Models on single systems could certainly be optimized for such a unique and powerful device and nvidia already has a stronghold into the software side of ai. I think you could make clusters of these via the avalible networking as well.
 
jfc, lets just go back to 8 bit pcs if everything is gonna be nybble.

How many 8 bit cores can you pack into a wafer on modern nodes?
 
so shitalker extraordinaire says its no good. must be bad then.
 
I mean the fp4 number are also close to the 5070.... because that probably around that GPU they put in.

It is really impressive FP4 numbers and that will be a very common use case for it.
 
It is really impressive FP4 numbers and that will be a very common use case for it.
I am trying to imagine doing PCA or SVD on 4 bit data....or numerically solving ODEs on 4 bits. Or FEA...etc...just such a weird flex to get excited about. Yay, a bunch of z80s. I mean, love my TI calculators, but still....
 
I am trying to imagine doing PCA or SVD on 4 bit data....or numerically solving ODEs on 4 bits. Or FEA...etc...just such a weird flex to get excited about. Yay, a bunch of z80s. I mean, love my TI calculators, but still....
It is obviously an AI machine and considering the demand, is it a weird flex for a hardware maker to make ? Achieving to do what they do on 4 (or 3 or less bits even) is impressive.
 
It is obviously an AI machine and considering the demand, is it a weird flex for a hardware maker to make ? Achieving to do what they do on 4 (or 3 or less bits even) is impressive.
So here's the thing. If you're gonna be training a model you tend to do a LOT of exploratory data analysis. That doesn't really work great on 4 bits. Maybe once you've done that work you could use this to run the pytorch model training, but that's a small part (albeit important).

What I am getting at is that I see zero value for model training on that. It seems like a machine that does inference. So if you don't want to rent cloud compute and want to host your own prebuilt model, then this makes sense. And the 128gb means you can fit big models.

But that's it...beyond that i don't see any value. Honestly, copilot's pretty handy (as an example). I don't know what I need it cant do such that i want ro pay to host my own model.
 
Keeping your data private and in-house? I "love" how this is not even a consideration now. Just as the doctor Microsoft ordered.
Hosting on aws behind a load balancer is private. I love how people are unable to think now. Just as social media ordered.
 
Cloud price for 80GB or more vram system is not particularly cheap too right now, right ?

https://cloud.google.com/compute/vm-instance-pricing

According to this a 1 H100 (80GB) gpu cost $8074.53 a month or $11.061 an hour on google, Amazon has some $4-5 an hour that could do, but for an use case of very large vram do not need H100 level of grunt force, could just be vastly cheaper and quick.

There is also on robots-factory-vehicle usage anything with real time imagery, just getting rid of some latency vs cloud solution.
 
Cloud price for 80GB or more vram system is not particularly cheap too right now, right ?

https://cloud.google.com/compute/vm-instance-pricing

According to this a 1 H100 (80GB) gpu cost $8074.53 a month or $11.061 an hour on google, Amazon has some $4-5 an hour that could do, but for an use case of very large vram do not need H100 level of grunt force, could just be vastly cheaper and quick.

There is also on robots-factory-vehicle usage anything with real time imagery, just getting rid of some latency vs cloud solution.
Zero chance you have it on all the time. You stick in a container and put that behind a load balancer with spot prices and you're probably paying 3-400 a year.
 
If it is a factory running some model, it can be on a lot of the time....
A vehicule-robot that use it, can be on a lot of the time.
Your own 70B code generator helper, can be on a lot of the time
Use it for training, not much of a limit how many hours, you will do trying different dataset
 
This is something i literally do. I have 2 3090s with nvlink. To try a bigger model i make a local container and try it in a spot instance. Huggingface has some nice tool to drop memory needs. If i really want to run the model i can put it on aws and its cheap. If you need it running constantly...that's different. zero chance digit is used to train models.
 
Not that much more expensive than making your own 2x3090s setup (and less than what it would have cost to do one not so long ago), $1500-$1700 in GPU alone.

A single 48GB RTX 6000 still sell for more than this whole computer, same for 80GB A100 (and a lot of 40GB even), not sure how close to those in performance it will get outside 4bits, but if you need large ram with modest performance this could work
 
Last edited:
Not that much more expensive than making your own 2x3090s setup (and less than what it would have cost to do one not so long ago), $1500-$1700 in GPU alone.

A single 48GB RTX 6000 still sell for more than this whole computer, same for 80GB A100 (and a lot of 40GB even)
Hold on now...you're not playing fair. One i bought new at retail during pandemic and it was paid for in eth mining. The other was basically free through trading old hardware off. The nvlink was 200 bucks.

I am cheap as fuck.

Also, my pc is great for a variety of tasks.


Edit: Actually, very curious if that lil guy would be good a mining coins??? (probably not)
 
Last edited:
Anything with the word "AI" or god forbid "supercomputer" in the name and I am insta-buying as reflexively as Comix clicking Win A Free iPad.

Anything with both "AI" and "supercomputer" in the name and I'm entering cranial liquifaction. I am Elon every time he remembers he has enough money to make women not care how many cameras he's got set up. I'm very aware that there is no honor in this.
 
Last edited:
Back
Top