Nvidia making 1,000% profit on H100 accelerators

Agent_N

Gawd
Joined
Aug 20, 2004
Messages
862
This is likely why Nvidia is not terribly concerned with gamers or cards targeted to gamers, the profit margins are not in the same ballpark as AI accelerators.

"Nvidia is raking in up to 1,000% in profit for each H100 GPU accelerator it sells, according to estimates made in a recent social media post from Barron's senior writer Tae Kim. In dollar terms, that means that Nvidia's street-price of around $25,000 to $30,000 for each of these High Performance Computing (HPC) accelerators (for the least-expensive PCIe version) more than covers the estimated $3,320 cost per chip and peripheral (in-board) components. As surfers will tell you, there's nothing quite like riding a wave with zero other boards on sight."

https://www.tomshardware.com/news/nvidia-makes-1000-profit-on-h100-gpus-report
 
I was just saying I don't think AMD will be the only company skipping out on high-end graphics next generation...
Not the high end but the upper mid, I suspect that Nvidia will keep their 102 dies, those are after all the datacenter silicon that doesn't bin well enough, they will have that regardless, it is the rest of the stuff they will scrap, but not go away with.
Traditionally Nvidia has the 102, 103, 104, 106, and 107 dies. I bet they will instead consolidate that down to 3, High, Mid, and Low, they can then use clock speeds, packaging, and VRAM to differentiate between them.
So the 90 and 80 will still get the 102's, but the 70 and 60TI will likely get the 103, possibly use the 103 for the mobile 80 and 90 Q-Max parts, then use the 104 for the 60-NonTI, 50, and the 70-50 mobile Q-Max parts.

So instead of leaving those markets, they will consolidate, so they are building fewer parts to allocate that silicon to the Enterprise which their binning should do well enough at feeding the 80 and 90 series parts, then they can fill the rest with smaller cheaper dies that they separate out how they see fit.
 
Last edited:
I was just saying I don't think AMD will be the only company skipping out on high-end graphics next generation...
Most people cannot afford high end GPUs anyway, the bulk of sales is low end up to mid range at best. No reason for AMD to sink a lot of money into a high end GPU when they can make a lot more money making AI accelerators.
 
Not the high end but the upper mid, The 80 and 90 series are essentially made from the chips that don't bin well enough for their enterprise units, what I suspect Nvidia to do is skip over the 70 series and go for the 50 and 60 instead.
Not exactly true. The H100 was benchmarked somewhat recently and it sucks for gaming. The Enterprise cards have different designs for specific workloads, gaming isn't one of them. The high end GPUs are specific to gaming, not really designed for Enterprise use.
 
Would have been a bit shocking if it is was significantly less (for the raw gross), $3300 is already a lot, imagine a 4090 cost $700, how much room you have in between to go to 814mm instead of 600 something, PCiexpress5.0 instead of 4 and HBM instead of ddr.
 
Not exactly true. The H100 was benchmarked somewhat recently and it sucks for gaming. The Enterprise cards have different designs for specific workloads, gaming isn't one of them. The high end GPUs are specific to gaming, not really designed for Enterprise use.
Isn't the 4090 more an RTX 6000 binned down than a H100 ?

H100 are quite different chips, the GH100 is a 814 mm², 5120 bits HBM type of affair

that the up binned pro line of Lovelace:
https://www.nvidia.com/en-us/design-visualization/rtx-6000/

10% slower in 3dmark with the much lower clock speed- wattage, made to be stacked with others in a case.
 
Not exactly true. The H100 was benchmarked somewhat recently and it sucks for gaming. The Enterprise cards have different designs for specific workloads, gaming isn't one of them. The high end GPUs are specific to gaming, not really designed for Enterprise use.
Many datacenter and workstation parts use the exact same chip as the 4090, it is just binned differently and they change the memory pattern and firmware.
Every workstation Nvidia card sucks for gaming always has, but they have always used the exact same silicon, just different drivers, firmware, and memory configuration.
The reason Nvidia gaming cards suck for most of their Workstation loads are the same, the firmware, drivers, and memory is gimped on Nvidia's side to make them not work.

AMD physically separates their consumer and enterprise parts with the RDNA and CDNA architectures, with completely different production runs, Nvidia does not, they use software and VRAM to keep them distinct at an I want to say artificial level, but it's not artificial because it's real they are separated, somebody help me with the word I'm looking for because it's not coming up.


Correction for clarity:
In my haste to answer, I said the H100 used the same silicon as the 4090 which is obviously not true, My brain muddled things up and somehow used H100 as a placeholder for the popular Workstation and Datacenter SKU's so I am going to blame that on a lack of sleep caused by BG3.
 
Last edited:
Isn't the 4090 more an RTX 6000 binned down than a H100 ?

H100 are quite different chips, the GH100 is a 814 mm², 5120 bits HBM type of affair

that the up binned pro line of Lovelace:
https://www.nvidia.com/en-us/design-visualization/rtx-6000/

10% slower in 3dmark with the much lower clock speed- wattage, made to be stacked with others in a case.
The Enterprise graphics rendering cards like the RTX 6000 will do better at gaming as one of their use areas is graphics rendering and such, so it may be more like the 4000 series cards for gamers than the more specific accelerators such as the H100. Again, these chips are geared towards specific uses and are not strictly binned down gamer grade GPUs.
 
somebody help me with the word I'm looking for because it's not coming up.
Gimped. Nvidia actively gimps gpus to make sure you can't do *too* much with them.

*Edit* Yesterday I learned that mobile 3080s could actually have 16gb of vram and did. Imagine that. A mobile 3080 being what the 3070/ti should have always been.
 
So instead of leaving those markets, they will consolidate, so they are building fewer parts to allocate that silicon to the Enterprise which their binning should do well enough at feeding the 80 and 90 series parts, then they can fill the rest with smaller cheaper dies that they separate out how they see fit.

My bet is that they'll do a limited run of halo gaming cards just to say they got the big win, then allocate wafer space for enterprise and call it a day. How many -90-series cards do they have to produce for it to not count as vaporware? 50K? 30K? You know they're doing the math already.
 
The H100 uses the exact same chip as the 4090, it is just binned differently and they change the memory pattern and firmware.
https://www.techpowerup.com/gpu-specs/h100-cnx.c4131
Die Size: 814 mm²
Density: 98.3M / mm²
Shading Units14,592
1011-default.jpg
1005-ad102-300-a1.jpg



vs
https://www.techpowerup.com/gpu-specs/nvidia-ad102.g1005
Die Size: 609 mm²
density: 125.3M / mm²
Shading Units:18,432

Does it mean that the central die of the H100 is the same things than an AD102 with just 6 addon arround it for the giant HBM memory ?

Hopper has 9216 FP64 units, Lovelace 288, it is something that can be changed a bit like an FPGA by firmware after the fact once binned ?
 
https://www.techpowerup.com/gpu-specs/h100-cnx.c4131
Die Size: 814 mm²
Density: 98.3M / mm²
Shading Units14,592
View attachment 591109View attachment 591110


vs
https://www.techpowerup.com/gpu-specs/nvidia-ad102.g1005
Die Size: 609 mm²
density: 125.3M / mm²
Shading Units:18,432

Does it mean that the central die of the H100 is the same things than an AD102 with just 6 addon arround it for the giant HBM memory ?

Hopper has 9216 FP64 units, Lovelace 288, it is something that can be changed a bit like an FPGA by firmware after the fact once binned ?
The AD 102, is used for the Titan ADA, RTX 6000 ADA, RTX 5000 ADA, the L40 series, the 4090, and 4080 cards

The GH100 is used for the H800, and all the H100 SKU's in both SXM5 and PCIe variants.


The H100 may be the halo datacenter card and certainly overshadows the L40 but Nvidia moves more L40 units by a long shot, not to mention all the RTX workstation cards. Those datacenter and workstation cards get the better binned of the 102 chips with the consumer parts getting the rest, once demand wains for those workstation parts and the supply of the better-binned cards opens up is when we would see the 4080 and 4090 TI variants.
 
Last edited:
The AD 102, is used for the Titan ADA, RTX 6000 ADA, RTX 5000 ADA, and the L40 series.

The GH100 is used for the H800, and all the H100 SKU's in both SXM5 and PCIe variants.
Thus why I thought that: H100 uses the exact same chip as the 4090, was not really the case ? more like the RTX 6000 ADA/L40 use the exact same chip as the 4090 not the H100.
 
Thus why I thought that: H100 uses the exact same chip as the 4090, was not really the case ? more like the RTX 6000 ADA/L40 use the exact same chip as the 4090 not the H100.
Yeah no I didn't mean to say that, that is my hands working faster than my head.

I'll go strike that out to not confuse others.
 
The AD 102, is used for the Titan ADA, RTX 6000 ADA, RTX 5000 ADA, the L40 series, the 4090, and 4080 cards

The GH100 is used for the H800, and all the H100 SKU's in both SXM5 and PCIe variants.


The H100 may be the halo datacenter card and certainly overshadows the L40 but Nvidia moves more L40 units by a long shot, not to mention all the RTX workstation cards. Those datacenter and workstation cards get the better binned of the 102 chips with the consumer parts getting the rest, once demand wains for those workstation parts and the supply of the better-binned cards opens up is when we would see the 4080 and 4090 TI variants.
Hopper arch (H100) and Lovelace (AD1xx) are not the same, they are completely different though they may share tensor cores and SM, the quantity of those shared design parts are vastly different. They are designed for specific workloads, one GPU design cannot be good at everything, they are customized just like AMD is doing with RDNA and CDNA.
 
has for the subject of doing or not, allowing or not much of their precious Fab space to an halo product.

) feel, they will try to have the gaming champ for a while
) feel like they will make a lot of them, the 4090 was huge success, I am sure their owner are more likely to want to fill steam survey and be active gamers than 2060 owner and could be overrepresented here quite a bit but, steam hardware survey seem to point to 750k sales among steam users.

In S Korea they even sold more 4090 (3.17%) than 4080 (3.02%) of total dgpu sales of 2023. The 4090 sold more than half of all the Radeons combined in 2023 in some market if those numbers are true.

It depend on a lot about how much the world building of FAB and the general 2020-2021 massive worldwide we need more chips capacity achieve to be online and achieve to compete with TSMC established in performance, yield and for how long the AI crazy or a next one goes, it is possible that by 2025 the idea to have to jungle how much they allocate and everything come at the cost of an other opportunity in a clear way will stop to be true.

HBM memory, chiplet, interconnect between, vram quantity chance are the hopper next will be different enough that the 5090 will not be in direct competition that much
 
Last edited:
Isn't the 4090 more an RTX 6000 binned down than a H100 ?

H100 are quite different chips, the GH100 is a 814 mm², 5120 bits HBM type of affair

that the up binned pro line of Lovelace:
https://www.nvidia.com/en-us/design-visualization/rtx-6000/

10% slower in 3dmark with the much lower clock speed- wattage, made to be stacked with others in a case.
Yeah, I went back and corrected that, I should give BG3 a rest for a day or 2 and actually get to bed...
The RTX workstation cards also use GDDR ECC and that has a latency penalty in addition to the speed and bandwidth differences of the GDDR6X memory used on the consumer parts
 
) feel like they will make a lot of them, the 4090 was huge success, I am sure their owner are more likely to want to fill steam survey and be active gamers than 2060 owner and could be overrepresented here quite a bit but, steam hardware survey seem to point to 750k sales among steam users.

Depends on what the supply of high-end 4000 cards is at the point of launch. There are rumors that they have a lot more 4090s and 4080s, and they're holding them back and releasing them in small batches to keep prices high.
 
Hopper arch (H100) and Lovelace (AD1xx) are not the same, they are completely different though they may share tensor cores and SM, the quantity of those shared design parts are vastly different. They are designed for specific workloads, one GPU design cannot be good at everything, they are customized just like AMD is doing with RDNA and CDNA.
And I hope Nvidia continues to differentiate them, their current method of product segmentation for keeping their consumer parts from being viable enterprise ones is not working for anybody. Normally they just gimp the memory and call it a day, but they can't keep doing that because it's starting to show.
 
Depends on what the supply of high-end 4000 cards is at the point of launch. There are rumors that they have a lot more 4090s and 4080s, and they're holding them back and releasing them in small batches to keep prices high.
Yeah not like they are making more at this point, their run at TSMC is done, it's also not difficult to actually get either the 4090 or 4080 cards right now, if they dumped them all then next year they would just find themselves in the same boat as AMD currently is with the 6900 series parts, they are out there steeply discounted making the 7000 series look bad.
Nvidia doesn't want to risk having them out there at huge discounts competing against the 5000 series.
 
Do you know how long and how much money NVIDIA has been investing in AI?
Yeah it’s almost like they developed an entire language model for it or something and spent the last 17 years working with industry specialists to refine it?

Crazy how it just all happened to magically take off out of nowhere while Nvidia coincidentally had a product ecosystem that was near perfect for it just there waiting to be utilized…


Sorry if this is snarky…. It’s a weird day
 
Yeah it’s almost like they developed an entire language model for it or something and spent the last 17 years working with industry specialists to refine it?

Crazy how it just all happened to magically take off out of nowhere while Nvidia coincidentally had a product ecosystem that was near perfect for it just there waiting to be utilized…


Sorry if this is snarky…. It’s a weird day
Not snarky, just plain blunt fact :).
 
The mother of future 5000 series.

Hopefully will be a relevant jump for an "earthly" price.
 
Yeah, I went back and corrected that, I should give BG3 a rest for a day or 2 and actually get to bed...
The RTX workstation cards also use GDDR ECC and that has a latency penalty in addition to the speed and bandwidth differences of the GDDR6X memory used on the consumer parts
Same here,too much BG3,in Chapter 3 now. The cut down AD103 is what the 4080 uses,not the AD102. 9728/112/304/76/304, out of 10752/120/336/84/336.

Your posts are always very informative,cheers.
 
Jensen will be one of those people who gets to list "right place, right time" among his major life accomplishments. Sucks for gamers but if you are a primary Nvidia investor, good on you, your guy is making the right moves. :)
Smart gamers would have bought shares in nvidia & a PS5 for gaming 😉
 
Same here,too much BG3,in Chapter 3 now. The cut down AD103 is what the 4080 uses,not the AD102. 9728/112/304/76/304, out of 10752/120/336/84/336.

Your posts are always very informative,cheers.
I'm one of those players who gets a ways in and then decides if only I was playing a character that could do 'This' and wants to make a new one. It's very hard for me to get anywhere in games like BG3, love the shit out of it so far but really hard for me to play.
Or worse, Oh god that's how that decision I made 20h ago played out.... I don't like that, should I load it back up, no that would be stupid, that would take me like 4 days to get back here, but would it I mean I wouldn't have to spend so much time searching, or would I have to spend more because I would remember doing things then forget to do them and miss items and loot, so I would have to be more careful and it would maybe take longer, wait no that's stupid why would it take longer... Well maybe this part isn't feeling right because one of my companions is built wrong, maybe I should look into that... A few hours later...

Anyways yeah I keep forgetting the 4080TI isn't a thing and any rumors about its existence are just that.
 
I worked for a company that sold specific products to hospitals. We had a control board we charged 3k for. It cost us $200 to make.
Basically the story across the entire healthcare and pharma industry. Part of the markup is theoretically for the robust QC that healthcare and pharma products are supposed to go through.
 
Nothing surprising here being in tech sales. We sell 7.6tb SSDs for 5k+ Par for course, cogs, margins, etc, etc ,etc. absolutely standard practice.
Not bad, Dell charging $3K CAD for 2TB right now for a "read intensive" one and it basically doubles if I want "mixed-use" so really $5K is almost a deal for that.
 
Nothing surprising here being in tech sales. We sell 7.6tb SSDs for 5k+ Par for course, cogs, margins, etc, etc ,etc. absolutely standard practice.
Sure, but who are your customers??? Lazy corporate buyers?
 
Back
Top