Radeon Instinct MI60 Benchmark Had Tesla V100 GPU Operating Without Tensor Enabled

Megalith

24-bit/48kHz
Staff member
Joined
Aug 20, 2006
Messages
13,000
According to Wccftech, AMD was not being entirely honest when comparing its new 7nm Radeon Instinct M160 GPU with NVIDIA’s Tesla V100. The company’s RESNET-50 benchmark suggested that the parts were comparable, but that was only because the Tensor cores in the Tesla were disabled: it was only running at a third of its potential. “The performance of the V100 is just over three times that of the Radeon Instinct MI60.”

The company had claimed comparable inference performance as compared to NVIDIA’s Tesla V100 flagship GPU. I remembered seeing ResNet 50 performance before and could distinctly remember it being in the 1000s, so I looked through the footnotes and found the cause: the test was conducted in FP32 mode. The Tesla V100 contains Tensor cores and significantly more die space (the GCN architecture is hard-limited to 4096 stream processors) and those can be used to accelerate inference and learning performance by multiple factors.
 
I don't get why companies do this shit. Get over it, your products are subpar and lying about performance ALWAYS comes out (like learn your fucking lesson) and makes you look dumb.

Yeah, I dislike Nvidia as much as the next guy but facts are facts...their shit is better. There's a reason I'm rocking a 1080Ti and not something from AMD.
 
Not possible. Everyone knows only NV would do something like that. All the AMD guys told me so.

Oh stop it. No one has ever claimed one company is perfect and lets not act for one fucking second that Nvidia has clearly and obviously done WAYYY more anti-competitive closed source shenanigans than AMD. AMD lying about something like this hardly nullifies all of Nvidia's bullshit.
 
As an AMD fanboy (these days anyway) I am sorry to see gray-area behavior from them. On the surface what they did looks just like recent Intel shenanigans: bury key settings in a footnote with no justifying discussion. From the Wccftech article there is more to the story, but you'd have to know AI hardware to be convinced by AMD's response.

Unfortunately, while at Wccftech I slipped into the comments section so now I need a shower and a tetanus shot. Gawd.
 
“Regarding the comparison – our footnotes for that slide clearly noted the modes so no issues there. The rationale is that FP32 training is used in most cases for FaceID to have 99.99%+ accuracy, for example in banking and other instances that require high levels of accuracy.” – AMD

Given this is what is used in almost all cases atm. I can see why AMD would display it in this manner.

Much like Nvidia wanting us to give the 2000 series boosts for RTX and the new aliasing methods that arn't in major use.

I don't consider this nearly as shady and seems abit overblown.
 
Last edited:
This is pretty shady, did not expect this from AMD. This is just as bad as the Intel 9900K review that disabled half the cores on Ryzen chips.
Wut? This isn't the first time AMD pulled shit like this. AMD is not some righteous charity company. If rolls were reversed with competiting companies they would no doubt pull the same bullshit Intel and Nvidia pull. The fact they even trying to pull this bs when they are lacking shows their integrity.
 
Wut? This isn't the first time AMD pulled shit like this. AMD is not some righteous charity company. If rolls were reversed with competiting companies they would no doubt pull the same bullshit Intel and Nvidia pull.
Let's not pretend like any company is on same level as Nvidia for anti-consumerism....
Nvidia puts even Intel to shame on that front.
Intel had dominant processors for better part of a decade but did not price rape like Nvidia.
 
Let's not pretend like any company is on same level as Nvidia for anti-consumerism....
Nvidia puts even Intel to shame on that front.
Intel had dominant processors for better part of a decade but did not price rape like Nvidia.
You think AMD wouldn't price rape if they were in Nvidia shoes? A lot of people believe Intel price raped the past decade also.
 
Let's not pretend like any company is on same level as Nvidia for anti-consumerism....
Nvidia puts even Intel to shame on that front.
Intel had dominant processors for better part of a decade but did not price rape like Nvidia.
Your memory might need to be looked into...I'll just leave you with one name. "Extreme Edition".
 
Just in case anyone got the notion that AMD disabled something on the competing part, they did not.
Whether or not FP16 operations are not accurate enough for any serious kind of face recognition software is completely beyond me though. It doesn't seem that accuracy is what that part of the tech industry tends to put first.
 
Intel had dominant processors for better part of a decade but did not price rape like Nvidia.

Really? Intel's top of the line CPU's were all over 1k and that's just for a CPU. Where as Nvidia prices a release of cards over 1200 dollars for the card, power management all of it and Intel gets a pass for not being worse that Nvidia? Take the blinders off a bit and realize what you are saying. I think you will come to the same conclusion.
 
It's like they were trying to make an apples comparison or something... as indicated.

That's like saying "hey, the only reason the Tesla Model 3 won against the monster truck is because they were on flat ground!" ... come on.
 
That Vega is supposed to have neural network instructions to do f32 accumulate with f16 operands. So theoretically 2x better than with f32.
Means Nvidia still would be 1.5x faster.
I can see why they preferred just talking about f32.
But now everyone thinks that Nvidia is 3x faster while it is probably not...
Maybe it was not a smart thing to do that comparison
 
  • Like
Reactions: TAP
like this
Really? Intel's top of the line CPU's were all over 1k and that's just for a CPU. Where as Nvidia prices a release of cards over 1200 dollars for the card, power management all of it and Intel gets a pass for not being worse that Nvidia? Take the blinders off a bit and realize what you are saying. I think you will come to the same conclusion.

AMD did it too with their top-end parts, when they had such things- and they'd be fools not to do it again should they find themselves in a similar position.
 
Just in case anyone got the notion that AMD disabled something on the competing part, they did not.
Whether or not FP16 operations are not accurate enough for any serious kind of face recognition software is completely beyond me though. It doesn't seem that accuracy is what that part of the tech industry tends to put first.

FP16 is half-precision. For a lot of models, this will be sufficient. For more serious models, single precision (FP32) and double precision (FP64) could be required. It's essentially at what point does the data model start to squash the information too much to infere anything of value versus the advantage of using less memory and computing slightly faster. For simple (mostly binary choice) operations, FP16 is fine. For something like FaceRec, it'd probably need the extra precision.

So AMD isn't necessarily being dishonest in that FP32 is likely the normal use case for this application. I'd guess they chose the application (Facial Recognition) that gave them the best parity with Nvidia's hardware (they probably tested a bunch).

That's why the benchmark doesn't have the performance across a ton of ML applications. They cherry picked.

Also, here's a decent write-up by an Uber engineer: https://www.quora.com/What-is-the-difference-between-FP16-and-FP32-when-doing-deep-learning
 
FP32 is the most common set used for most machine learning tasks, in situations where you are running against strict datasets like facial recognition finger print analysis or data mining FP32 is fine. This is where money is being made and having a competitive product here is a big win for AMD because currently their adventures into this product space are laughable.

FP64 is best used for large data sets that need to be analyzed, new content is generated and every ounce of precision is required. This space is dominated by “content creators” render farms and super computers things that are doing serious math day in and day out. Much smaller market with much larger margins but a very small set of buyers it would be a tough sell for AMD’s cards here even if they were 10 or 15% faster because they would be an unknown or would require retooling to incorporate into existing projects.
 
All the more reason I love the detailed history Kyle published last month. It clearly showed how there's no side of the fence this stuff doesn't happen on. End result, no surprise here.
 
Really? Intel's top of the line CPU's were all over 1k and that's just for a CPU. Where as Nvidia prices a release of cards over 1200 dollars for the card, power management all of it and Intel gets a pass for not being worse that Nvidia? Take the blinders off a bit and realize what you are saying. I think you will come to the same conclusion.
upload_2018-11-12_9-49-46.png


Back at the end of 2006 I was choosing between this and a QX6700. The latter won out because moar coars. It was also slightly cheaper at $950 instead of $1,080.
 
As much as I hate shady shit from either companies (and yeah I like AMD), wouldn't this be valid benchmarking for FP32 calculations then? As Spidey329 said.
Most of what I'm working with is Fp16. Vega is ~twice as fast as 1080Ti for this, so it's great value for that use. Obviously with tensor cores that's now changed for FP16 in the RMA series.
Fp32 is a totally different ballgame though, so if it is strictly FP32 test, then these results are fine?
The 3x performance lead is only in FP16 loads, not FP32.
It's like saying the 9900k is twice as fast as the 2700k, but only in the two AVX512 programs available lol.

Dodgy and I do not support it, but no worse than competitors are doing.
 
  • Like
Reactions: mkk
like this
How dare they compare FP32 against FP32 and put a note saying as much!? :rolleyes:

Comparing mixed precision when you can only use FP32 or higher doesn't make sense. The people who use these for work would like to know strictly FP32 performance–the ones who use mixed precision likely already know nvidia is kicking ass.
 
While I haven't read the article yet, from what I remember from around a year ago the tensor cores needed software designed to utilize it properly? For example software designed for cuda in general wouldn't use them unless updates/reprogrammed.

That could explain things?
 
Back
Top