Sony Working on AMD Ryzen LLVM Compiler Improvements, Possibly for PlayStation 5

I was guesstimating, you might be right. But looking at the wikipedia for jaguar, based on clock speed (which is not a great comparison, but I can't find GFlops apples to apples) the Liverpool Jaguar (in the ps4) is 800mhz. So roughly x5 just on clockspeed. I would like to see a GFlops rating for both though. And this is just CPU, not talking about GPU... current Vega has to be 3-5x better than the PS4.

https://en.wikipedia.org/wiki/Jaguar_(microarchitecture)

I am also talking about CPU. Zen IPC is ~2.5x greater than Piledriver

getgraphimg.png


And Piledriver IPC is roughly equivalent to Jaguar IPC (+- 10%)

800MHz is the GPU clock in the first PS4. the Jaguar cores in last consoles (e.g. PS4 Pro) run >2GHz. Assuming Zen cores clocked around 4GHz (which is an overoptimistic assumption) we obtain performance gap is

2.5 * 4/2 = 5

This is the ~5x factor I mentioned in my former post between a console with 8 jaguar cores and one with 8 Zen cores.
 
So was Pascal, but I don't see anyone else complaining.
That's a good point actually. I guess my big issue is that AMD tends to disappoint. So I am keeping my expectations low intentionally. But you could well be right. Thanks for that.
 
Even if they kept the same GPU in the PS4 Pro (roughly equivalent to an RX 480 from what I've seen examples of) and just upgraded the CPU on it, the PS4 Pro would be capable of AAA games with native 4K 60fps.
The Jaguar CPU is what is holding the current generation back, and being clocked at 2.3GHz is just not enough for that architecture to keep the GPU fed data properly - maybe if it were clocked at 4.5GHz or faster, but those are really weak x86_64 cores.

I'm not knocking the Jaguar architecture at all, it's just that when it was developed back in 2013, it was designed for low-power systems, thin clients, and embedded systems of that era - hardly a high performance contender of that era, let alone in 2018 or the 2020s!
I would not be surprised at all to see Sony and/or Microsoft move to ARM processors at this point, just like Nintendo did, regardless of whether they are streaming or dedicated systems.

Heck, even an ARMv8 A57 CPU core is clock-for-clock around ~1.5 times faster than an x86_64 Jaguar CPU core, and it requires far less electrical power plus it produces much less heat; it might also be potentially cheaper due to the licensing of x86_64, but that is pure speculation on my part.

Both Sony and Microsoft wanted ARM. x86-64 was chosen only because 64bit ARM wasn't ready then. However, I think it is simpler for them now to continue with AMD and x86-64 instead switch to ARM. Long time consoles will not be x86.

Cortex cores are also smaller than Jaguar cores.
 
The CPU that'll end up in the PS5 would likely be clocked much lower than your traditional Ryzen 8 core. I wouldn't be shocked if it ran at 1.8Ghz or around there. HBM makes sense, even though its expansive. Console makers tend to like to avoid price fluctuation and would rather get a contract that guarantees them a fixed cost for the production of HBM. We're also years away from when a PS5 is released, so who knows what'll happen in the memory market by then.

1.8GHz looks too small. I would expect clocks in the 2--3GHz range.
 
Not really, Nvidia does not do Semi-custom. If the next gen of consoles wants something other than what's already on the market, they can't commission it, even then, if they want something more powerful than an ARM chip, they will need a separate chip commissioned for the CPU, as Nvidia only does ARM.

In about the same size than one Zen core Nvidia could put four Cortex cores. Four A72 cores @2.5GHz are much faster than one Zen core @2.5GHz.
 
Both Sony and Microsoft wanted ARM. x86-64 was chosen only because 64bit ARM wasn't ready then. However, I think it is simpler for them now to continue with AMD and x86-64 instead switch to ARM. Long time consoles will not be x86.

Cortex cores are also smaller than Jaguar cores.
That makes total sense, and I remember the then high-end ARM CPUs back in 2013 being the A17/A15, which really were not that great overall for price-performance compared to the Jaguar-based x86-64 processors, though they were somewhat close in general-processing power.
You are right, though, the fact that they were not 64-bit was a total deal breaker back then, as 4GB of addressable RAM+hardware was not nearly enough... maybe back in 2010, but not in 2013.

As for the existing ARM processors, if I remember correctly (did the math on this a while back), I believe ARMv8 A57 cores are about 1.2 times faster clock-for-clock compared to Jaguar-based x86-64 cores.
It's kind of funny that clock-for-clock the Nintendo Switch A57 (four) cores are technically faster than the Jagaur cores in the PS4/XBone, but because they are only clocked at 1GHz, Jaguar ends up beating it due to its much higher clock speed. :)

That is impressive about the A72 cores having that amount of processing power - I will have to read up a bit more on those, thanks for the info!
 
Keep in mind that if we do go the route of Ray-Tracing then wouldn't AMD's GPU with higher compute power do better than Nvidia?

Current architectures?

Sure!

But it'd be too slow. If ray-tracing is the way forward, new, faster hardware will need to be developed, and given AMD's development speed, I'd bet on Nvidia if I had to place one. Hell, given the relative simplicity of ray-tracing, I'd bet on Intel.
 
Current architectures?

Sure!

But it'd be too slow. If ray-tracing is the way forward, new, faster hardware will need to be developed, and given AMD's development speed, I'd bet on Nvidia if I had to place one. Hell, given the relative simplicity of ray-tracing, I'd bet on Intel.
Yes, Intel might actually be able to do it better. They demoed it on Larrabee many years ago. Remember Larrabee? Hopefully Intel can get it right this time. And God help AMD if they do.
 
I800MHz is the GPU clock in the first PS4. the Jaguar cores in last consoles (e.g. PS4 Pro) run >2GHz. Assuming Zen cores clocked around 4GHz (which is an overoptimistic assumption) we obtain performance gap is

2.5 * 4/2 = 5

This is the ~5x factor I mentioned in my former post between a console with 8 jaguar cores and one with 8 Zen cores.
Ah, now you are right about that as well.
Clock-for-clock, Zen is about 2.66 times faster than Jaguar, but if the clock speed were to double (2.0GHz to 4.0GHz) then that would most certainly result in a 5+ times boost in performance overall in general.

Again, the saddest part about the newest iteration of the current-generation consoles is the weak CPUs, which in turn is heavily holding back both the GPU (Destiny 2 - devs stated this) and game logic (Fallout 4 has this issue), at least at 4K resolutions and a few 1080p resolutions, depending on the game.
 
So far nobody has made a ARM CPU that even comes close to what Intel or AMD has with x86.

If you say so

cavium-thunderx2-spec-int-rate-peak-gcc7-jpg.jpg


This ThunderX2 chis is being used in a supercomputer built by Cray. Supposedly HPC benchmarks will be released in the Cray User Group conference this week.
 
Yes, Intel might actually be able to do it better. They demoed it on Larrabee many years ago. Remember Larrabee? Hopefully Intel can get it right this time. And God help AMD if they do.
Yes I remember Larrabee!
I was the person who was screwed out of getting one due to the seller gtechccs selling it to Linus Tech Tips after I had already paid for it! :confused:

Back on topic, yes, and I remember it was quite the GPGPU stepping stone, and it allowed for around 1TFLOP FP32 general-purpose processing power on it circa 2009, which was quite the achievement.
AMD had 1TFLOP of FP32 processing power on the HD4850/HD4870 back in 2008, but next to no GPGPU applications existed at that time to seriously take advantage of it, save for crypto-lock breaking software via brute force, and perhaps a few folding applications which were far from well optimized - AMD has come a long way since that time, and so has Intel, at least in general-compute.
 
Last edited:
This ThunderX2 chis is being used in a supercomputer built by Cray. Supposedly HPC benchmarks will be released in the Cray User Group conference this week.
As is backed by much info in this thread as well: https://hardforum.com/threads/arm-server-status-update-reality-check.1893942/page-2

ARM is becoming a serious contender to x86-64 in nearly all areas except the desktop, and even that has made quite a few strides in the last few years - Apple, whether you like them or not, will be the ones to officially move this forward with their own in-house ARM designs.
I do have to say that their latest 6-core ARM CPU (3 big, 3 little), the Apple A10X, is clock-for-clock 40% faster than Jaguar x86-64 cores, and the A10X is considered a mobile processor.

I don't see why ARM would be so unreasonable outside of development or licensing costs as the main CPU in the next consoles - NVIDIA GPUs is another question, though, especially with their poor history with both Microsoft and Sony.
 
If you say so

View attachment 75688

This ThunderX2 chis is being used in a supercomputer built by Cray. Supposedly HPC benchmarks will be released in the Cray User Group conference this week.
Yea but they achieve this with many cores on two CPUs. Now use two Threadripper chips and see what happens. This is kinda what Intel did with Larrabee and that kinda failed. Kinda cause Larrabee turned into a CPU.

Also single threaded vs multithreaded. That's really good for servers but we need good single threaded performance for the desktop.

1.8GHz looks too small. I would expect clocks in the 2--3GHz range.
If it's an APU with a heavy duty GPU, the heat production would be high and consoles aren't known for amazing cooling. So I'd expect it to be clocked really low.

Current architectures?

Sure!

But it'd be too slow. If ray-tracing is the way forward, new, faster hardware will need to be developed, and given AMD's development speed, I'd bet on Nvidia if I had to place one. Hell, given the relative simplicity of ray-tracing, I'd bet on Intel.
Ray Tracing has been done with older DX11 GPUs, and hybrid Ray-Tracing has been around for a while. You can download a demo running on your PC from 6 years ago. I expect Nvidia to have a proprietary API cause Nvidia. You obviously don't need one, but this way it sells far more graphic cards and you can't blame Nvidia for lacking compute power in their older GPU's because of proprietary API.

Go ahead, download the demo, it looks amazing. I wouldn't be surprised if AMD puts Ray Tracing on the CPU considering how little the CPU is used in games, and its not like AMD doesn't benefit from people needing faster CPUs. Traditionally CPU's have done Ray-Tracing. Would be more amazing if AMD mixed in GPU compute with CPU's. I forget that's something I think OpenCL can do, and maybe now Vulkan as well.



I am also talking about CPU. Zen IPC is ~2.5x greater than Piledriver

View attachment 75681

And Piledriver IPC is roughly equivalent to Jaguar IPC (+- 10%).

Not really. Here you can see the Athlon 5350 is more than half as slow as the AMD A8 3850. The Athlon 5350 is a Jaguar chip while the A8 3850 is a Liano chip. And the FX 8350 is 20% faster in IPC than that chip and it was based on Trinity. So yea, the Jaguar chips were half as slow in IPC compared to Trinity, and Trinity is 50% slower than Ryzen.

62708.png


cinebench.gif
 
Ray Tracing has been done with older DX11 GPUs, and hybrid Ray-Tracing has been around for a while. You can download a demo running on your PC from 6 years ago. I expect Nvidia to have a proprietary API cause Nvidia. You obviously don't need one, but this way it sells far more graphic cards and you can't blame Nvidia for lacking compute power in their older GPU's because of proprietary API.

Go ahead, download the demo, it looks amazing. I wouldn't be surprised if AMD puts Ray Tracing on the CPU considering how little the CPU is used in games, and its not like AMD doesn't benefit from people needing faster CPUs. Traditionally CPU's have done Ray-Tracing. Would be more amazing if AMD mixed in GPU compute with CPU's. I forget that's something I think OpenCL can do, and maybe now Vulkan as well.


I... what?

No, CPUs will not be fast enough. Adding them to the render loop will likely slow things down due to syncing issues- multi-GPU but worse- and no current GPU is fast enough to render modern graphics with ray-tracing.

It will take new hardware.
 
Not really. Here you can see the Athlon 5350 is more than half as slow as the AMD A8 3850. The Athlon 5350 is a Jaguar chip while the A8 3850 is a Liano chip. And the FX 8350 is 20% faster in IPC than that chip and it was based on Trinity. So yea, the Jaguar chips were half as slow in IPC compared to Trinity, and Trinity is 50% slower than Ryzen.
I know this thread is getting big, but I have already done the general-performance clock-for-clock comparison directly between Ryzen and Jaguar (see post #30), and Ryzen is roughly 2.66 times faster than Jaguar.
Not saying you are wrong or anything, just thought you might find it an interesting read. :)
 
I am also talking about CPU. Zen IPC is ~2.5x greater than Piledriver

View attachment 75681

And Piledriver IPC is roughly equivalent to Jaguar IPC (+- 10%)

800MHz is the GPU clock in the first PS4. the Jaguar cores in last consoles (e.g. PS4 Pro) run >2GHz. Assuming Zen cores clocked around 4GHz (which is an overoptimistic assumption) we obtain performance gap is

2.5 * 4/2 = 5

This is the ~5x factor I mentioned in my former post between a console with 8 jaguar cores and one with 8 Zen cores.
That is NOT "IPC". :rolleyes:

The goal was to achieve 40% IPC and they ended up with a hair over 50% improvement in IPC.


EDIT: Yep took only a few seconds to see that: https://www.anandtech.com/show/1117...-7-review-a-deep-dive-on-1800x-1700x-and-1700


"Then it was realised that AMD were suggesting a +40% gain compared to the best version of Bulldozer, which raised even more question marks. At the formal launch last week, AMD stated that the end goal was achieved with +52% in industry standard benchmarks such as SPEC from Piledriver cores"

AMD_Ryzen_Tech_Day_-_Architecture_Keynote-02_575px.jpg
 
Last edited:
I... what?

No, CPUs will not be fast enough. Adding them to the render loop will likely slow things down due to syncing issues- multi-GPU but worse- and no current GPU is fast enough to render modern graphics with ray-tracing.

It will take new hardware.
I just showed you a demo from 6 years ago that you could run 6 years ago with hybrid ray tracing. Hardware 6 years ago could do ray tracing, so why would it be impossible today? There have been modified Quake and Doom engines that actually used Ray Tracing.

Intel has shown you can do it in real time with just using the CPU. You obviously wouldn't ray trace the entire game, cause the hardware can't do that, but if the math for ray-tracing was just pure math then you could feed it to both the GPU and the CPU, assuming ray-tracing doesn't care which is faster. Nvidia wouldn't want assistance from the CPU where AMD would.



 
Well first of all, Vega was delayed for well over a year. So even if AMD is saying early 2018 for Navi, I would say early 2019 is a safer and more reasonable estimate. So when is the next architecture after Navi coming out? 2020? Does it even have a name? And should we bank on a 2021 release due to delays? I really don't think Sony can wait that long. Perhaps they can. But really waiting on a new from the ground up architecture from AMD when their lead chip designer just left them seems extremely risky to me. Perhaps it would pay off. But perhaps not. You are right though. They are currently in a very good position.
You're hilarious in your tech expertise.
None of your above quotes matters. Sony/MS do not buy off the shelf GPU's. AMD design's what its contractors desire. You keep grasping at the retail PC world and equating it to future console constructs.
That is idiotic. None, I will repeat NONE of the x86 consoles have ever used off the shelf, stock GPU's!
Everything is custom. x86 based offers so much freedom and (KEY) compatibility. Intel is years out, and Nvidia has no license.
I hope you have worked out some issues in your public insanity though.
Shame if you haven't.
 
You're hilarious in your tech expertise.
None of your above quotes matters. Sony/MS do not buy off the shelf GPU's. AMD design's what its contractors desire. You keep grasping at the retail PC world and equating it to future console constructs.
That is idiotic. None, I will repeat NONE of the x86 consoles have ever used off the shelf, stock GPU's!
Everything is custom. x86 based offers so much freedom and (KEY) compatibility. Intel is years out, and Nvidia has no license.
I hope you have worked out some issues in your public insanity though.
Shame if you haven't.
I think he was talking about the GPU technology in general and how those specific architectures could potentially be used in the future consoles, assuming AMD and/or x86-64 is used.
He also never stated (from what I read) that Sony or Microsoft use off the shelf GPUs, so I'm not sure where that comment is coming from; if you can quote him saying that, please show me since I must have missed it.

I do have to ask just for general knowledge of processors in consoles, what does x86-64 offer that ARM does not in terms of freedom and compatibility?
There isn't too much which is "custom" about the existing Jaguar processors in the consoles other than the fact that AMD put two quad-core CPUs onto one die (similar in design to the Pentium D and Atom 330) and perhaps changing the MMU to function with GDDR5 and the higher latency timings; I would assuming this is why AMD opted to call it a "semi-custom" design, which makes sense I suppose.

The GPUs are most definitely custom, though, I will give that one to you in full.
Also, no need to insult him, this is a very interesting discussion and it is very intriguing to see everyone's viewpoints - I should also add, that at the end of the day for what Sony is eventually going to use, this is all pure speculation, so there aren't really any "right answers" for any of it, other than existing part specifications themselves. :)
 
Ah, now you are right about that as well.
Clock-for-clock, Zen is about 2.66 times faster than Jaguar, but if the clock speed were to double (2.0GHz to 4.0GHz) then that would most certainly result in a 5+ times boost in performance overall in general.

Again, the saddest part about the newest iteration of the current-generation consoles is the weak CPUs, which in turn is heavily holding back both the GPU (Destiny 2 - devs stated this) and game logic (Fallout 4 has this issue), at least at 4K resolutions and a few 1080p resolutions, depending on the game.

This is my beef with your statement it implies that you know things through the destiny developers which coincide with your statement. Which is rather odd because if you look at the hardware then look at your statement you would think it is true.
The hardware itself is not at fault it is that Destiny 2 developers can't get more out of it. Which is either limited by their performance or there will to change things around.
Even Xbox 1 X proves that. That it is harder on PS4 Pro does not mean that the limit is set by hardware but rather by approach to the hardware.
I find it odd that game developers these days would blame anything hardware related towards performance of their own engine. In the past they had to deal with hardware where you needed to optimize nearly everything to get the smallest amount of performance gain to make your game not suck.

It has been nearly 5 years since the landscape changed from nice to have more cores to need to make use of all the cores to drive the gpu. That some companies are to lazy to adopt or not have the staff that excels at it does not mean have to mean what you are claiming.

What is more unsettling is that you claim that both Sony and MS have no idea why there creating hardware to drive their platform what is worse they failed 2 times at it.

What you are claiming makes no sense whatsoever.
 
This is my beef with your statement it implies that you know things through the destiny developers which coincide with your statement. Which is rather odd because if you look at the hardware then look at your statement you would think it is true.
The hardware itself is not at fault it is that Destiny 2 developers can't get more out of it.
Well, that's the polar opposite of what the Destiny 2 developers stated:
http://gearnuke.com/destiny-2-project-lead-confirms-30-fps-due-cpu-limits-evaluating-4k-xbox-one-x/
(article is from June 15, 2017)




The consoles are CPU-bound, which in turn limits the frames per second on every resolution, regardless of whether or not it runs at 720p, 1080p, 2K, or 4K.
Mark Noseworthy himself states just that in the third link.


EDIT:
I might also add that the XBone X Jaguar CPU is clocked at 2.3GHz compared to the PS4 Pro Jaguar CPU clocked at 2.13GHz.
So when they say the XBone X CPU is bottlenecking the GPU and game logic, just imagine how much worse it will be on the PS4 Pro.

I really would enjoy your response to this. :sneaky:


EDIT #2 - October 2020:
In an effort to not necro this thread, I would like to add this quote from the Scorn developers, who are now developing specifically for the XBSX:
I ask what, specifically, Series X offers the team that current-gen consoles couldn't: "It's mostly evolutionary improvements that are going to make the biggest difference. The most important one is elimination of the CPU bottleneck that exists in the current-gen consoles and much faster loading of assets thanks to the SSD. It's all about responsiveness and not having to wait on things."
 
Last edited:
Well, that's the polar opposite of what the Destiny 2 developers stated:
http://gearnuke.com/destiny-2-project-lead-confirms-30-fps-due-cpu-limits-evaluating-4k-xbox-one-x/
(article is from June 15, 2017)




The consoles are CPU-bound, which in turn limits the frames per second on every resolution, regardless of whether or not it runs at 720p, 1080p, 2K, or 4K.
Mark Noseworthy himself states just that in the third link.


EDIT:
I might also add that the XBone X Jaguar CPU is clocked at 2.3GHz compared to the PS4 Pro Jaguar CPU clocked at 2.13GHz.
So when they say the XBone X CPU is bottlenecking the GPU and game logic, just imagine how much worse it will be on the PS4 Pro.

I really would enjoy your response to this. :sneaky:
You are just repeating yourself, the developer is talking about their engine not what the hardware can do or can not. In the design for both consoles the cpu are just there to send data to the gpu why would you else use a processor that is 5 years old, not because it is the fastest.


He says his engine can't do more fps nowhere does he mention that they have done anything different remember Battlefield on the PS4 that did a respectable job.

You did not explain why would MS or Sony for that matter use the same hardware if the software would not be able to do resolutions they wanted.

That has to be the most awkward thing when you release hardware that can not push the gpu 2 times in a row.

On both counts your arguments make no sense , one developer struggles with his engine and manufacturers that wilfully ignores serious bottleneck in their hardware design.
 
Last edited:
I just showed you a demo from 6 years ago that you could run 6 years ago with hybrid ray tracing. Hardware 6 years ago could do ray tracing, so why would it be impossible today? There have been modified Quake and Doom engines that actually used Ray Tracing.

Intel has shown you can do it in real time with just using the CPU. You obviously wouldn't ray trace the entire game, cause the hardware can't do that, but if the math for ray-tracing was just pure math then you could feed it to both the GPU and the CPU, assuming ray-tracing doesn't care which is faster.

And I ignored it, because we're talking about AAA-games, not tech demos. Of course you can run ray tracing on CPUs- that's how it's always been done, and of course you can run it on a GPU, because it is just math, but in neither case is current hardware up to par. We don't just need faster hardware, we need hardware that is optimized for the purpose!

Nvidia wouldn't want assistance from the CPU where AMD would.

Your religion is showing.

Further, the idea of relying on the CPU for ray-tracing is just plain silly. You'll be burning resources that could be used for AI, or for keeping response times down, etc.
 
You are just repeating yourself, the developer is talking about their engine not what the hardware can do or can not. In the design for both consoles the cpu are just there to send data to the gpu why would you else use a processor that is 5 years old, not because it is the fastest.


He says his engine can't do more fps nowhere does he mention that they have done anything different remember Battlefield on the PS4 that did a respectable job.

You did not explain why would MS or Sony for that matter use the same hardware if the software would not be able to do resolutions they wanted.

That has to be the most awkward thing when you release hardware that can not push the gpu 2 times in a row.

On both counts your arguments make no sense , one developer struggles with his engine and manufacturers that wilfully ignores serious bottleneck in their hardware design.
I'm kind of shocked that an experienced technology reviewer like yourself doesn't understand what "CPU bound" means.
Might want to look that up, it will explain all of your questions clearly.
 
And I ignored it, because we're talking about AAA-games, not tech demos. Of course you can run ray tracing on CPUs- that's how it's always been done, and of course you can run it on a GPU, because it is just math, but in neither case is current hardware up to par. We don't just need faster hardware, we need hardware that is optimized for the purpose!
What exactly is optimized for Ray-Tracing? Are we talking about fixed functions built into GPU's that can accelerate Ray-Tracing? Cause its been tried and doesn't give real-time Ray-Tracing.
Your religion is showing.

Further, the idea of relying on the CPU for ray-tracing is just plain silly. You'll be burning resources that could be used for AI, or for keeping response times down, etc.
Exactly how much of a 8 core 16 thread CPU you think is used during todays games? Also, my religion is competition. I'd totally rock a blue Intel graphics card. And guess what, Intel would be for using the GPU+CPU to do Ray-Tracing. Guess why?

 
What exactly is optimized for Ray-Tracing? Are we talking about fixed functions built into GPU's that can accelerate Ray-Tracing? Cause its been tried and doesn't give real-time Ray-Tracing.

What else would we be talking about? And how does 'tried' equate to future hardware? Are you implying that ray-tracing should only be done in software on CPUs and as GPU shaders?

Exactly how much of a 8 core 16 thread CPU you think is used during todays games? Also, my religion is competition. I'd totally rock a blue Intel graphics card. And guess what, Intel would be for using the GPU+CPU to do Ray-Tracing. Guess why?

Today's games- most of the CPU. Tomorrow's games? Likely all of today's CPUs, and then some.
 
What else would we be talking about? And how does 'tried' equate to future hardware? Are you implying that ray-tracing should only be done in software on CPUs and as GPU shaders?
How else would you do Ray-Tracing? PowerVR does show a RTU (Ray Tracing Unit) in their GPU design that does Ray-Tracing but I would think that todays GPU's have enough compute power to do it anyway. The PowerVR chip is meant for tablets.

Here we can see real time ray tracing using OpenCL and OpenGL. Again, wasn't OpenCL an API developed by Apple and other companies to use both the CPU and GPU?



Today's games- most of the CPU. Tomorrow's games? Likely all of today's CPUs, and then some.
Like what games specifically? Far Cry 5 on a Ryzen 1700 used 24% of the CPU.

 
  • Like
Reactions: ChadD
like this
You are just repeating yourself, the developer is talking about their engine not what the hardware can do or can not. In the design for both consoles the cpu are just there to send data to the gpu why would you else use a processor that is 5 years old, not because it is the fastest.
Good lord, the developer directly states in the tweets:
"All consoles will run at 30fps to deliver D2's AI counts, environment sizes, and # of players. They are all CPU-bound."
"1080 doesn't help. CPU bound so resolution independent."

I'm pretty sure when they are talking about the CPU, they are talking about hardware, and even say that it is CPU bound, just like I stated - the game engine also runs on, you guessed it, the CPU of all things, so go figure.
This is a very basic concept, how do you not get this???


He says his engine can't do more fps nowhere does he mention that they have done anything different remember Battlefield on the PS4 that did a respectable job.
It's because the game engine (logic, AI, etc.) is too much for the XBone X (and by extension PS4 Pro) Jaguar CPU to handle in order to keep the GPU properly fed data, thus having to limit the frame rate to 30fps at all resolutions.
Again, do you not know what "CPU bound" means?

GPU bound:
720p - 100fps
1080p - 60fps
4k - 30fps

As the resolution gets bigger, the GPU does not have the processing power to handle what is happening at larger resolutions without lowering the frame rate.
In this case the GPU is the bottleneck, and not the CPU.


CPU bound:
720p - 30fps
1080p - 30fps
4k - 30fps

As the resolution gets bigger, regardless of how powerful the GPU is, the CPU can only provide it so much data since that is literally the limit of the CPU, regardless of the resolution.
This is not a hard concept, and one which was originally talked about on this forum by Cannondale06 back in the late 2000s, so it isn't like this is "new" or "esoteric" technical knowledge.

Try playing a modern AAA game with a GTX 1080 Ti on a Socket 755 Pentium 4 @ 3.0GHz and tell me how well the game runs.
You will be lucky to get 5-10fps at best because the CPU is far too weak to provide proper data to the GPU, regardless of what the resolution and/or game settings are - there are countless YouTube videos to back this up, so don't act like there is no evidence of these claims.

Here is one with a Pentium 4 paired with a GTX 970 playing games at 1080p, which the GTX 970 itself can easily handle:



You did not explain why would MS or Sony for that matter use the same hardware if the software would not be able to do resolutions they wanted.
I'm not sure what this has to do with anything, or even why I should have to explain this, but by using logic and deduction (and history), I will tell you.
Back in 2013, the Jaguar x86-64 CPU, along with AMD available GPU technology (as a semi-custom APU), was the most price effective option available to Sony and Microsoft, thus they went with AMD.

Back then, though, the Jaguar CPU was acceptable for the games of that era, and the consoles were more GPU bound at that point, especially at 1080p and attempting to get over 30fps.
Do you not remember both Sony and Microsoft marketing the "cinematic experience" gimmick for the 30fps limit?

That has to be the most awkward thing when you release hardware that can not push the gpu 2 times in a row.
The original GPUs were barely mid-range back in 2013, and the Jaguar CPUs in each console were actually enough to deliver enough data for 30fps and in some cases 60fps, depending on the game and resolution.
With the console refreshes, though, the Jaguar was way long in the tooth and was not enough to deliver the vastly more power GPUs enough data for 4K 60fps gameplay in many (not all) cases.

A new CPU would change the overall architecture of the APU, thus requiring a new semi-custom APU from AMD (extremely costly on both fronts), a new SDK would be needed, and backwards compatibility with the earlier consoles would be more difficult, regardless of both architectures being the x86-64 ISA.
So, logically thinking, this would have financially been a bad move for both Sony and Microsoft, and thus they stayed with the Jaguar CPU in both consoles.

That's why the XBone X and PS4 Pro were each a console refresh, not a new generation of consoles.
I can't even believe I'm having to explain this to YOU of all people, and I am seriously shocked by your lack of knowledge on this! o_O

Sorry man, no disrespect intended, but I expected quite a bit more from you on such a subject, sad to say, especially with your harsh comments towards me. :unsure:

On both counts your arguments make no sense , one developer struggles with his engine and manufacturers that wilfully ignores serious bottleneck in their hardware design.
Well, yeah, my arguments do make sense and your statements have literally nothing to do with anything.
This isn't just one developer, jeeze, look at FromSoftware and their PS4 exclusive Bloodborne - this was a masterfully crafted game by developers who knew how to squeeze every ounce of processing power out of a console, and that game is still limited to 30fps due to the PS4/Slim/Pro being CPU bound.

Even with the PS4 Pro's Boost Mode enabled and the patched version of Bloodborne which supports said Boost Mode, the game only runs at a more consistent 30fps than before.
The PS4 Pro's GPU is more than capable of running a game like Bloodborne at 60fps, especially at 1080p or 720p, but because even the PS4 Pro is CPU bound with its Jaguar clocked at 2.13GHz, the developers couldn't get the game to run at a consistent 60fps, so they limited it to 30fps on the Pro as well.

Again, this is by FromSoftware, who are extremely good developers in their own right, and for a title which was exclusive to the PS4, no argument can be made for a "cross-platform lack of optimization" since there was no need.
Again, I would really like to hear your response to all of this, because so far, your silence is deafening.
 
Last edited:
Stop replying please you lack the simplest understanding of what I am saying.

Gaming engines either are cpu or gpu bound hardware however is not.

In a gaming engine you can choose whether you make the cpu do all the work or the gpu do all the work. If i send all my data to the gpu then the factor shifts to how much data I can send before the GPU starts being the bottleneck.

If I did the reverse and make the cpu do all the work then the gpu would hardly do anything. just visit some of the youtube people reviewing game frame rate per second where they have the cpu and gpu percentage on screen you can easily identify this (yes you can!).

You will see that this is the reason why you can not claim the hardware is bottlenecked because software dictates the amount of usage on either cpu or gpu.

Good lord, the developer directly states in the tweets:
"All consoles will run at 30fps to deliver D2's AI counts, environment sizes, and # of players. They are all CPU-bound."
"1080 doesn't help. CPU bound so resolution independent."

I'm pretty sure when they are talking about the CPU, they are talking about hardware, and even say that it is CPU bound, just like I stated - the game engine also runs on, you guessed it, the CPU of all things, so go figure.
This is a very basic concept, how do you not get this???

He is talking about his gaming engine, software that drives the hardware. What is so hard to understand ?
Look I even prove it to you 60 fps 1080p on older hardware.



So explain it to me again what is cpu bound ?
 
Yea but they achieve this with many cores on two CPUs. Now use two Threadripper chips and see what happens. This is kinda what Intel did with Larrabee and that kinda failed. Kinda cause Larrabee turned into a CPU.

Also single threaded vs multithreaded. That's really good for servers but we need good single threaded performance for the desktop.

The benchmark I gave you is for ONE CPU: 32-core ARM vs 32-core AMD vs 28-core Intel.

No problem with single-thread.

Not really. Here you can see the Athlon 5350 is more than half as slow as the AMD A8 3850. The Athlon 5350 is a Jaguar chip while the A8 3850 is a Liano chip. And the FX 8350 is 20% faster in IPC than that chip and it was based on Trinity. So yea, the Jaguar chips were half as slow in IPC compared to Trinity, and Trinity is 50% slower than Ryzen.

View attachment 75698

View attachment 75699

I was discussing average IPC and I demonstrated my claim with several dozens of benches. Your CB benchs will not change anything.

The FX-8350 is 25% faster than the A8 3850 in the second graph (1.11/0.89). So the FX-8350 would score about 4224 in the Anandtech graph, but the FX has 4.2GHz turbo, so

4224 / 4.2 = 1006

For the Athlon 5350

1967 / 2.05 = 959

So Jaguar has 95% of the IPC of the Piledriver. which agrees with my point "Piledriver IPC is roughly equivalent to Jaguar IPC (+- 10%)."

Moreover, this is for single thread. For multithread, you have to consider that the Piledriver chip has a module penalty of about 20%, whereas Jaguar doesn't, because Jaguar has full cores.

Finally, I was comparing Kabini to Richland, because both APU lack L3. The Piledriver cores in the FX-8350 are a bit faster than the Piledriver cores in Richland, because the FX has L3. Adding L3 to Jaguar cores would also increase its performance.
 
Last edited:
That is NOT "IPC". :rolleyes:

The goal was to achieve 40% IPC and they ended up with a hair over 50% improvement in IPC.


EDIT: Yep took only a few seconds to see that: https://www.anandtech.com/show/1117...-7-review-a-deep-dive-on-1800x-1700x-and-1700


"Then it was realised that AMD were suggesting a +40% gain compared to the best version of Bulldozer, which raised even more question marks. At the formal launch last week, AMD stated that the end goal was achieved with +52% in industry standard benchmarks such as SPEC from Piledriver cores"

View attachment 75720

Per the basic equation of computer architecture

IPC = Performance / Frequency

If you use only single thread performance (as AMD does in that slide) then you get single-thread IPC.

If you use a more general definition of performance that includes both single thread and multithread performance then you get the average IPC.

This is a moot point about nomenclature and labels. We are discussing performance per clock. If you want to restrict the concept of IPC only to single thread, then you will have to use a complicated expression as

Performance = (IPC+SMT-CMT) * Frequency

to compare the different chips mentioned in the general case. The labels will change but the muarchs and how good they are will not change.
 
Further, the idea of relying on the CPU for ray-tracing is just plain silly. You'll be burning resources that could be used for AI, or for keeping response times down, etc.

Using a quad-core CPU for raytracing is silly. Using a manycore CPU is not.
 
That makes total sense, and I remember the then high-end ARM CPUs back in 2013 being the A17/A15, which really were not that great overall for price-performance compared to the Jaguar-based x86-64 processors, though they were somewhat close in general-processing power.
You are right, though, the fact that they were not 64-bit was a total deal breaker back then, as 4GB of addressable RAM+hardware was not nearly enough... maybe back in 2010, but not in 2013.

As for the existing ARM processors, if I remember correctly (did the math on this a while back), I believe ARMv8 A57 cores are about 1.2 times faster clock-for-clock compared to Jaguar-based x86-64 cores.
It's kind of funny that clock-for-clock the Nintendo Switch A57 (four) cores are technically faster than the Jagaur cores in the PS4/XBone, but because they are only clocked at 1GHz, Jaguar ends up beating it due to its much higher clock speed. :)

A57 Opteron vs Jaguar Opteron SPECint

https://www.anandtech.com/show/7724...arm-based-server-soc-64bit8core-opteron-a1100

That is impressive about the A72 cores having that amount of processing power - I will have to read up a bit more on those, thanks for the info!

You can find muarch details and rough IPC metric (Dhrystone MIPS per MHz) in next link

https://en.wikipedia.org/wiki/Comparison_of_ARMv8-A_cores
 
Technically Sony could use arm... realistically I doubt it makes real sense right now.

The switch from CELL in PS3 to x86 was required... cause well the cell turned out to be a dead end. There was no new Cell chip to be had. x86 gave them the fastest in with developers. I am not sure they are looking to abandon x86 quite yet. As it would force a lot of pain on the software side. The industry is moving to a x86/arm agnostic world... but its not quite yet there. Sony would be in for a lot of work as would the developers themselves.

The only real option that could see them go arm. Is Nvidias Xavier SOC... however Nvidia has no shortage of customers for them for AI stuffs, and even though its running 8 of their own ARMv8 cores and a 512 core volta gpu... I am still not so sure a Ryzen/Vega APU isn't a lot faster for games as things stand right now at least. (unless Sony really was able to tweak the heck out of their compilers and api... that would require a lot of work with Nvidia directly as their carmel core in those chips is a fairly modified ARMv8 no one really knows whats in that chip but NV). Also the xavier has a ton of sensor DSP stuff Sony I don't think would care about... so NV would still be looking at spinning something custom. Its possible but NV isn't known for doing custom stuff on the cheap.

At the end of the day imo if the choice is AMD APU vs Nvidia Xavier... the AMD is likely to be far cheaper, cause Nvidia rolls like that. Performance is likely to be pretty much a wash. AMDs hardware will likely get cheaper down the road and they will be more then happy to spin cheaper version for Sony a few years in. Nvidia has a track record of not playing ball... Sony and MS have both been burned by NV on future chip costs I think hell would have to freeze over before either went back to NV if they had other options.

Outside of NV... there just isn't any ARM manufacturer that could supply Sony. The high performance guys like Marvel/Cavium are not going sell their high end server parts cheap. The mobile guys are not going to have the performance or GPU chops.

I know I have posted 100s of times about how ARM is going to take over everything at some point... and I believe ARM will. I'm just not sure their is anything in the pipe for the next 2-3 years that would fill the full size console bill anyway. NV does have a chip that could perhaps manage... but Nvidia isn't going to sell it off cheap to Sony when they are selling them for crazy prices to the automotive industry, and even if they did Sony wouldn't trust em. lol
 
Last edited:
Using a quad-core CPU for raytracing is silly. Using a manycore CPU is not.

I don't even think using a quad-core for ray-tracing itself is silly- what's silly is trying to use a CPU at all for it given that most of a CPU is dedicated to out of order processing that is totally unneeded for a pure compute job, if you have a choice.

I could see using excess CPU resources for ray-tracing, but this wouldn't be viable in a consumer AAA-game real-time environment. For desktop games, the developer would not be able to guarantee the available resources, and for consoles/mobile, those circuit and power resources would be better spent on dedicated hardware.
 
  • Like
Reactions: ChadD
like this
I don't even think using a quad-core for ray-tracing itself is silly- what's silly is trying to use a CPU at all for it given that most of a CPU is dedicated to out of order processing that is totally unneeded for a pure compute job, if you have a choice.

I could see using excess CPU resources for ray-tracing, but this wouldn't be viable in a consumer AAA-game real-time environment. For desktop games, the developer would not be able to guarantee the available resources, and for consoles/mobile, those circuit and power resources would be better spent on dedicated hardware.

There are some very big reasons why for Raytracing a CPU is actually the best option.

Raytracing requires a good bit of ram. One of the main reasons that a company like say Pixar renders their movies on CPUs to this day is because you simply can't load that amount of data into a GPUs ram space... and if its not loaded into GPU ram a GPU actually isn't that fast. There are some edge case cards like the Radeon Pro SSGs... which combo both ECC memory which is another big deal for production Raytracing at least (not games per say) with a ton of on board SSD space. I know some of the smaller 3D / VFX studios are starting to use those cards. Still a farm of those for high end rendering purposes would still be a bit insane. They are great for actual real time pre render, editing type work as you can imagine.

Also frankly GPUs suck at doing Ray tracing... its why no one has really used it till now. No despite the Hype Nvidia hasn't invented something new. Real time ray tracing in APIs like OpenGL and by extention Vulcan have been around a long time now (years). NV and MS are simply trying as usual to lock developers down to their own hardware and API. Anyway that's a side... my point is GPUs are very good at doing lots and lots of highly coherent work. Basically repeating the same simple math over and over really fast. Ray Tracing is NOT highly coherent work. The math involved is far less ordered and not really what GPUs where designed to crunch. A few companies have tried to build chips designed to accelerate specific bits of Raytracing, such as Caustic Graphics... they developed hardware and GL extensions years back (like 10 years back) they where bought by Imagine Technologies and the mobile raytracing they where talking about a couple years ago was the Caustic tech slotted into their ARM SOC. Nvidia and AMD haven't really been adding raytracing units specifically... we are just at a point where they have focused on compute performance enough that their cores can be massaged to do a much better job then previously. So both AMD and Nvidia have been showing off real time Partial ray tracing.

I'm not saying a GPU is useless for raytracing... but ya most current GPUs pretty much are. A fully programmable processor frankly is better suited to ray tracing math. Not only is the CPU simply better at the less coherent more random math... they don't tend to have the same issues with memory. When it comes to Ray tracing in games its going to be hybrid... mostly still raster gfx with some raytracing thrown in for effect. In such cases I am sure using CPU cores for the rays... and system memory to store them may well be the best way to go. I don't think developers are going to want to turn textures quality down a few notches even on 16+gb cards so they can do more ray tracing. Off loading the rays to system ram may be ideal as long as the cpu has a few cores to spare.

PS... I think we a long way (like 5-10 years) away from seeing any game developers really add ray tracing effects to major games. At least not in any "required" way. Any developer that does add them as a toggled effect... people are going to have to expect to see their FPS drop to single digits if they don't have TOP end systems.
 
Last edited:
Alright, I'm not going to disagree there, and you do bring up a good point about memory usage- but it's still looking at current and past hardware.

Ray-tracing is still very simple math, which again means that CPUs are way, way overbuilt. Further, when talking about memory and cinematic-level renders, I think that the scale would likely be different for AAA-games and real-time renders. I don't know how different, no one does, but I'm betting that 'less' would be reasonable :).

So I expect that future GPU iterations will start including hardware specifically designed to adapt their architectures for ray-tracing where they are currently deficient today. I'd bet that the transition from pure fixed function to more flexible units will happen over time like the DX7 -> DX8 -> DX9 ->DX10/11/12 transition from fixed-function to universal shaders did with raster graphics.

Further, I agree on the 'mix', where ray-tracing would likely be added to enhance certain effects in games as an introduction with greater utilization going forward.
 
  • Like
Reactions: ChadD
like this
Alright, I'm not going to disagree there, and you do bring up a good point about memory usage- but it's still looking at current and past hardware.

Ray-tracing is still very simple math, which again means that CPUs are way, way overbuilt. Further, when talking about memory and cinematic-level renders, I think that the scale would likely be different for AAA-games and real-time renders. I don't know how different, no one does, but I'm betting that 'less' would be reasonable :).

So I expect that future GPU iterations will start including hardware specifically designed to adapt their architectures for ray-tracing where they are currently deficient today. I'd bet that the transition from pure fixed function to more flexible units will happen over time like the DX7 -> DX8 -> DX9 ->DX10/11/12 transition from fixed-function to universal shaders did with raster graphics.

Further, I agree on the 'mix', where ray-tracing would likely be added to enhance certain effects in games as an introduction with greater utilization going forward.

Well I'm honestly not convinced CPUs are over built for ray tracing. Over the years many companies have tried to build the "Raytracing card" the market is huge and its a high end Pro paying market. Lots of R&D money has been dumped in. A lot of them explained that the math wasn't that complicated till they tried to make hardware that did it fast and I believe realised the math wasn't really that simple. Having said that of course your right... if we are not worried about cinema quality output and are ok with a % of errors things get much easier. :)

The closest anyone seems to have really gotten was Caustic games... imagine bought them up and played with their tech and that's about it. They nor anyone else has been able to come up with the Raytracing ASIC drop in cards that the movie / vfx industry would have gobbled up if they could.

I do agree with what your saying though... the GPUs will get good enough. I'm not sure NV, AMD, or Intel if they ever make a GPU really will add specific Ray tracing units as much as they will make their compute units programmable enough to be much more useful. On the ram front we'll see if the fast GPU type ram prices don't fall... the SSG type solution AMD is using could make sense on a smaller scale for gaming cards as well. Instead of the 2TB SSD on the pro card... I could see a much smaller much less expensive SSD being one easy solution.
 
Designing large ASICs is hard- producing them in a volume that is actually profitable?

I can see why no one has been able to do it yet competitively versus just stacking Xeons/Epycs. At the same time, AMD and Nvidia are both shipping some of the largest high-performance volume products on the market, and are likely to be the first that can really attack this market. And the VFX industry is likely to be a driver. Just as both companies have pushed their products into the industrial compute space, I expect them to develop ray-tracing acceleration both for consumer real-time uses and for industrial uses in Quadros/FirePros.
 
Back
Top