AMD possibly going to 4 threads per core

The fact is there is no pressing need for 4 way smt any time soon except in the enterprise. When the need for it exists then and only then will it become a part of consumer cpus.

See below:

Actually there is a case in point where SMT 4+ is useful for consumers. Especially once MS figures out how to address busy threads to physical cores over SMT or Hyperthreaded virtual cores.

That is the lower powered desktop systems or even laptops. Why put a octocore processor in a laptop if you can just a 4 core with smt 4. You get 16 threads to work with. All of your background tasks get assigned to those and your gaming or more intensive productivity apps get distributed to your physical cores.

Exactly. Much of the work done by most consumer systems is in no way 'heavy'. Even most games aren't, when you're looking at stuff like MOBAs and Overwatch and so on.

In reality, one could get away with dual-core CPUs again; you'd have eight threads, and hopefully most 'work' would be outsourced to the GPU cores if not handled by SIMD units like SSE and AVX, so those 'SMT cores' would mostly either be running through branching logic or shuttling commands for other subsystems around.
 
In reality, one could get away with dual-core CPUs again; you'd have eight threads, and hopefully most 'work' would be outsourced to the GPU cores if not handled by SIMD units like SSE and AVX, so those 'SMT cores' would mostly either be running through branching logic or shuttling commands for other subsystems around.
Exactly, and that is essentially what the current-gen consoles are doing right now.
Even though the 8-core AMD Jaguar CPU (assuming the 2.1GHz PS4 Pro CPU). have 8 physical cores (no SMT), all 8 of those Jaguar cores is basically the equivalent to an Intel Haswell Core i3 dual-core @ ~3.0GHz in full SMP.

So essentially, if we were to take that i3 dual-core and give it 4-way SMT (so 2 physical cores and 8 logical cores/threads), it would perform identical to the 8-core Jaguar in terms of 8-threaded processing capabilities.
Like you said, if we were to have a modern dual-core with the equivalent processing power of 8 current-gen Intel/AMD cores, and include 4-way SMT, it would perform roughly the same; this is assuming the SMT is optimized and scales 100% without detriment to thread-switching.
 
So essentially, if we were to take that i3 dual-core and give it 4-way SMT (so 2 physical cores and 8 logical cores/threads), it would perform identical to the 8-core Jaguar in terms of 8-threaded processing capabilities.
Like you said, if we were to have a modern dual-core with the equivalent processing power of 8 current-gen Intel/AMD cores, and include 4-way SMT, it would perform roughly the same; this is assuming the SMT is optimized and scales 100% without detriment to thread-switching.

Except for needing to completely redesign the core count for 4 potential threads at once... but yea I see your point...
 
1. Please link.
2. 'Background processes', while broadly applicable, is also pretty nebulous in defition because it is different for every user and even different from instance to instance for most users.

And what you're really looking for is how the longest frametimes are affected, because that's what you 'feel'.

Yeah I tried to post the link when I made the post but I can't find it; I don't remember if it was a full video (which should be easy to find) or a smaller test inside another GN or HU vid. Basically "background processes" they used in the test IIRC was the standard gaming PC software that most gamers would have running.

Edit: Found it:

 
Even though the 8-core AMD Jaguar CPU (assuming the 2.1GHz PS4 Pro CPU). have 8 physical cores (no SMT), all 8 of those Jaguar cores is basically the equivalent to an Intel Haswell Core i3 dual-core @ ~3.0GHz in full SMP.

So essentially, if we were to take that i3 dual-core and give it 4-way SMT (so 2 physical cores and 8 logical cores/threads), it would perform identical to the 8-core Jaguar in terms of 8-threaded processing capabilities...

Going to call BS on that one. Unless you helped design the Jaguar / Haswell cpu's, or have extensively programmed on both, that is a hell of a leap and no way to really know. You running prime95 or sisoft sandra on your PS4?
 
Going to call BS on that one. Unless you helped design the Jaguar / Haswell cpu's, or have extensively programmed on both, that is a hell of a leap and no way to really know. You running prime95 or sisoft sandra on your PS4?
I doubt they are even comparable to Haswell CPUs. Remember PS4 came out in a 2013. Haswell was released a couple months before the release of the PS4. Also AMD CPUs were stinking it up back then.
 
Remember PS4 came out in a 2013.

It's not even that -- the cores used in the Jaguar CPU were designed for tables using the Bulldozer architecture.

They are ass. They were ass when they were released. However, they're just less ass than what they replaced (ancient IBM Power stuff), they're full out-of-order x86-64 cores, and there are eight of them.
 
Going to call BS on that one. Unless you helped design the Jaguar / Haswell cpu's, or have extensively programmed on both, that is a hell of a leap and no way to really know. You running prime95 or sisoft sandra on your PS4?
I've posted the results again and again throughout multiple threads on the PS4 and AMD Jaguar; here is a list of threads about all of it.
If you do the math from the results from PassMark on both CPUs, they are within a few percents of one another.

All CPUs, ISAs, and microarchitectures are comparable to one another, one just need to know how to do it.
 
Last edited:
Going to call BS on that one. Unless you helped design the Jaguar / Haswell cpu's, or have extensively programmed on both, that is a hell of a leap and no way to really know. You running prime95 or sisoft sandra on your PS4?
Agreed. Quite the leap and no real way of working out the math on that abstract scenario.
 
Agreed. Quite the leap and no real way of working out the math on that abstract scenario.
Here is an old quote from another thread where I did the math:

I wouldn't just compare the GFLOPS rating, as floating point operations is only one part of the CPU (and GPU), and does not directly correlate with CPU performance in total, as integer functions and IPC both play a large factor.
This is one area where a synthetic benchmark can give a better rough guestimate or ballpark of where CPUs and GPUs fit next to one another performance-wise in general.

The best comparison I can attempt to give between Jaguar and Ryzen is this:
https://www.cpubenchmark.net/compare/AMD-Ryzen-5-1500X-vs-AMD-GX-420CA-SOC/3001vs2121

Both the Ryzen 5 1500X and AMD GX-420CA are quad-cores, so we can do a nearly identical comparison of them, at least with IPC.
Please keep in mind that the synthetic benchmark's scores in the link above by itself doesn't mean anything specifically other than a placeholder for where each CPU falls performance-wise to one another.

The Ryzen 5 1500X quad-core @ 3.5GHz scored 10685.
The AMD GX-420CA (Jaguar) quad-core @ 2.0GHz scored 2299.

Now lets do the math on this!


So, if we want an apples-to-apples GHz to GHz comparison, we want to bring the Ryzen 5 1500X down to 2.0GHz as well, along with the score itself:
2.0 ÷ 3.5 = ~0.571428571
(3.5 is roughly 57% faster than 2.0)

Now, we want to take the score of the Ryzen 5 1500X and bring it down in the same manner:
~0.571428571 x 10685 = 6106 (rounding up from a long decimal)
(lowering the Ryzen CPU's score by 57% as well to match the equally reduced clock speed to allow a direct comparison)

So, a Ryzen 5 1500X quad-core @ 2.0GHz would have a score of 6106.
How does this compare in IPC in general performance-wise to the AMD GX-420CA?

6106 ÷ 2299 = ~2.655937364

Thus, the general performance core-for-core and clock-for-clock for the Ryzen 5 1500X quad-core @ 2.0GHz would be roughly 2.66 times faster than the AMD GX-420CA quad-core @ 2.0GHz.

If you take that knowledge and compare it to a ~3.0GHz Intel i3 Haswell, you can figure out, and see the results for yourself.
This isn't hard to do...
 
no way to really know.

I doubt they are even comparable to Haswell CPUs.

no real way of working out the math on that abstract scenario.

Seriously?
It isn't hard to take the 2.0GHz quad-core Jaguar score, multiply it by 1.065 to get 2.13GHz (same clock as the PS4 Pro's CPU), and then multiply that score by 2 (converts the quad-core score to the 8-core score).

Come on guys, this is basic math. :meh:
You all are bitching and moaning about how "nothing works", yet are offering no solutions - that is incredibly telling of you all.
 
Last edited:
Seriously?
It isn't hard to take the 2.0GHz quad-core Jaguar score, multiply it by 1.065 to get 2.13GHz (same clock as the PS4 Pro's CPU), and then multiply that score by 2 (converts the quad-core score to the 8-core score).

Come on guys, this is basic math.
Basic math is your expectation, but not reality. Clearly!
Dedicated low level programing for one specific chip specifically of 8 cores, combined with bare metal GPU leverage does NOT equal a broad estimation of x86 where cores & threads and MHz are programmed for the largest CPU group ATM.
Your throwing away too much of the reality of console programing for specific hardware to meet your BASIC math extrapolations.
 
Basic math is your expectation, but not reality. Clearly!
Dedicated low level programing for one specific chip specifically of 8 cores, combined with bare metal GPU leverage does NOT equal a broad estimation of x86 where cores & threads and MHz are programmed for the largest CPU group ATM.
Your throwing away too much of the reality of console programing for specific hardware to meet your BASIC math extrapolations.
Then by all means, please come up with something better - I'm all ears (eyes), and will give credit where credit is due.
So far, all I'm reading from you bunch is that "this is impossible", even though it so obviously is not, and has been answered in many threads prior to this.
 
Last edited:
You would have to be able to run a standardized cpu benchmark on both. Haven't ever heard of benchmarking apps being released for any consoles...
 
You would have to be able to run a standardized cpu benchmark on both. Haven't ever heard of benchmarking apps being released for any consoles...
See post 292 for the answer to that mystery.
Oh, but because this is a "console", and not just two quad-core Jaguar x86-64 CPUs tethered together as an 8-core CPU, I suppose we will never know. :rolleyes:
 
It's not even that -- the cores used in the Jaguar CPU were designed for tables using the Bulldozer architecture.
Jaguar doesn't use the Bulldozer (CMT) architecture; outside of both being x86-64, they share little in common with one another, and clock-for-clock, Jaguar has a higher IPC than Bulldozer.
Jaguar CPUs were not designed for tablets, they were designed for embedded systems, thin clients (typing this on one now), and blade servers.

They are ass. They were ass when they were released. However, they're just less ass than what they replaced (ancient IBM Power stuff), they're full out-of-order x86-64 cores, and there are eight of them.
In 2013, Jaguar CPUs were designed to be low-power CPUs, and were basically (for the cost) all that was available for the consoles.
Intel didn't make APUs, which would have heavily increased the cost, NVIDIA left a bad taste in both Sony and Microsoft's mouths from prior generations, and ARM CPUs were still completely 32-bit at the time with the ARM64 being in its earliest implementation and totally unproven back then.

The 8-core CPUs were very modest, but it was the only way to get 8 threads at the time for a low cost, as no x86-64 design offered 4-way SMT in 2012/2013 when these were being designed and released, and Intel could not offer an all-in-one APU/SoC design at the time with a powerful enough GPU (HD2500 was standard back then for them).
So as much as you are complaining about it now, your opinion is incredibly biased from a 2019 standpoint, and not at all inclusive as to what was truly available and actually on the market in 2012/2013, nor the extreme costs, heat output, and design factors that would have been involved if they had gone with Intel or another company (IBM, Oracle, etc.) at the time.

Another thing, the IBM Cell processor was a powerhouse when it was released in 2006 and utterly quashed any x86 and x86-64 CPU of that era up until around 2010 in FPU-related performance, though it was weak in integer-related performance (really not what it was designed for, though).
The IBM Cell did not even officially go end-of-life until 2012 in enterprise markets, so I would hardly call it "ancient" by 2013.


So in your opinion, what would have been a better option for both Sony and Microsoft to go with around 2012/2013 that would have allowed for 8-threads and an APU/SoC design?
 
Last edited:
Jaguar doesn't use the Bulldozer (CMT) architecture

I'm mistaken here -- it was a Bulldozer-era CPU.

Jaguar CPUs were not designed for tablets, they were designed for embedded systems, thin clients (typing this on one now), and blade servers.

This is a similar TDP space. Jaguar is analogous to Intel's later Atom CPUs, before Intel transitioned that range over to Core... cores.

and were basically (for the cost) all that was available for the consoles.

I've made this point repeatedly in the past, and I purposefully didn't speak to it here. No, there was no other single-vendor alternative at the time, and between AMD's integration of CPU and GPU on the same die and the sharing of a memory controller as well as AMD's near-death bargaining position (another result of Bulldozer), AMD was far and away the least expensive option, and for the low-performance needs of consoles, certainly a viable choice.

So in your opinion, what would have been a better option for both Sony and Microsoft to go with around 2012/2013 that would have allowed for 8-threads and an APU/SoC design?

Taking into account the above, they didn't have to go with an SoC; an Intel CPU and Nvidia GPU would have produced higher performance levels and greater performance / watt unquestionably. That's a space that AMD's CPU division hasn't competed in since Intel was shipping Pentium IVs and since their graphics division was named ATI.

And if they'd used an Intel desktop solution for the CPU, they wouldn't have needed so many cores. An Intel dual-core at 4.0GHz would have been faster for the level of work that console games actually do.


But as noted, the cost for such a solution would look quite prohibitive compared to the starvation wages that Microsoft and Sony could pay (at the time) dying AMD.
 
Basic math is your expectation, but not reality. Clearly!
Dedicated low level programing for one specific chip specifically of 8 cores, combined with bare metal GPU leverage does NOT equal a broad estimation of x86 where cores & threads and MHz are programmed for the largest CPU group ATM.
Your throwing away too much of the reality of console programing for specific hardware to meet your BASIC math extrapolations.
Modern consoles are not "bare metal", and haven't been for nearly 20 years, and have far more layers between the hardware and the software/games at this point.
The only advantage is that it is a closed platform which can be further optimized for.

It still isn't hard to see how the CPU itself stacks up against other CPUs, which is what you are implying is "impossible" when it is so extremely possible.
Even if further optimizations are made, using those numbers and estimations will give us a good ballpark estimate of where the performance of that CPU will land.

Apparently basic logic concepts are out of your reach as well. :meh:
 
Taking into account the above, they didn't have to go with an SoC; an Intel CPU and Nvidia GPU would have produced higher performance levels and greater performance / watt unquestionably. That's a space that AMD's CPU division hasn't competed in since Intel was shipping Pentium IVs and since their graphics division was named ATI.

And if they'd used an Intel desktop solution for the CPU, they wouldn't have needed so many cores. An Intel dual-core at 4.0GHz would have been faster for the level of work that console games actually do.
I believe they wanted more threads than just the processing power, and at the time, nothing else offered 8-threads, at least that wasn't cost prohibitive or enterprise-based.
You are right, though, even a Sandy Bridge or Ivy Bridge dual-core at 4.0GHz with SMT (4-threads) would have been much more powerful.

At the time, while the Jaguar CPUs were analogous to Atom CPUs, the Jaguar CPUs were, even for that era, much much faster and more powerful than the best Atom processors on the market - while also having a slightly higher TDP as well, though.
Having 8 threads was important for Sony and Microsoft's software development of their platforms, though, and I do think that was mainly the deciding factor, along with the low cost as you mentioned.

For the first iteration of the consoles, the CPU was fine for driving the then-anemic GPUs at 720p and 1080p.
However, at 4K, those GPUs were far inadequate and unfortunately in the refresh, while the GPUs themselves are very capable of driving 4K, the CPUs are too anemic to properly drive the GPU (feed it enough data) in order to get higher refresh rates, which is why most games, even optimized exclusives like Bloodborne and Spider-Man, are locked at 30fps regardless of the resolution; the GPU can handle it, but the CPU cannot, and at least not consistently, hence the titles's fps being locked.

Agreed with everything else you said.
 
Modern consoles are not "bare metal", and haven't been for nearly 20 years, and have far more layers between the hardware and the software/games at this point.
The only advantage is that it is a closed platform which can be further optimized for.

It still isn't hard to see how the CPU itself stacks up against other CPUs, which is what you are implying is "impossible" when it is so extremely possible.
Even if further optimizations are made, using those numbers and estimations will give us a good ballpark estimate of where the performance of that CPU will land.

Apparently basic logic concepts are out of your reach as well. :meh:
Nope :ROFLMAO::ROFLMAO::ROFLMAO:
 
Modern consoles are not "bare metal", and haven't been for nearly 20 years, and have far more layers between the hardware and the software/games at this point.
The only advantage is that it is a closed platform which can be further optimized for.

It still isn't hard to see how the CPU itself stacks up against other CPUs, which is what you are implying is "impossible" when it is so extremely possible.
Even if further optimizations are made, using those numbers and estimations will give us a good ballpark estimate of where the performance of that CPU will land.

Apparently basic logic concepts are out of your reach as well. :meh:
How many "squeezed" oranges equal a grapefruit in your world? A CPU can perform so dramatically different, depending on what is asked of it. Ya?
You don't think MS and Sony haven't pushes that CPU/GPU to their breaking point? AKA close to metal?
 
Damn, that's one hell of a comeback, don't know how I can top that. /s

How many "squeezed" oranges equal a grapefruit in your world? A CPU can perform so dramatically different, depending on what is asked of it. Ya?
That's what makes PassMark so great, it is a general-purpose synthetic benchmark, and is a great rough estimate on where CPUs stand against one another, and it is completely apples-to-apples, and one of the few synthetic benchmarks that performs as such.
Other than the Jaguar CPU in the consoles having a memory controller attached to GDDR5 unified memory/VRAM, and having a bridge that links both quad-core CPUs (it isn't a monolithic CPU), it is the exact same Jaguar CPU that is in everything else.

Do you even understand the technology in these consoles?
From everything you keep spouting, it looks like you sure as hell don't.

You don't think MS and Sony haven't pushes that CPU/GPU to their breaking point? AKA close to metal?
No, and if you knew anything about consoles in the last 20 years, you would know that isn't the case.
Closed platform optimization is not the same as "close to the metal" - ffs, this is common knowledge, and has been for well over a decade.

Both Sony and Microsoft are using high-level APIs and have been since the last few generations, so "close to the metal" hasn't been in practice in over 20 years.
Unless you think companies and programmers still program in assembly language. :rolleyes:
 
Last edited:
Apparently it is too hard for you to do... why don't you just go sit in the corner and let the adults talk. o_O
You know it would be nice to have an intelligent discussion with you if you could form posts with more than one word, but since that isn't possible, I'm just going to assume you are a troll, or mentally disabled, of which I hope it is the latter, because with the former, your ignorant posts have no excuse.
 
Last edited:
I believe they wanted more threads than just the processing power, and at the time, nothing else offered 8-threads, at least that wasn't cost prohibitive or enterprise-based.

As covered in this thread earlier by others, there's not really a reason for more threads if fewer threads will do. It's even faster, assuming that all other variables can be 'leveled'.

But that's the trick: as we noted above, while an Intel-based solution would be higher performing in every metric, and the lower number of cores would have been irrelevent, price for both an Intel CPU and the price for an external GPU on separate silicon with separate memory for both and the control logic between would have been far more.
 
As covered in this thread earlier by others, there's not really a reason for more threads if fewer threads will do. It's even faster, assuming that all other variables can be 'leveled'.

But that's the trick: as we noted above, while an Intel-based solution would be higher performing in every metric, and the lower number of cores would have been irrelevent, price for both an Intel CPU and the price for an external GPU on separate silicon with separate memory for both and the control logic between would have been far more.

It's not necessarily always true that fewer faster cores is equivalent to more slower cores, even if the performance should be equal on paper. Fewer cores means more thread management overhead due to more frequent thread switching. There's also a chance that thread that would be able to run on a 5th core would have to wait longer when there's only 4 because other threads have higher scheduling priority. Hyperthreading could alleviate that some, but you're still going to have threads contending for shared resources of a hyperthreaded core where they wouldn't on two independent cores.
 
As covered in this thread earlier by others, there's not really a reason for more threads if fewer threads will do. It's even faster, assuming that all other variables can be 'leveled'.

But that's the trick: as we noted above, while an Intel-based solution would be higher performing in every metric, and the lower number of cores would have been irrelevent, price for both an Intel CPU and the price for an external GPU on separate silicon with separate memory for both and the control logic between would have been far more.
Though the only thing is that on the PS4 and XBone, some of the cores are dedicated strictly to running the OS itself, just as some of the unified memory is dedicated to the OS.
With fewer cores, that may have been more difficult to pull off with thread scheduling, rather than having a dedicated core (or two) to for just the OS.

For all other systems and processes outside of consoles, you are correct.
I do think the bottom line, at least back in 2012/2013, was the cost and the limitation of the form-factor for Sony and Microsoft's options, and it was win-win for AMD contractually, even with minimal profit margins.
 
It's not necessarily always true that fewer faster cores is equivalent to more slower cores, even if the performance should be equal on paper. Fewer cores means more thread management overhead due to more frequent thread switching. There's also a chance that thread that would be able to run on a 5th core would have to wait longer when there's only 4 because other threads have higher scheduling priority. Hyperthreading could alleviate that some, but you're still going to have threads contending for shared resources of a hyperthreaded core where they wouldn't on two independent cores.

This is true, but also depends significantly on the workload -- if the processes are completely independent, then they might run faster on separate cores, however, if they are related then the overhead of managing them might also easily eclipse any performance gains from running on separate cores. Generally speaking two high-speed cores with SMT2 would be enough for nearly all consumer systems, if those two cores were fast enough and were properly fed by the rest of the system.

Though the only thing is that on the PS4 and XBone, some of the cores are dedicated strictly to running the OS itself, just as some of the unified memory is dedicated to the OS.

They did this in order to easily ensure a relatively consistent experience with respect to the OS. It makes sense when you have eight weak cores, as core / thread affinity isn't something that anyone has to guess at, but it's also not like operating systems aren't already good with process priority in general, nor that in a homogeneous environment would the kernels used in the consoles have any issue with fewer, faster cores.
and it was win-win for AMD contractually, even with minimal profit margins.

It was a huge win for AMD and for consumers; while AMD didn't rake in the profit, the revenue stream absolutely helped keep their operations going and their R&D moving forward -- and given that they had already supplied a solution for both Microsoft and Sony, it was in both companies interest to ensure that AMD stayed hard at work on technology that would be used to power future iterations.

Only now, after the upcoming consoles are released, would Intel, Nvidia, Samsung, or possibly Qualcomm or ARM themselves be up to the task of providing a competent console CPU, and only Intel has x86 for a more seamless transition if that were to happen. And it can happen! But it's still a longshot :).
 
Speak for yourself pleb. :D

on the serious side my CPU use more CPU hours on highly scaleabble stuff that could probably use SMT4
if SMT4 would just give me a 10% boost that would men days I get to finish something faster

Some of my scrips easily scales the works load to above 1000 threads without loosing almost any effiiciency.

Are you the typical user???? Not in za million years. I did not say there was no use for it for any consumers, I said it was not a pressing need where tens of thousands if not hundreds of thousands of consumers would benefit from 4 way smt. You really think that a minority of one so too speak will dictate a consumer cpu product???? I suggest you have a better chance of building your own fab and engineering your own chip design for 4 way smt than you will in getting this in a consumer level chip in the next 2 to 3 years.
 
Are you the typical user???? Not in za million years. I did not say there was no use for it for any consumers, I said it was not a pressing need where tens of thousands if not hundreds of thousands of consumers would benefit from 4 way smt. You really think that a minority of one so too speak will dictate a consumer cpu product???? I suggest you have a better chance of building your own fab and engineering your own chip design for 4 way smt than you will in getting this in a consumer level chip in the next 2 to 3 years.

LOL. Did you totally miss the sarcasm there?
Well you also missed any evidence support to you 1-2 liners claims in this thread.

But lets actually look at your posts since you bring in the spotlight
Oct 7, 2019 os2wiz : "4 way smt will NOT scale well, is totally useless for home use and will only be available for server chips"

totally [ toht-l-ee ]
adverb
wholly; entirely; completely.
https://www.dictionary.com/browse/totally
https://www.merriam-webster.com/dictionary/totally

So to you Question. yes you did in fact say "... there was no use for it for any consumers"

I see you put just a much research into this comment as you prior ones
This is what happens when you type first think later. You end up disagreeing with yourself...
 
Last edited:
Back
Top