We found the Missing Performance: Zen 5 Tested with SMT Disabled

Wishful/borderline delusional thinking. Zen 6 might get 12 cores on a single CCD and I wouldn't expect 16 cores until Zen 8 or later.

TSMC's 3nm tech increases transistor density by only 33% compared to the 4nm tech Zen 5 is currently using, which is what Zen 6 will likely be using. 33% more transistors won't fit double the cores on the same sized die.

Additionally, even though TSMC's 5nm tech improved transistor density by 80% compared to 7nm, we got bigger cores with larger caches instead of more cores with Zen 3 to Zen 4. I expect the trend of allocating die space to making larger cores to continue. Single-threaded performance continues to be king of determining overall gaming performance- a hypothetical 7600X3D (6-core X3D Zen 4) solidly beats a 7950x in gaming.
My ideal solution would be an 8 core Zen_X3D with a 16 core Zen_C core.

You have the 8 cores with all of the cache for lat next bound computing.

You have the 16 cores for the throughput bound computing.
 
Yeah but they have to do the whole P/E core thing to manage it and it’s a major PITA. It’s what’s caused the voltage issues, timing issues, AVX512 nonsense, and numerous other problems along the way.

FWIW AMD is going in the same direction. Apple is, too. I think we're just hitting bumps in the road while kernel devs are struggling with.

Credit where credit's due, Intel should be recognized as the first to go down that road, although man did they hit the bumps hard.
 
FWIW AMD is going in the same direction. Apple is, too. I think we're just hitting bumps in the road while kernel devs are struggling with.

Credit where credit's due, Intel should be recognized as the first to go down that road, although man did they hit the bumps hard.
The difference between Intel and AMD here, is that AMD's little cores are simply shrunken Zen 4/Zen 5 cores. All the same features and IPC, running a little slower, etc. So, they won't have to drop features or attempt some new scheduling to make it work.
 
FWIW AMD is going in the same direction. Apple is, too. I think we're just hitting bumps in the road while kernel devs are struggling with.

Credit where credit's due, Intel should be recognized as the first to go down that road, although man did they hit the bumps hard.
ARM has been doing it for years, you can go back to 2014 with the Cortex A17, it allowed for a BIG.little pairing with the A7 architecture.
Dissimilar cores have been a thing for a long time, and it has strengths when implemented into a system that is tailored for the solution.
Microsoft is too generic and as a result their system can't handle it well, that may change in time but not until there are enough configurations out there to make it worthwhile for them to optimize for it.
This problem is why Intel has spent so much time trying to build a scheduler that presents the CPU to Windows as a single heterogeneous CPU and leaves the task assignment to the CPU itself. That project has hit some significant road bumps, probably the correct solution especially with GPUs and NPUs, and god only knows what else could be thrown into the mix. But Microsoft is not making it easy on anybody, not even themselves, they removed the little cores from their own ARM package because they didn't want to deal with the issue firsthand. Then again the Surface platform as a whole sells like ass outside the enterprise market and the enterprise market just needs functional silicon in employees' hands so the new chips at least cover the required checkboxes.
 
ARM has been doing it for years, you can go back to 2014 with the Cortex A17, it allowed for a BIG.little pairing with the A7 architecture.

Yeah that's true, I was just giving Intel credit for bringing it to the desktop/laptop scene. And Microsoft deserves a bunch of the discredit when it comes to scheduling around it, since like you pointed out, this isn't exactly new.

I expect more bumps in the road with the addition of AI coprocessors, and I suspect we're going to see a lot of other specialized silicon for task-specific work in the hunt for better power efficiency.
 
Poor AMD was so blindsided with Windows coming out of nowhere. They didn't have enough warning in advance to test their product on such a niche OS. Don't worry people, AMD is just another driver/microcode/update/patch... away from the great performance.

Indeed and yet who got a new OS version for their new e-cores a couple of years ago?
 
My ideal solution would be an 8 core Zen_X3D with a 16 core Zen_C core.

You have the 8 cores with all of the cache for lat next bound computing.

You have the 16 cores for the throughput bound computing.
I'd love to have the option for a 16 core with vcache on both dies, so there are no scheduling issues.
 
I'd love to have the option for a 16 core with vcache on both dies, so there are no scheduling issues.
I assure you that would be worse, everything is great until core 4 on CCD0 needs to coordinate with core 9 on CCD1 and everything lags the hell out because the latency gap between them is so terrible.
What you want is a single CCD with 12 cores and cache to spare stacked or otherwise.
 
Poor AMD was so blindsided with Windows coming out of nowhere. They didn't have enough warning in advance to test their product on such a niche OS. Don't worry people, AMD is just another driver/microcode/update/patch... away from the great performance.

Not sure if serious.

But hardware has ALWAYS driven Windows development. Hardware designers don't design to old code. They are at the forefront. It's extremely common for hardware to never achieve full potential because software/firmware never catches up before the next product cycle begins. Except for Linux.
 
I remember downloading the patch for my FX8350 for Windows 7. Of course hardly anyone retested the Bulldozer chips when the patch was released.
https://support.microsoft.com/en-us...-2008-r2-88574d02-f181-2a37-fee4-939b102ce89a
plus if you looked at the settings they were using it would be 2000/2000 for ht/nb which was what phenoms ran when those piledriver chips were suppose to be 2600/2200 and they test them with 1600 ddr3 when they ran best with 2133. not to mention the ghz oc you would get on them. add in the garbage Windows scheduler there was a lot of performance left on the floor. guess why they started overclocking them out of the box these days? i ran a custom liquid cooled 8370 for a long time and never had problems running games. but i did see a massive improvement after getting it tuned vs out of the box performance. but that was half the fun of diy builds. not as fun as it use to be if you ask me
 
Not sure if serious.

But hardware has ALWAYS driven Windows development. Hardware designers don't design to old code. They are at the forefront. It's extremely common for hardware to never achieve full potential because software/firmware never catches up before the next product cycle begins. Except for Linux.
Architecture is designed years in advance, finished CPUs tape out months before. If they're not capable of working out support with MS on time, that's on them, and microcode/BIOS doubly so.
 
Architecture is designed years in advance, finished CPUs tape out months before. If they're not capable of working out support with MS on time, that's on them, and microcode/BIOS doubly so.
What gets me is Lisa was on stage with a Microsoft engineer, holding up a chip and bragging about how they had been working with Microsoft for months to ensure everything was working smoothly and would be seamless... "to avoid the problems our competitor has".
To launch a product and find out it's tied to the freaking XBox game bar in a janky-ass workaround that sucks 4 flavors of balls.
 
What gets me is Lisa was on stage with a Microsoft engineer, holding up a chip and bragging about how they had been working with Microsoft for months to ensure everything was working smoothly and would be seamless... "to avoid the problems our competitor has".
To launch a product and find out it's tied to the freaking XBox game bar in a janky-ass workaround that sucks 4 flavors of balls.
I mean, they partnered with Microsoft, what did you expect?
 
I mean, they partnered with Microsoft, what did you expect?
To get ravaged from behind by a drunken dev team one day while I was innocently trying to read /writingprompts, but no...
Instead, the older couple from the bar sent me a drink with a wink, and I don't know what happened next but I came to the next day with a splitting headache and one fewer kidney.

I mean it was so abrupt, they didn't even give me the chance to try and enjoy it.
 
I read months ago that the 9000 series were supposed to be a marginal improvement over the 7000 series. Like very marginal.

Look about right to me. Plus AMD doesn't like to give people any reason to buy their new stuff, they seem to prefer to use their new stuff as a way to advertise their old stuff. At least until the price cuts start.

Personally I really like that, the mileage I've gotten out of my AM4 machines is fantastic, that's convinced me to stick with AMD for at least another generation.
 
There looks to be a sizable performance difference between Windows 11 and Ubuntu when it comes to Zen5.
https://www.phoronix.com/review/ryzen-9950x-windows11-ubuntu
geometric-mean-of-all-test-results-result-composite-w1vuloar99.svgz
 
The idea here is that Windows 11 isn't doing Zen5 any favors.
Hardware Unboxed tested Zen 4 and its nearly the exact same boost. So, its not only Zen 4. They didn't have time to test an Intel chip. But, Steve believes there would be a similar boost with Intel, as well.
 
Architecture is designed years in advance, finished CPUs tape out months before. If they're not capable of working out support with MS on time, that's on them, and microcode/BIOS doubly so.
Nope. That's the entire point of these posts. In Linux you get performance uplift because it's Linux and opensource. Windows is NOT open source so we're seeing these weird performance anomalies.
 
  • Like
Reactions: ChadD
like this
Nope. That's the entire point of these posts. In Linux you get performance uplift because it's Linux and opensource. Windows is NOT open source so we're seeing these weird performance anomalies.
Exactly. CPUs from everyone not just AMD... have gotten too complicated, too many disparate cores/CCXs/co processors (in the case of qualcomm). The engineers that create these chips NEED to be the engineers looking over the software implementations in terms of how the OS is scheduling cores and how the HAL/Kernel is interfacing with the chip and its cores/cache systems ect. Playing telephone sending specs and suggestions to Microsofts people isn't working out.

We have been seeing this for a few years now... with new CPUs from AMD AND Intel having issues the first few months, that mostly get fixed eventually. I suspect the terrible performance of Windows ARM on the new Qualcomm chips is probably a related issue. These companies are giving info BUT not code to microfloppy. Microsoft engineer teams are implementing all these very different CPU designs on their end with their closed source scheduling and HAL bits. The results are not pretty and its costing Microsoft, its also going to get much worse. We have more CPUs now with on board AI bits... such as AMD XDNA2 cores, Intel chips with AI bits. Qualcomm is also had a crap launch and I bet if someone runs some tests on ARM windows they will find the same performance uplifts on Admin... its too bad Qualcomm hasn't been more directly supporting Linux with their new snap dragon and more importantly their GPU. It would be interesting to see what those chips are fully capable of if the Qualcomm engineers handled the support for the entire pipe. I bet they would run pretty damn well.

For microsoft I still don't get why they are clinging to their windows kernel system. They could switch to a Microsoft fork of the Linux Kernel... and still close source 90% of the rest of their OS. Switching to a Linux file system would be a minor issue... they could use a proper NTFS replacement anyway, and most devices people are plugging in are running Linux file systems anyway. They could hand off a bunch of the heavy lifting of hardware support directly to hardware MFGs. I honestly thought that was the direction Microsoft would go years ago when they started adding the Linux subsystem stuff. I think really they aren't going to have much choice in the future. IMO it has cost them their big ARM windows push twice now... with their last ARM push the hardware on the market was better then the OS. (and with the last push we know from Linux support for some of those old chips that is true... Linux ran better on those machines often by quite a wide margin) Its now cost AMD a launch that is a bit of a face plant. I think Qualcomms Snap X chip falling flat is probably also largely on MS. Anyone want to take bets that Intels Arrowlake has windows issues as well ?
 
Last edited:
  • Like
Reactions: kac77
like this
Exactly. CPUs from everyone not just AMD... have gotten too complicated, too many disparate cores/CCXs/co processors (in the case of qualcomm). The engineers that create these chips NEED to be the engineers looking over the software implementations in terms of how the OS is scheduling cores and how the HAL/Kernel is interfacing with the chip and its cores/cache systems ect.

Electronic/microarchitecture Engineers aren’t OS/kernel engineers. These are two very different disciplines.
 
Electronic/microarchitecture Engineers aren’t OS/kernel engineers. These are two very different disciplines.
Of course, AMD and Intel do however have all those engineers in house and they work hand in hand with the hardware design team to bring up Linux support. The kernel engineers at AMD are not shocked by some new AMD server setup. Same goes for Intel... both push updates for CPUs a good year or more out. Its fair to say in both AMD and Intels case their teams are >>> then microsofts. Both contribute tons of work to Linux... and the majority of that work is bringing up CPU support. There is good reason Zen5 just works under Linux regardless of workload (including games) without doing anything special on the user end. This is also true for Intel and their core setups.

For Kernel 6.10 AMD has contributed 25793 line changes... Intel 88245. Intel and AMD together accounted for 17.6% of the changes for the latest version of the kernel. I don't know the amount of lines that were directly CPU related vs GPU or file system ect. I know this kernel introduced a bunch of Intel CPU loadstate stuff that probably explains their massive contribution this round.
https://lwn.net/Articles/981559/
The previous Kernel 6.9 AMD contributed 171877 lines of code Intel 70800. 30.6% of all the code changes for that kernel version were from them. Its safe to say they have both been busy bringing up support for new upcoming CPUs.
https://lwn.net/Articles/972605/
The next kernel has the new gaming optimized scheudler included in it... and I think it will be interesting to see how much code both AMD and Intel contributed. I know they both had people sending pulls so its probably going to be another 30%+

Microsoft is swimming upstream. Essentially every other company in the world is working on the competing system. The companies that power windows machines (Including Qualcomm) contribute 1,000s of hours of man power every month a piece... it seems crazy to not harnass that like everyone else. Why have hardware implemenation engineers at Microsoft when you can just let the ones the hardware companies hired themselves work for you for free?
 
Nope. That's the entire point of these posts. In Linux you get performance uplift because it's Linux and opensource. Windows is NOT open source so we're seeing these weird performance anomalies.
That is not an issue, or at least shouldn't be. It's not like they put out new CPUs every other month and/or they are some obscure company. They are the second largest player in Microsoft's most important market. Poor support is on them. Hell, if leaks are to be believed, they are not capable of doing their internal code on time.
 
Hardware Unboxed tested Zen 4 and its nearly the exact same boost. So, its not only Zen 4. They didn't have time to test an Intel chip. But, Steve believes there would be a similar boost with Intel, as well.
Yes, with that extra admin account. There's a lot more that goes on that manages how the CPU works in the OS. Particularly how AMD's Zen5c cores are and aren't used per application.
 
Or a windows 11 thing, windows 10 sound like is similar...

And Intel similar to AMD.
I’m thinking this has to be a fundamental permission problem inside the HAL and it’s likely been there since Windows 8.
 
Interesting find. Hoping to see a lot more data on this in the coming days (and not only from HUB).

And I suppose I can do my own tests too when I stop being lazy.

Though running as admin (in addition to security risks) does cause some unwanted things with various apps. Like if you run your game as admin, you will also need to run Discord as admin for the keybinds to mute/PTT to function, for example...

And Discord being this connected, social app, with a large number of not so well intended users, you sure as hell don't want to run it as admin. Daily flow of users getting fooled by malicious links and DMs spreading across servers.

None of this changes anything when it comes to the Zen 5 disappointment though.
 
Interesting find. Hoping to see a lot more data on this in the coming days (and not only from HUB).

And I suppose I can do my own tests too when I stop being lazy.

Though running as admin (in addition to security risks) does cause some unwanted things with various apps. Like if you run your game as admin, you will also need to run Discord as admin for the keybinds to mute/PTT to function, for example...

And Discord being this connected, social app, with a large number of not so well intended users, you sure as hell don't want to run it as admin. Daily flow of users getting fooled by malicious links and DMs spreading across servers.

None of this changes anything when it comes to the Zen 5 disappointment though.
I find it more interesting that AMD was using this account to generate its benchmarks to begin with. So they knew it’s a thing and has been for a long time.

So they have been using it to inflate their marketing numbers for how long?
 
I find it more interesting that AMD was using this account to generate its benchmarks to begin with. So they knew it’s a thing and has been for a long time.

So they have been using it to inflate their marketing numbers for how long?
Or they simply had too much faith that MS would have fixed the issue in a update prior to their launch. At this point I think its safe to assume the quality issue AMD held the launch up for is as likely to be MS having promised a date on an update that they have missed, as it is some CPUs were mislabeled. lol
 
Or they simply had too much faith that MS would have fixed the issue in an update prior to their launch. At this point I think its safe to assume the quality issue AMD held the launch up for is as likely to be MS having promised a date on an update that they have missed, as it is some CPUs were mislabeled. lol
Faith Microsoft would suddenly patch an issue that can be replicated going back OS releases for 9 years?

I know some people worship at the Church of Gates but that’s a level of faith I’d describe as fanatical.

We can’t even be certain this isn’t just considered a cost of doing business with the core of the Microsoft OS Abstraction Layers. This may not be something they actively consider a “bug”.
 
lol years ago when i put in a ryzen performance fix in regards to SMT and CCX people we hounding me for being intel fan ( i have AMD atm)

Shameless self plug but i will say i havee enjoyed being able to disable SMT or CCX switcinh in multple games. Im still testing the disable E core for the games (still works for background load)

Image1.png



As CPU become more and more asynchronous proper handling of thread allocation becomes important for optimal performance.
Disabling SMT has been a potential performance boost since Core I7.
 
lol years ago when i put in a ryzen performance fix in regards to SMT and CCX people we hounding me for being intel fan ( i have AMD atm)

Shameless self plug but i will say i havee enjoyed being able to disable SMT or CCX switcinh in multple games. Im still testing the disable E core for the games (still works for background load)

View attachment 673555


As CPU become more and more asynchronous proper handling of thread allocation becomes important for optimal performance.
Disabling SMT has been a potential performance boost since Core I7.
In my experience, the best overall gaming setting for Intel 12th, 13th, and 14th, is no HT, with e-cores enabled.

Significantly lower power and heat, and is essentially as good as using Intel's APO feature.
 
In my experience, the best overall gaming setting for Intel 12th, 13th, and 14th, is no HT, with e-cores enabled.

Significantly lower power and heat, and is essentially as good as using Intel's APO feature.
You make similar changes if you are running Hyper-V on one of them too. You get much better performance from the VM’s that way.
 
In my experience, the best overall gaming setting for Intel 12th, 13th, and 14th, is no HT, with e-cores enabled.

Significantly lower power and heat, and is essentially as good as using Intel's APO feature.

It depends on the number of cpu heavy threads but yesi agree an extra thread execution on a logical core form SMT has way less performance gain than an extra thread execution on an E cores
If you game/application has a few enough cpu heave threades disabling E cores in addition to SMT can also give you and additonal boost over just disabling SMT. again depending on the softwares threading abilty and the number of cores on your CPU.

A potential worstcase scenario for your suggestion would be a single threaded game/application. SMT has not draw back for a single threaded application. But e-cores surely have.

so you suggestion is good the optimal is just more complicated than just one answer.

--- edit ---
Some numbers fun:
In my quick and dirty test ( dont take it to serioust), it seems that a P cores on my laptop is around 40%ish faster then my E cores
That revese means an E cores is only around 70% of an P cores ( not counting SMT)
Earlier testing (along time ago) sesm to indicate SMT gives around 20-25% performance boos

If im a software with 2 threads and my choice is either to run on a single physsical cores with SMT im running at 120-125% single P core performance
If im instead runnig on 2 E cores im running at 140% single P core performance.
So yeas getting rid of SMT before getting rid of E-cores is the right way to go for multithreaded software. (if the threadcount is low enough)
but if the threade count is low enough so the choice for a thread is to run on a dedicae p core vs a dedicated E cores. Then it might be additional beneficial to cut out the E-cores as well to ensure the th thread alwyas have a 1:1 ration ( or better) with dedicated P cores.

-- edit 2--
This is why i love my software because i can change it on the fly. with no reboot.
and it only impact for the game im playing
If i disable SMT to ensure my games thrda get dedicated full cors. the SMT part stil linger to handle background load without distrubing the execution time of my game.
Or with E cores. E-cores are not used for the game but still can run background stuff so it does not take away from the P-core.
Much better than disabling SMT and E-cores in bios which totally removes them.
 
Last edited:
AMD brought it upon themselves. Intel has shown repeatedly that there is no need to change what you are doing. If you’ve been doing the same thing for 10 years, just keep doing it. Especially in the world of computer processing. 😏
 
Back
Top