AMD: Zen will offer 40% faster performance per clock than Carrizo

And intel Know it... the return and re-implementation of OoOE (Out-of-order Execution Technology) to being a successor of Hyper threading I think its a good example and a good route for intel in cannonlake and I think that's the main reason why they are planing to increase the core count of mainstream chips by not using HT but relegating the work to OoOE which is supposed to offer +30% multi-thread performance at the same energy cost than HT but also allowing higher FP performance.. so 6 logic cores with 12 OoOE threads will be great for mainstream market.

Intel had a great time of refine OoOE with those tiny Silvermont and goldmont Atom chips with haswell IPC in 4 cores/ 4 threads and just 5W TDP mobile chips or 8c/8t server chips. which work great so I guess all that experience and refining added to cannonlake will be just at least Interesting..

I'm confused what you mean by Out of Order in this context.

I mean just about every non-Atom Intel chip made since the introduction P6 has used some form of reordering to the queue.
 
Zarathustra[H];1041891958 said:
I'm confused what you mean by Out of Order in this context.

I mean just about every non-Atom Intel chip made since the introduction P6 has used some form of reordering to the queue.

im referring to OoOE as a Re-implementation and Hyper Threading successor. Hyper Threading itself its based as you said in a form queue reordering.. let's say and take it as a new Hyper Threading version but with stronger Multi-thread performance.
 
im referring to OoOE as a Re-implementation and Hyper Threading successor. Hyper Threading itself its based as you said in a form queue reordering.. let's say and take it as a new Hyper Threading version but with stronger Multi-thread performance.

What does that mean though? That sounds like HT/SMT just with better efficiency than intel has now.
 
What does that mean though? That sounds like HT/SMT just with better efficiency than intel has now.

HT and SMT are kinda different in a lot of aspects, SMT works better than HT by a wide margin, and OoOE is more similar to SMT than HT so yes, resumed it's like HT/SMT with better efficiency and that's mean better multi thread performance in mainstream chips, 6 cores with the new OoOE at 350ish it's a great value in the mainstream segment in my opinion specially for those like me with 8+ machines pushed 24/7.. increasing performance and obtaining faster results but also lowering the power bill it's great. 2017 will be a interesting year in my opinion as I'm looking for high core count but also a strong IPC performance i'll buy the better option regardless of any brand.. my 2 FX platforms do a good job but are far of being near of what the oldies 3930Ks can do..

Now looking in a gaming perspective it also would be great to have 8 or 12 stronger threads than what we have now in the Intel side I could see as a big benefit as in 2017 most games will be DX12 already.
 
HT and SMT are kinda different in a lot of aspects, SMT works better than HT by a wide margin, and OoOE is more similar to SMT than HT so yes, resumed it's like HT/SMT with better efficiency and that's mean better multi thread performance in mainstream chip..

You're not being clear here.

Hyperthreading is an implementation of Simultaneous multithreading. They are not different, it's simply confusing when you realize that there are so many different implementations of SMT, none of which had been determined to be "best."

HT = SMT. It's thread-level parallelism. In the Intel implementation, it takes ADVANTAGE of superscalar core design by attempting to utilize more execution units. But it won't work if you don't have independent threads to run on the system.

Out of Order Execution also attempts to utilize more functional units with instruction-level parallelism. It reschedules instructions from a SINGLE THREAD. So long as there's no dependency, this can increase usage of execution units.

Both methodologies have the same end goal: better utilization of execution units. And both have completely different ways of going about that goal, both on software requirements, and hardware implementations!

HT = SMT.

OoOE is it's own concept, separate from multiple threads.
 
i run a 48 core Opty as my home PC and i can tell you that just about nothing is even remotely capable of utilizing it. 16 threads is plenty for desktop :)i wish AMD came out with an 8 core/16 thread APU with shared HBM2 and a Fiji on it :) i'd pay $1200 for it.
We were just goofing around and nothing scales of course on 32 cores we all know that. I think there was the engine that streetfighter 4 used which supposedly had really good parallel function it would split of tasks as many times as it could.
Not to say that this is not something we would not like to see because in general the only scaling in CPU is more cores since were not going forward on gigahertz these days.
So I've been doing multi-threaded coding recently and it's a lot of fun to have these huge core counts.... so... 128 cores or bust! :D
I wouldn't mind seeing high core count but it has to be "usable" in the desktop market rather then something exotic.

So 8 cores makes sense to me but if there would be a 16 core version for AM4 I would not think twice to purchase that16 core one :) .
 
Poor design for the consumer market ;)

Poor design for the fabrication process they used, and pretty much all markets, other than select virtualization/encryption servers.

BD chips are not all that bad, what really hurt them is no new chipset beyond AM3+, and getting high performance out of the 28nm node they used for steamroller, and well excavator. The latter caused them to cancel plans for a steamroller FX, and a excavator FX on the desktop. The process they used was suited towards lower frequency and better power consumption. SR FX, and EX FX would have been relatively good chips if they were produced and were able to hit the clock speeds of Piledriver.
 
Poor design for the fabrication process they used, and pretty much all markets, other than select virtualization/encryption servers.

BD chips are not all that bad, what really hurt them is no new chipset beyond AM3+, and getting high performance out of the 28nm node they used for steamroller, and well excavator. The latter caused them to cancel plans for a steamroller FX, and a excavator FX on the desktop. The process they used was suited towards lower frequency and better power consumption. SR FX, and EX FX would have been relatively good chips if they were produced and were able to hit the clock speeds of Piledriver.

It's not all process and chipset related. They were a huge disappointment even on their original process with their original chipset at the time they were launched. I don't see how a new chipset would have changed that.

I'm not suggesting that the bulldozer architecture was without its merits. It does great (or at least did when it was newer, it's getting long in the tooth now) in highly threaded server environments, including virtualization. Heck, before I got actual ser er parts I ran an FX-8120 in my ESXI server for years, and it performed great in that role. (What eventually made me upgrade was that I needed more than 32GB of RAM, and the 990FX platform didn't support that)

What I am saying is that the design was a very poor match for client workloads. It should never have been marketed as a client CPU. On the client side they would have been better off if they had just incrementally improved and die shrunk Phenom II. Heck, a friend of mine who does software development for a living (primarily in Scala these days) actually reverted to his old quad core Phenom II after a couple of months with his FX-8350, as his compile times INCREASED too much.

It's pretty clear what happened. The revenue problems caused by Intel's illegal business practices during the Athlon years and the money spent on the ATI acquisition put AMD in a tough spot financially. They didn't have the means to develop BOTH a new server architecture and a new client architecture, so someone decided that what they were designing as a semi decent server platform would have to do as the next client chip. This was likely - in part - justified by the sense that client side computing is going away in favor of mobile, so all they really needed was a stop gap product in that market anyway. (we no know that to be wrong, but at the time many believed this)

And it was a disaster.
 
Beyond that bulldozer was really only good at specific server workloads. The cache was such that branching code performance was crap. The cost of a cache miss roughly doubled from the late phemons (deneb?) to bulldozer and so bulldozer only really does well when code operates on large datasets with few conditionals.
 
Dresdenboy was not finished blogging about Zen some speculation on his part here but not that far fetched ;) .
http://dresdenboy.blogspot.nl/2015/10/how-many-days-until-zen.html

This chart shows the time delta in days between the publication of patches and the launch of a particular CPU containing a new core. For some launch dates only a month was given, so I took the last day of that month for the calculation.

The Zen bars show the timeline in months starting with publication of the specific patch. With this at hand, anyone can draw their own conclusions. The scenario of first Zen based server or desktop CPUs hitting the markets in 4Q16 doesn't seem unlikely.
 
Assuming this 40% figure is true, it is SO difficult to predict where it will land, based on the fact that Carizzo comes in variable TDP package only.

It's difficult to figure out what kind of performance it affords at what clock rate, when allowed to run free.

According to this bench on Jagatreview, the IPC appears to be 9.9% better than Kaveri, which means we should see (if this 40% is reliable) ~54% faster IPC than Kaveri.

Other links in this thread suggest a design that will run at ~4ghz on the desktop. So, doing some math, 7870k + 100mhz +54% IPC this places us in between high end Ivy Bridge and high end Haswell performance in single threaded environments.

If they can pull it off, they will not have caught up to Intel, but they will definitely have made themselves viable again, of course, depending on what Intel does in the next year before it launches.
 
But the discussion about those numbers hold no real value until something rolls of the production line at Foundry X or Y. Where we know some other factors the focus on numbers is something which might be done for marketing reasons still need to hear where the manufacturing process is taking AMD Zen...
 
I'm expecting IB/Haswell level performance, which is a problem considering Intel will be two arches ahead by then. AMDs problem is they are always playing catchup, and not getting their chips in OEM products kills them.
 
I'm expecting IB/Haswell level performance, which is a problem considering Intel will be two arches ahead by then. AMDs problem is they are always playing catchup, and not getting their chips in OEM products kills them.

I think plenty of people would line up in droves to buy an AMD chip that was only 5-10% slower(assuming each iCore gen is about 5% faster IPC vs previous gen). AMD would likely be cheaper to boot. Biggest disadvantage other than this being total "Hopeful" speculation is that AMD generally doesn't have the OC'ing headroom Intel does. Further, AMD generally did a crap job of CPU L2/3 cache performance.

I'd love to be able to buy a new AMD chip that was "almost" as fast as Intel in all facets of performance. My 8320 is a decent chip, but it's IMO a new version of a P4 whereas Intel learned from the IPC/heat issues of the P4 and made a better AthlonXP with the Core2 series on up.
 
I'm expecting IB/Haswell level performance, which is a problem considering Intel will be two arches ahead by then. AMDs problem is they are always playing catchup, and not getting their chips in OEM products kills them.

It is unrealistic to expect AMD to ever be competitive with Intel again. They have a tiny fraction of the funding, a tiny fraction of the Engineering man power, have to rely on external fabs, and - compared to Intel - older larger process nodes.

It was a miracle they were ever able to compete with Intel at all. It was a timely confluence of events that allowed that to happen.

- AMD buying out some Digital technology assets and hiring their engineers.
- The spectacular FAIL on Intel's part with the P4 Netburst architecture.

Because of this great fortune (essentially winning the lottery for AMD) they were able to produce competitive chips for a few years. They COULD have turned this into being a long term competitor, but they failed to do so, partially because they didn't have the know-how to take advantage of their surprising new position, and partially because Intel used illegal business practices to keep them from getting their foot in the doors with OEM's and to sabotage their performance on code compiled on the Intel compiler.

Intel eventually settled with AMD over this to the tune of $1B, but that settlement was a fraction of what it really should have been, considering the damage they illegally did, and it came way after the damage was already done.

Expecting AMD to catch up to the point where they tie or beat Intel in raw performance is - at this point - just as ridiculous of an expectation as it would have been to predict the Athlon vs. P3/P4 wars during the AMD K5/K6 years

The best they can hope to is to get back to the point where they play K6 to Intel's Pentium II again, and that's what Zen does.

I think plenty of people would line up in droves to buy an AMD chip that was only 5-10% slower(assuming each iCore gen is about 5% faster IPC vs previous gen).

Agreed.

I'd love to be able to buy a new AMD chip that was "almost" as fast as Intel in all facets of performance.

Agreed. To me it is irrelevant if Intel is a few percent faster than AMD. The relevant part is if they are fast enough to do what I want to do. That is not currently the case, but I am hopeful that with Zen that will be the case in late 2016. But you never know, we have heard this story from AMD before)

Unless something happens to my rig, I will likely still be running my i7-3930k when Zen launches. At that point, I will consider replacing it with Zen, as long as Zen is an actual upgrade, and not a downgrade or side grade.
 
Agree 100% with you Zarathustra. They need to show me something that is close enough and a decent enough upgrade over what I have. One thing that would make me upgrade in a heartbeat would be PCIE 4.0 speeds to the point where you need that bandwidth to have low latency gaming.

Then it would be a simple task of picking out a chip that fit my budget.
 
AMD was always pretty competitive, including whole Phenom II line.
Phenom II X6 was maybe still worse than Nehalem/Lynnfield i7 in both performance and power consumption but difference was not that great and AMD was cheaper, both CPU and mainboards. Then there was Bulldozer with its fake 8 cores, bigger power consumption and worse IPC. Competitiveness of AMD went through the drain to hit the all time low. AMD processors now are joke, really bad joke. People buy them mainly because of good memories with the brand, not for their performance or price. Nowadays it is hard to recommend AMD as in most cases Intel is better, in all price ranges, even with cheapest builds.

Zen doesn't need to be faster or more efficient than newest Intels. It need to be something that have better reason to being chosen than "I had Athlon K7/K8 many years ago and would like to have AMD now too" like is with Bulldozer now. 40% more performance than Carrizo sounds good if its actually single core performance (IPC) and will be putting AMD back in the game. If it is some multi-core bullshit like with FX then AMD is as good as dead. They are already no real competition for Intel anyway. Only competition for them are old Intel products that are in working condition and have enough performance like eg. infamous 2500K/2600K which make thousdants of people to hold on buying new builds and will circulate in second hand market making people with tight budget not buy some i3 but those used * Bridge cores.
 
Even the Zen diagrams are wrong in that article... as Zen has 6 integer pipelines not 4.
No, it doesn't. Zen apparently has 2 AGUs which are usually included in an integer execution resources count, but are not integer execution pipelines.
 
Thanks for the correction... 4 integer pipelines and 2 load/store pipelines... what i was getting at is their diagram for BD arch. had 4 pipes per core (2 integer and 2 l/s) see:

http://cdn.wccftech.com/wp-content/uploads/2015/10/AMD-Zen-vs-Steamroller-Block-Diagram-copy1.jpg

and

http://cdn.wccftech.com/wp-content/uploads/2013/07/AMD-Steamroller-vs-Bulldozer.jpg

and

http://www.realworldtech.com/bulldozer/8/

but for Zen they don't include the L/S pipes, only the integer pipelines. i thought that a bit strange.




No, it doesn't. Zen apparently has 2 AGUs which are usually included in an integer execution resources count, but are not integer execution pipelines.
 
AMD was always pretty competitive, including whole Phenom II line.
Phenom II X6 was maybe still worse than Nehalem/Lynnfield i7 in both performance and power consumption but difference was not that great and AMD was cheaper, both CPU and mainboards.

From what I remember, worst case it was even with Conroe, and in best case scenarios it was toe to toe with i5 Nehalems. Still really good for the price.

Then there was Bulldozer with its fake 8 cores, bigger power consumption and worse IPC. Competitiveness of AMD went through the drain to hit the all time low. AMD processors now are joke, really bad joke. People buy them mainly because of good memories with the brand, not for their performance or price. Nowadays it is hard to recommend AMD as in most cases Intel is better, in all price ranges, even with cheapest builds.

Sad, but true. They really shot themselves in the foot getting ahead of themselves not once but twice in designing an architecture that really should have had a smaller process node to work with. Ergo - Barcelona should have been on 45 or 32nm and Zambezi should have been on 22nm. These days I only look at AMD CPUs for friends who are new to PC gaming or their APUs for most everything else. They do enough and the platforms are cheap. Bonus points for APUs being able to do casual gaming at 768p at less than half the price of the Iris Pro equipped Intel lineup in case the kiddos want to play something and actually have it run. I still remember a friend who didn't know shit about computers wondering why he couldn't play Medal of Honor Airborne on a Celeron Northwood 2.8GHz with i865 integrated graphics. APUs are great for those guys too. For everyone else that needs horsepower it's Intel or GTFO.

Zen doesn't need to be faster or more efficient than newest Intels. It need to be something that have better reason to being chosen than "I had Athlon K7/K8 many years ago and would like to have AMD now too" like is with Bulldozer now. 40% more performance than Carrizo sounds good if its actually single core performance (IPC) and will be putting AMD back in the game. If it is some multi-core bullshit like with FX then AMD is as good as dead. They are already no real competition for Intel anyway. Only competition for them are old Intel products that are in working condition and have enough performance like eg. infamous 2500K/2600K which make thousdants of people to hold on buying new builds and will circulate in second hand market making people with tight budget not buy some i3 but those used * Bridge cores.

My i7 2600 still whoops the 8350 in 95% of software, case in point. I'd only buy one as a curiosity if I had the cash to blow and didn't know what else to do with it. I really hope Zen delivers so that we can get progress rolling again.
 
I'm hoping for an AMD revival, not likely but I remember reading in 1998 on the Intel website with how Intel wanted to keep 32bit computing well into 2013...
 
Then there was Bulldozer with its fake 8 cores, bigger power consumption and worse IPC.

There was nothing "fake" about those cores.
The FX-8XXX has eight real integer cores, but with only four FPUs, as it is a CMT design, not an SMT design like we are normally used to with most processors.

On AMD's FX CPUs, each clustered module has two cores and one shared FPU, where as a normal SMT design has one core and one paired FPU per core.
You can see on the image below (left) that there are two integer cores for each FPU:

AMD-Zen-vs-Steamroller-Block-Diagram-3.jpg


It looks like Zen is going back to an SMT design, which in my opinion, is a lot more efficient and should have a good increase on IPC.
 
Last edited:
i've always thought that it would have been nice if AMD had made a Phenom X8 using updated K10 cores that ran at 3.7 GHz. i wonder what the die size of such a beast on the same node would have been vs. BD...
 
i've always thought that it would have been nice if AMD had made a Phenom X8 using updated K10 cores that ran at 3.7 GHz. i wonder what the die size of such a beast on the same node would have been vs. BD...

in retrospect, a Phenom X8 shrunk from 45nm to 32nm would have been no worse than bulldozer.
 
i've always thought that it would have been nice if AMD had made a Phenom X8 using updated K10 cores that ran at 3.7 GHz. i wonder what the die size of such a beast on the same node would have been vs. BD...

in retrospect, a Phenom X8 shrunk from 45nm to 32nm would have been no worse than bulldozer.

I would have preferred die shrunk updated K10 cores, but kept it where they were, at 6 of them, and instead used the extra thermal envelope to crank up the clocks a little, rather than add two additional cores that get little to no use in most client workloads.

A Phenom III x6 would still have trailed Intel, but would have fared much better than Bulldozer eventually did.

With Zen, they were reportedly given free reign to design it from the ground up to create the best chip they could, but I am sure there are a lot of lessons learned and reused little bits (where it makes sense) from both K10 and bulldozer.

In any engineering business you rarely design anything COMPLETELY from scratch, unless you are entering a new market and you have nothing related. Doing so would not be cost effective.
 
Im not biased but just for the sake of laughter watch ZEN completely blow intel out of the water any every neysayer will eat thier words. But well see....
 
Im not biased but just for the sake of laughter watch ZEN completely blow intel out of the water any every neysayer will eat thier words. But well see....

I'd love to see that happen, but by AMD's own predictions that isn't going to happen, and typically predictions like these give a little when you reach a final product.

I think it will.be a strong chip, and depending on how it performs when it launches, I may get one, but I don't expect it to get any closer to Intels performance lead than ~10% or so.
 
^ Hopefully they can be viable in a budget market again, or be an alternative to i5s on their best chip.
It's sad to see intel offer better products in the ~$100 price range, people cannot lost the budget perception of AMD, it will be a huge disaster for them
 
Im not biased but just for the sake of laughter watch ZEN completely blow intel out of the water any every neysayer will eat thier words. But well see....

I really want the 8 core / 16 threaded processor to be a $700 chip ( meaning one that competes with Intels 8 core / 16 threaded processor) however with the 40% IPC number AMD gave us and the 95W TDP limit combined with the expectation of a lower stock and overclocked frequencies on the 14nm process I do not believe it will be that close. Hopefully it will be better than the i7 5820k. I am looking for an upgrade for my i7 970 which I paid $365 for in 2011 (just after the bulldozer launch). For my tasks I need both better single threaded and 8+ threaded performance.
 
Last edited:
Yeah i can see your point.

Its interesting comparing A10-7850k and A8-3870k scores on Cinebench 11.5 at the same 3.7 GHz clock rate. LLano (updated K10 without L3) scores 4.28 and Steamroller scores 3.63.

it was built on the same process node as Piledriver IIRC... i wonder how it would have done with an L3. With 8 cores it probably would have approached 9 on CB 11.5 at a time when Sandy Bridge was only in the low sevens.

Zarathustra[H];1041912060 said:
I would have preferred die shrunk updated K10 cores, but kept it where they were, at 6 of them, and instead used the extra thermal envelope to crank up the clocks a little, rather than add two additional cores that get little to no use in most client workloads.

A Phenom III x6 would still have trailed Intel, but would have fared much better than Bulldozer eventually did.

With Zen, they were reportedly given free reign to design it from the ground up to create the best chip they could, but I am sure there are a lot of lessons learned and reused little bits (where it makes sense) from both K10 and bulldozer.

In any engineering business you rarely design anything COMPLETELY from scratch, unless you are entering a new market and you have nothing related. Doing so would not be cost effective.
 
There was nothing "fake" about those cores.
The FX-8XXX has eight real integer cores, but with only four FPUs, as it is a CMT design, not an SMT design like we are normally used to with most processors.

On AMD's FX CPUs, each clustered module has two cores and one shared FPU, where as a normal SMT design has one core and one paired FPU per core.
You can see on the image below (left) that there are two integer cores for each FPU:

AMD-Zen-vs-Steamroller-Block-Diagram-3.jpg


It looks like Zen is going back to an SMT design, which in my opinion, is a lot more efficient and should have a good increase on IPC.

Adding the FP execution units will certainly help in FP performance, but that by itself doesn't do much to move the needle on Integer based performance.

That's the primary reason I'm VERY skeptical of 40%; sure, FP might see that much improvement because you remove the bottleneck, but but for Integer workloads, I don't see much movement on performance.
 
Back
Top