• Some users have recently had their accounts hijacked. It seems that the now defunct EVGA forums might have compromised your password there and seems many are using the same PW here. We would suggest you UPDATE YOUR PASSWORD and TURN ON 2FA for your account here to further secure it. None of the compromised accounts had 2FA turned on.
    Once you have enabled 2FA, your account will be updated soon to show a badge, letting other members know that you use 2FA to protect your account. This should be beneficial for everyone that uses FSFT.

AMD Zen Performance Preview

nah its not like that with nodes, the only variable is AMD not the node itself.

The 14nm node they are using is the exact same node as Samsung's, now if AMD elected to use custom libs, or blocks, that would be on AMD and thus GF, but I don't think that is the case here.
 
Last edited:
People crash rented cars all the time.
Those are the best cars to crash too. Insurance covers it all and you get to have fun :p

To the point you're making; I think I agree to a point. It's possible GF is simply limiting clocks due to whatever manufacturing problem(s) they have, but if that's the case why does AMD keep renewing the wafer agreement? Does GF have any other customers using the 14nm node?
 
Those are the best cars to crash too. Insurance covers it all and you get to have fun :p

To the point you're making; I think I agree to a point. It's possible GF is simply limiting clocks due to whatever manufacturing problem(s) they have, but if that's the case why does AMD keep renewing the wafer agreement? Does GF have any other customers using the 14nm node?

Not to sound like a broken record, but AMD are not the best at making good decisions. I'm pretty sure AMD is GF's only major customer.
 
Not to sound like a broken record, but AMD are not the best at making good decisions. I'm pretty sure AMD is GF's only major customer.

I'm pretty sure they were forced to spinoff their fab business into GF or their financial death spiral would have gone even quicker.

Besides Vega going TSMC and complete lack of interest from 2nd tier SoC vendors like Mediatek in GF speak volumes about GF's competitiveness.
 
I'm pretty sure they were forced to spinoff their fab business into GF or their financial death spiral would have gone even quicker.

Besides Vega going TSMC and complete lack of interest from 2nd tier SoC vendors like Mediatek in GF speak volumes about GF's competitiveness.

Yeah, GF has not done well by AMD at all, unfortunately. Although FX did have design issues, the GF process really hurt it nonetheless.
 
Those are the best cars to crash too. Insurance covers it all and you get to have fun :p

To the point you're making; I think I agree to a point. It's possible GF is simply limiting clocks due to whatever manufacturing problem(s) they have, but if that's the case why does AMD keep renewing the wafer agreement? Does GF have any other customers using the 14nm node?

I don't know if this is still the case, but small node capacity has been very limited for some time. This is why GPU's were stuck on 28nm for so long. Smaller capacity was limited in large part due to yield, and the likes of Apple and Qualcomm buying up all the available supply.

It might be opening up now, but even so, if AMD didn't want to manufacture with Global Foundries, where else would they go? TSMC? They are pretty fully booked too. (In part due to their 16nm process being better than Samsung 14nm process)
 
Last edited:
I don't know if this is still the case, but small node capacity has been very limited for some time. This is why GPU's were stuck on 28nm for so long. Smaller capaciry was limited in large part due to yield.

It might be opening up now, but even so, if AMD didn't want to manufacture with Global Foundries, where else would they go? TSMC? They are pretty fully booked too. (In part due to their 16nm process being better than Samsung 14nm process)

They should have never spun their fabs off.
 
They should have never spun their fabs off.

Meh, if they didn't do those measures to appease shareholders and investors, they might have let AMD go over the cliff. As we've seen with that recent stock move AMD did to push back its debt, if they don't hold investor confidence, they dont have enough to sustain themselves. I'm sure when they started declining, they tried to keep whatever possible. Didn't they let go a lot of stuff that makes tons of money now? (Note, I know very little about AMD history so I might be wrong on everything)
 
Yeah, GF has not done well by AMD at all, unfortunately. Although FX did have design issues, the GF process really hurt it nonetheless.

While they were ~5 months late, GF's 32nm process was actually very good..I'd wager it was one of their best nodes (performance wise) and was actually considered slightly superior to Intel's 32nm...

Back on topic, I am hoping the silence from AMD is golden this time around..I love both companies, and while my 5Ghz 3770K isn't anything to sneeze at, I am very interested in a platform upgrade..I want NVME drive support (for booting), and am actually considering doing an MATX/ITX build on steroids instead of the full tower's stuffed with rads I usually build..
 
Yeah AMD back to their marketing PR bullshit again. "Hey look we're faster than Intel!!!" Sure you are, when you artificially de-clock the Intel CPU. Show me your cpu beating Intel at stock clocks and I'll be impressed. Deliberately go out of your way to obfuscate the results because you know you can't really compete and you can go fuck yourself.

Edit: You know what? At this stage, until AMD can stop with the deliberate lies and misdirection in their marketing, I don't really care if they do bring out a CPU that's faster than Intel or a graphics card that's faster than nVidia. I'm not interested. Start telling the truth about your products and I'll look at you again. Keep up with the blatant lies and misdirection and, as I said, you can go fuck yourself.

Better start not using intel or Nvidia for the same reasons then. They have a much, much, MUCH longer history of it. Might wanna ditch your car while you're at it, that whole industry is exactly the same. Oh there's more...no..wait never-mind, I hope by now you get the point.
 
Indeed. Intel is not at a pentium 4 state these days where they were ready to jump ship (at least on the high end) to a new uarch. Intel have committed to x86 and have been executing year in and year out with improvement after improvement despite the technical difficulties which only get larger. Thankfully AMD can learn from intel in making zen and save on R&D. Intel does publish some good docs on what they do in each processor. This won't push AMD ahead of intel, but it makes it easier to catch up or get close. AMD for there part does have some wind in their sails with rising sales, stock prices, and (I think) consumer perception.

Intel has desperately been trying to kill of x86 for almost 20 years now; that was the whole point of the Itanium, which unlike x86 was designed to scale upwards. The plan was, with the move to a new 64-bit architecture breaking software compatibility, to toss out x86 and build something that would meet future computing needs.

Then AMD came out with x86-64, and we're stuck with what's basically a stalled out CPU architecture. And because of Windows (or more specifically, software lock in), we're stuck with it for the next 20 years or so.
 
Intel has desperately been trying to kill of x86 for almost 20 years now; that was the whole point of the Itanium, which unlike x86 was designed to scale upwards. The plan was, with the move to a new 64-bit architecture breaking software compatibility, to toss out x86 and build something that would meet future computing needs.

Then AMD came out with x86-64, and we're stuck with what's basically a stalled out CPU architecture. And because of Windows (or more specifically, software lock in), we're stuck with it for the next 20 years or so.

We are not stuck with anything, x86 just simply works well. More likely, Itanium was created to give Intel a Vendor lock in and keep everyone else out of the game. However, when it ended up not working, Intel had to deal with reality and instead, they bribed OEM's instead. Reality of what Intel has done tends to hit home around these parts. (They did improve their own stuff of course but, only just enough lately to keep the money coming in and nothing more.)
 
I say that is the single biggest mistake AMD made.

Nonsense. If AMD hadn't purchased ATi they wouldn't be anwhere right now. Their APUs are the only thing keeping them afloat. The price needed to be paid to survive.
 
Intel has desperately been trying to kill of x86 for almost 20 years now; that was the whole point of the Itanium, which unlike x86 was designed to scale upwards. The plan was, with the move to a new 64-bit architecture breaking software compatibility, to toss out x86 and build something that would meet future computing needs.

Then AMD came out with x86-64, and we're stuck with what's basically a stalled out CPU architecture. And because of Windows (or more specifically, software lock in), we're stuck with it for the next 20 years or so.

This is also nonsense. x86 is going EVERYWHERE. Once the process gets it small enough to be power efficient in phones ARM is done. x86 is going to be everywhere. Businesses are moving out of mainframes and going x86/linux more and more everyday. x86 is the future. Hardly a dead architecture. It only seems that way because AMD hasn't been pushing them as hard as they had in the past.
 
Nonsense. If AMD hadn't purchased ATi they wouldn't be anwhere right now. Their APUs are the only thing keeping them afloat. The price needed to be paid to survive.

They could just have licensed designs from ATI/Nvidia/others for a tiny fraction. Or simply develop something in house.

They took ATI and destroyed it completely. Because its a CPU company that doesn't really care about GPUs. While perhaps not as fast. intel would have done the same to ATI if that's any comfort.
 
Nonsense. If AMD hadn't purchased ATi they wouldn't be anwhere right now. Their APUs are the only thing keeping them afloat. The price needed to be paid to survive.

That price put them in the red and they have not been out of it since then. In fact, they sold off technologies that were later highly successful just to keep themselves "afloat". Yes, they definitely overpaid for it.
 
Nonsense. If AMD hadn't purchased ATi they wouldn't be anwhere right now. Their APUs are the only thing keeping them afloat. The price needed to be paid to survive.

My comment was about the price paid for ATI. They paid 5 billion dollars to acquire ATI and in less then 10 years their entire company was worth less than 2 billion with the combined ATI.
 
My comment was about the price paid for ATI. They paid 5 billion dollars to acquire ATI and in less then 10 years their entire company was worth less than 2 billion with the combined ATI.
Unfortunately they had to pay what ATI was worth then not what it's worth now. Maybe they should have waited to get a cheaper deal, but then ATI may have been bought up by someone else too.
 
Unfortunately they had to pay what ATI was worth then not what it's worth now. Maybe they should have waited to get a cheaper deal, but then ATI may have been bought up by someone else too.

Even if we look on the deal itself, and remove the part where they should never have gotten it. The biggest problem is they paid with a lot of cash. They didn't have to. They could simply have paid with shares.
 
x86/x86_64 is a very flawed arch, I would guess that it will be able to scale to about an 8 issue wide pipeline before it becomes to difficult to decode the instructions fast enough, so a fair bit of room, but because of the idiotic layout of an instruction word, multiplie places in the instruction word that can change the size of said instruction word, the seer stupid length incremental size of said instruction word, a instruction word on 64bit will have a minimum of 2 bytes, and a max of 13-15 bytes, a 1 byte prefix(includes part of the register address), the opcode bytes 1-5 bytes, a postfix byte which includes the rest of the register address, and address mode, 0-8 intermediate/displacement bytes. and SSE/avx further complicates that, but I don't remember the jist of the layout off the top of my head.

The real mess is in how the opcode section is laid out, You don't have a bit or a couple of bits to define the size, their are reserved opcodes that will cancel the first byte and use difference opcodes on the second byte of the opcode word, the second and third opcode byte have things like this as well, its a damned rats nest. It is hard to create wide pipelines with x86 because the decoders can't work on a fixed instruction size, or at the very least a small set of instruction sizes, You can do non fixed instruction sizes with wide pipes, IBM z-series has 3 sizes, but you can read the size using the first 2 bits of every instruction, no need to read a rats nest.

Something to replace x86 would be a wonderful thing, software is getting close to the point that it might become possible before to long. Itanic, was never a replacement as much as intel would like to think to the otherwise, depended to much on the compiler for optimized code, and is built around the very complicated to code for vliw/epic arch.

ARMv8 is interesting and could be the first step for the ARM arch to go places, its a good copromise between cisc and risc, it doesn't toss out opportunities for instruction level parallelism by cutting out address modes for load and store like most risc's do, it dumped the conditional instructions that make wide issue pipes difficult, it's instruction word layout is sensible, doesn't depend on stupidly wide and difficult to fill simd/mimd registers(avx), sadly they don't have a firmware standard that makes it easy to just slap any operating system onto any system, most of their market is focused on ultra low power chips, and development of high ipc parts is near non-existent. Granted even with that said, their are 6issue wide cores out their, and their is some rather interesting cache fabrics out their with 384+ core configuration machines.
 
Something to replace x86 would be a wonderful thing, software is getting close to the point that it might become possible before to long. Itanic, was never a replacement as much as intel would like to think to the otherwise, depended to much on the compiler for optimized code, and is built around the very complicated to code for vliw/epic arch.

You make it sound like code doesn't depend on compilers for optimal code; try defaulting with optimizations set to off and marvel at the 80% performance boost setting the -O2 flag does. As an architecture, Itanium was fine, and far superior to what we're basically stuck with.
 
x86/x86_64 is a very flawed arch, I would guess that it will be able to scale to about an 8 issue wide pipeline before it becomes to difficult to decode the instructions fast enough, so a fair bit of room, but because of the idiotic layout of an instruction word, multiplie places in the instruction word that can change the size of said instruction word, the seer stupid length incremental size of said instruction word, a instruction word on 64bit will have a minimum of 2 bytes, and a max of 13-15 bytes, a 1 byte prefix(includes part of the register address), the opcode bytes 1-5 bytes, a postfix byte which includes the rest of the register address, and address mode, 0-8 intermediate/displacement bytes. and SSE/avx further complicates that, but I don't remember the jist of the layout off the top of my head.

The real mess is in how the opcode section is laid out, You don't have a bit or a couple of bits to define the size, their are reserved opcodes that will cancel the first byte and use difference opcodes on the second byte of the opcode word, the second and third opcode byte have things like this as well, its a damned rats nest. It is hard to create wide pipelines with x86 because the decoders can't work on a fixed instruction size, or at the very least a small set of instruction sizes, You can do non fixed instruction sizes with wide pipes, IBM z-series has 3 sizes, but you can read the size using the first 2 bits of every instruction, no need to read a rats nest.

Something to replace x86 would be a wonderful thing, software is getting close to the point that it might become possible before to long. Itanic, was never a replacement as much as intel would like to think to the otherwise, depended to much on the compiler for optimized code, and is built around the very complicated to code for vliw/epic arch.

ARMv8 is interesting and could be the first step for the ARM arch to go places, its a good copromise between cisc and risc, it doesn't toss out opportunities for instruction level parallelism by cutting out address modes for load and store like most risc's do, it dumped the conditional instructions that make wide issue pipes difficult, it's instruction word layout is sensible, doesn't depend on stupidly wide and difficult to fill simd/mimd registers(avx), sadly they don't have a firmware standard that makes it easy to just slap any operating system onto any system, most of their market is focused on ultra low power chips, and development of high ipc parts is near non-existent. Granted even with that said, their are 6issue wide cores out their, and their is some rather interesting cache fabrics out their with 384+ core configuration machines.


Very good post! I have to agree with everything you just stated!
 
You make it sound like code doesn't depend on compilers for optimal code; try defaulting with optimizations set to off and marvel at the 80% performance boost setting the -O2 flag does. As an architecture, Itanium was fine, and far superior to what we're basically stuck with.


Rather the opposite in fact, you absolutely had to depend on the compiler, as hand writing ASM on that thing is a bitch. And the compilers never really caught up with the oddities in that arch. Small things like having to manage register renaming in software(register windows), out of order execution handled by the compiler, speculative execution while waiting on the results of a branch which again the compiler had todo this, The instruction packing changed from revision to revision of the processor and unlike a more traditional arch causes huge performance hits until you recompile ALL of your software including the OS kernel.

These constraints are probably fine for a mainframe or ultra high end server, but would never have been acceptable outside that area. And in the mainframe/enterprise server space, They never came close to competing with IBM or Oracle/Fujitsu on RAX, or scalability.

Itanic isn't even a half ass arch'ed, at best its a quarter ass'ed arch.
 
True the limitation of Itanic was the software even after the 2 billion Intel pumped into software development just wasn't enough for others to come aboard, too much time and $ to do those kinds of changes for most software developers and it didn't make sense either because of x64 that AMD created. Of course MS said no and rightfully so because if they went with Itanic instead of x64, their software market would be worse then what Apple and Linux had, so it would have put them back into competition with competitors that they have overshadowed for years.

Added to this the first iteration of Itanic was just not good lol, it couldn't keep up with x86 processors, even the 2nd iteration even though marginally better in raw performance didn't do anything as it couldn't show that performance without the software being optimized for it, not just a recompile.
 
Last edited:
Rather the opposite in fact, you absolutely had to depend on the compiler, as hand writing ASM on that thing is a bitch.

Speaking as a software developer; NO ONE USES HARDCODED CPU OPCODES ANYMORE. Its all done compiler side. Unless you're writing a device driver, a benchmarking program, or working with a realtime system that actually cares about performance, NO ONE trys to outsmart the compiler anymore.

And the compilers never really caught up with the oddities in that arch. Small things like having to manage register renaming in software(register windows), out of order execution handled by the compiler, speculative execution while waiting on the results of a branch which again the compiler had todo this, The instruction packing changed from revision to revision of the processor and unlike a more traditional arch causes huge performance hits until you recompile ALL of your software including the OS kernel.

*cough*

You do know that pretty much everything you just pointed out exists in x86/x86-64, right? Kinda undermines your argument a bit.
 
Speaking as a software developer; NO ONE USES HARDCODED CPU OPCODES ANYMORE. Its all done compiler side. Unless you're writing a device driver, a benchmarking program, or working with a realtime system that actually cares about performance, NO ONE trys to outsmart the compiler anymore.



*cough*

You do know that pretty much everything you just pointed out exists in x86/x86-64, right? Kinda undermines your argument a bit.


Speaking as a developer; I USE, AND MANY OTHER DEVELOPERS USE CPU OPCODES ALL THE TIME via inline ASM, Compilers still have a hard time with vectorization, hand coded SSE/AVX is pretty common.

Actually, compiler works pretty well on x86/x86_64, other then the specific use of whats stated above. And because a modern x86 cpu can reorder instructions, code compiled for one cpu does not generally hurt to much when ran on another cpu, this is why most linux distro's pick the generic cpu profile rather then providing a different package repo for each cpu arch revision for each manufacture.
 
Speaking as a developer; I USE, AND MANY OTHER DEVELOPERS USE CPU OPCODES ALL THE TIME via inline ASM, Compilers still have a hard time with vectorization, hand coded SSE/AVX is pretty common.

Actually, compiler works pretty well on x86/x86_64, other then the specific use of whats stated above. And because a modern x86 cpu can reorder instructions, code compiled for one cpu does not generally hurt to much when ran on another cpu, this is why most linux distro's pick the generic cpu profile rather then providing a different package repo for each cpu arch revision for each manufacture.

What type of work do you do?

Most developers I know these days save themselves the trouble and use higher level languages (Java, Scala, etc) because for most programming performance doesn't matter anymore, because most users are sitting on computers with way more RAM and CPU cycles than they'll ever need.

There are exceptions where performance is important, but for the overwhelming majority of software code, that simply isn't the case.
 
Last edited:
scientific calculations, But I have used it in game development to.

Generally C++ with inline ASM, generally any place that has heavy vector math would benefit, and to an extent it depends on the compiler,... Intel ICC > GCC > MS compiler >? LLVM(I haven't used LLVM a great deal so forgive on this one). But none of the compilers do that great with SIMD/MIMD, Generally code produced by any of those compilers will use the unaligned move instructions(about 3 times slower then the aligned functions), have poor register utilization(The MS compiler wraps intrinsic's with wads of push and pop op's for some reason?!?), or the compiler produces code that uses scaler instructions even when it could easily pack the values together or is unaware of special instructions, (example addss vs addps vs haddps).

You can still get close to an order of magnitude speed increase in certain critical sections of code vs the compiler when SIMD/MIMD is involved.
 
scientific calculations, But I have used it in game development to.

Generally C++ with inline ASM, generally any place that has heavy vector math would benefit, and to an extent it depends on the compiler,... Intel ICC > GCC > MS compiler >? LLVM(I haven't used LLVM a great deal so forgive on this one). But none of the compilers do that great with SIMD/MIMD, Generally code produced by any of those compilers will use the unaligned move instructions(about 3 times slower then the aligned functions), have poor register utilization(The MS compiler wraps intrinsic's with wads of push and pop op's for some reason?!?), or the compiler produces code that uses scaler instructions even when it could easily pack the values together or is unaware of special instructions, (example addss vs addps vs haddps).

You can still get close to an order of magnitude speed increase in certain critical sections of code vs the compiler when SIMD/MIMD is involved.
MSVC is most likely the worst compiler right now. Also, last time i checked, auto-vectorized linear algebra lib by LLVM is faster than hand written de facto assembler one.

But yeah, HPC is the one area where manual SIMDing makes sense still.
 
One of these days LLVM might finely be their, I love the concept of it as a whole, I still don't buy that it can auto-vectorize better then hand written, perhaps close enough that the benefit of hand written wouldn't be worth. Its a common enough problem tho, That I am sure the compiler developers will eventually solve it, maybe when Intel finely learns they can't continually make the registers larger to solve all of their problems!, Then again filling 512bit registers might be an excellent argument for letting the compiler do it!
 
Speaking as a developer; I USE, AND MANY OTHER DEVELOPERS USE CPU OPCODES ALL THE TIME via inline ASM, Compilers still have a hard time with vectorization, hand coded SSE/AVX is pretty common.

Funny, because I haven't regularly dropped CPU Opcodes since the MMX/3dNow days. With very few exceptions, the compiler does a good enough job of extracting performance gains.

MSVC is most likely the worst compiler right now. Also, last time i checked, auto-vectorized linear algebra lib by LLVM is faster than hand written de facto assembler one.

The last compiler comparisons I saw [circa 2012 or so] had MSVC ahead of GCC for what it's worth, but I imagine GCC is likely slightly in front now. Still, given MSVCs other benefits [ease of development], most production is done on MSVC, and code ported to GCC if necessary.
 
Speaking as a software developer; NO ONE USES HARDCODED CPU OPCODES ANYMORE. Its all done compiler side. Unless you're writing a device driver, a benchmarking program, or working with a realtime system that actually cares about performance, NO ONE trys to outsmart the compiler anymore.



*cough*

You do know that pretty much everything you just pointed out exists in x86/x86-64, right? Kinda undermines your argument a bit.

Ah yeah sometimes still do ;) really depends on what the developer is going for, and if there is time (resources) to do it.
 
Back
Top