• Some users have recently had their accounts hijacked. It seems that the now defunct EVGA forums might have compromised your password there and seems many are using the same PW here. We would suggest you UPDATE YOUR PASSWORD and TURN ON 2FA for your account here to further secure it. None of the compromised accounts had 2FA turned on.
    Once you have enabled 2FA, your account will be updated soon to show a badge, letting other members know that you use 2FA to protect your account. This should be beneficial for everyone that uses FSFT.

Windows compiler that can optimize for AMD?

Gman1979

Gawd
2FA
Joined
Sep 21, 2011
Messages
643
I was wondering if anyone has managed to create a compiler for windows that can properly optimize for AMD cpus yet? I've worked with GCC, LLVM, and Intel's compilers in Linux but my son has picked up the programming bug early (sophomore in high school) and wants to work within Windows as one of his teachers is willing to give him extra credit for doing so.

He started asking me if there were compilers capable of properly optimizing for AMD cpus on Windows and I told him I didn't think so. Has anyone made any progress in this realm or do we still not even have optimized primitives and libraries that are AMD specific on Windows for development?
 
There was a thing with Intel compilers where it could generate alternate paths for functions. This can make sense, since you can dispatch to a potentially faster path, like a vectorized SSE version instead of scalar FPU, for example. But it was basically entirely bullshit in this case, because they didn't give a shit that you actually supported the features if you didn't return an Intel processor string. So even though the AMD chips were perfectly capable, they'd get pipelined down the slow path anyway.

TLDR; It was quite literally the case that the Intel compiler intentionally tried to cripple performance for non Intel CPU's. It was possible to patch the dispatcher to not be biased and get large performance gains for AMD hardware as well.


Basically, don't worry about it, you can trust that you'll get fast results from compilers that don't do sketchy shit like that.
 
There was a thing with Intel compilers where it could generate alternate paths for functions. This can make sense, since you can dispatch to a potentially faster path, like a vectorized SSE version instead of scalar FPU, for example. But it was basically entirely bullshit in this case, because they didn't give a shit that you actually supported the features if you didn't return an Intel processor string. So even though the AMD chips were perfectly capable, they'd get pipelined down the slow path anyway.

TLDR; It was quite literally the case that the Intel compiler intentionally tried to cripple performance for non Intel CPU's. It was possible to patch the dispatcher to not be biased and get large performance gains for AMD hardware as well.
Intel writes compilers for their own CPUs. I doubt they intentionally crippled performance; they're likely just building in support for their own products. I'm not sure this is unreasonable.

If there are some specialized instructions like SSE that are supported on both then it can be built into GCC or some other generic compiler.
 
Intel writes compilers for their own CPUs. I doubt they intentionally crippled performance; they're likely just building in support for their own products. I'm not sure this is unreasonable.

If there are some specialized instructions like SSE that are supported on both then it can be built into GCC or some other generic compiler.

Yes, it was intentional, any hardware that doesn't return the Intel string gets the slow path for no reason. I believe Intel lost the lawsuit and were supposed to have changed this "functionality," not sure what ever happened with that.
 
The thread starter's question isn't very clear to me. Are you asking if there are compilers which offer AMD specific optimizations? Are you asking if there are compilers that will optimize for CPUs in general, including AMD ones?

Yes, it was intentional, any hardware that doesn't return the Intel string gets the slow path for no reason.

No...Any hardware that doesn't return the Intel string doesn't get the benefit of all of the time and resources Intel spent on the capabilities of the compiler to optimize for Intel's own hardware. They're not intentionally slowing down AMD's CPUs. They're just not going out of their way to optimize for them, which is a 100% valid, fair, and reasonable thing to do. It's Intel's product, and Intel can make their product as feature rich or as feature lacking as they so choose. In this case, they chose to make their compiler lack the feature of optimizing for non-Intel hardware. Don't like it? Use a different compiler; There's plenty of other ones for you to choose from.

I believe Intel lost the lawsuit and were supposed to have changed this "functionality," not sure what ever happened with that.

Intel lost a lawsuit regarding the disclosure of their lack of optimization, not for the actual lack of optimization itself. The lawsuit only required that they inform users that their compiler does not optimize for non-Intel hardware and pay damages to users of the compiler that had expected the compiler to do so. As far as I know, that was mostly paid out to makers of benchmarking solutions and related software companies for the cost of recompiling their product on a compiler which will optimize code in a more vendor independent fashion. AMD received no damages because they were not harmed in any unlawful way.
 
Last edited:
They aren't Intel specific optimizations, they work perfectly fine elsewhere. You can patch the routine and everyone will benefit.

I checked and they now have a notification about other vendors. They're more than capable of removing the bias . Cool, it's their compiler, they can do whatever they want, but that doesn't make it less shitty of behavior. They go out of their way to ensure only their processors get the fast path, it's intentional.
 
They go out of their way to ensure only their processors get the fast path, it's intentional.
I don't share the evil/jaded perspective of this. Intel is limiting their surface area to what they can control and support ("fast path"), and providing an alternative set of instructions to work in areas that they could not offer the same surface level of support/testing ("common"). The concept isn't evil, and is used in many other areas of software development.
 
To answer the OP's question, all compilers for Windows will optimize for AMD chips as well as Intel's. You have to be careful with enabling certain options in Intel's ICC (Compiler Collection), as they do have a few libraries which only work on certain Intel CPUs, as my colleague and I recently found out. For all other compilers (MinGW, MSVC) there's little that can go wrong.
 
Cool, it's their compiler, they can do whatever they want, but that doesn't make it less shitty of behavior. They go out of their way to ensure only their processors get the fast path, it's intentional.
It's business. Naturally their compiler is designed to favor their own hardware. It's still standards-compliant, and you genuinely have to go well out of your way to use the Intel compiler toolchain anyway.

AMD's old shader compiler did the exact same thing, so it's not like it's just Intel being mean to everyone: it's purely a function of a business doing good business.
 
Has anyone made any progress in this realm or do we still not even have optimized primitives and libraries that are AMD specific on Windows for development?
What is an example of an AMD-specific optimization?
 
What is an example of an AMD-specific optimization?

The AMD and Intel x86 ISA support is slightly different. For example AMD supports SSE4a whereas Intel does not; and vice versa for some other instructions I presume.

With good knowledge of the microarchitecture the compiler can do a variety of other optimizations - for example scheduling instructions to optimally hide their latencies.

I think you know all this though!
 
They aren't Intel specific optimizations, they work perfectly fine elsewhere. You can patch the routine and everyone will benefit.

I don't know specifically what optimizations you are talking about, but not all x86 processors have the same capabilities. Some optimizations may well be CPU specific.
 
Back
Top