Why Amd's Bulldozer "should" be far behind

Status
Not open for further replies.
Me i am happy with my X6, with Bulldozer being 25-50% faster per clock all the better. But gaming on my X6 i am GPU limited anyway so am i bothered i don't have sandy nooo. I game at 1080p with FSAA CPU performance means very little at that rez. And i meda encode once in a while, where the extra cores come in handy.
 
Fused multiply-add? That's nothing really spectacular. Only a specific operation can get faster. Advanced Vector Extensions was introduced by Sandy Bridge.

AMD Bulldozer has AVX like Sandy Bridge but also has XOP and FMA4 which Sandy Bridge doesn't. It will be interesting to see how the instruction set match up. Intel's instruction sets usually dominates AMD but thier FMA4 and XOP set is out way before Intel's.
 
Wtf_is_this_shit.jpg

This made me laugh.
 
Intel needs AMD to stick around, anti-trust investigations are expensive
 
AMD Bulldozer has AVX like Sandy Bridge but also has XOP and FMA4 which Sandy Bridge doesn't. It will be interesting to see how the instruction set match up. Intel's instruction sets usually dominates AMD but thier FMA4 and XOP set is out way before Intel's.

As I said, FMA and XOP only make one (well, two here) specific instructions faster. Software would need to changed to take advantage of that an must be compiled only for that CPU.


Yeah, cause the criteria for picking supercomputer CPUs is very similar to the one in picking home computer CPUs... those people want the highest RAM density, with the most performance, lowest power usage, all at the cheapest price. That market is a LOT of money, every cent/milliwatt costs.
 
I've never heard of this processor before, thank you for posting this info. Very good history lesson imo. Very cool find. :cool:

Also, there are far more AMD Operterons in supercomputers than there are Xeons, that says something. AMD CPUs have always scaled far better in multi-processor systems.

This isn't true. Xeons always, even during 2004-2006 had more HPC wins than Opteron.

http://top500.org/charts/list/25/procfam
http://top500.org/charts/list/26/procfam


And the second is false also. Xeon scaled from 1Pto 32P systems. Opteron was found in 1P to 8P systems.

Both scaled gluelessly to 4P ( Opteron up to 8P but scaling was poor and performance subpar, some attempted to make a chipset - Horus - to fix that but failed ), but typically over 4P you need a chipset to handle coherency. Xeon had such supporting chipsets from IBM, Unisys and Compaq and was able to scale up to 32 sockets.

So no, Opteron did not scale better to larger systems, in fact, it did not scale at all post 8 sockets while Xeons were found in 16 and 32 socket systems.

Where Opteron scaled better and this is probably the starting point from which you falsely generalize was 1 to 4P scaling. This was indeed better for Opterons since with each added socket, the system BW and memory BW increased, for Xeons it stayed constant. Intel attempted to fix the issue by going to 1 FSB/socket and large L3 caches which alleviated the problem somewhat. ( FSB based Xeon system held a number of records for 1-4P systems achieved by sheer brute force approach ).
Nehalem changed all that obviously and its QPI interconnect and IMCs are comparable if not better than what Opteron has.
 
Status
Not open for further replies.
Back
Top