Sun or Oracle Sparc processors

Discussion in 'All non-AMD/Intel CPUs' started by scharfshutze009, Mar 17, 2015.

  1. Red Falcon

    Red Falcon [H]ardForum Junkie

    Messages:
    9,833
    Joined:
    May 7, 2007
    Define "business workload".
    Your descriptions are very very broad.

    I've seen x86 blades handle "business workloads" in full production, and on 4-5 year old E7 CPUs.
    Really, give us some applications that SPARC absolutely excels at that can't be done with a large cluster of x86 systems.

    Yes, modern SPARC CPUs have lots of cores, which individually, are much "weaker" cores than x86 or POWER/PowerPC.
    I know that SPARC excels at databases, and queuing up lots of smaller tasks, almost akin to how modern GPUs work, only with a bit more general purpose and less processing power than GPGPUs.

    So other than Oracle databases, please give us some examples.
    I need to give you links to this? Seriously? Have you been hiding under a rock since the G80 GPUs emerged back in 2006??? :confused:
    Really, I'm not trying to be insulting, but this is general/common knowledge that is in the news on a daily basis.

    Just search for "NVIDIA deep learning" and that should give you enough info to answer your question.
    Yes, the HPC market has absolutely saved and propelled NVIDIA, and has for nearly the last decade.
     
  2. jimmyb

    jimmyb 2[H]4U

    Messages:
    3,165
    Joined:
    May 24, 2006
    Intel's server business gets ~50% margins on a market size of approximately $50 billion. Their *profit* in x86 servers is larger than the entire RISC *revenue* size combined. The vast majority of the money is in x86 servers.

    IBM and Oracle might not make a whole lot of margin on x86 processors, but Intel certainly does.

    Right. And the majority of server applications, including many enterprise such as data centers, perform better on state of the art x86 processors for less money. They're cheaper and they're faster for these applications, which is why almost everyone uses them.

    Intel's x86 architecture dominates in performance in all but the most niche applications. It's quality and quantity. You're equating SAP performance with processor "quality", while ignoring the other 95% of workloads where x86 is far better.

    Again, there's far more money in server x86. Intel designed their architecture to win in 90% of the market, and they do so with industry leading margins.

    IBM and Oracle know they can't compete profitably here against Intel, so they've gone for the remaining 10%.

    Server vendors want to move into a less competitive market where they don't have to compete against Intel and dozens of other server vendors. Meanwhile, the end-customers are moving in the opposite direction towards commodity x86 solutions.

    Look at the market data I shared earlier. It's becoming increasingly niche.


    Agreed. And Xeons have better performance in coarse grained parallel or few-threaded workloads. Their execution pipeline, OOE, caching, branch prediction, is the most advanced in the market.

    You are yet again falsely equating business workloads with the workloads that SPARC wins. The majority of businesses do things other than run SAP, and this is why the majority of server sales go to Intel, and why Intel makes the most money in the server market.

    Right. SPARC and Power are awful for HPC. They cost more and are slower. Clearly a win for Intel.


    Profit is what counts. Intel makes the most profit by capturing 90% of the server market, and they do so with margin above the industry average.


    The fact that Sun couldn't compete in HPC is irrelevant to the fact that it is a larger market than RISC and other customer architectures.

    Cluster interconnect latency is going down, bandwidth is going up, fabric topologies and infrastructures are getting more advanced. The difference between a many-socket system and a closely coupled cluster will continue to shrink. This is where the industry is investing their money.

    Right, and the market has decided that they don't need 16 and 32 socket servers except for in a very limited number of applications.

    The market share data I shared earlier. The majority of revenue *and* profit is in x86. The majority of businesses are buying x86 servers.

    So your entire argument is out of date then. Apparently x86 is competitive in the niche 16 socket applications that you're talking about, and presumably if there's a business case for it we'll see 32 socket x86 platforms in the near future.

    Many UNIXs run on x86. They're two independent concepts.
     
  3. jimmyb

    jimmyb 2[H]4U

    Messages:
    3,165
    Joined:
    May 24, 2006
    Fastest in a workload representing like 5% of the market. Not nearly as impressive with that caveat.

    It's so easy to selectively pick out benchmark statistics. I can do the same for Intel:
    • Encryption performance increased by 600% with the addition of AES instructions
    • Video encoding performance increased 500% over a couple generations of Quick Sync
    • LINPACK scores improved 200-300% with the addition of AVX2 instructions.
    • Some parallel workloads scale almost linearly in core count. 8->16 cores, that's almost 100% improvement.

    I could cherry pick lots of other selective improvements.
     
  4. brutalizer

    brutalizer [H]ard|Gawd

    Messages:
    1,593
    Joined:
    Oct 23, 2010
    Scale-up workloads that are not embarassingly parallel. For instance, large database instances. SAP business software. I.e. things that are not suitable to run on a cluster. Clustered workloads are typically scientific computations, i.e. stuff that fits into the cpu cache and run in a tight for loop, over and over again on the same grid points. OTOH, business source code tend to branch all over the place, so the cpu must go out for main RAM all the time.

    For instance, large databases. SAP. Enterprise business workloads.

    The new SPARC M7 does not have weak cores. It has a new generation cores called "S4". The earlier cores where called S1,S2 and S3. The S3 has the ability to dedicate all threads in one core, to a single thread. I.e. it has one single strong thread, or many cores. On the fly it is decided. For instance, POWER7 can do something similar but you have to reboot and reconfigure the server.

    Large instances of databases. SAP.

    Well, this is not common knowledge to me. I know that NVIDIA has like 80% of the mass GPU market. So I thought that was why NVIDIA is more profitable than AMD. But you claim it is not the the mass market, it is the HPC market?

    And what has deep learning to do with NVIDIA? Deep learning is a very small nische market that I doubt would bring in the big money to NVIDIA.

    So that NVIDIA owns 80% of the GPU market is irrelevant? To me this sounds like:
    -Dont you know that Intel is profitable only because of the HPC market? (That Intel owns the x86 market is irrelevant?)
     
  5. brutalizer

    brutalizer [H]ard|Gawd

    Messages:
    1,593
    Joined:
    Oct 23, 2010
    True.

    IBM and Oracle are doing high margin business, Intel is not. Intel is making up the profit by sheer numbers, quantity. IBM and Oracle is doing quality. The margin on large Unix servers are sky high. Intel has much much much lower margins. It seems that you believe that Intel's low margin of... 10%(???) is good, whereas IBM and Oracle maybe has something like >50% margin or so. Intel sell many cpus in the $400 mark, whereas IBM and Oracle sells cpu for $10-20.000 or so. There is a reason IBM and Oracle is branded as way more expensive than x86. It's because they do high margin business.

    True that cheap x86 servers has come a long way and are fine for smallish tasks.

    Where is x86 better? In the business workloads, SPARC/POWER is faster. In computations POWER is better. x86 has barely 400 gflops/sec, whereas POWER8 has above 400Gflops. SPARC XIfx has 1.100 gflops. Have you seen the benchmarks with POWER8? It dominates x86. SPARC dominates on large business workloads such as SAP, databases, etc.

    Sure, there are many cheap x86 servers in the datacenters, but they do virtualization, etc. They dont run large business software tackling extreme workloads. And for this highly lucrative nische (which is small - true) there are no x86. x86 does not exist. All large workloads are run on POWER/SPARC servers.

    What is important to IBM and Oracle is margin. Not profit. HP thinks this too, as their CEO Meg tried to sell of the x86 division because it was too low margin. They fought tremendously for their measly 5% margin or so, on x86. Not worth it, HP concluded. Just like IBM and Oracle.

    Low margin business is not sound. If you are a mere 5% from extinction, no - that is not good. You need some margin for bad times. If you have a company with 5% margin on everything it sells but they have a large market - and if you have a company with 30% margin with a smaller market - then the stock market will look more favorable on the smaller company because it is in better shape for the future. If sales tank a quarter, the large company needs to sell off assets, and if the sales does not recover for a prolonged time, they will go bankrupt. Low margin business is too unstable for a company to rely on.
    http://www.bloomberg.com/news/artic...ibm-server-unit-for-2-3-billion-amid-pc-slump
    "...While [IBM] will continue to sell a range of higher-end servers and mainframes, offloading the x86 division removes a low-margin business from its books....“We were no longer in a position to get the kinds of returns that we wanted,” Steve Mills, senior vice president of software and systems at Armonk, New York-based IBM, said in an interview. “We wouldn’t do this if we didn’t see the obvious taking place in the market...”

    What is most preferable and sound you think, to have a high salary and large mortgages on your house so you only barely keep your nose over water, or a lower salary and much smaller mortgages so you have a very good margin so you can cope with problems without problems, like you loose your job? Have you heard about the subprime house crisis that sank the entire USA in 2008 and plunged the world into a financial crisis? The common people could not cope with higher costs so everything tanked, despite the subprime market being very large with many billions. With low margin you are just a step away from very bad things. Everybody is trying to offload their low margin x86 division.

    http://www.businessinsider.com/what-hp-looks-like-without-its-pc-unit-2014-10?IR=T
    "...Margins for PCs are slim, only about 4%...."

    That is why HP, which is the planets largest PC vendor, wanted to sell of it's x86 division worth $56 billion.
    http://www.businessinsider.com/hp-is-planning-to-split-2014-10?IR=T
    "...The PC and computer segment is massive for HP. It accounts for half of the company's revenue. For the first six months this year, it reported $27.8 billion in revenue. That's about three times the size of HP's next biggest unit, the Enterprise Group, which makes servers, storage, and network hardware...."

    HP has to work so hard for the measly 4% margin, factorys, supply chain, R&D, storage, etc etc etc. It is not simply worth it, HP said. But in the end HP changed their mind and kept their low margin stinking x86 division. But it is not the first time HP talked about getting rid of the x86 division.

    Wayne Gretzky said "I skate to where the puck is going to be, not where it has been". x86 margin is shrinking for every year. And now it is 4%. In a few years it will be 3%. As IBM said: “We wouldn’t [sell our x86 divsion] if we didn’t see the obvious taking place in the market...”

    IBM and Oracle can go for the lucrative 10% which Intel does not can. Who wants to do a 4% margin business?

    POWER8 is faster in every benchmark, I suspect. POWER8 has higher gflops. So I dont see how Xeons have better performance than POWER/SPARC?

    SPARC is designed to do large scale business workloads, that is why SPARC is faster on that kind of workload. Even if Intel makes more money totally on the server market, that is not that important as a whole. Intel market cap is $139 billion, IBM is $142 billion and Oracle $159 billion so they are bigger. So, it seems you can do good business if you exclusively go for high margin business.

    Seriously, I dont understand why you believe this? POWER8 has higher gflops than any Xeon, and SPARC XIfx has 1.100 gflops. The next generation Xeon will probably increase performance 10% or so, so maybe Xeon will reach 440 gflops in the next generation?

    Margin is what counts. Compare to Oracle and IBM market cap. I hope you dont believe that Intel selling the vast majority of their cpus for $400 is higher margin than RISC cpus for $10-20.000?

    Sun focused on the business enterprise market.

    No it isnt. The big money is in large business servers. Again, IBM sells a few 100 mainframes per year, and mainframes account for something like 15% of IBM's total huge revenue. If we compare to SGI which exclusively dabbles in the HPC market, SGI has a market cap of $0.15 billion. So, it seems that large HPC clusters such as SGI UV2000 is a very small market compared to large business servers.

    True. x86 is getting stronger so today a 4 or 8-socket x86 gives plenty of power. You dont need a POWER/SPARC for most of the workloads today. But for the largest workloads, you have no other choice than POWER/SPARC. There are no x86 servers that can tackle large business workloads. And these large business workloads, brings in looooot of money.

    True. But x86 is low margin with 4% so nobody wants to touch it and tries to dump it. x86 is a far cry from being a loss business.

    Wrong. The 16-socket x86 servers does not offer the same performance as POWER/SPARC, they scale badly. And what is really important in the large server arena, is RAS. RAS is extremely expensive. x86 does not have that good RAS. For instance, on Mainframes and POWER/SPARC you can routinely replace cpus, RAM etc on the fly. SPARC and Mainframes can reply instructions if they detect an error in the calculation. Some Mainframes have three cpus and all calculations are run on all of them simultaneously, and if any cpu output differ, it is shut down. etc etc. These tailor made Reliability solutions are very very expensive and x86 does not have that kind of technology which has been perfected for decades. So large Unix and Mainframes are much much much more reliable than x86 servers which is a huge selling point in the business Enterprise arena. They prefer a large Unix server, capable of tackling the largest workloads, reliable that very very seldom goes down, to a new unproven unmature 16-socket x86 server.

    True. But I meant that Unix OSes will scale the sh-t out of primary x86 OSes such as Linux/Windows because these desktop OSes have never been run on large servers earlier and need heavy redesign to be able to cope with large 16/32 socket servers. And also, SPARC/POWER will also scale the sh-t out of any x86 server.
     
  6. brutalizer

    brutalizer [H]ard|Gawd

    Messages:
    1,593
    Joined:
    Oct 23, 2010
    Can you show us relevant benchmarks against POWER8 and SPARC? Encryption performance is not an important selling point when selling huge servers. gflops are interesting, and the most common business benchmarks such as the important SAP. Linpack, Lapack, etc.

    (I suspect that x86 is much slower on encryption than SPARC as they do it in real time, and SPARC M7 has built in hardware accelerators for that)
     
  7. Red Falcon

    Red Falcon [H]ardForum Junkie

    Messages:
    9,833
    Joined:
    May 7, 2007
    "Embarrassingly parallel"... you act like parallel processes are a bad thing, or something to be ashamed of.
    I've seen a lot of production applications and workloads running full databases with parallel clusters, so it isn't as uncommon as you might think.

    Considering almost all HPC and GPGPU functions are massively parallel, I would say that scale-out, not scale-up, is the future; this is just going off of what has happened in the last ten years in technology.

    This is very, very niche, and fits with SPARC and POWER/PowerPC systems, so that makes sense.

    Actually, it is both.
    Not only is it leading in the mass GPU market (Tegra/GeForce/Quadro) but it is also leading in the HPC market (Quadro/Tesla).

    GPUs have changed massively from what they were ten years ago.
    The G80, or 8800GTX/GTS and their CUDA cores (NVIDIA stream processors) is what changed everything; even the wiki page talks about this.

    Right now it is, but another ten years from now, it will be huge.
    Automated vehicles, robotics, AI, etc. will all utilize this technology.

    NVIDIA is already prepping for this with their Maxwell (GTX900 series) GPUs, since they have removed all but a single double-precision FPU on their die, leaving the near-majority of FPU functionality as single-precision, which deep learning takes huge advantage of; also, they did this to not immediately cannibalize their existing Tesla GPU products.

    I never said it was irrelevant.
    In fact, it is quite the opposite.

    ...and GPGPUs have upwards of 6TFLOPS of SP compute capabilities, and that's just on the consumer GeForce line, not counting Tesla at all.
    This is why the world's top supercomputers use x86 processors paired with CUDA GPGPUs.

    So what is your point?
    If anything, this just shows how niche, specific, proprietary, and in some ways obsolete, that SPARC and POWER CPUs really are for modern tasks.

    This is why interconnects like infiniband exist, and if one's software is written to take advantage of this, then there is no issue.
    Perhaps on a 1:1 scale, SPARC and POWER do have more processing power than x86, but their extreme cost is what kills it.

    It is far cheaper and much more cost effective to invest into many x86 systems with interconnects, than it would be to invest in just one SPARC or POWER system, for the same price, and with far less computing power (depending on the software, task, etc.).
    Again, SPARC and POWER systems are expensive because they are niche and proprietary.

    You are right about the quality of their CPUs compared to x86, but the extreme cost is what kills this argument.

    Your data and info on x86 is 20 years out of date, at least.
    We can do all of this on x86 blade servers, and have been able to for years; I did just what you described here back 2012 for crying out loud.

    Really, your knowledge on the topic of x86 and GPGPUs is from 1995-2005; things have changed massively since that time period.
     
    Last edited: Sep 8, 2015
  8. jimmyb

    jimmyb 2[H]4U

    Messages:
    3,165
    Joined:
    May 24, 2006
    No, Intel's server business margin is just over 50%. Where are you getting this 10% number from? Please provide some data for the numbers you're inventing about Intel's server margin.

    What's your source for that number? It seems like you're just guessing. Sun was losing money on the server business so they were operating with negative margins. I understand Oracle is doing better with their business, but I can't find any SPARC specific numbers. Please feel free to cite some actual sources, or alternatively stop guessing numbers and presenting them as fact.

    Even so, your guess is basically the same as Intel actually operates at.

    And yet somehow Intel has server business margins >50%.

    You're seriously pulling out a naive GFLOPs comparison? By this metric GPUs are the clear winners.

    Even on the x86 front we have the Xeon Phi coming in at greater than 1 Tflops, and the next version out later this year is supposed to exceed 3 Tflops.

    So if you're going to fall back on a Gflop comparison x86 still wins.

    http://www.anandtech.com/show/9193/the-xeon-e78800-v3-review/11
    And it does so on less power for less money.


    You've just agreed that data center, HPC, virtualization is better on Intel. These are business workloads. Google's workloads are way bigger than any SAP installation. You keep on conflating business workloads with your limited subset of workloads where SPARC/POWER might be better.


    Yes, see my link above.

    Look at CPU2006 results. The POWER8 and Xeon are basically the same per-socket, and the Xeon costs a lot less and consumes less power.

    Data centers and virtualization are business workloads. Extreme is a subjective term, but in terms of working data size and computational operations required, they can be just as high if not higher than SAP. Again, look at Google, Amazon, Facebook workloads: they are absolutely massive business workloads, and they use x86.

    I guess by extreme you actually mean SAP. It's a very selective definition that you're using, that is out of line with the rest of the computer industry.


    "All large workloads" include data centers, HPC, etc. These run on x86. Your statement is false.

    Low revenue business is also not sound. It's not clear to me why you're going on about business strategy. I agree that many companies want to move out of competition from Intel into less competitive markets; it's irrelevant when analyzing processor architecture.

    And Intel goes for the even more lucrative 90%. And, apparently they're releasing competitive products in the remaining 10% as well.


    The Gflop comparison is primarily a function of how many cores are on the chip. It's not particularly representative of actual performance. Clusters win on this metric, then GPUs, then the Xeon Phi, followed by the SPARCs, POWER and regular Xeon.

    The 10% number you keep quoting is average per-core throughput largely determined by IPC improvements. If for some reason you want to predict the Gflop scaling then you should be looking at core count increases. This is why GPUs do so well.


    And yet again you're equating "large workloads" with SAP. Data centers are large workloads, HPC is large workloads. Your definition of large workloads is wrong.


    Are you just inventing these claims? Please provide some evidence.

    I literally spent 2 minutes googling large socket count scaling on Xeons and pulled up these numbers. That's approximately 80% improvement as a function of socket count. That is considered very good scaling.

    With this scaling, going from 16 to 32 sockets the Xeons would occupy top spot according to SAP

    Literally what evidence do you have to suggest that the Xeons don't scale well? Provide specific sources.


    Another claim without any supporting evidence. What RAS features and metrics specifically does a state-of-the-art Xeon fall short on? Please provide specifics, and examples where this was a factor in a purchasing decision.

    Great. I never said otherwise. Note that x86 is the most common processor for UNIX operating systems.
     
  9. jimmyb

    jimmyb 2[H]4U

    Messages:
    3,165
    Joined:
    May 24, 2006
    By relevant I suppose you mean the tiny subset of workloads where they are competitive. I've said repeatedly, the majority of business workloads are not those applications. Can you provide any market data that suggests otherwise?

    Citation needed.

    Regardless of your lack of evidence,

    LINPACK:
    The top #2 positions are held by x86 systems.

    SAP:
    16 socket Xeon benchmark right up there with the 16 socket POWER and SPARC systems. Based on the 8 to 16 socket scaling of a Xeon, a 32 socket system is projected to take the top position. And again, for less money and power.

    What do you think the "AES instructions" are?
     
  10. brutalizer

    brutalizer [H]ard|Gawd

    Messages:
    1,593
    Joined:
    Oct 23, 2010
    It is very difficult to parallelize some workloads. Some workloads are not even parallelizable, it is known they are impossible (aka known as P-complete problems). Typically, business workloads are not parallelizable. As explained by SGI talking about their large Linux servers with 10.000s of cores UV2000 and Altix:
    http://www.realworldtech.com/sgi-interview/6/

    "....However, scientific applications have very different operating characteristics from commercial applications. Typically, much of the work in scientific code is done inside loops, whereas commercial applications, such as database or ERP software are far more branch intensive. This makes the memory hierarchy more important, particularly the latency to main memory. Whether Linux can scale well with a workload is an open question. However, there is no doubt that with each passing month, the scalability in such environments will improve. Unfortunately, SGI has no plans to move into this [scale-up] market, at this point in time. However, it would be very interesting to see how the low latency Altix systems would perform with commercial workloads...."

    Only recently, SGI has finally released the 16-socket UV300H x86 server which is designed to tackle scale-up business workloads. Scale-out clusters can not do that. Embarassingly parallel workloads are very specialized and typically only fit for scientific calculations, as explained by SGI.

    The problem is that large business servers, serve many clients simultaneously, as one client does something, the code jumps around wildly in the source code. Maybe the user is doing some accounting, and next he does something else. The business code branches off wildly, making clusters useless. You need a large scale-up server to tackle such workloads.

    People would really like to go scale-out, but most business workloads are not parallelizable - or very very tricky to do. For instance, databases running over many nodes - very very difficult to guarantee data integrity and synch between many nodes, and rollback, etc. Clustered databases have many problems, and very few vendors can offer clustered databases. I suspect them offers are crippled in some ways, which makes scale-up databases superior.

    If it were easy to do clustered scale-out databases Oracle would have no market anymore. Just release a 64-node x86 cluster and outperform the Oracle database for a tiny fraction of the cost. But no one has succeeded. And that is why Oracle earns the big bucks.

    These are business workloads. High margin. Sure it is nische, but it pays very very very well. SGI has tried for decades to venture into this highly lucrative arena.

    I do suspect that owning 80% of the consumer GPU market is more worth than delivering GPUs to some supercomputers each year? But you claim that the supercomputer market is why NVIDIA fares better than AMD? Not that NVIDIA owns 80% of the consumer market? Just like Intel owns the majority of the consumer CPU market.

    No one knows what the future will bring. Maybe deep learning will become obsolete. Maybe someone will create a specialized chip that outperforms GPUs, for a fraction of the wattage, and the cost? I would not be so sure on deep learning's abilities.

    It seemed that you claimed NVIDIA fares better than AMD because of the supercomputer market?

    The GPGPUs are not general purpose cpus, you need to rewrite your code. SPARC with 1.100 gflops are general purpose, it runs the same code, just several times faster than x86.

    It is true that SPARC and POWER are nisched to business workloads. Because that is where the big money are. And that is why IBM and Oracle are more successfull as companies than Intel or large x86 vendors such as SGI. No one wants to touch x86 servers.

    True. But business workloads can only run on large scale-up servers. And such x86 servers hardly exist. So you have no choice than SPARC/POWER.

    I find this very very veeeeery doubtful. Scale-up x86 servers simply does not have the same RAS as SPARC/POWER and Mainframes. Actually, I dont believe you. The superior RAS is one of the main selling points. The most probable thing is that you falsely believe that x86 RAS is comparable to SPARC/POWER and Mainframes.
    -Oh, this x86 server has ECC RAM. Well, that makes it as reliable as a SPARC/POWER/Mainframe.

    Not quite.
     
  11. brutalizer

    brutalizer [H]ard|Gawd

    Messages:
    1,593
    Joined:
    Oct 23, 2010
    It is apparent that I was guessing because of all my question marks and "maybe" wordings. I did not claim those guesses as facts. I based my guesses on that x86 vendors have very low margins at 4% which is well known (that is why everybody tries to exit x86 server market). I thought Intel must be doing better than 4% and guessed 10%. I was wrong on this, Intel apparently does far better as you have proved with 50%.

    Anyway, this does not change that the x86 server market stinks which is extremely low margin. That is why everybody tries to exit it.

    I suspect that if Intel with their $400 cpus have 50% margin, then the cpu division at IBM and Oracle with their $10-20.000 cpus, have much higher margin. The cost of manufacturing a cpu should be roughly the same, x86 or SPARC/POWER. And still Intel eeks out 50%. So SPARC/POWER cpu divisions should be able to eek out much more than Intel.

    I am comparing general purpose cpus. SPARC/POWER vs x86. You can not compare a specialized addon GPU to a general purpose cpu. To take advantage of GPUs, you need to scrap all your code and rewrite everything using CUDA, and some workloads are not even parallelizable and can not be run on GPUs, such as business workloads. On SPARC XIfx you run the same code, but just faster. So, no, x86 lags behind in sheer computing power.

    What is your point? That Haswell is slower than POWER8 on all(?) benchmarks?

    I did not agree on that. I wrote, "On business workloads SPARC/POWER is faster". That does not imply that Intel is better on other workloads, I have not even mentioned other workloads. Maybe I claim that SPARC/POWER is faster on these workloads as well? Which I do. Your logic is way off the road. With such a logic, it is no wonder you draw wrong conclusions and believe x86 is faster than SPARC/POWER and has better RAS.

    Hmmm.... This is a bit weird. You are showing links to CPU2006 and claim that x86 is comparable to POWER8?

    In your link, an 8-socket POWER8 server reaches 5,130 CFP2006 and the fastest 8-socket x86 reaches 3.980. So, how in earth can you claim that x86 is comparable to POWER8? Not only that, you claim that x86 is also FASTER than POWER and SPARC in the rest of your paragraphs. I have not seen a single benchmark posting from you, where a x86 is faster than POWER8. So how can you continue claim that POWER8 is faster? Oh, now I know. Wrong logic.

    And lets look at the highest scores. It is done by 16-socket servers, ie POWER8 server reach 14.400. Whereas the fastest x86 reaches 3.980. I fail to see why you claim that x86 servers are faster than SPARC and POWER? The link you showed, only has CFP2006 comparing both POWER8 and x86, that is why I looked at it. Imagine a Fujitsu M10-4S with 64-sockets SPARC 3.7GHz cpu score. :)

    I dont think I am clear enough. I am talking about business software, used for accounting and such business stuff, like SAP, databases, etc. That are used by large business companies, banks, etc to do their everyday business.

    Google has a very large cluster with 900.000 servers Ive read. But no one is doing accounting or pay rolls or other business stuff on that cluster. Neither Facebook. Amazon does lot of business, and that is served somewhere by a large database on a single scale-up server, most probably a SPARC or POWER server. So the database backend is not x86 server - I suspect. The reason is that Amazon generates many transactions, so you need a large database for that. And the largest databases with best uptime are SPARC/POWER. I suspect an 4-socket or 8-socket x86 does not suffice to power the entire Amazon database backend.

    So when I say business software, I do not mean virtualization, HPC, etc. I mean accounting and such business stuff. And such code branches heavily, so you can not run such business software on clusters, as explained by SGI.

    Of course if Google replaced 900.000 SPARC or POWER servers in their cluster, it would certainly not be slower as each server is faster than x86. But the cost would kill it.
    I guess by extreme you actually mean SAP. It's a very selective definition that you're using, that is out of line with the rest of the computer industry.

    Large Business workloads.

    I am talking about business strategy because that is where the big money are, and that market is the most important one. It has the highest margins and that is where the big fight is. SPARC and POWER targets that market. x86 tries to get in but fails. x86 has not the ability: not the performance, not the scalability, not the RAS.

    Code:
    And Intel goes for the even more lucrative 90%.  And, apparently they're releasing competitive products in the remaining 10% as well.  
    Intel is not able to go into the most important market, where the margins are really high.

    If we talk about strongest general purpose cpus, then x86 lags behind.

    True, but core count is not as important as you think. The point is, an x86 Xeon has roughly 150 watt to burn. It does not matter if it has 18 cores or 10 cores, the performance output will be roughly the same because they only have 150 watt juice. If Xeon 18 core had 300 Watt tdp and Xeon 10-core had 150 watt, then the Xeon 18 core would be roughly twice as fast. But that is not the case. Using 150 watt you can only extract that much performance, no matter how many cores. If you have many cores, they must be weak to fit into 150 watt. If you have few cores, they can be stronger. But still 150 watt is 150 watt. Intel can not surpass that barrier, which hampers performacne.

    SPARC and POWER probably uses 250 watt or even more. They are sometimes water cooled, sitting in huge 1000kg 32-socket servers. They can burn lot more juice.

    Have you seen benhmarks when they limit a GPU down to 200 watt, and compare it to 300 watt GPUs? The same GPU? The 300 watt version is easily faster, because it can use more watt. The same with SPARC/POWER. I do suspect that if Intel also had 250-300 wattage to burn, the x86 would be as fast as SPARC/POWER (even though SPARC/POWER spends more transistors on RAS and stuff other than performance).

    Code:
    And yet again you're equating "large workloads" with SAP.  Data centers are large workloads, HPC is large workloads.  Your definition of large workloads is wrong.
    
    Large business workloads.

    Regarding x86 scales badly. The SAP link you showed me, is the first scale-up 16-socket x86 server SAP benchmark ever, the bench is brand new, just a month old. I have missed that link. It is a HP Integrity Superdome server, it is just a tweaked Unix server. HP had for long Integrity 64-sockets servers, running PA-RISC and Itanium cpus with HP-UX operating system. HP has for long tried to replace Itanium with x86, in their Kraken server. Now they have succeeded after many years, but they stayed at 16-sockets instead of going up to 64-sockets as Itanium could.

    We see that HP reaches 460.000 saps with 16-sockets which is 40% better than the best 8-socket x86 which reaches 320.000 saps. So this is quite good, so I must revise my earlier statement: "x86 is now able to handle up to medium sized SAP workloads". Thanks for correcting me. I would not want to say false things, and if you prove me wrong, I immediately change my mind.

    Regarding x86 scalability. According to wikipedia, scalability is the ability to go up and tackle larger and larger workloads. Which x86 can not do, it stays at 16 sockets. Whereas SPARC goes up to 64-sockets. The top SAP spot has 32-socket SPARC with 840.000 saps. The 8-socket POWER8 reaches 436.000 saps, almost as good as the 16-socket HP x86 server at 460.000 saps. So, I would not say that x86 is faster than SPARC or POWER, nor does it scale better.

    Another thing, business benchmarks such as SAP is very hard to scale. They dont scale linearly as parallel workloads do, so you can not say that a 16-socket reaches X so therefore a 32-socket will reach 2X. There is a reason SPARC benchmark stayed at 32-sockets, I suspect that a 64-socket SPARC would only be 10% faster or so, because of hard scalability. Which means SPARC 64-socket would look bad, so therefore there are no 64-socket entries. That is my guess why there are no 64-socket entries.

    I dont know exactly as I have not studied RAS. But x86 are notorious for being unreliable among Unix and Mainframe sysadmins. Have you heard about x86 servers going on for decades? It is like you say "how do you know Windows is unstable compared to unix, what does Windows fall short on, be specific" - everybody knows that Windows is not the most stable os. For instance, I hope you are not going to claim that you can hot swap everything in a x86 server, as you can do with SPARC/POWER/Mainframes. Why would anyone build such very expensive RAS into a cheap 4-socket server?

    Yes, x86 is the cheapest. That is why.
     
  12. brutalizer

    brutalizer [H]ard|Gawd

    Messages:
    1,593
    Joined:
    Oct 23, 2010
    Business workloads, such as accounting. That is where the big money is, that market is the most important.


    Link please to Linpack?

    Again, you can not take a business software benchmark and project it scales linearly. If you could, it would be possible to run such workloads on clusters. It is not possible to scale linearly.

    BTW, an 8-socket POWER8 gets almost the same SAP benchmark such as the 16-socket x86 server. And SPARC holds the top spot with almost twice as high score as the x86 server.

    Yes, I know. But I know that SPARC and POWER has more TDP to play with than x86. I also have seen benchmarks where SPARC was faster than x86 on crypto and compression and database queries, etc.
     
  13. jimmyb

    jimmyb 2[H]4U

    Messages:
    3,165
    Joined:
    May 24, 2006
    In the first sentence you agree that Intel has 50% margin on its x86 server market, which is quite good margin. In the second sentence you fall back on your claim that x86 server market is extremely low margin. How can you possibly reconcile these two statements?


    Check my link again. The single-threaded performance is faster on the Xeon than the POWER8.

    Integer performance is basically the same per socket at around ~5000. The business workloads you've been talking about are going to stress integer performance and leave the floating point units mostly idle. And again, at less cost and power.


    Then don't keep saying "business workloads" or "large business workloads", because what you're talking about is only a tiny subset of "large business workloads".

    Ironically, pay roll, accounting, etc, is computationally a tiny workload for Google and Facebook. Their largest workloads are run on their clusters. SAP is a tiny workload compared to large data center.

    No. Amazon uses commodity x86. Stop speculating - these things are easy to look up.

    Amazon and most other sophisticated technology companies know how to architect large hardware and software systems so that they can be coarsely distributed. There is no reason that different users buying items on Amazon need to be served by a coherent database.

    Their software engineers have failed if me buying an item on amazon is somehow stalling a thread on another user buying an item.

    But virtualization, HPC, web services, and data centers *are* business software. You are using an incorrect definition. If you are talking about SAP only, then say SAP only.

    Also, even among SAP workloads, you appear to be talking about the very tail-end. Most organizations don't require a multi-million dollar computer to run their SAP services.

    So when you "business workloads", you actually mean a tiny fraction of SAP workloads.

    What? This simply isn't true. Can you provide a source and benchmarks for why branch heavy programs don't execute well on distributed systems? Can you even explain why this would be the case in terms of the program flow or system architecture?

    When clusters perform poorly it is generally because of slow coherency and data sharing, not branches. Branches and memory accesses are different.

    The Xeons are regarded as having the best branch prediction and ILP of any microprocessor on the market. These sorts of optimizations are evident when you look at benchmarks that are single-threaded, or more generally benefit from per-core IPC. See the Anandtech link again for an illustration of this performance; the Xeon appears to have higher per-core IPC.

    Utter speculation on your part. You have no idea how Google's data center would perform on SPARC or POWER. I am interested in actual facts.


    I am talking about all computing. I have no interest arbitrarily narrowing the scope of discussion.

    In any case, the Xeon Phi will execute standard x86 binaries, and the upcoming Xeon Phi is general purpose and not add-in. So even by your arbitrary criteria, x86 wins on a naive flops comparison.

    1) Stop guessing about things like power usage. If you want to make an argument go look up the actual facts rather than speculating.

    2) Intel has higher performance per watt, so your point is irrelevant.

    3) Using more power is not a selling point. Ceteris paribus almost everyone prefers lower power.

    As you touch on, the comparison is only valid if the performance per watt is the same, which it isn't.

    The fact that x86 doesn't have a 32 socket system yet doesn't mean the architecture won't scale; it means nobody has built the system (presumably because the market size is so tiny). The trend from 8 socket to 16 socket showed good scaling, so by all evidence we can expect it to scale to 32 socket in a similar fashion.

    Also your definition of scaling is bizarre. x86 goes from sub-watt embedded applications all the way up to 16 sockets with unified memory, and further into distributed memory systems with thousands of cores. The range of systems and workloads that x86 is deployed in far exceeds what people are using SPARC in.

    Your argument that it won't scale because there's no 64 socket system is silly. This segment is like the 3-sigma tail of the SAP market. How many of these systems are sold every year? Setting 32 sockets as the cut-off point for what we consider scale-able is arbitrary. Xeon, SPARC, and POWER all show up in the SAP top 10. They are all viable options in that workload.

    I could similarly say that SPARC doesn't scale because it doesn't have a 1024 socket system. Or it doesn't scale because it doesn't exist in milliwatt embedded sililcon applications. Or it doesn't scale because it's not used in distributed systems.

    I never claimed the scaling would be linear. In fact, I explicitly said it was sub-linear. The trend still indicates that a 32 socket Xeon would take top spot or close to it.


    If you don't know about it, then don't make claims. I have no interest in hearing your speculation.


    Linkpack top 500 is here.
     
  14. jimmyb

    jimmyb 2[H]4U

    Messages:
    3,165
    Joined:
    May 24, 2006
    Your comments about power are confused.

    Intel's (CMOS) manufacturing process is considered 2 years ahead of the rest of the industry. The per-transistor power consumption is lower then anyone else. This is a big factor in why Intel has lower power - because it has better silicon.

    Intel targets a TDP based on market demands. It has nothing to do with the cores being "weak". It has to do with the manufacturing process, die area, supply voltage, and clock frequency, and how they think they can maximize profit. Performance scales sublinearly with respect to power consumption, so a careful tradeoff is made here in selecting a TDP.

    Look at:
    http://www.bit-tech.net/hardware/2014/09/03/intel-core-i7-5930k-and-core-i7-5820k-revie/8
    For one of the i7s, by increasing the clock frequency ~50% (3 to 4.5GHz) the power consumption under load jumps ~90% (~220W to 436W). Intel has done their market research and determined this is where the sweet spot is.

    If they wanted to sell parts with higher clock frequency, performance and power consumption, they could do so with existing silicon.
     
  15. jimmyb

    jimmyb 2[H]4U

    Messages:
    3,165
    Joined:
    May 24, 2006
    Are you seriously suggesting that most business workloads or most database procedures are P-complete problems? Or is this just a non-sequitur meant to distract us? Complexity theory speaks to the inherent serialization of instructions in a program based on unavoidable data dependencies; it says nothing about the optimal memory architecture of a system, ie. distributed vs. low latency shared memory space.

    The SAP "business workloads" you've been talking about don't constitute a single decision problem. They are multiple problems, and they are slow because of the traditional difficulties with coarse concurrency that requires 1) a coherent view of data, and 2) random data access over a large working data set. It's not because they are P-complete problems.

    Ironically, actual P-complete problems are going to run better on processors with a higher single-threaded throughput because they don't benefit from multiple cores/threads. The benchmarks I linked above indicate that this is the Xeon.

    In your quote SGI never said that only scientific workloads are parallelizable. You are misrepresenting what they said.

    There are lots of non-scientific workloads that are highly parallelizable, at different coarseness levels.

    Off the top of my head: running multiple programs at the same time, virtualizing multiple systems, separate threads for different components a game engine (audio, rendering, AI, etc.).

    Your claim about only scientific computing being highly parallelizable is totally incorrect.
     
  16. Red Falcon

    Red Falcon [H]ardForum Junkie

    Messages:
    9,833
    Joined:
    May 7, 2007
    Not to resurrect a long dead thread, but I found this to be quite interesting in spite of what you have stated: https://groups.google.com/forum/#!topic/distsys-discuss/ZBPOe7pCV5M


    I find it interesting that a $2300 x86 system paired with a GPU can out-perform a nearly $1 million SPARC system with "far superior" specs, both from 2012; modern GPUs from 2015 would run circles around these systems, as well.
    Just thought I would share this for those that come across this.
     
  17. Red Falcon

    Red Falcon [H]ardForum Junkie

    Messages:
    9,833
    Joined:
    May 7, 2007
    nvm, thanks mods!
     
    Last edited: Nov 25, 2015
  18. pixelblue

    pixelblue [H]Lite

    Messages:
    113
    Joined:
    Aug 31, 2015
    IMO, nVidia and AMD could become big players in this kind of large scale computing market if they invested a little more in it. Particularly nVidia with their Tesla compute units.
     
  19. niomosy

    niomosy Limp Gawd

    Messages:
    247
    Joined:
    Nov 21, 2005
    Interesting but it's a test of a single SQL query. That PC isn't going to be running a massive database of any kind with that configuration. I'd rather see a comparison to an Intel system that can run a large database and do some real-world tests. That would provide more beneficial results. It seems there's a bit more to GPU-based databases though. Also from that link:

     
  20. LightningCrash

    LightningCrash 2[H]4U

    Messages:
    2,412
    Joined:
    Dec 29, 2000
    No need for real-world experience here: just read the Oracle marketing papers like whats-his-face.