AnandTech tests Calxeda's 24 node (96 core) ARM server

Discussion in 'All non-AMD/Intel CPUs' started by pxc, Mar 13, 2013.

  1. pxc

    pxc [H]ard as it Gets

    Messages:
    33,064
    Joined:
    Oct 22, 2000
    http://anandtech.com/show/6757/calxedas-arm-server-tested

    The main contenders:
    Calxeda 24 nodes, each with a quad core Cortex A9 (ECX-1000) @ 1.4GHz, 96GB RAM (4GB x 24, single channel per node)

    Dell PE R720 dual socket Xeon E5-2660 (8 cores per chip) @ 2.2GHz, 96GB RAM
    (other CPUs are compared, see the link)

    As Johan notes, Calxeda's server is more of a cluster than a typical rack server. The performance is interesting. In workloads that it targets, it performs well against the dual Xeon E5-2660 Dell server, beating it in web server tests in both response time and throughput for various loads. Power consumption is 33% lower than the Xeon server under average and peak load (idle power is nearly equal between the ARM and Xeon servers). And this is just with Cortex A9 chips. A15 and A50 should do even better in the future, although using significantly more power.

    Sounds great, but then there's the price. $20,000 for the Calxeda 24 node server vs about $8000 for the Dell dual Xeon E5-2660 server. However, if filling a whole rack (and this server is intended for that usage), each Calxeda server comes out to around $8500 each.

    Notice that "dirt cheap" ARM processors have close to zero advantage at the server level pricing. Those ECX-1000 chips, if they had GPUs and other things needed in a handheld or tablet processor, would sell for around $20. A Xeon E5-2660 lists for around $1300. Scale makes a huge difference here.

    If Intel weren't replacing the aging Atom and coming out with a similar very low power Atom processor for high density server nodes based on the new uarch, this would be very alarming. The 2 pages of simple benchmarks including a dual core Atom are kind of interesting in this light. The ECX-1000 memory bandwidth is very poor and the dual core Atom N2800 has a healthy lead over the quad core A9 based ECX-1000. It should be possible for Silvermont (or later) Atoms to remain power and performance competitive against future Cortex A50 server designs. The 32nm Atom N2800 compared though is a bit of a power hog, relatively speaking, although not as bad as Johan suggests. (Not the same workloads, but 8.5W per ARM node vs 12W for an Atom system isn't too far off.)
     
  2. Digital Viper-X-

    Digital Viper-X- [H]ardForum Junkie

    Messages:
    13,741
    Joined:
    Dec 9, 2000
    Impressive numbers for the webserver part, dissapointing everywhere else, but good power numbers too.

    I wonder how the A15 and the 64bit Arm v8 will do.

    Well it might not be bad because you see 3.5W, but that's nearly 50% increase :p
     
  3. crispy79

    crispy79 [H]Lite

    Messages:
    111
    Joined:
    Jun 7, 2009
    64 bit arm is going to kick some ass. I just hope that we see an ecosystem spring up of standard form factors. I hope AMD helps facilitate this.
     
  4. pelo

    pelo 2[H]4U

    Messages:
    2,911
    Joined:
    Apr 23, 2011
    pxc, don't judge it by its price too much. That's Calxeda charging a fortune for being the only one in the market. The chips themselves are absolutely tiny and on an older more mature node with less complex PCBs and features making them cheaper than the Intel parts. The prices should drop considerably once Samsung, Qualcomm, AMD, Marvell and nVidia enter the market.

    Bear in mind these are bog standard ARM SoCs that were designed primarily for smartphones. The A57/A53 64-bit cores are the first real ARM chips that will target the server market.
     
  5. pxc

    pxc [H]ard as it Gets

    Messages:
    33,064
    Joined:
    Oct 22, 2000
    Not really "bog standard", but it is almost certainly based on a standard licensed Cortex A9 core. http://www.calxeda.com/technology/products/processors/ecx-1000-series/

    The A57, ARM's first "server chip" has been described by the company as a server, tablet and phone processor. Clock frequency and inclusion of big.LITTLE will probably separate the mobile and server segments using the A57 model, not a separate uarch for mobile and servers. The Cortex A53 is targeted at Cortex A9 performance levels, but 40% smaller and using less power. IOW, there really isn't a "server" design even on announced future products.
     
  6. pelo

    pelo 2[H]4U

    Messages:
    2,911
    Joined:
    Apr 23, 2011
    pxc, ARM doesn't make 'specific' cores or SoCs. I think you're misinterpreting their business model here...

    What ARM does is license IP. That's pretty much it. What people do with that IP is up to them (there are certain limitations, but the ISA is fair game). For example, a company can make a 20-core ARM SoC utilizing A7's. That's not ARM's SoC, but rather just ARM standard A7 core design. In order to cater to the multitude of markets that ARM cores find themselves in (everything from routers and SSDs to tablets and smartphones), they have to make a core that's very much vanilla. It has to perform well for any potential use that a potential customer will use them for. Whenever you see A9 or A7 or A#, it's very much a bog standard ARM core.

    The A57 is also a bog standard ARM core. It has to be. It's up to companies like Qualcomm or Marvell to tinker with it to fit their specific needs. For example, Qualcomm might widen the front end and increase IPC at the sake of higher power consumption because they'd only use two of them in a smartphone SoC, but Marvell would go in the opposite direction for embedded devices.

    The jump to 64-bit is what's been holding back ARM processors from gaining significant market share in the server market. RAM is almost always the first bottleneck with most servers, and limiting your cores to 32-bit will certainly limit the number of clients you can appeal to =P Even with the current 32-bit limit, ARM actually has a higher market share than AMD's Opteron division.
     
  7. Fort_Major

    Fort_Major Limp Gawd

    Messages:
    425
    Joined:
    Apr 22, 2006
    The biggest problem I see is that while this is cool - nobody is working on it despite there has been talk about "Server ARM" for about 4-5 years.

    Price is outrageous.
    There is no Windows Server + Stack for it
    OS/Server Stacks are probably another 5+ years away from being actually optimized. Re-compiling an application for this type of architecture is not optimizing.
     
  8. pxc

    pxc [H]ard as it Gets

    Messages:
    33,064
    Joined:
    Oct 22, 2000
    I think all these things were considered by Calxeda.

    1. single server price is crazy at $30,000, but more reasonable at $8500 per server by the rack full. Overall costs, if utilization is high at least, could be significantly lower with the ARM cluster over the long term (AT's tests show a 33% lower power advantage at high load).

    2. These types of servers are unlikely to replace traditional office/business server roles. They're made for cloud servers or some type of physicalization-able tasks where host OS doesn't matter much. The stack, when used, will almost certainly be LAMP-ish (someone want to make a new acronym for *nix, Apache, Some SQL/SQL-ish server, Some scripting language ;)).

    3. Not all applications will be suitable for these types of servers. I've been pointing that out for a long time. These are high density servers made for data centers, where the bottlenecks could primarily be things other than CPU performance. Not servers that will replace your Windows domain controller, mail server, file server, etc... at least not until a lot more is done for compatibility with clients, Windows security and applications.

    ARM certainly has been hyping up servers for years, and the first CPU (Cortex A57) deemed suitable for servers (and tablets and handhelds) isn't coming out in production until next year. Calxeda admits in the marketing materials that ARM servers are in the "early adopter" phase.
     
  9. Fort_Major

    Fort_Major Limp Gawd

    Messages:
    425
    Joined:
    Apr 22, 2006
    Yep.

    Will have to see what unfolds over the next few years. At least this is a first stepping stone.
     
  10. pelo

    pelo 2[H]4U

    Messages:
    2,911
    Joined:
    Apr 23, 2011
    http://www.fudzilla.com/home/item/30947-arm-and-tsmc-tape-out-first-cortex-a57-chip

     
  11. Fort_Major

    Fort_Major Limp Gawd

    Messages:
    425
    Joined:
    Apr 22, 2006
    Why are better chips being fabicrated on 16nm process
    and it being looked forward that the lower end chips being fabricated at 20nm
    From my understanding of the fabrication process the more complex the chip is the harder it is to fabricate at a lower end, why our memory/flash fabrication has always led the way in terms of smallest fabrication.

    Am I missing/misunderstanding something?
     
  12. pelo

    pelo 2[H]4U

    Messages:
    2,911
    Joined:
    Apr 23, 2011
    Yes, but the complexity between an A15 and an A57 isn't drastic. In fact, you can make the assumption that no core is inherently anymore difficult than another core with respect to fabrication. The GPU portion of an SoC (or APU/CPU) is much more difficult due to the density and complexity.

    TSMC's 16nm process is a 20nm BEOL with FinFETs. GloFo is taking the same approach with their 14nm-XM process.
     
  13. Fort_Major

    Fort_Major Limp Gawd

    Messages:
    425
    Joined:
    Apr 22, 2006
    Okay, Thanks for clearing that last bit up.
     
  14. pelo

    pelo 2[H]4U

    Messages:
    2,911
    Joined:
    Apr 23, 2011