AMD new EPYC Milan review from Anandtech

I thought Zen3 threadrippers will come before Milan:(
 
Last edited:
Takeaways:
  • Milan leverages a new I/O die with higher performance and significantly higher power consumption
  • Single-threaded performance is up, by a fair amount (+25% over Rome at the same clock speed). This is likely a combination of raw IPC improvements and incremental improvements to the uncore
  • Multi-threaded performance is roughly constant; multi-threaded performance-per-watt has regressed due to increased uncore power consumption
  • Lightly-threaded performance is up thanks to aggressive boost clocks and huge caches
Overall, I think Milan is a good real-world improvement geared towards the market's needs, despite the fact that raw throughput hasn't increased much. While hyperscalers such as AWS are obsessed with compute density, not everyone is a hyperscaler and Milan is a good boost in performance for the smaller guys who aren't interested in consolidating into 128-core 1U nodes. In the smaller 16, 24, and 32 core configurations we're going to be seeing a 25% gen-on-gen boost at comparable price points, plus enough PCIe lanes for all-flash filers equipped with 200GbE. There's a lot of market for the smaller 1P servers handling utility services or workloads which don't scale to 3-digit core counts (i.e., most of them).

What's missing right now, I think, is an intermediate IOD configuration with quad-channel, 2DPC (L)RDIMM support and 64 PCIe lanes, possibly at lower data rates. The performance enhancements made on the Milan IOD have pushed it out of the realm of small servers - if you don't need octal channel or 128 lanes, you are ironically stuck buying Intel for the low end. Ryzen doesn't really work in servers (it doesn't support RDIMMs, and 16 lanes is bad), and Threadripper falls too far off of the performance-per-watt cliff (everything is 280W). On the other hand, one could argue that Intel has the correct design at the low end - the modular concept doesn't really work out on a 16-core part because of the extra latencies and communications overhead, and a monolithic 16-core part isn't unmanageably large.
 
Man I wish VMware didn't crap on licensing in v7 to core count vs physical socket, but I as we have a bunch of xeon v4 boxes that could consolidate to one.
 
good comments, but depending on how you define "small server", I am skeptical that amd isnt attractive vs intel.

there are many components to tco, & intel just seem laughably behind on so many metrics. Any evaluation will eventually encounter a deal breakingly bad one.

any cost effective system for smaller enterprise must embrace Hyper Converged Infrastructure - its just so liberating & cost effective - just add another host to expand - intead of replace with a bigger box

the various hosts appear as one manageable unit.

amd seem to have a killer edge - the key being where servers get primitive - the interconnect speeds between hosts.

we see amd get deal after deal in exaserver contests, & that same magic applies in HCI on a smaller scale

if by small u mean they have a small server for this and a small server for that - i suspect thats a very uneconomic model from here on - hci will not only be best, but cheapest.
 
good comments, but depending on how you define "small server", I am skeptical that amd isnt attractive vs intel.

there are many components to tco, & intel just seem laughably behind on so many metrics. Any evaluation will eventually encounter a deal breakingly bad one.

any cost effective system for smaller enterprise must embrace Hyper Converged Infrastructure - its just so liberating & cost effective - just add another host to expand - intead of replace with a bigger box

the various hosts appear as one manageable unit.

amd seem to have a killer edge - the key being where servers get primitive - the interconnect speeds between hosts.

we see amd get deal after deal in exaserver contests, & that same magic applies in HCI on a smaller scale

if by small u mean they have a small server for this and a small server for that - i suspect thats a very uneconomic model from here on - hci will not only be best, but cheapest.

By small I mean anything with 16 cores or fewer, these days typically 1P servers with a bunch of I/O and drives used for things such as consolidating services, storage, or networking. Milan isn't really viable at that level - a 155W part built out of five dies that idles at 100W is a poor choice for sporadically-loaded servers. Intel has some nice parts in this range, for example, the 5128R is a 125W part with good turbo, a reasonable idle, and a good price.
 
By small I mean anything with 16 cores or fewer, these days typically 1P servers with a bunch of I/O and drives used for things such as consolidating services, storage, or networking. Milan isn't really viable at that level - a 155W part built out of five dies that idles at 100W is a poor choice for sporadically-loaded servers. Intel has some nice parts in this range, for example, the 5128R is a 125W part with good turbo, a reasonable idle, and a good price.
I do not know the large S&P 500 business scene, but as a small business owner myself, I would argue that consolidation of most cores to one 1U -rack is very important to small and medium size companies. The rent of 1U -rack at the city centre (inside the bedrock and backed up daily to the cloud somewhere) can run from USD 100 up to 300 per month (that does not include data and electricity, only the space of 1U server) > with such a high rental cost, you would want to consolidate as much as possible, because it would be crazy to hold 4 separate servers and pay USD 100-300 per month/each, when probably the whole sh*t hardware is not worth 12 months rent.
 
I do not know the large S&P 500 business scene, but as a small business owner myself, I would argue that consolidation of most cores to one 1U -rack is very important to small and medium size companies. The rent of 1U -rack at the city centre (inside the bedrock and backed up daily to the cloud somewhere) can run from USD 100 up to 300 per month (that does not include data and electricity, only the space of 1U server) > with such a high rental cost, you would want to consolidate as much as possible, because it would be crazy to hold 4 separate servers and pay USD 100-300 per month/each, when probably the whole sh*t hardware is not worth 12 months rent.

Right, unless your servers are incredibly old though you are probably going to consolidate into something with 48 or 64 cores, and a single-socket AMD system makes a lot of sense there. But there are times where you aren't looking for maximum density, for example you might want redundant bare-metal firewall(s), a dedicated SAN that provides central storage for your virtualized infrastructure, or machines that have specialized expansion cards, and that's where AMD sort of gets pushed out of the market.
 
Man I wish VMware didn't crap on licensing in v7 to core count vs physical socket, but I as we have a bunch of xeon v4 boxes that could consolidate to one.
Still can. Modern consolidation rates, you'll still cut license count SIGNIFICANTLY. Last DC design I did took 84 nodes down to 18 - even doing that on EPYC (would have been... 14 I think, due to cluster constraints) would have been WAY fewer licenses than the original 84x2.
 
Right, unless your servers are incredibly old though you are probably going to consolidate into something with 48 or 64 cores, and a single-socket AMD system makes a lot of sense there. But there are times where you aren't looking for maximum density, for example you might want redundant bare-metal firewall(s), a dedicated SAN that provides central storage for your virtualized infrastructure, or machines that have specialized expansion cards, and that's where AMD sort of gets pushed out of the market.
They have SOC based Epyc for that - low cost, 8 core or so. :D Generally ITX sized boards.
https://www.supermicro.com/en/products/aplus/solutions/epyc3000-embedded

Others make them too - generally special order.
 
just seen this that board is tiny i wonder if that asrock itx board could handle this cpu since its same socket as previous gens ROMED4ID-2T that would be one hell of htpc lol ><
 
Man I wish VMware didn't crap on licensing in v7 to core count vs physical socket, but I as we have a bunch of xeon v4 boxes that could consolidate to one.
We just bought a v7 license and we were still able to purchase it per socket, we bought two 1 socket licenses for the dual CPUs in that box. Got it through CDW (technically CDW-G). Is licensing per core the new vmware standard now?
 
We just bought a v7 license and we were still able to purchase it per socket, we bought two 1 socket licenses for the dual CPUs in that box. Got it through CDW (technically CDW-G). Is licensing per core the new vmware standard now?
Each socket up to 32 cores
 
Ah, that makes sense. That box has 2 32 core processors in it.
Yeah. Big selling point early on was "cut your VMware licensing costs in half! Single socket, 64 cores!"
VMware: "Uh... No."

edit: Love Milan/etc, but I'll also point out that for the high count procs, speed is rather... low. 2.2Ghz is good for a lot of things, but not everything, and you'll rarely see the boosts. Make sure you think about what you're putting on it - doing mega-VDI on one of those, for instance, sucks without a LOT of GPUs (too slow for 30fps desktops, which is still a single-threaded process). Same issue that Intel had early on with some of the high-count V4 (Broadwell) and Skylake (X1XX) procs - some things still want Ghz, while others are just fine with IPC increases and more threads. Choose your processor for servers knowing your workload(s).
 
Back
Top