AMD Quietly Launched 8 Core EPYC CPU with 64MB L3

AlphaAtlas

[H]ard|Gawd
Staff member
Joined
Mar 3, 2018
Messages
1,713
AMD quietly launched an interesting addition to the EPYC line, which Servethehome spotted yesterday. The EPYC 7261 has just 8 cores and 16 threads. But, unlike the existing EPYC 7251, the 7261 has all 64MB of its L3 cache enabled, as well as a significantly higher base clock, a TDP of 170W instead of 120W, and support for DDR4 2666. For reference, AMD's 8 core Ryzen desktop CPUs only have 16MB of L3, and Intel's new i9-9800X has 16.5MB. That is a truly massive amount of cache for an 8-core CPU, which makes the 7261 an interesting option for expensive, I/O intensive software that's licensed by the core.

This chip quietly launched in June 2018, but we missed it since there was no announcement of the part. We have seen it available from major OEMs such as Dell EMC, HPE, and Supermicro. Pricing wise, we are still looking to get the official figure, but it seems to be a ~$150 price increase on most configurators over the EPYC 7251 which would put it about in the price range of the AMD EPYC 7281 16 core 32MB part.
 
With a base clock of 2.5Ghz and a turbo of 2.9, that's still not going to be an exciting workstation chip.

So AMD still has the weird dilemma in this generation. They have a workstation CPU I. Threadripper that doesn't have any workstation level motherboards available for it. There are - however - workstation style motherboards available for Epyc, but none of the Epyc, but none of the Epyc chips are clocked high enough to really make for a good workstation CPU.

They really ought to do something about this.
 
How many pcie lanes?

If it doesnt have the memory latency issues it's cousins have.....
 
How many pcie lanes?

If it doesnt have the memory latency issues it's cousins have.....
all of them.

The issue with this release, I imagine will be the tray price. It fits a very nice spot, but I think its priced right near the 16 core CPU with the same amount of cache. Will have to see what the OEM options will be.
 
I need to see how something like this operates a multi-threaded highly I/O intensive workload.. like a very busy transactional DB. If it is in truth faster than a 16 core on a socket per socket basis then this could be a good win. Throw 4 of these for some insane I/O and great DB performance that exceeds the equivalent 16 core solutions.

When companies like Microsoft are charging 26 thousand dollars in licensing per 4 cores IF you have a good enterprise contract then you see where a chip like this can be appealing. AMD just really NEEDS to show how good this chip is in that sort of workload to make it worth while.
 
I'm curious to see what it does with long data sets, big joins from Oracle. Mostly, I'd be curious with what it does to data that hasn't been optimized, or maybe some ugly machine-generated queries. But those are pretty specialized, and I can't see someone buying a server that's optimized to save on disk use.

I have seen some SQL-backed GIS tasks that have really lit up a drive stack, though.
 
Last edited:
With a base clock of 2.5Ghz and a turbo of 2.9, that's still not going to be an exciting workstation chip.

So AMD still has the weird dilemma in this generation. They have a workstation CPU I. Threadripper that doesn't have any workstation level motherboards available for it. There are - however - workstation style motherboards available for Epyc, but none of the Epyc, but none of the Epyc chips are clocked high enough to really make for a good workstation CPU.

They really ought to do something about this.

Out of curiosity, what is missing from the current Threaripper motherboard lineup that makes them/it unsuitable for workstation use?
 
Out of curiosity, what is missing from the current Threaripper motherboard lineup that makes them/it unsuitable for workstation use?

They aren't completely unusable for workstation use, but they aren't ideal either.

1.) Workstation and server users tend to need more PCIe slots with 8x or below lanes. The boards currently on the market are designed with gaming in mind, and thus optimized for as many 16x slots as possible.

2.) The above might be less of a problem if there were on-board 10G ethernet on all of them. Some of them do have this feature, but it uses Acquantia chipsets instead of intel chipsets, and is 10G-Base T, rather than SFP based.

3.) While the chipset and CPU support ECC, it is unclear to me if TR boards out there actually enable ECC. A purpose designed workstation board would have this validated and in place.

4.) WS boards tend to ahve more advanced BIOS features, like per-slot disabling of boot roms, etc, that some find very useful.

5.) Current TR boards all have disco lights, elaborate fashion heatsinks and racy paint jobs. Sure, you can disable this stuff, and don't HAVE TO put the board in a case with a window, but it still sends the wrong message, and that message is that TR is a gaming product, not a WS product.
 
They aren't completely unusable for workstation use, but they aren't ideal either.

1.) Workstation and server users tend to need more PCIe slots with 8x or below lanes. The boards currently on the market are designed with gaming in mind, and thus optimized for as many 16x slots as possible.

2.) The above might be less of a problem if there were on-board 10G ethernet on all of them. Some of them do have this feature, but it uses Acquantia chipsets instead of intel chipsets, and is 10G-Base T, rather than SFP based.

3.) While the chipset and CPU support ECC, it is unclear to me if TR boards out there actually enable ECC. A purpose designed workstation board would have this validated and in place.

4.) WS boards tend to ahve more advanced BIOS features, like per-slot disabling of boot roms, etc, that some find very useful.

5.) Current TR boards all have disco lights, elaborate fashion heatsinks and racy paint jobs. Sure, you can disable this stuff, and don't HAVE TO put the board in a case with a window, but it still sends the wrong message, and that message is that TR is a gaming product, not a WS product.


1) Well, keep in mind that you can (obviously) use lower 4x, 8x, etc. cards in the 16x slots... For me personally, I would rather have the option of 16x slots as it gives me the option to use anything I want vs. being stuck with a 8x card and only a 4x slot.

2) You can get Intel 10GB SFP with two SFP ports for $175. Small price to pay especially since the price point of the current motherboards are less than a typical "workstation" class board.
https://www.serversupply.com/produc...MIsaC6teyz3gIVCztpCh1G9w7dEAQYASABEgJe1PD_BwE

3) I agree with you 100% on this one. Having said this, I have 128GB in mine and often have that maxed out and havent run into any issues (with the caveat that I did have some memory compatibility issues initially even with QVL memory).

4) Zenith Extreme has this feature.

5) Meh, I dont care.. Sending the "right" or "wrong" message via lighting when in reality it comes down to getting your work done is easily dismissed.

My machine has a 1950x, 128GB Ram, P900 Optane Boot NVME, Intel P3608 NVME, 1080ti, Intel 10GB network card. I have also used the same config with 2x Vega 64's for compute work, and this machine is basically under constant load at all times. Its super stable..

So in all, if you base the term "workstation" off of functionality & stability (which I would argue is the correct way to make the distinction) the current crop passes muster..
 
1) Well, keep in mind that you can (obviously) use lower 4x, 8x, etc. cards in the 16x slots... For me personally, I would rather have the option of 16x slots as it gives me the option to use anything I want vs. being stuck with a 8x card and only a 4x slot.

Well, yes, but when you only have 4 slots, this limits how many boards you can use.

Sure, you could turn a 16x slot into two 8x slots with some sort of riser, but that's going to be difficult to fit properly.

I agree that it is nice to have 16x slots. To me, the ideal Threadripper workstation board would have 8 full length 16x slots. Now, we only have 64 lanes to work with, but you could use the good old auto switching technique sharing the 16x lanes over two slots to make it either:

1 - 16x
2 - 0x
3 - 16x
4 - 0x
5 - 16x
6 - 0x
7 - 16x
8 - 0x

Or:

1 - 8x
2 - 8x
3 - 8x
4 - 8x
5 - 8x
6 - 8x
7 - 8x
8 - 8x

Or any combination of the above.

Mine would probably be:

1 - 16x (GPU)
2 - 0x
3 - 8x
4 - 8x
5 - 8x
6 - 8x
7 - 8x
8 - 8x

2) You can get Intel 10GB SFP with two SFP ports for $175. Small price to pay especially since the price point of the current motherboards are less than a typical "workstation" class board.
https://www.serversupply.com/produc...MIsaC6teyz3gIVCztpCh1G9w7dEAQYASABEgJe1PD_BwE

Of course, but now you've used up one of those scarce PCIe slots you don't have that many of.

If I got my 8 switched slots layout above, I'd be perfectly fine using one of them for a 10G NIC.
 
Last edited:
The Zenith Extreme has 4- 16X PCIe slots and a 4x PCIe slot so 5 of the 8 you are looking for are already present.. Use the 4x slot for your 10GB Ethernet. It also has a 1x slot, but not sure what anyone in this use case would use it for. Also keep in mind that you have 3x NVME slots (which you can arrange so they **each** have full 4x bandwidth if you want). So depending on what type of storage you were intending to run, you could still effectively pull off what you are describing assuming that you would need to use some of those PCIe ports for storage anyways and you could use M2 NVME solutions. So you could ear mark at least 1x M2 as another 4x PCIe "slot" and that gets you to 6. Plus, keep in mind all your scenarios only need 64 PCIe lanes, so nothing is going into a PLX controller so everything is full bandwidth.

Given all this, if 6 theoretical slots is not enough, then yes you are out of a current solution. But if thats truely the case then you may need a server board at that point and you have Epyc options.... :)
 
1) Well, keep in mind that you can (obviously) use lower 4x, 8x, etc. cards in the 16x slots... For me personally, I would rather have the option of 16x slots as it gives me the option to use anything I want vs. being stuck with a 8x card and only a 4x slot.

2) You can get Intel 10GB SFP with two SFP ports for $175. Small price to pay especially since the price point of the current motherboards are less than a typical "workstation" class board.
https://www.serversupply.com/produc...MIsaC6teyz3gIVCztpCh1G9w7dEAQYASABEgJe1PD_BwE

3) I agree with you 100% on this one. Having said this, I have 128GB in mine and often have that maxed out and havent run into any issues (with the caveat that I did have some memory compatibility issues initially even with QVL memory).

4) Zenith Extreme has this feature.

5) Meh, I dont care.. Sending the "right" or "wrong" message via lighting when in reality it comes down to getting your work done is easily dismissed.

My machine has a 1950x, 128GB Ram, P900 Optane Boot NVME, Intel P3608 NVME, 1080ti, Intel 10GB network card. I have also used the same config with 2x Vega 64's for compute work, and this machine is basically under constant load at all times. Its super stable..

So in all, if you base the term "workstation" off of functionality & stability (which I would argue is the correct way to make the distinction) the current crop passes muster..

I agree about the lack of workstation features on current motherboards.
My Checklist would be the following.

1. Dual 10GBE and/or SFP+ ports
2. OCuLink connectors for NVMe expansion, at least 4
3. Thunderbolt 3 connectivity (Most new DAW & Video Editing hardware use Thunderbolt)
4. Clean low latency USB ports for DAW use, at least 2
5. At least 3 full speed M.2 ports that have dedicated lanes, no lane sharing with other slots.
6. Change SATA to SAS-3/SAS-4 with SATA backwards compatibility
7. ECC Memory

Remove all the junk, no RGB, unnecessary plastic shrouds, don't need onboard audio, overclocking features, etc..
 
When companies like Microsoft are charging 26 thousand dollars in licensing per 4 cores IF you have a good enterprise contract then you see where a chip like this can be appealing. AMD just really NEEDS to show how good this chip is in that sort of workload to make it worth while.

When dealing with SQL, go with less cores and higher clock speeds whenever possible. The price of the hardware is general less than cost of the software.
 
When dealing with SQL, go with less cores and higher clock speeds whenever possible. The price of the hardware is general less than cost of the software.
Ain't that the truth. I'm really starting to hate new ms pricing structure.
 
Finally a chip that makes sense. Plenty PCI-E for I/O. Higher clock and lower core count to battle rediculas per core license aggrements. Now if only these software vendors would list these chips as supported.
 
I agree about the lack of workstation features on current motherboards.
My Checklist would be the following.

1. Dual 10GBE and/or SFP+ ports
2. OCuLink connectors for NVMe expansion, at least 4
3. Thunderbolt 3 connectivity (Most new DAW & Video Editing hardware use Thunderbolt)
4. Clean low latency USB ports for DAW use, at least 2
5. At least 3 full speed M.2 ports that have dedicated lanes, no lane sharing with other slots.
6. Change SATA to SAS-3/SAS-4 with SATA backwards compatibility
7. ECC Memory

Remove all the junk, no RGB, unnecessary plastic shrouds, don't need onboard audio, overclocking features, etc..

That's a tall order. What would benefit from this. I rarely see workstations get taxed and if they do people usually offload the workload to another workstation instead of trying to spend so much money on one workstation. Especially at the speed technology changes. You would be better off accessing storage from thunderbolt than dealing with sas. Or use an lsi card and have external jbod instead of hoping for connections internally. Or lessen your workstation specs and spend the money on fast storage spaces server, which will allow easier upgrade path as you don't have to worry about storage internally.
 
That's a tall order. What would benefit from this. I rarely see workstations get taxed and if they do people usually offload the workload to another workstation instead of trying to spend so much money on one workstation. Especially at the speed technology changes. You would be better off accessing storage from thunderbolt than dealing with sas. Or use an lsi card and have external jbod instead of hoping for connections internally. Or lessen your workstation specs and spend the money on fast storage spaces server, which will allow easier upgrade path as you don't have to worry about storage internally.
Sounds like a/v production, with a large local working cache and fast ethernet for site storage and backup.
 
I agree about the lack of workstation features on current motherboards.
My Checklist would be the following.

1. Dual 10GBE and/or SFP+ ports
2. OCuLink connectors for NVMe expansion, at least 4
3. Thunderbolt 3 connectivity (Most new DAW & Video Editing hardware use Thunderbolt)
4. Clean low latency USB ports for DAW use, at least 2
5. At least 3 full speed M.2 ports that have dedicated lanes, no lane sharing with other slots.
6. Change SATA to SAS-3/SAS-4 with SATA backwards compatibility
7. ECC Memory

Remove all the junk, no RGB, unnecessary plastic shrouds, don't need onboard audio, overclocking features, etc..

You would need more than 64 PCIe lanes to do all that without PLX, which also traditionally meant you would go dual socket CPU (which in this case isnt a consideration and wouldnt make sense anyways). Having said this, you could easily pull that off with any Epyc based solution.
 
A/V production is what I have in mind. I know a dual CPU server motherboard would be best if I wanted all those specs, and its a tall order. Unfortunately most mobile racks are not deep enough for EATX rack chassis or cases. Some of the Threadripper motherboards are perfect size but lack the necessary features without adding lots of expansion cards. What most people don't realize is once you load up a bunch of expansion cards you introduce latency/noise on the bus injecting interference. This interference causes static in recordings or playback on DAW gear. This is why most of the industry uses MAC's with Thunderbolt or even Firewire, but the lack of a modern MAC Pro is driving some of the industry to PC's. Plus Apple is eliminating upgradeability, serviceability, and expandability without a mess of dongles.
For this type of work I use a NVMe for OS Drive, Scratch Disk(s), Project Drives, and fast large capacity RAID for workstation storage. Offloading files over the network to a SAN/Avid via teamed 10GBE is a common thing in the industry. When a RED camera records around 160 MB/s that is about 240GB for around 25min of footage depending on the settings. With that in mind think of the storage specs needed for just a commercial shoot using 2-3 cameras, think think of TV or film.....
 
A single socket Epyc system would be able to do everything you are wanting to do. Seeing how Epyc systems are almost exclusively OEM you just need to find someone who sells one that will fit in your rack. Worst case scenario is you buy a barebones Super Micro chassis & motherboard then customize it the way you want.

And yes, the latency and corresponding noise is a real issue. I always turn off everything on board I am not using. Thats one of the reasons I also use external DAC's for sound, etc. But the lesson to be learned here is there are current solutions to solve these problems. Your use case scenario is the exact definition of "edge case scenario" though.....
 
The cache will benefit it some in some tasks, but in most the comparatively low clocks will probably hurt it more than it benefits from the cache.

Not sure about that, in many laptop CPUs, the difference between a dual core i5 and a dual core i7 is the cache, which does make a big difference.

Also Wendell once said that what you can't have in clock speed you can have it in cache.
 
Last edited:
A single socket Epyc system would be able to do everything you are wanting to do. Seeing how Epyc systems are almost exclusively OEM you just need to find someone who sells one that will fit in your rack. Worst case scenario is you buy a barebones Super Micro chassis & motherboard then customize it the way you want.

And yes, the latency and corresponding noise is a real issue. I always turn off everything on board I am not using. Thats one of the reasons I also use external DAC's for sound, etc. But the lesson to be learned here is there are current solutions to solve these problems. Your use case scenario is the exact definition of "edge case scenario" though.....
I've been looking at the SuperMicro H11SSL-NC, it ticks most of the boxes, its just missing 10GBE and Thunderbolt. I hope to see some Thunderbolt cards that will work on any system soon, now that Intel made it royalty free.
 
Back
Top