X470 ECC Support

Discussion in 'Motherboards' started by drescherjm, May 6, 2018.

  1. drescherjm

    drescherjm [H]ardForum Junkie

    Messages:
    14,527
    Joined:
    Nov 19, 2008
    In less 2 weeks I am headed to MC to purchase a X470 / 2700 combo however I very much want ECC support for this (since it will be for a server application zfs linux / pvr). I have spent a little time looking into this and found very few options. Of the GigaByte boards only the X470-AORUS-GAMING-7-WIFI lists ECC support. The cheaper models say ECC operates in non-ECC mode.

    https://www.gigabyte.com/us/Motherboard/X470-AORUS-GAMING-7-WIFI-rev-10#sp

    I am having trouble finding info for ASUS.

    ASRock does mention ECC support.
     
  2. Aluminum

    Aluminum Gawd

    Messages:
    686
    Joined:
    Sep 18, 2015
    Good luck, none of this shit is well documented and you get endless finger pointing when you try to verify. Its all unofficial, hell its hard enough just to verify it on threadripper which is official...maybe. Summit ridge worked on x370 on good boards, pinnacle ridge should work on good x470 boards they really aren't much different. It does seem that raven ridge is fucked with no ecc possible though, probably something to do with gpu imc added (random amd reddit comments saying half-yes vs motherboard vendors saying no).

    The only true positive is ECC corrected errors logged by your OS. All the bit width, registry and other bullshit is unverifiable. Its also possible to have false negatives (no logging) and inducing errors without totally crashing or fucking up your system is a problem in itself.

    Still better than intel though, where you know the answer is "fuck you" + you just reminded them of another feature to remove/segment with the next generation.
     
    Soul_Est likes this.
  3. drescherjm

    drescherjm [H]ardForum Junkie

    Messages:
    14,527
    Joined:
    Nov 19, 2008
  4. Nobu

    Nobu 2[H]4U

    Messages:
    3,279
    Joined:
    Jun 7, 2007
    I think the only way to ensure you get ecc support without getting threadripper or epyc is to get a ryzen pro cpu (which only comes in made to order workstations as far as I can tell, at least for now).

    Don't know about support on x470, just that my gigabyte x370 aorus k5 only supports unbuffered ecc, and I don't know how well it's supported.
     
  5. drescherjm

    drescherjm [H]ardForum Junkie

    Messages:
    14,527
    Joined:
    Nov 19, 2008
    I expect that unbuffered ECC is the only hope for any zen based AMD system other than EPYC.
     
    Warriorprophet likes this.
  6. Official Gigabyte

    Official Gigabyte Verified GBT Rep

    Messages:
    6
    Joined:
    Dec 19, 2017
    At least for us ECC (unbuffered) support is based on PCB layers. You need 6 layer boards to support it, IE the Gaming 7. (Of course the CPU has to support it as well.)

    The full list for GIGABYTE is:

    X470 AORUS Gaming 7 WIFI
    X370 Gaming 5, Gaming K7, AB350N Gaming WIFI.
     
    Hagrid, Soul_Est, BoiseTech and 5 others like this.
  7. drescherjm

    drescherjm [H]ardForum Junkie

    Messages:
    14,527
    Joined:
    Nov 19, 2008
    Thanks, I really appreciate the official response.
     
    Hagrid likes this.
  8. Peat Moss

    Peat Moss Limp Gawd

    Messages:
    308
    Joined:
    Oct 6, 2009
    What do you mean by, "All the bit width, registry and other bullshit is unverifiable."?

    Are there any systems that do this ?
     
  9. drescherjm

    drescherjm [H]ardForum Junkie

    Messages:
    14,527
    Joined:
    Nov 19, 2008
    I got my Ryzen 2700 and the ASUS X470 Prime PRO + 16 GB Crucial DDR4 2400 EUDIMM Kit (ECC Unbuffered). I put it together last night. In the BIOS / UEFI I did not see any mention of ECC. The specifications say ECC. I hope I was not lied to in the ASUS specifications. I did not have time to do any real testing since it was late..

    https://www.asus.com/us/Motherboards/PRIME-X470-PRO/specifications/
     
    Pieter3dnow likes this.
  10. drescherjm

    drescherjm [H]ardForum Junkie

    Messages:
    14,527
    Joined:
    Nov 19, 2008
    After at least 1 day of kill-ryzen testing, I am pretty sure that my Ryzen 2700 does not have the linux / gcc bug discussed in this thread:

    https://community.amd.com/thread/215773

    Also I am very impressed in its energy efficiency. It's been running hours on end compiling with all 16 threads at full load and it's staying around 60C with the stock heatsink at a low speed. I believe it is the ASUS normal profile I selected in the UEFI/BIOS.
     
    Last edited: May 27, 2018
    kirbyrj and /dev/null like this.
  11. drescherjm

    drescherjm [H]ardForum Junkie

    Messages:
    14,527
    Joined:
    Nov 19, 2008
    BTW. I believe ECC is enabled and working.

    Here is a little info of the CPU and os / kermel.


    This tells me its enabled.

    This tells me there have been 0 errors (I expect that from server experience ECC errors should be rare)
    This tells me the mode of error correction for the first rank
    Same goes for the 2nd rank

    Here is what SECDED means
     
    Last edited: Jun 4, 2018
    Soul_Est and Pieter3dnow like this.
  12. drescherjm

    drescherjm [H]ardForum Junkie

    Messages:
    14,527
    Joined:
    Nov 19, 2008
    I am thinking of trying to increase the memory frequency to see if I can trigger ECC corrections.

    Edit: I set it to 2666 to begin this (output is from lshw)

     
    Last edited: Jun 8, 2018
  13. Aluminum

    Aluminum Gawd

    Messages:
    686
    Joined:
    Sep 18, 2015
    That is one way, but in my experience the Zen IMC and/or AGESA memory trainer/timing boot check is usually the weaker link versus known good b-die. Don't forget to feed them decent voltage too.

    Another way is a hairdryer, start slowly at low setting and far away.

    The foolproof way from ancient times is to tape over 1 r/w pin to force an error to be corrected every time but no idea if this is still a viable method with current ddr design. (if you can even boot this way past the first billion+ errors you're golden)
     
    drescherjm likes this.
  14. drescherjm

    drescherjm [H]ardForum Junkie

    Messages:
    14,527
    Joined:
    Nov 19, 2008
    I actually thought about that.
     
    Last edited: Jun 4, 2018
  15. alienb

    alienb Gawd

    Messages:
    871
    Joined:
    Jun 10, 2004
    I have the ASROCK 470 ITX and it didn't work with the ECC 32gb registered DIMMs I have. (samsung chips)
     
  16. drescherjm

    drescherjm [H]ardForum Junkie

    Messages:
    14,527
    Joined:
    Nov 19, 2008
    I'm pretty sure you need unbuffered ECC.
     
  17. alienb

    alienb Gawd

    Messages:
    871
    Joined:
    Jun 10, 2004
    Yes I expected this. Just wanted to let someone else know if they were wondering. This stick did work inside of x99 and x299 Boards, though.
     
    drescherjm likes this.
  18. vkfu

    vkfu Limp Gawd

    Messages:
    262
    Joined:
    Sep 20, 2009
    I have an Asus Prime X370 Pro running with 2 x 16GB DDR4-2400 Crucial ECC DIMMs and ECC is enabled. When I was shopping for my X470 motherboard, I considered the Asus X470 Prime Pro, but saw that its webpage didn't mention ECC and found no ECC DIMMs listed in its QVL. So I went with an ASRock X470 Taichi instead. I might have bought the X470 Prime Pro instead if I had known it would support ECC.
     
  19. osrk

    osrk [H]ard|Gawd

    Messages:
    1,879
    Joined:
    Jan 10, 2003
    That was a great response. I went with AsRock as it's listed on there specs specifically but had I known this I would have considered gigabyte too. Please communicate to gigabyte engineering or marketing to put a little effort into this. With the lack of workstation boards from Supermicro or tyan people who want to use Ryzen for zfs or esxi workstations need to find boards with ECC. It will drive a few more board sales, I promise. My last two purchases for ryzen boards were driven by this feature that a lot of boards likely already have.
     
  20. Jedibeeftrix

    Jedibeeftrix [H]Lite

    Messages:
    105
    Joined:
    Dec 1, 2016
    any B450 boards meet this 6 layer pcb requirement for ECC unbuffered support (running in ECC mode)?
     
  21. dexvx

    dexvx [H]ard|Gawd

    Messages:
    1,104
    Joined:
    Aug 14, 2002
    Asrock B450M mentioned ECC support as well.

    https://www.asrock.com/mb/AMD/B450M Pro4/index.us.asp

    - AMD Ryzen series CPUs (Pinnacle Ridge) support DDR4 3200+(OC) / 2933/2667/2400/2133 ECC & non-ECC, un-buffered memory*
    - AMD Ryzen series CPUs (Summit Ridge) support DDR4 3200+(OC) / 2933(OC) / 2667/2400/2133 ECC & non-ECC, un-buffered memory*

    May have to buy one to test it out. Need to put in a SAS HBA.
     
  22. IdiotInCharge

    IdiotInCharge [H]ardForum Junkie

    Messages:
    12,028
    Joined:
    Jun 13, 2003
    Question here: will Ryzen Pro CPUs be needed for full ECC support, and would these boards support that?

    Also be interested if they could get that support down to ITX... :D
     
  23. drescherjm

    drescherjm [H]ardForum Junkie

    Messages:
    14,527
    Joined:
    Nov 19, 2008
    I think the biggest need is BIOS / motherboard support.
     
  24. dexvx

    dexvx [H]ard|Gawd

    Messages:
    1,104
    Joined:
    Aug 14, 2002
    Ryzen Pro is only for the APU's for ECC. The non-APU's should all have ECC support baked.

    I was thinking about a B350/B450 for a transcoding storage server, but then I realized there's no display if you use Pinnacle/Summit Ridge. And also no IPMI.

    So it looks like AMD still has a ways to go for the 1S server market. Basically it seems like its Epyc or nothing.
     
  25. IdiotInCharge

    IdiotInCharge [H]ardForum Junkie

    Messages:
    12,028
    Joined:
    Jun 13, 2003
    Not that I'm disagreeing with your point, but AMD does have Ryzen Pro SKU equivalents for all consumer Ryzen CPUs, not just the APUs, which is the source of my question.

    I'm also poking at the idea that there's a difference between 'officially supported' and 'it probably works/AMD didn't try to break it'. And I would certainly be looking at the APUs!
     
  26. dexvx

    dexvx [H]ard|Gawd

    Messages:
    1,104
    Joined:
    Aug 14, 2002
    Didn't even realize AMD made a Ryzen Pro based on Summit Ridge. But those look MIA, can't find stock of them anywhere.

    I think the only difference between officially supported and 'AMD didn't disable it' is that if you are in need of support, then you're SOL. As people have mentioned before, actually forcing an ECC event and seeing how it is handled is the only way to actually verify it works.
     
  27. IdiotInCharge

    IdiotInCharge [H]ardForum Junkie

    Messages:
    12,028
    Joined:
    Jun 13, 2003
    Then that's the case for all of them, really. They've been announced, talked about for a year, and they're really just Xeon'd-up consumer parts, so if they are a 'thing' we should be able to buy them, right?

    And having 'official' as in 'you know it's working' support for ECC is kind of a big deal for the stuff that more or less needs it the most, like file servers. The 'Pro' APUs are especially interesting because they would support a convergence of strong dependability for NAS while also providing performance for VM stuff and the ability to pass real GPU compute resources to said VMs or be used for say media transcoding.

    Of course, we'd also want a Supermicro-style board that provides 10Gbit for the NAS channel, a pair of 1Gbit ports for WAN/LAN (firewall/IDS/IPS and routing if you're brave) and a 1Gbit port for IPMI, along with an eight-channel SAS controller, all on mITX so it could be used as a converged workcenter appliance :D.
     
  28. drescherjm

    drescherjm [H]ardForum Junkie

    Messages:
    14,527
    Joined:
    Nov 19, 2008
    I have not had a single recorded ECC correction (not really expected at stock but possible if overclocking the ram). Although I only pushed it up to DDR4 2700 and the system has been off for most of the last few weeks due to an incompatibility (or other issue) with a G1 WDC Black 512GB NVMe drive. I just don't have time to debug at the moment.
     
    Last edited: Aug 2, 2018
  29. vkfu

    vkfu Limp Gawd

    Messages:
    262
    Joined:
    Sep 20, 2009
    I've actually had some with the G1 WDC Black 512GB PCIe drive as well. This is the first time I have seen anyone else report a similar experience.

    I have two and was able to confirm it was a device issue because they both would fall off the bus in AM4 motherboards but were fine in an Intel C236 motherboard. I found that lower quality PSUs without DC-DC regulation (Corsair CX430, CX430M) were more problematic but I still saw this issue on an ASRock Taichi X470 with better PSUs (Corsair RM550x, EVGA 550 G3). I installed a G2 WDC Black 500GB NVMe in the same system and had no problems. I also found a FW update for the G1 WD Black but haven't tested the drive since applying the update.
     
  30. drescherjm

    drescherjm [H]ardForum Junkie

    Messages:
    14,527
    Joined:
    Nov 19, 2008
    I have a reasonably high end 1000W modular PS on that system. The problem seems to have reduced its frequency ( to less than 1 time per week of 24/7 uptime) however that is no good for what I want to do with the system.
     
  31. vkfu

    vkfu Limp Gawd

    Messages:
    262
    Joined:
    Sep 20, 2009
    Did you try the updated firmware? Mine shipped with B35500WD but the latest is B35900WD

    Code:
    root@nanoluteus:~/server-status# nvme fw-log /dev/nvme0n1
    Firmware Log for device:nvme0n1
    afi  : 0x1
    frs1 : 0x4457303035353342 (B35500WD)
    
    I have also found that setting the BIOS Power Control Idle Control option to "Typical Current Idle" helped with some idle lockup issues I experienced (see https://bugzilla.kernel.org/show_bug.cgi?id=196683)
     
  32. drescherjm

    drescherjm [H]ardForum Junkie

    Messages:
    14,527
    Joined:
    Nov 19, 2008
    Thanks. I was going to move it to a windows box to update the firmware. It looks like I have the latest so no need for that. I have not updated the BIOS/UEFI on my X470 board however.

    Code:
    jmd1 ~ # nvme fw-log /dev/nvme0n1
    Firmware Log for device:nvme0n1
    afi  : 0x1
    frs1 : 0x4457303039353342 (B35900WD)
    
     
  33. drescherjm

    drescherjm [H]ardForum Junkie

    Messages:
    14,527
    Joined:
    Nov 19, 2008
    Since you mentioned linux. The reduction in frequency seemed to be related to a kernel update. I noticed that there was some nvme updates a few revisions back somewhere around 4.17.12.
     
  34. drescherjm

    drescherjm [H]ardForum Junkie

    Messages:
    14,527
    Joined:
    Nov 19, 2008
    I just finished reading the entire thread. I am not sure that this issue is what I have.

    What happens to me is after some period of time the system hard locks up. After pressing the reset button the sometimes (not always) the nvme drive is not detected. When the nvme drive disappears, I have to first power off the machine and then boot from a sysrescue usb stck reinstall grub on the nvme drive then reboot.

    There is one other weird symptom that had happened 2 times. If I was logged in via ssh (GUI already locked up) when this happened on htop there was 1 thread had an entire core 100% used in a kernel task. Several tasks were unkillable, however I could read and write to the nvme drive however eventually the ssh would disconnect.

    Before the kernel update (that I mention in the previous post) I was able to trigger this behavior by rebuilding a lot of large packages in gentoo (like chromium).


    BTW, The nvme drive has ZOL installed and always scrubbs without errors even after it disappears from the system.
     
  35. Wingman_ice

    Wingman_ice n00b

    Messages:
    15
    Joined:
    Jul 18, 2018
    I had a compatibility problem with a FusionIO IODrive2 1.2TB card. Couldn't boot with it in the 3rd 16xPCIe slot or the 2nd 16x PCIe slot in Gen3 mode. It only works with the PCIe mode set to Gen2 for the first 2 PCIe slots. Why I imagine it didn't work in the 3rd slot due to that card needing at least 8x bandwidth. My Fusion IODrive Duo works just fine in the 3rd 16x slot. This was on the Gigabyte Aorus Gaming 7 Wifi MB.
     
  36. vkfu

    vkfu Limp Gawd

    Messages:
    262
    Joined:
    Sep 20, 2009
    I had the 2017 WD Black 512GB become inaccessible and be undetected by the BIOS even after I pressed the reset button. I would have to power cycle the machine to see the drive again. That matches your experience, although it would boot up again from the drive without any problems.

    I was hoping that the FW update would fix the issue. Oh well. All of my testing was with Ubuntu 16.04 and 18.04. So my kernels were considerably older than yours.

    I considered requesting support from WDC but it's been easier to just get new drives. In addition to the 2018 WD Black, I've also used ADATA NVME drives with Realtek controllers and they've all been trouble free.

    I'm about to give the 2017 WD Black with updated FW another try in a new system and will report what happens.
     
  37. drescherjm

    drescherjm [H]ardForum Junkie

    Messages:
    14,527
    Joined:
    Nov 19, 2008
    I just purchased a 1TB 960 EVO ( on sale for $248 + tax at amazon lightning deal ). Hopefully that fixes the issue..
     
  38. drescherjm

    drescherjm [H]ardForum Junkie

    Messages:
    14,527
    Joined:
    Nov 19, 2008
    I have got the 960 EVO installed / system cloned. It may be a placebo effect however it seems significantly faster in STR.

    One thing I did not mention is since I am using zfs (on linux) for my root filesystem I don't have TRIM (yet - its under development). I am not sure if that caused any problems with the WDC black however I had less than 100GB used of a 5XX GB G1 black now the same amount used in a 1TB 960 EVO..
     
    Last edited: Sep 1, 2018
  39. IdiotInCharge

    IdiotInCharge [H]ardForum Junkie

    Messages:
    12,028
    Joined:
    Jun 13, 2003
    If you're not using more than one drive, why use ZFS with its known limitations? It's a brilliant filesystem, but what does it do for you that say EXT4 wouldn't?
     
  40. drescherjm

    drescherjm [H]ardForum Junkie

    Messages:
    14,527
    Joined:
    Nov 19, 2008
    I have zfs on several single drive installations (home and work). I make good use of the other features like snapshots and send / receive. The lack of TRIM has not caused an issue except for possibly the WDC G1 Black. My 10+ year old core2quad system (which the ryzen is supposed to replace) has a 1TB Sandisk ssd single disk zfs root for probably 2 years now.