ZFS server, non-ECC to ECC

xXaNaXx

Gawd
Joined
May 15, 2003
Messages
954
if i were to set up a ZFS server temporarily with non-ECC memory, then take that out & put in ECC memory when i can afford it, will there be any issues, or will it pretty much just be like swapping any other RAM?
 
If your processor supports both ECC and non ECC then this should work without issue. Although be very careful about buying registered RAM. I do not believe AM3+, lga1150, lga1155 or lga1156 support REG ECC at all.
 
ECC memory itself does not increase the cost that much. What makes it more expensive is all of the other components needed, i.e. motherboard and cpu. The increased cost of the memory is really not much.
 
If your processor supports both ECC and non ECC then this should work without issue. Although be very careful about buying registered RAM. I do not believe AM3+, lga1150, lga1155 or lga1156 support REG ECC at all.

some AM3+ motherboard has hidden ECC support where not supported by tech support.
Asus has some AM3+ with ECC support (tech support too)....

unbuff ecc DDR3 RAM.
 
i ordered an ASUS SABERTOOTH 990FX R2.0 mobo & an AMD FX-6300 Vishera 3.5GHz (6-core) CPU

i know these both support ECC RAM, i just didn't know if there would need to be any reconfigurtion when i switch from non-ECC to ECC (unregistered), or if it was pretty much just plug-n-play.

as previous posted,

your motherboard will recognize ECC and non-ECC.
when recognize ECC, ECC configuration menu will be shown-up
for non ECC, ECC configuration is hidden automatically :D
you can not mix ECC and non ECC, will not boot( on my ASUS motherboard),
 
A quick easy question. Why a nas MUST need ECC? I built an alike nas without ECC and it runs 24/7 for months withou any issue.
 
A quick easy question. Why a nas MUST need ECC? I built an alike nas without ECC and it runs 24/7 for months withou any issue.
Silent corruption. Imagine the following scenario: You copy a file to your NAS. When it hits your memory, one of the 0's turns into a 1 due to... I don't know. Solar radiation, your wife running the microwave, static electricity in the air, something. The file completes the copy to the hard drive, and you're none the wiser that it did not copy perfectly... Until you try to use the file, and get a CRC error.

It's incredibly unlikely, but possible. Non-ECC RAM works just fine 99.9% of the time. Folks who buy ECC are preparing for that 0.1%.
 
Silent corruption. Imagine the following scenario: You copy a file to your NAS. When it hits your memory, one of the 0's turns into a 1 ... The file completes the copy to the hard drive, and you're none the wiser that it did not copy perfectly... Until you try to use the file, and get a CRC error.
Good explanation ... but you might never get any CRC error indication (from ZFS) IF that bit-glitch occurred prior to ZFS starting its protection of that data. (Ie, over the network, or upon its initial write to the ZFS host OS buffer) I believe ZFS has some copy tools to help eliminate this possibility.
It's incredibly unlikely, but possible. Non-ECC RAM works just fine 99.9% of the time. Folks who buy ECC are preparing for that 0.1%.
Yup, Murphy is a sneaky b*st*rd.
 
Good explanation ... but you might never get any CRC error indication (from ZFS) IF that bit-glitch occurred prior to ZFS starting its protection of that data. (Ie, over the network, or upon its initial write to the ZFS host OS buffer) I believe ZFS has some copy tools to help eliminate this possibility.
Correct. If ZFS gets corrupted data, then ZFS will not repair it. It will store the corrupted data. This is the reason people recommend ECC RAM, because ZFS does not protect against RAM corruption. RAM protection is not ZFS responsibility.

Actually, as ZFS does not protect against bit flips in the CPU neither, you should actually use a cpu with good RAS too, such as a IBM Mainframe cpu, or a high end SPARC or POWER cpu. For instance, Mainframe and some SPARC cpus, can rollback and replay instructions if there was an error. So they have good RAS.

ZFS fixes data corruption on disks.
ECC fixes data corruption in RAM.
RAS cpu fixes data corruption in CPU.
And then you should fix the rest of the PC too. The network card, etc.

What makes a computer expensive, is RAS. Performance is cheap to get. But reliability is very expensive. That is the one big reason Mainframes, high end SPARC/POWER servers etc are expensive.

In fact, Intel wanted to add a cosmic radiation detector into cpus. And if the detector noticed heavy radiation, then the cpu would replay all instructions. So cpus are not safe against data corruption neither. You would need a cpu with good RAS for this.
 
It's incredibly unlikely, but possible. Non-ECC RAM works just fine 99.9% of the time. Folks who buy ECC are preparing for that 0.1%.

The problem with errors by chance:
In former times your RAM was small and slow. There were errors but only one every few months or once a year.

Now you have a lot of very fast RAM with the chance of an errorr every few weeks or months. The problem grows with amount of RAM.

This is similar to disks and silent errors.
This is not a problem with small disks but with large storage arrays.
 
So cpus are not safe against data corruption neither. You would need a cpu with good RAS for this.
For those curious, the current Intel CPU which offer RAS protection is the E7 line. The E5s and E3s, sadly, do not. It's really amazing that we're operating at a level where alpha particles and cosmic rays are something we need to worry about.
 
Back
Top