brutalizer
[H]ard|Gawd
- Joined
- Oct 23, 2010
- Messages
- 1,602
Follow along with the video below to see how to install our site as a web app on your home screen.
Note: This feature may not be available in some browsers.
Also, your link isn't loading.
Btrfs RAID 5/6 Code Found To Be Very Unsafe & Will Likely Require A Rewrite
Written by Michael Larabel in Linux Storage on 5 August 2016 at 11:00 AM EDT. 111 Comments
It turns out the RAID5 and RAID6 code for the Btrfs file-system's built-in RAID support is faulty and users should not be making use of it if you care about your data.
There has been this mailing list thread since the end of July about Btrfs scrub recalculating the wrong parity in RAID5. The wrong parity and unrecoverable errors has been confirmed by multiple parties. The Btrfs RAID 5/6 code has been called as much as fatally flawed -- "more or less fatally flawed, and a full scrap and rewrite to an entirely different raid56 mode on-disk format may be necessary to fix it. And what's even clearer is that people /really/ shouldn't be using raid56 mode for anything but testing with throw-away data, at this point. Anything else is simply irresponsible."
So hopefully you aren't making use of any Btrfs RAID 5/6 support as it turns out to be in very bad shape and may even be ifdef'ed out of the mkfs code. Unfortunately it could take some time to fix especially with the potential for a format change being necessary to address the problem. The
Coincidentally, I'm in the middle of some Btrfs RAID tests right now but will now be limited to 0/1/10 for the four SSDs.
RAID56 - btrfs WikiRAID56
Status
The parity RAID code has multiple serious data-loss bugs in it. It should not be used for anything other than testing purposes.
From 3.19, the recovery and rebuild code was integrated. The one missing piece, from a reliability point of view, is that it is still vulnerable to the parity RAID "write hole", where a partial write as a result of a power failure will result in inconsistent parity data.
The first two of these problems mean that the parity RAID code is not suitable for any system which might encounter unplanned shutdowns (power failure, kernel lock-up), and it should not be considered production-ready.
- Parity may be inconsistent after a crash (the "write hole")
- Parity data is not checksummed
- No support for discard? (possibly -- needs confirmation with cmason)
- The algorithm uses as many devices as are available: No support for a fixed-width stripe (see note, below)
If you'd like to learn btrfs raid5/6 and rebuilds by example (based on kernel 3.14), you can look at Marc MERLIN's page about btrfs raid 5/6.
Note
Using as many devices as are available means that there will be a performance issue for filesystems with large numbers of devices. It also means that filesystems with different-sized devices will end up with differing-width stripes as the filesystem fills up, and some space may be wasted when the smaller devices are full.
Both of these issues could be addressed by specifying a fixed-width stripe, always running over exactly the same number of devices. This capability is not yet implemented, though.
I'm not surprised, the btrfs raid 6 on my test server has failed every time I've tried to test some rebuild scenarios over the last few years. That said, the wiki warning was added fairly recently though and I know people who have lost data to these bugs.Also, it should be mentioned that the devs point out that their raid implementation is currently broken, so this really shouldn't come as a surprise:
Yup, used to work with him at Fusion-io before he went to FB a few years ago...BTW, FB has hired Chris Mason, author of BTRFS and that is why they try BTRFS.