ZFS on Linux vs. MDADM ext4

bexamous

[H]ard|Gawd
Joined
Dec 12, 2005
Messages
1,670
Run xfs_fsr, often not installed by default.. find some package that is like xfs-tools or something. Just set it to run every so often, if no defragging is needed it exits pretty quickly. Probably debatable how useful it actually is, but no downside to running it.
 

kac77

2[H]4U
Joined
Dec 13, 2008
Messages
2,892
At work (medical imaging research) I am just about to begin slowly migrating my 40TB+ of data to that (btrfs on mdadm raid 6) after about 1.5 to 2 years of testing with basically a few TB of throw away data but similar usage pattern to real data. My first array with real data will be just 6 x 2TB drives in mdadm raid6 with btrfs as the filesystem. I actually put the drives in the server today however they must all pass a 4 pass badblocks read/write test before being put into usage. With that said I will backup this data just like all other data in my raid arrays on to my 2 drive LTO2 autochanger.

LTO2? Sheesh. Either way with at least a backup plan you can afford to at least push the envelope and try new things. There's nothing wrong with that. It's those that don't who should stay clear of doing bleeding edge file systems and software in general.
 

drescherjm

[H]F Junkie
Joined
Nov 19, 2008
Messages
14,937
LTO2? Sheesh.

$4000 to $5000 to replace the archive probably is not money best spent. I mean I can buy quite a few $25 tapes for that and its not like backup speed is holding me back or swapping out tapes since the archive holds 24 at a time and most of our data comes in at a certain rate and gets backed up then and never changes just gets added to. I am not saying that I have not considered replacing it.
 
Last edited:

kac77

2[H]4U
Joined
Dec 13, 2008
Messages
2,892
$4000 to $5000 to replace the archive probably is not money best spent. I mean I can buy quite a few $25 tapes for that and its not like backup speed is holding me back or swapping out tapes since the archive holds 24 at a time and most of our data comes in at a certain rate and gets backed up then and never changes just gets added to. I am not saying that I have not considered replacing it.

:) I know I just haven't seen LTO2 in a minute.
 

cantalup

Gawd
Joined
Feb 8, 2012
Messages
758
Think it matters on what you consider safe. Silent data corruption on madam + ext4 can go unnoticed for a long time. Long enough that the corrupted files can make their way to backups and you lose any good copy. I worry about that more than anything. ZoL has been just fine for me... I don't use dedupe or anything fancy... just want checksumed data + snapshots + scrubbing.

I'd rather my ZoL file server blow up and I have to start from scratch and restore data from backups vs a mdadm/ext4 not catching my data being corrupted and me losing it.

I still use to deploy madadm+ext4 without data corruption :), everything is perfect..
any form can be corrupted even on zfs, zfs put extra layer to minimize corruption.



actually snapshots work on ZoL with some caution :)...

oh yeah, I never use reiserfs :p... used to use ext3.

this is the reason on my experience "safe way is mdadm+ext4 on production".

data corruption can be some forms...

I use ZoL for backup server to support zfs on linux...

honestly, I do not think too much "silent data corruption" since corruption can happen on any level...
you need to us good motherboard server, network cards, ECC ram, and others. <-- please do not being cheap :p..
 

cantalup

Gawd
Joined
Feb 8, 2012
Messages
758
Well the data being stored should be irrelevant. I could be storing randomly generated data that is totally worthless, it doesn't change how well the storage/backup system is will prevent against data loss. I'm just arguing that ZoL + good backups is safer to use than mdadm + ext4 + good backups. I see the most likely source of data loss will be silent data corruption that goes unnoticed and makes it way onto backups. Anything obvious you can easily restore from backups.

I only focused on this as the person I initially replied to said "the safest solution is mdadm very mature and stable."

I don't think that is true. I've lost data using mdadm + reiserfs.... backups would have let me restore it all, but I was very close to hitting seneratio I describe where data is corrupted, not noticed, and makes it to backups. In my case a memory dimm started to fail, system hung and was rebooted.. upon startup fsck ran and though the filesystem had errors... I said yes to correct them and it kept finding more and more. In the end figured out one memory dimm had failed and was causing fsck to think the filesystem was corrupted when it was not, and when it 'fixed' these errors it was just going through corrupting data. After replacing dimm the filesystem could still be mounted but tons of files were fucked up. Had this dimm failure not been so severe it could have resulted in some silent data corruption that went unnoticed to backups.

Anyways this is just one reason why I think ZoL is better than ext4+mdadm. My file server also hosts VM images and using ZFS snapshots is so much nicer than any alternative... eg compared against ext4+lvm+mdadm. Or just how nice it is to quickly scrub array means only scrub the data and not unused space, eg with mdadm.

I guess the one advantage of mdadm is expanding arrays by 1 disk. That is only thing I wish ZFS could do that mdadm can.

Anyways for your question, I'm not 'that' concerned about anything getting corrupted, I just see no reason not to prevent against it. Most of my file server is stuff I don't really care about, tv shows and movies... a small amount is VM images that I use... and then like backups of 'my documents' and stuff that I care about, and then probably most important stuff is pictures/videos I've taken. I'm not greatest photographer but I've spent lots of time taking pictures and post processing them. The final images can be smaller but some intermediate steps the space is a bit larger, the inital raw files from a 5d2 are like 20MB or so? I dunno. I'd be pretty disappointed if I lost this type of stuff. I keep backups of everything because why not, but the smaller mount of stuff I really care about I'll keep two backups of. Its pretty unlikely I think I'll be losing it... except for that one situation where silent data corruption ends up in the backups. Its so easy to guard against, just use some file system with checksums + scrubbing I see no reason not to .
mdadm is just a level below filesystem (reiserfs, ext4, xfs, and others)

I stayed away from reiserfs.
corruption can happen many ways, and I do not concern much on "silent data corruption". you need good hardwares!.

when you are in the production environment, you have to pick one where is the best for your work.
some random corrupted data situation that happened to me-> bad NIC, and switch(unmanaged for example).

ZoL is not ready for production environment, still have some hidden holes.
 

bexamous

[H]ard|Gawd
Joined
Dec 12, 2005
Messages
1,670
At work in Lab I manage we use Solaris w/ZFS on servers, but they're all mirroed with backup systems running RHEL6 w/ZoL. Every 5 minutes servers sync using snapshots + zfs send/recieve. I've not had any problems.

And silent data corruption can happen for many reasons, but the point is you can catch it if your filesystem checksums data. At a previous job we were testing some new servers, well known brand, but ran into problem. Had some stress tests fail... narrowed it down do simply reading from HDD long enough. Basically in loop do:
while true ; do
echo 3 > /proc/sys/vm/drop_caches
md5sum somebigfile
done

Let this run long enough and md5sum would sometimes change. Nothing was detecting any errors. Could repo this on any of the servers using any HDD, even plugged directly in board with any SATA cable. The manufacturer of motherboard ended up admitting the problem was theirs... I don't remember exactly what it was, but resolution ended up being they paid for all new add-in controllers for all our servers. Thing is, I'm sure we were not the only company to buy these servers, and I'm sure they didn't contact every customer and go "Oh so hey that server you bought is defective." Shady stuff. This was back when SATA was still pretty new, so its been awhile.
 

Red Falcon

[H]F Junkie
Joined
May 7, 2007
Messages
11,804
Whichever system software RAID (MDADM) or ZFS will be running on, be sure to use ECC memory if possible since the cache for these types of storage methods use system memory.

One can obviously use non-ECC and they will work, but there will be a higher potential for data corruption.
 

drescherjm

[H]F Junkie
Joined
Nov 19, 2008
Messages
14,937
Whichever system software RAID (MDADM) or ZFS will be running on, be sure to use ECC memory if possible since the cache for these types of storage methods use system memory.

One can obviously use non-ECC and they will work, but there will be a higher potential for data corruption.

At work I do that when possible (my last 2 server purchases were xeons) although I do not worry so much about that. I mean having servers that are set to log every single ECC correction in the machine check log and not seeing any corrections in years of operations makes me believe this is a quite rare event even with current ram density.

Although I do remember in 2006 to 2008 at home I had dual processor dual core opteron system that used to have about 1 single bit ECC correction logged per month.
 

cantalup

Gawd
Joined
Feb 8, 2012
Messages
758
At work in Lab I manage we use Solaris w/ZFS on servers, but they're all mirroed with backup systems running RHEL6 w/ZoL. Every 5 minutes servers sync using snapshots + zfs send/recieve. I've not had any problems.

And silent data corruption can happen for many reasons, but the point is you can catch it if your filesystem checksums data. At a previous job we were testing some new servers, well known brand, but ran into problem. Had some stress tests fail... narrowed it down do simply reading from HDD long enough. Basically in loop do:
while true ; do
echo 3 > /proc/sys/vm/drop_caches
md5sum somebigfile
done

Let this run long enough and md5sum would sometimes change. Nothing was detecting any errors. Could repo this on any of the servers using any HDD, even plugged directly in board with any SATA cable. The manufacturer of motherboard ended up admitting the problem was theirs... I don't remember exactly what it was, but resolution ended up being they paid for all new add-in controllers for all our servers. Thing is, I'm sure we were not the only company to buy these servers, and I'm sure they didn't contact every customer and go "Oh so hey that server you bought is defective." Shady stuff. This was back when SATA was still pretty new, so its been awhile.

actually md5sum can not detect most/some corruptions.
I read(years back) a published research paper on md5(sum) and sha(sum) that reveal md5 can not detect error/corruption mostly compared with sha(1/2)

this is the reason (on my only conclusion), I can not trust md5(sum) all together..

as I said, get good hardwares!! :). you can not bargain on production environment for being cheap...
as now, never get corrupted files on mdadm+ext4(was ext3).
 

cantalup

Gawd
Joined
Feb 8, 2012
Messages
758
At work I do that when possible (my last 2 server purchases were xeons) although I do not worry so much about that. I mean having servers that are set to log every single ECC correction in the machine check log and not seeing any corrections in years of operations makes me believe this is a quite rare event even with current ram density.

Although I do remember in 2006 to 2008 at home I had dual processor dual core opteron system that used to have about 1 single bit ECC correction logged per month.

you would see in a long run for years... ECC error/correction would get log along in system/OS log.
I would not worry when a small ECC error that points random location :D.
 

bexamous

[H]ard|Gawd
Joined
Dec 12, 2005
Messages
1,670
actually md5sum can not detect most/some corruptions.
I read(years back) a published research paper on md5(sum) and sha(sum) that reveal md5 can not detect error/corruption mostly compared with sha(1/2)

this is the reason (on my only conclusion), I can not trust md5(sum) all together..

as I said, get good hardwares!! :). you can not bargain on production environment for being cheap...
as now, never get corrupted files on mdadm+ext4(was ext3).

No md5 can will detect errors just fine. It works perfectly as a hash function. The problem however is a method was found generate collisions quite easily. This only affects its use in security applications.

And the point is, saying 'good hardware' doesn't help. Hardware fails. Good hardware can become bad hardware. The point of checksumming data is to verify the hardware is and continues to be good. And saying you don't get corrupted files, while likely true, how do you know? Because if you don't have any method of verifying then you actually don't know.
 

brutalizer

[H]ard|Gawd
Joined
Oct 23, 2010
Messages
1,600
They've contributed back minor bugfixes, improvements, and added extra functionality. The most major thing done recently is TRIM support. However ZFS feature flags, LZ4 compression is a collaboration between Illumos, FreeBSD and some other developers.
I know that the Illumos team have been working with TRIM support for some time. If they got help from the BSD developers in that project, that is only good. And bug reports are always good.


And keep in mind that FreeBSD offers by far more flexibility to your storage, because of the GEOM framework. Specifically encryption and high availability.
What is GEOM? And high availability, how?

ZFS has encryption.
 

bexamous

[H]ard|Gawd
Joined
Dec 12, 2005
Messages
1,670
Not sure what GEOM is going to do for you? ZoL + GlusterFS would probably work for HA storage for cheap.
 

devman

2[H]4U
Joined
Dec 3, 2005
Messages
2,400
ZFS has encryption.

Correct me if I'm wrong but ZFS only has encryption on newer pool versions in Solaris, which will not be open sourced. Illumos is planning to add encryption to via some other mechanism to their fork and it will likely not be compatible. Basically ZFS has two divergent forks after v28.
 

olavgg

Limp Gawd
Joined
Oct 27, 2010
Messages
232
What is GEOM? And high availability, how?

GEOM can do disk device transformations which means you can create a "virtual" device to your ZFS storage. This virtual device can be configured with HAST to provide high availibility to your ZFS storage. GEOM supports proper disk flushes, so its safe to use together with ZFS.
 

extide

2[H]4U
Joined
Dec 19, 2008
Messages
3,494
I have a ZoL box running at home, it's been going to ~6months now and running very well. Initially I was building/installing it manually but I have recently changed over to use the repository so it works with apt-get now. I have gone through 3 or 4 versions of ZoL with no problems.
 

devman

2[H]4U
Joined
Dec 3, 2005
Messages
2,400
I have a ZoL box running at home, it's been going to ~6months now and running very well. Initially I was building/installing it manually but I have recently changed over to use the repository so it works with apt-get now. I have gone through 3 or 4 versions of ZoL with no problems.

Are snapshots working well?
 

brutalizer

[H]ard|Gawd
Joined
Oct 23, 2010
Messages
1,600
Correct me if I'm wrong but ZFS only has encryption on newer pool versions in Solaris, which will not be open sourced. Illumos is planning to add encryption to via some other mechanism to their fork and it will likely not be compatible. Basically ZFS has two divergent forks after v28.
True. But, ZFS has encryption. Provided by Oracle's ZFS version.


I still use to deploy madadm+ext4 without data corruption :), everything is perfect..
Interesting. How can you be sure "everything is perfect"? How can you be sure you dont have silently corrupted data? Do you do MD5 checksum every day, on all your data, to see if no bits have been randomly flipped?

Which checksum algorithm do you use? How often do you scan your data for randomly flipped bits?
 

brutalizer

[H]ard|Gawd
Joined
Oct 23, 2010
Messages
1,600
GEOM can do disk device transformations which means you can create a "virtual" device to your ZFS storage. This virtual device can be configured with HAST to provide high availibility to your ZFS storage. GEOM supports proper disk flushes, so its safe to use together with ZFS.
I didnt get this. GEOM creates a "virtual" device to ZFS storage? Like, iSCSI?

Do you mean GEOM is in charge of the disk, and lies between ZFS and the disk? In that case, it is like having a hw-raid card between ZFS and the disk - which is a bad thing.

And make HAST similar things, can be done on ZFS too. As I understand it from your link, HAST allows you to store data on two different servers simultaneously. That can be done too on Solaris or any other OS I think.

Clustered filesystems is another thing, and ZFS is not clustered.
 

danswartz

2[H]4U
Joined
Feb 25, 2011
Messages
3,715
No, it's not like a 'hw raid card', it's a virtualized block device. That's all. It has a local device and a remote one, but that is all hidden to zfs.
 

cantalup

Gawd
Joined
Feb 8, 2012
Messages
758
T.....



Interesting. How can you be sure "everything is perfect"? How can you be sure you dont have silently corrupted data? Do you do MD5 checksum every day, on all your data, to see if no bits have been randomly flipped?

Which checksum algorithm do you use? How often do you scan your data for randomly flipped bits?

no checksum, just weekly backup..

I/we will find out when any file corrupted or not, since all files are scattered around with many projects(consumers) that based on the same level base data (data/bin files, source files, and shared libarary).
base program/data/source- extended program/data/source- customer progra,/data/source.
all three levels must have no corrupted files where can break the whole system ehem customer system.

when I need checksumming, sha(1/2)summing, this is only for a certain file, such as licensing ( many small/big text and binary files, but each customer has difference licensing)
 

cantalup

Gawd
Joined
Feb 8, 2012
Messages
758
No md5 can will detect errors just fine. It works perfectly as a hash function. The problem however is a method was found generate collisions quite easily. This only affects its use in security applications.

And the point is, saying 'good hardware' doesn't help. Hardware fails. Good hardware can become bad hardware. The point of checksumming data is to verify the hardware is and continues to be good. And saying you don't get corrupted files, while likely true, how do you know? Because if you don't have any method of verifying then you actually don't know.

nothing perfect
it should start with good hardwares.. and backup :)

as I know, I never had bad hardware(server).. just in the past for being cheap(buying cheap NIC and unmanaged switch):p..

how do I know , the data is not corrupted? please read my prev post ...

running sha take longer than md5 as I know...
as long as you are happy with md5, nothing to worry
 

cantalup

Gawd
Joined
Feb 8, 2012
Messages
758
I have a ZoL box running at home, it's been going to ~6months now and running very well. Initially I was building/installing it manually but I have recently changed over to use the repository so it works with apt-get now. I have gone through 3 or 4 versions of ZoL with no problems.

see you again :)
I moved back to ZoL for backup server (again) with CentOS months back. replacing legacy adaptec HW card. (again)
pretty solid as long as we know the limitations, and a lot better than the RC that I tried last year.
 

brutalizer

[H]ard|Gawd
Joined
Oct 23, 2010
Messages
1,600
No, it's not like a 'hw raid card', it's a virtualized block device. That's all. It has a local device and a remote one, but that is all hidden to zfs.
I dont get it really. Who is in charge of the disk? It hides the disk to ZFS? So ZFS is not in charge?






no checksum, just weekly backup..
Then you are not protected against random bit flips.


I/we will find out when any file corrupted or not, since all files are scattered around with many projects(consumers) that based on the same level base data (data/bin files, source files, and shared libarary).
base program/data/source- extended program/data/source- customer progra,/data/source.
all three levels must have no corrupted files where can break the whole system ehem customer system.
This is a very bad strategy, and you should change your strategy regarding how you protect your data.

I had lots of Amiga discs with games. You sure know that, with time, those data rots so the Amiga discs doesnt work after a few years. It is like VHS cassettes, they also get more flickery and flickery after some years. The same with hard disks, they also get randomly flipped data after some years. And RAM chips also get randomly flipped data, that is why you need ECC.

I had 300 Amiga discs. How can I be sure that no bits were flipped randomly? I mean, maybe some bits in a MP3 file were flipped, or in a JPEG file were flipped. But I will not notice it, it is too small a change. Or, maybe an executable file is flipped, but only a few bits at the end that is only activated when the user tries to do something weird.

All bit flips are not immediately noticed, they will not crash your computer or say "file is unreadable" - in that case you know there is an error and you go to the backup. Some bit flips are silent, and you dont notice them, because they corrupt less important files. For instance, "Readme.txt" or something else that you never read.

I am trying to say that randomly flipped bits does not always affect important files, so you will not always notice corrupted data. Some bit filps are in not important files. Or bit flips are in a small part of an file that you never use, and when you try to use that small part, everything crashes.

CERN, the big physics centre with LHC collider, did a study on bit flips: how often do they occur? There are also much research on how often does bit flips in RAM occur? The answer might surprise you, if you care to read research papers. But maybe you are not interested in what researchers say about data corruption. Maybe your company you work at, does not care about data corruption.

I have told you about data corruption several times, but you just ignore all the research on this subject. That is not really good strategy. You should pay heed to what researchers say.
 

danswartz

2[H]4U
Joined
Feb 25, 2011
Messages
3,715
"I dont get it really. Who is in charge of the disk? It hides the disk to ZFS? So ZFS is not in charge?"

Like I said, it's a virtualized block device. The HAST driver takes a remote IP/port and a local block device and treats it (sorta) like a raid1 - writing to both sides before telling the client that the write is complete. Reads are only done from the local disk unless it is dead or some such. So ZFS is in charge, but in charge of the virtual block device, and any and all checksumming, etc, is performed just like for a 'real disk'. Is that clearer?
 

cantalup

Gawd
Joined
Feb 8, 2012
Messages
758
Then you are not protected against random bit flips.



This is a very bad strategy, and you should change your strategy regarding how you protect your data.

I had lots of Amiga discs with games. You sure know that, with time, those data rots so the Amiga discs doesnt work after a few years. It is like VHS cassettes, they also get more flickery and flickery after some years. The same with hard disks, they also get randomly flipped data after some years. And RAM chips also get randomly flipped data, that is why you need ECC.

........
I have told you about data corruption several times, but you just ignore all the research on this subject. That is not really good strategy. You should pay heed to what researchers say.

I never ignore data corruption. it is on my mind to make my works run smoothly.
the one that I do not understand, you always make "ZFS" is a god that can handle everything.
please read my old posting, did I ignore data corruption in general?...


flipflop discussion could be minimized with ECC :)
yes, ECC on servers :), did I mentioned server on my prev posting :)...
server(not home server) is 99% using ECC ram


bad or not, it is not on my hands .. the up-level management handles that.
I just want to share that madadm+ext4( was ext3) is ok(safe) as long as we know the environment.


has been running for many years,... even before I start working on those projects :) , the system is running as I mention on my previous posting.


one backup server has been replaced with ZoL on centos:D... nothing fancy snapshot or rollback. that work smoothly..
interested in ZoL?...

My job: developer(mostly),admin(yeah when someone on up-level manager told me to :|), and maintenance (when admin guys request to up-level manager).
kind of fun and experience to know on how the whole system (company) work-flow.
 
Last edited:

brutalizer

[H]ard|Gawd
Joined
Oct 23, 2010
Messages
1,600
I never ignore data corruption. it is on my mind to make my works run smoothly. the one that I do not understand, you always make "ZFS" is a god that can handle everything. please read my old posting, did I ignore data corruption in general?...
ZFS is not a God that can handle everything. But the design goal of ZFS is safety, not speed.

I believe ZFS is safer than any other solution, because of the research I have showed here earlier. Researchers says that ext, XFS, NTFS, etc - are not safe and might corrupt your data. And researchers say that ZFS is safe.

I base my belief on research, not fanboyism. If researchers say that a new filesystem is safer, then I immediately switch away from ZFS. I base everything on research and hard science. I dont stay with ZFS because Solaris tech is "best" or some fanboy, I stay with ZFS because of all research on data corruption. Have you read the research on data corruption? No? But I have. Maybe you should read the research on data corruption too.



flipflop discussion could be minimized with ECC :) yes, ECC on servers :), did I mentioned server on my prev posting :)... server(not home server) is 99% using ECC ram
Flipflop can be minimized, but not eliminated with ECC. Typically, ECC protects against one bit flip flop, but sometimes there are two bit flip flops, or three or more - and ECC does not protect against two bit flip flops. So, ECC is not safe. "Chipkill" tech protects against more bit flip flops than ECC and Chipkill is safer than ECC.

Do you agree that you need ECC for RAM, because there will be bit flip flops on RAM? Do you understand that you need ECC for disks too, because there will be bit flip flops on disks too?

ZFS is like ECC on disks, but ZFS is much safer than ECC, because ECC protects against one bit flip flop, ZFS protects against many bit flip flops, and many other sorts of error than bit flip flops, for instance read-write-hole-error that hardware raid does not protect against.



has been running for many years,... even before I start working on those projects :) , the system is running as I mention on my previous posting.
And how do you know that you dont have random bit flip flop on some files, since many years? One guy said he had old RAR files, and recently he wanted to read some documents in the RAR file, but the RAR file was corrupted! He did not know it was corrupted because he did not do RAR file checksum every year. Now he have switched to ZFS, because ZFS automatically checksums everything all the time, and if the checksum is wrong then ZFS will automatically repair the file.

How do you know that old files are error free? Maybe they have bit flip flops. Have you checked every file, every year? You need to do a MD5 checksum of all files, every year to detect bit flip flops.

Is random bit flop on disks a problem? According to research, it is a big problem. For instance, one bit on every 67 TB data is silently bit flip flopped. Just read the research.



I dont understand why I am telling you about data corruption, in many posts. And still you continue to claim that you have no bit flip flop - and I ask you how you know it? Your answer: "I just know it, I dont need to check every file". That is belief, religion. Not hard science, not research. What is your data corruption strategy? Religion. Not science.

-Do you have cancer, a small tumour in your body?
-No.
-How do you know it?
-I am feeling well. Therefore I dont have cancer.
-Have you checked if you have cancer?
-No, I dont need to. I am feeling well!
-But you can have a very small beginning tumour and still feel well?
-No, if I am feeling well, then I dont have growing cancer, I dont have any tumours.
-Maybe you should check?
-No, I dont need to. I know I am feeling well.

That is RELIGION. The same as old church used. No proof, no hard science, no research, no test.

I am telling you that your strategy of Belief and Religion is not good. You should start to do checksums of all files, and every month you should compare checksums to see if there are any bit flip flops.

If your software crashes, maybe it is bit flip flop? Not a bug? Microsoft said that 20% of all Windows crashes were because of bit flip flops in RAM, therefore MS wanted everybody to use ECC, to make Windows more stable. You know, when Windows crash, then MS gets logg data over internet, so MS can see why the PC crashed.
 

Aesma

[H]ard|Gawd
Joined
Mar 24, 2010
Messages
1,854
"I dont get it really. Who is in charge of the disk? It hides the disk to ZFS? So ZFS is not in charge?"

Like I said, it's a virtualized block device. The HAST driver takes a remote IP/port and a local block device and treats it (sorta) like a raid1 - writing to both sides before telling the client that the write is complete. Reads are only done from the local disk unless it is dead or some such. So ZFS is in charge, but in charge of the virtual block device, and any and all checksumming, etc, is performed just like for a 'real disk'. Is that clearer?

Doesn't that kill performance, or even causes ZFS to think the drives are defective ?

About data corruption, I don't have ECC at home yet, each hardware part I own is tested and good, but some things I do or some software I use might be dodgy. Anyway I only uses drives separately so far, with for each one another of the same size as backup. I started doing CRC32 hashes of the contents of some drives and their backups, and I find inconsistencies everywhere.
 

danswartz

2[H]4U
Joined
Feb 25, 2011
Messages
3,715
"Doesn't that kill performance, or even causes ZFS to think the drives are defective ?"

If you have a gigabit link to the other host, you can get about 100MB/sec, about what a regular sata disk can manage. Remember this is supposed to be for high availability with real-time block-level replication - there is no easy answer if you need to do that in real time so failover can be done seamlessly. Why would that make zfs think the drive is defective?
 

danswartz

2[H]4U
Joined
Feb 25, 2011
Messages
3,715
Since local and remote have to be written, sure, you will have slower writes. Again, though, this is for a use case where you must have real-time, block-level replication.
 

brutalizer

[H]ard|Gawd
Joined
Oct 23, 2010
Messages
1,600
I started doing CRC32 hashes of the contents of some drives and their backups, and I find inconsistencies everywhere.
Can you tell us more about this? Are you seeing silent data corruption on your files, that you thought were safe?
 

Aesma

[H]ard|Gawd
Joined
Mar 24, 2010
Messages
1,854
Can you tell us more about this? Are you seeing silent data corruption on your files, that you thought were safe?

I don't know if it's silent data corruption as in "the file was fine on the drive, then it was corrupted", or rather, "the file was never fine because it was corrupted during transfer". I just know that until recently I didn't do hashes of my files but after I discovered that a simple move had corrupted a file I started doing this and find differences between original and backup. What is worse is that I often switch the backup and original drives to even out wear so I don't even know which file is good when I find inconsistencies.
 

cantalup

Gawd
Joined
Feb 8, 2012
Messages
758
I am really disappointed with speed in MDADM! Is it really that sucky?

this is depend on your expectation :)
sucky on what? read? write? nfs? windows sharing? and what kind of hardware are you using... memory/IO(HBA) cards

software based RAID can be good or bad, where links to many aspects..
 

cantalup

Gawd
Joined
Feb 8, 2012
Messages
758
I don't know if it's silent data corruption as in "the file was fine on the drive, then it was corrupted", or rather, "the file was never fine because it was corrupted during transfer". I just know that until recently I didn't do hashes of my files but after I discovered that a simple move had corrupted a file I started doing this and find differences between original and backup. What is worse is that I often switch the backup and original drives to even out wear so I don't even know which file is good when I find inconsistencies.

what filesystem are you using?
some filesytems do padding bytes ...oops...:p, it looks like bad/corrupted file during cheksuming.
can you load to your program or app (require those files) to check those files are bad/corrupted?

get better hardware to minimize you headache...
 

cantalup

Gawd
Joined
Feb 8, 2012
Messages
758
ZFS is not a God that can handle everything. But the design goal of ZFS is safety, not speed.

I believe ZFS is safer than any other solution, because of the research I have showed here earlier. Researchers says that ext, XFS, NTFS, etc - are not safe and might corrupt your data. And researchers say that ZFS is safe.

I base my belief on research, not fanboyism. If researchers say that a new filesystem is safer, then I immediately switch away from ZFS. I base everything on research and hard science. I dont stay with ZFS because Solaris tech is "best" or some fanboy, I stay with ZFS because of all research on data corruption. Have you read the research on data corruption? No? But I have. Maybe you should read the research on data corruption too.


If your software crashes, maybe it is bit flip flop? Not a bug? Microsoft said that 20% of all Windows crashes were because of bit flip flops in RAM, therefore MS wanted everybody to use ECC, to make Windows more stable. You know, when Windows crash, then MS gets logg data over internet, so MS can see why the PC crashed.

to not go in in the loop, I will conclue ti.
1) I do NOT say , you are uring "fanboyism", did I mentioned?
2) I do share with my real experiences and my knowledges and not a god that know everythig
3) I always love to learn with open-minded ( not locked vendor or filesystem os OS)
4) I like ZFS too, please check on my previous post, and love others too as long as make my works done
5) I researched and read "Data corruption" papers and love to compare based on the authors.
6) did i SAID... ECC on server. why did you rephrase flipflop again and again?...
7) I use windows and linux at works.. I have to work on those :D, making a living and saving money is my first priority :p...
8) I would like to stop "in" the loop discussion as we seen before. some hardforumers already know :).

end of posting.
 
Top