ZFS on Linux vs. MDADM ext4

drescherjm

[H]F Junkie
Joined
Nov 19, 2008
Messages
14,937
you are a contributor to them for kernel patch. nice!

I bisected the kernel down to the exact patch that caused the misbehavior (which was basically a deadlock in the kernel on hot plugging a device) and then tested and a couple of patches to help understand what exactly was the cause of the failure. It turned out to be not raid related but related to the entire scsi subsystem.

raid 5/6 on lvm? hmm.. that's make lvm2(or newer) getting fat aka complicated :D

Being on the lvm2 mailing list I read the commit logs from time to time to see these commits.. I am not sure why they are doing this. I guess they already had raid 0 and 1 and spanning so ..
 

omniscence

[H]ard|Gawd
Joined
Jun 27, 2010
Messages
1,311
One feature still missing in mdadm is a proper spindown (MAID) handling. Its the reason I'm currently considering a hardware raid card. Apart from that it is really stable and feature complete. Just last week I moved 4 drives of a running 6 drive RAID6 from one controller to another without a complete rebuild. I'm not aware of a hardware controller allowing that.
 

cantalup

Gawd
Joined
Feb 8, 2012
Messages
758
....

Being on the lvm2 mailing list I read the commit logs from time to time to see these commits.. I am not sure why they are doing this. I guess they already had raid 0 and 1 and spanning so ..

indeed as I know, spanning (raid 0 style) and mirroring (raid 1 style) on LVM2
 

brutalizer

[H]ard|Gawd
Joined
Oct 23, 2010
Messages
1,600
Why is oracle developing BTRFS to linux? And I had actually never heard of it, seems very interesting, will it be a raid filesystem or do I have to use MDADM + BTRFS?
BTRFS is very similar to ZFS, some say it is a ZFS wannabe, just like Linux SystemTap is a Solaris DTrace wannabe, Linux Systemd is a Solaris SMF wannabe, Linux Open vSwitch is a Solaris Crossbow wannabe, Linux OpenVZ / LXC is a Solaris Container wannabe, etc etc. It seems that Linux devs are cloning heavily Solaris tech, it would be great to see Linux devs inventing something new and innovative themselves, instead of cloning from other OSes, and at the same time insisting they never cloned anything.


Because BTRFS is modeled after ZFS, you dont need MDADM. BTRFS is actually a huge monolithic piece of software with built in raid. However, because ZFS has no separate layers (for instance, communicating with an external raid layer) it means ZFS is "badly designed" and it is a "rampant layering violation" according to Linux kernel hackers, such as Andrew Morton:
https://blogs.oracle.com/bonwick/entry/rampant_layering_violation
This new innovative design was definitely not normal, and I understand Linux devs got upset of that radical crazy idea, but it is exactly that design that allows ZFS to offer data integrity superior to normal filesystems: because ZFS has control of the whole chain, from RAM down to disk. Normally you have several software layers, each handling its own separate layer, for instance, ext4 + MDADM. LVM2. etc.

(Of course, now that Linux devs have built BTRFS, the "rampant layering violation" is no problem anymore. It was only a problem as long as Solaris was alone of doing it. Now that Linux also has a monolithic filesystem, it is the best idea in the world. Of course, that crazy idea emerged from the Linux world, they never looked at ZFS - no matter how much BTRFS creator mentions ZFS as a reference design from where he take good ideas from. And of course C# was no Java clone in the beginning, as Microsoft supporters like to claim)



One concern is that adding checksums here and there, to provide data integrity does not cut it. CERN did a study on this, and their conclusion is "adding checksums it not enough" to give data integrity. For instance, hard disks have many checksums and still they get read/write errors. Even the new DIF data checksum disks to combat data corruption, fail to give data integrity. The conclusion is that it is difficult to get data integrity, you need to know how to do it. Many have tried, and all have failed (except ZFS, according to researchers). Apparently, ZFS devs alone seems to succeeded with data integrity, because of the research papers on ZFS data integrity. But until I see research on BTRFS too - it will just be as safe as DIF disks, or raid solutions to me: in other words, no better data integrity than the rest of the bunch.
 

JoeComp

[H]ard|Gawd
Joined
Jan 23, 2012
Messages
1,036
nothing wrong with copying good designs.

Everyone copies everyone else. It is very rare to find something that is completely new and original. Certainly the Solaris engineers built much of their technology on things that came before, also.
 

brutalizer

[H]ard|Gawd
Joined
Oct 23, 2010
Messages
1,600
Of course you can copy, I do that all the time. But then you should give credit, or else you are trying to steal from others.

For instance, the Linux Systemtap team talked lots about DTrace, and later removed every reference to DTrace in the logs. Great with all the cred they are giving, dont you think?
https://blogs.oracle.com/ahl/entry/dtrace_knockoffs

"Amusingly, in an apparent attempt to salvage their self-respect, the SystemTap team later renounced their inspiration. Despite frequent mentions of DTrace in their early meetings and email, it turns out, DTrace didn't actually inspire them much at all:
CVSROOT: /cvs/systemtap
Module name: src
Changes by: kenistoj@sourceware.org 2006-11-02 23:03:09
modified files:
. : stap.1.in
Log message:
Removed refs to dtrace, to which we were giving undue credit in terms of
"inspiration."



BTRFS is a ZFS wannabe, but there are people insisting BTRFS is new and original work. Which is not. Even the creator speaks lots of ZFS, and confessed he looks at ZFS to get good ideas from.

There are many similar stories.


Of course Solaris engineers built on other tech too, but Solaris are one of the original Unix, contributing much to Unix. I dont see all OSes drool over, say, Linux tech and porting or cloning Linux tech? But everybody is cloning or porting Solaris tech. For instance:

If we talk about DTrace
IBM AIX is now cloning it, and calls the clone AIX Probevue
Mac OS X has ported DTrace
FreeBSD has ported DTrace
VMware has cloned it, and calls it vProbes: http://x86vmm.blogspot.se/2007/09/presenting-vprobes.html
QNX has ported it
Linux has cloned it, and calls it Systemtap, see above for link
NetApp has talked about porting DTrace to OnTap: http://rajeev.name/2007/07/30/would-dtrace-make-sense-on-ontap/

Except Windows, how many large OSes are left without wanting DTrace? None? Now, give me a similar list of Linux tech, that everybody clones or ports. There is no such list. The innovation level is not that high among Linux devs. NiH syndrome makes them clone everything, meanwhile bad mouthing other tech, until Linux has similar tech - and then it is the best in the world. *sigh*
 

kac77

2[H]4U
Joined
Dec 13, 2008
Messages
2,893
Solaris Container sure as hell wasn't the first virtualization solution either. This notion that all things begin with Solaris really is an odd one to make. There are tons of programs that were ported from Linux to Solaris. Why you are making this point is beyond me.
 

klank

Killer of Killer NIC Threadz
Joined
Aug 22, 2011
Messages
2,177
This thread should be locked. It has gotten wayyyyy off topic.
 

madrebel

Gawd
Joined
Sep 23, 2011
Messages
724
Of course you can copy, I do that all the time. But then you should give credit, or else you are trying to steal from others.

how many law suits have happened in the last 3 years over stupidly tiny software features? i'm not saying the systemtap group aren't a bunch of chodes but I don't have all the facts and could easily chalk up their back tracking to oracle now owning solaris and them being terrified of larry's army of lawyers.
 

cantalup

Gawd
Joined
Feb 8, 2012
Messages
758
WOW... this is out of control hehehe
the good thing is OP already had a decision! :)....

someone is really hate linux :)....

this make me to reread long journals for SCO fiasco where backed-up behindly by microsoft and ehem... sun...
I was following SCO fiasco for years..one site covered everything for the beginining to the last drop when SCO got delisted from stock market and filed bankcrupty protection ( I assume everyone remembers who is behing SCO at that time, and got "big" money and bought some SCO patents very cheap, poor SCO)

indeed, Solaris is not "an angels", it copied/converted some tools from outside solaris and make it available for solaris.
linux, freebsd, and others are doing too.
as long as we are in open source community, sharing "how to" is supposed to be good.
copied the whole thing without differences are bad for "open source" community.

as someone arguing on ZFS vs BTRFS, hey cool down, BTRFS came from oracle before bought dying sun. please complain to oracle about BTRFS, .........

last posting, cya on other linux/solaris thread :).
 

brutalizer

[H]ard|Gawd
Joined
Oct 23, 2010
Messages
1,600
Solaris Container sure as hell wasn't the first virtualization solution either. This notion that all things begin with Solaris really is an odd one to make.
I am not saying that Solaris Containers was the first virtualization solution. IBM has done lots in virtualization - many years ago, primarily with their Mainframes and later moved tech into IBM AIX. For instance, Solaris has cloned the old IBM LPAR virtualization technique and calls it Solaris LDOM. Today IBM AIX (or Mainframes) is not inventing anymore, merely copying others: for instance DTrace.

Solaris Containers are among the first (if not the first?) virtualization technique which is light weight. It means, only one kernel is running. Say you install a Linux container into Solaris, then every Linux kernel call gets transformed to a Solaris kernel call. Thus, if you run 100 Linux containers, they all get mapped to the single Solaris kernel. So a Solaris container is very lightweight: 40MB extra RAM (just allocating some extra kernel structs). One guy booted 1.000 Containers on 1GB RAM PC, it was very slow, but it was possible.

Compare to VMware Workstation; if you boot 100 Linux VMs, then 100 Linux kernels are running, consuming much RAM.

Of course, IBM AIX has later copied Solaris Containers (as Linux has) and now calls it AIX WPAR. But according to IBM, they have invented a new and innovative technique. :)

On the contrary, VMware have copied DTrace, but gives official credit to the DTrace team. That is how it is to be done: Dont try to be a b*tch and claim the hard work as your own, instead give credit?



Together with ZFS, Containers creates magic. Say you install a Container and configure software and test it. Then you can make a snapshot and deploy a new fully configured and tested VM in one second. The new VM clone will have its own ZFS filesystem where it writes its own data, but read from the Master VM. Thus, only the changes are saved in the new ZFS filesystem. Now, mix in Crossbow, the Solaris network virtualization technique, and you can build up whole huge networks with different exclusive IP adresses to each VM, firewall, switches, etc in one single PC.

Solaris is true to Unix: well working tools that can be combined in various awesome ways to do magic.


There are tons of programs that were ported from Linux to Solaris. Why you are making this point is beyond me.
Sure there are programs ported from Linux to Solaris, like media players, etc. But, where is the innovative Linux programs that everybody must have? Everybody drools over DTrace and wants it and cloned or ported it as we have seen.

Can you name a Linux software that everybody must have and drools over and ported or cloned? There are none. The level of innovation is not high among Linux devs: masters of cloning.
 

Red Falcon

[H]F Junkie
Joined
May 7, 2007
Messages
11,804
UNIX fanboi vs Linux fanboi, just, wow.
What you are both doing is completely foolish.
Both OS types and all of their distros are completely awesome, and both have their respective strengths and weaknesses.

UNIX = stable but rigid
Linux = flexible but divided

We all have our opinions on these OSes, but just let it go.
In the end, use what works for your specific needs.

Instead of fighting each other, why not go up against real fanbois like heatlesssun who fucking worships M$ and that pathetic excuse for an OS, Windows.
 

cantalup

Gawd
Joined
Feb 8, 2012
Messages
758
UNIX fanboi vs Linux fanboi, just, wow.
What you are both doing is completely foolish.
Both OS types and all of their distros are completely awesome, and both have their respective strengths and weaknesses.

UNIX = stable but rigid
Linux = flexible but divided

.....s.

solaris is UNIX, linux is UNIX too...(see picture, taken from the net)
2xdsy.jpg


the weakness of all is narrow minded make us blind:p.
 

Red Falcon

[H]F Junkie
Joined
May 7, 2007
Messages
11,804
Linux is not exactly UNIX, per se, as you can see it was developed from Minix, aka a "free" version of UNIX.
I'm sure you all already know this, but saying Linux is UNIX is not correct.

Yes, it was definitely derived from UNIX and was certainly UNIX-based in it's beginnings.
However, I have worked with many of both UNIX and Linux distros, and I can tell you, Linux is certainly not UNIX.


Don't change the subject!
We need to focus all of our *NIX hatred and rage at Microsoft!

UNIX and Linux, unite! :D
 

cantalup

Gawd
Joined
Feb 8, 2012
Messages
758
Linux is not exactly UNIX, per se, as you can see it was developed from Minix, aka a "free" version of UNIX.
I'm sure you all already know this, but saying Linux is UNIX is not correct.

Yes, it was definitely derived from UNIX and was certainly UNIX-based in it's beginnings.
However, I have worked with many of both UNIX and Linux distros, and I can tell you, Linux is certainly not UNIX.


Don't change the subject!
We need to focus all of our *NIX hatred and rage at Microsoft!

UNIX and Linux, unite! :D

I would revise my words :), some situation I blur all the variances to UNIX for making easy to understand and let them to learn in detail.

bell labs is the grandfather for all UNIX and variances :)

as I know, linux is not directly from minix, but borrowed some codes from minix. I read a book that focus open source, the book discussed minix vs linux om some chapters. a long reading but fun.

I do not change the subject honestly..., just simple point to " linux is not an alien in UNIX world"

I have been working some linux distro( RHEL/CENTOS, slackaware,Debian,SuSE/SLEPOS, and some embedded Linux), solaris( during my study at University), and windows (we can not get rid of it )..

I disagree to bring "rage" at microsoft :) why?
we are not in perfect world, for example, I need to use Microsoft windows, AIX(rarely since alawys try to avoid to use), linux and Proprietary embedded system.
 

brutalizer

[H]ard|Gawd
Joined
Oct 23, 2010
Messages
1,600
Nope Xen predates it and only runs one kernel.
What do you mean? Xen is not as light weight as Solaris Containers. Xen is a hypervisor. Which means it fires up an entire OS instance.

Say I want to start 20 Solaris VM with Xen, then I install Solaris 20 times, and need to use 20 x 4GB RAM = 80 GB RAM in total. And 20 x 3GB disk = 60 GB disk in total.

If I want to use 20 Solaris VMs with Solaris Containers, I snapshot the main Solaris install 19 times which takes 100MB space on the disk for each VM, and then I fire up 19 VMs from my primary Solaris installation, who takes 40MB RAM each. So this uses in total: 4GB RAM + 19 x 40MB = 4,76GB RAM and 3GB disk + 19 * 100MB = 4.9GB disk in total.

Xen with 20 Solaris VMs:
80GB RAM. 60GB disk.

Solaris with 20 solaris VMs:
4.76GB RAM. 4.9GB disk.
 

devman

2[H]4U
Joined
Dec 3, 2005
Messages
2,400
What do you mean? Xen is not as light weight as Solaris Containers. Xen is a hypervisor. Which means it fires up an entire OS instance.

Say I want to start 20 Solaris VM with Xen, then I install Solaris 20 times, and need to use 20 x 4GB RAM = 80 GB RAM in total. And 20 x 3GB disk = 60 GB disk in total.

If I want to use 20 Solaris VMs with Solaris Containers, I snapshot the main Solaris install 19 times which takes 100MB space on the disk for each VM, and then I fire up 19 VMs from my primary Solaris installation, who takes 40MB RAM each. So this uses in total: 4GB RAM + 19 x 40MB = 4,76GB RAM and 3GB disk + 19 * 100MB = 4.9GB disk in total.

Xen with 20 Solaris VMs:
80GB RAM. 60GB disk.

Solaris with 20 solaris VMs:
4.76GB RAM. 4.9GB disk.

If I was managing 20 VMs of the same OS. I'd probably share my /usr directory as a read only network mount between them. While Xen does lack true memory overcommit, there are techniques that can accomplish partially that result.

I'm not disagreeing with you that zones are awesome and more lightweight, but your characterization of Xen is unfair.
 

madrebel

Gawd
Joined
Sep 23, 2011
Messages
724
that isn't really virtualization per se though. that is more like LPAR from the mainframe days. yes, they are unique instances but you can't run windows in a solaris zone without the aid of some type of hypervisor.

and i don't think you're entirely accurate either. fairly certain each zone does run its own (at least partial) kernel otherwise rebooting individual zones would be problematic. i could be wrong though i'm not well versed in the whole zoning thing.

also you're talking 0 state, where you newly provision 20 VMs with no workload or you're assuming identical workload.

if there are differeing workloads your ability to share memory dwindles rapidly and you also start to grow disk space a great deal.
 

kac77

2[H]4U
Joined
Dec 13, 2008
Messages
2,893
What do you mean? Xen is not as light weight as Solaris Containers. Xen is a hypervisor. Which means it fires up an entire OS instance.

Say I want to start 20 Solaris VM with Xen, then I install Solaris 20 times, and need to use 20 x 4GB RAM = 80 GB RAM in total. And 20 x 3GB disk = 60 GB disk in total.

If I want to use 20 Solaris VMs with Solaris Containers, I snapshot the main Solaris install 19 times which takes 100MB space on the disk for each VM, and then I fire up 19 VMs from my primary Solaris installation, who takes 40MB RAM each. So this uses in total: 4GB RAM + 19 x 40MB = 4,76GB RAM and 3GB disk + 19 * 100MB = 4.9GB disk in total.

Xen with 20 Solaris VMs:
80GB RAM. 60GB disk.

Solaris with 20 solaris VMs:
4.76GB RAM. 4.9GB disk.
That's because you aren't running Solaris in paravirtualized form on Xen while you ARE doing so for Solaris.
 
Last edited:

brutalizer

[H]ard|Gawd
Joined
Oct 23, 2010
Messages
1,600
That's because you aren't running Solaris in paravirtualized form on Xen while you ARE doing so for Solaris.
Ok, then I maybe misunderstood Xen. I thought it was a hypervisor, similar to ESXi? I dont want to say things that are not true, so please explain to me, and provide a credible link. And I will stop saying that Xen is heavyweight.




that isn't really virtualization per se though. that is more like LPAR from the mainframe days.
Solaris has copied LPAR and calls it LDOM. LDOM and LPAR is heavy weight virtualization. Uses lot of resources, like ESXi.

IBM has copied Solaris containers, and IBM calls their clone WPAR. Containers / WPAR is lightweight, and dont use much resources, just as in my example with 20 Solaris VMs.


yes, they are unique instances but you can't run windows in a solaris zone without the aid of some type of hypervisor.
I didnt get this. "Run Windows"? You mean Microsoft Windows in a Solaris zone? You can run Linux in a Solaris zone, if you use Illumos. Every Linux kernel call, is transformed to a Solaris kernel call.

and i don't think you're entirely accurate either. fairly certain each zone does run its own (at least partial) kernel otherwise rebooting individual zones would be problematic. i could be wrong though i'm not well versed in the whole zoning thing.
The kernels running in a container is non existent. What has happened is that the one and only Solaris kernel has just duplicated some kernel structs, that are used by the different Solaris VMs. Each kernel struct is 40MB or so. Every kernel in the container, is redirected to the single solaris kernel, but using its own 40MB kernel data structs. Thus, there are only one Solaris kernel active, consuming 40MB for each Container.


if there are differeing workloads your ability to share memory dwindles rapidly and you also start to grow disk space a great deal.
Of course. But the point is, this is more light weight. If you use ESXi, a full hypervisor, then it will fire up 20 entire complete Solaris kernels all running simultaneously. There is not one single kernel with just different data structs swapping in, but there are 20 full kernels active. Each consuming GBs of RAM. Heavy weight.
 

kac77

2[H]4U
Joined
Dec 13, 2008
Messages
2,893
Ok, then I maybe misunderstood Xen. I thought it was a hypervisor, similar to ESXi? I dont want to say things that are not true, so please explain to me, and provide a credible link. And I will stop saying that Xen is heavyweight.

It's best if you read this.
http://wiki.xen.org/wiki/Xen_Overview
http://en.wikipedia.org/wiki/Paravirtualization

When you use paravirtualized guests generally you are able to fit far more guests than you would running fully virtualized. There are some downsides in flexibility. But you'll see much the same thing you were highlighting earlier. In the case of Xen only 1 DOM0 is needed and it's resources are shared to the other PV guests.

I didnt get this. "Run Windows"? You mean Microsoft Windows in a Solaris zone? You can run Linux in a Solaris zone, if you use Illumos. Every Linux kernel call, is transformed to a Solaris kernel call.

Yeah but that's because there's a kernel of Linux built for Zones. Essentially what you are doing is comparing a PV-type implementation with full virtualization and then highlighting the difference.
 

klokwyze

Limp Gawd
Joined
Feb 10, 2011
Messages
153
Oh god... are we talking about ZFS & Nexenta/Illumos? No thanks. I've dealt with that garbage in a massive production environment. What a pain in the ass.

I'll take a legacy "normal" system any day of the week. HP, Adaptec, LSI, etc. RAID controllers with proper monitoring in place are so much more efficient once you relize the new sort of headaches that ZFS gives you.... even when running optimally. The slight speed boosts & extra features just simply aren't worth it for me personally.

I think it's just because it's relatively new though... Once some sort of ZFS becomes the standard, it will be easy as pie.
 

jwcalla

2[H]4U
Joined
Jan 19, 2011
Messages
3,628
Oh god... are we talking about ZFS & Nexenta/Illumos? No thanks. I've dealt with that garbage in a massive production environment. What a pain in the ass.

I'll take a legacy "normal" system any day of the week. HP, Adaptec, LSI, etc. RAID controllers with proper monitoring in place are so much more efficient once you relize the new sort of headaches that ZFS gives you.... even when running optimally. The slight speed boosts & extra features just simply aren't worth it for me personally.

I think it's just because it's relatively new though... Once some sort of ZFS becomes the standard, it will be easy as pie.

I guess you don't care about your data.
 

Red Falcon

[H]F Junkie
Joined
May 7, 2007
Messages
11,804
Oh god... are we talking about ZFS & Nexenta/Illumos? No thanks. I've dealt with that garbage in a massive production environment. What a pain in the ass.

I'll take a legacy "normal" system any day of the week. HP, Adaptec, LSI, etc. RAID controllers with proper monitoring in place are so much more efficient once you relize the new sort of headaches that ZFS gives you.... even when running optimally. The slight speed boosts & extra features just simply aren't worth it for me personally.

I think it's just because it's relatively new though... Once some sort of ZFS becomes the standard, it will be easy as pie.

It sounds like you don't really know what you're doing, considering everyone else here is having success with it.
Also, you res'd a 2 year old thread to say that? :rolleyes:
 
Top