Help me find an utterly reliable chipset/board/OS combo

unhappy_mage

[H]ard|DCer of the Month - October 2005
Joined
Jun 29, 2004
Messages
11,455
Continuation from here, which I felt I'd disrupted enough. Anyone's welcome to give input.

Summary to this point: I'm looking into doing some testing of disk performance, and I want to put together a reputable platform to do so from. I'm not planning on doing tests under Windows, so "Alternative OS" support is important. I'm willing to buy a new card, and possibly a new system, to achieve this goal.

Now, to continue replying to the last post:
The retards fucked it up back in 2.2, LSI put out a proper one somewhere in 2.4, then they promptly fucked it up again. I mean come on, they can't even get TCP checksum on the EEPro's right - STILL! It has to do with higher level kernel functions that are horrifically broken to put it mildly.
Yeah, 2.2 is a bit old for my tastes. "Other OS" it is.
I'll be happy to save you cash and shatter their idiotic dreams for free. Raptor performing on par with a 15K MAX, HA HA HA! Yeah, also, President Bush resigned, Osama Bin Laden apologized for the WTC, and THG published a factual review.
Wow, really?! That's great! However, some people ask for definitive proof of this, and since everyone "in the know" about scsi know this so clearly, explaining it to everyone else is a waste of time. That's my theory, anyways, since there are no public conclusive tests of sata versus scsi on desktop or server apps that I've seen.
However.. I haven't done enough testing with the SATA MegaRAIDs and Solaris. I can't recommend it. They're not on the HCL. If it's not on the HCL yet, it's not safe. So, yeah, welcome to the land of you're just boned. *sigh*
Any opinions on the Marvell 6081s? I've got one of those, they're on the HCL for Sun, and Sun uses 6 of them in the 4500 (which is a pretty neat-looking machine - but the bus layout scares me, that many 8132s just isn't right :p), so they can't be *that* bad. I know the Linux support for them is shaky at best, though - DMA problems. No great surprise.
I don't screw around with bus. Contention will destroy validity. So unless you've already got some very big boxes laying around, start planning on laying out somewhere around $2-3K for a test platform. At least.
I figured so. I've got a dual P3 with PCI-X I'll probably use for the initial tests, then one of these days it's an H8DCE for me. The newer Opterons are neat and all, but I haven't seen any of the 2000 series that strike my fancy. Supermicro's boards all seem to have NEC PCI-X chipsets or not enough pci express lanes. Any recommendations?
(which specifically excludes everything from Tyan.)
Why am I not surprised by that? :p
 
Wow, really?! That's great! However, some people ask for definitive proof of this, and since everyone "in the know" about scsi know this so clearly, explaining it to everyone else is a waste of time. That's my theory, anyways, since there are no public conclusive tests of sata versus scsi on desktop or server apps that I've seen.

Oh, there are. I'm not sharing them because I get enough arguing with idiots in my diet from my clients. Depending on application and load, SATA continually gets smoked, and even U160 has time for a money shot. The only place SATA actually does any good is when serving limited amounts of static content - specifically, a few dozen files that all can fit in cache combined optimally. This trend, predictably, remains with caching controllers added to the mix as well.

unhappy_mage said:
Any opinions on the Marvell 6081s? I've got one of those, they're on the HCL for Sun, and Sun uses 6 of them in the 4500 (which is a pretty neat-looking machine - but the bus layout scares me, that many 8132s just isn't right :p), so they can't be *that* bad. I know the Linux support for them is shaky at best, though - DMA problems. No great surprise.

ARRRRRRRRRRRRGH.
NO. EN. OH. DIE. BURN IN HELL. ROT.
The 6081 is a steaming pile of CRAP, just like the X4500 - which is a steaming pile of CRAP. The bus layout is so broken that you have NO hope of achieving full throughput anywhere. The 6081 itself.. NO. Just NO. There is SOMETHING with those that really is wrong, and I haven't had time to figure out what yet. I think they likely have a broken DMA engine. (Not that Linux actually does DMA, but there's other behavior I've seen.)

unhappy_mage said:
I figured so. I've got a dual P3 with PCI-X I'll probably use for the initial tests, then one of these days it's an H8DCE for me. The newer Opterons are neat and all, but I haven't seen any of the 2000 series that strike my fancy. Supermicro's boards all seem to have NEC PCI-X chipsets or not enough pci express lanes. Any recommendations?

Everybody jumped to the garbage that is the NEC uPD 400-series. It uses PCI-Express lanes to play PCI-X; this doesn't work. It causes a significant increase in latency, and adds another ugly layer between bus and CPU. The AMD8132 is an HT-connected bridge. Supermicro has some 8132 boards now; H8DM3, H8DMi, and H8DM8 are all MCP55P + AMD8132, as are the H8QM's. (Rumour mill also says to expect an H8DM3-similar with full dual 16x sometime after May.)

unhappy_mage said:
Why am I not surprised by that? :p

Because you've had the misfortune of dealing with Tyan's Billing Twice^W^W"Customer Service" department?

Robstar said:
Supposedly a firmware upgrade fixes disk corruption under load....?

Completely untrue, yet another Linux cluebie. The reason LSI hasn't replied? Because that's a load of crap; if it was a firmware issue, it would affect all operating systems. (The MegaRAID family use a unified API.)
The latest firmware is 713R, here's the release notes:
http://www.lsilogic.com/files/support/rsa/sata/firmware/LSI_713R_Readme.txt
By the way, only two firmware releases for the 150-6.
There is a fix for a corruption when in degraded bug, and a fix for Linux's inability to preseve pointer integrity (AGAIN) causing the controller to vanish. No such load corruption issue. Typical ignorant whiner. The fact is that the driver is broken because of Linux kiddies screwing around with things they don't understand, and applying dozens of patches to drivers rather than actually fixing kernel pointer integrity - which is responsible for a number of very nasty and absolutely unacceptable bugs.
 
Oh, there are. I'm not sharing them because I get enough arguing with idiots in my diet from my clients. Depending on application and load, SATA continually gets smoked, and even U160 has time for a money shot. The only place SATA actually does any good is when serving limited amounts of static content - specifically, a few dozen files that all can fit in cache combined optimally. This trend, predictably, remains with caching controllers added to the mix as well.
Hence the word "public", as distinguished from "private".

At school here even mail is on sata storage. How does this even begin to work? They threw a half dozen boxes with 4gb of memory (V20z's) each in front of the FC->sata backend. Lots and lots of buffer, cheap (well, as these things go; they needed the frontends anyways for AFS), and they can add more storage for cheaper than scsi/FC prices - they're only using a quarter of that memory right now, there's plenty of room for growth. Spending the money once, in their case, pays off; they can buy a couple more of the XServe Raid boxes they use for backend storage.
ARRRRRRRRRRRRGH.
NO. EN. OH. DIE. BURN IN HELL. ROT.
The 6081 is a steaming pile of CRAP, just like the X4500 - which is a steaming pile of CRAP. The bus layout is so broken that you have NO hope of achieving full throughput anywhere. The 6081 itself.. NO. Just NO. There is SOMETHING with those that really is wrong, and I haven't had time to figure out what yet. I think they likely have a broken DMA engine. (Not that Linux actually does DMA, but there's other behavior I've seen.)
I thought you might say something of the sort. Are the numbers they throw out (2GB/s to memory, 1GB/s to network) blatant lies or misdirection? If one spreads the access over all the buses, would that make it better or worse? Just curiosity, I don't want to buy one. At $60k, it's not gonna happen any time soon anyways.
Everybody jumped to the garbage that is the NEC uPD 400-series. It uses PCI-Express lanes to play PCI-X; this doesn't work. It causes a significant increase in latency, and adds another ugly layer between bus and CPU. The AMD8132 is an HT-connected bridge. Supermicro has some 8132 boards now; H8DM3, H8DMi, and H8DM8 are all MCP55P + AMD8132, as are the H8QM's. (Rumour mill also says to expect an H8DM3-similar with full dual 16x sometime after May.)
That'd be neat. It does seem silly to add extra latency that way. But then, I don't design motherboards for a living; perhaps there's some obvious reason to avoid the 8132 in favor of the NEC? Other than cost, that is?
Because you've had the misfortune of dealing with Tyan's Billing Twice^W^W"Customer Service" department?
Well, that, and I've read your posts before ;)
 
Oh, there are. I'm not sharing them because I get enough arguing with idiots in my diet from my clients. Depending on application and load, SATA continually gets smoked, and even U160 has time for a money shot. The only place SATA actually does any good is when serving limited amounts of static content - specifically, a few dozen files that all can fit in cache combined optimally. This trend, predictably, remains with caching controllers added to the mix as well.



ARRRRRRRRRRRRGH.
NO. EN. OH. DIE. BURN IN HELL. ROT.
The 6081 is a steaming pile of CRAP, just like the X4500 - which is a steaming pile of CRAP. The bus layout is so broken that you have NO hope of achieving full throughput anywhere. The 6081 itself.. NO. Just NO. There is SOMETHING with those that really is wrong, and I haven't had time to figure out what yet. I think they likely have a broken DMA engine. (Not that Linux actually does DMA, but there's other behavior I've seen.)



Everybody jumped to the garbage that is the NEC uPD 400-series. It uses PCI-Express lanes to play PCI-X; this doesn't work. It causes a significant increase in latency, and adds another ugly layer between bus and CPU. The AMD8132 is an HT-connected bridge. Supermicro has some 8132 boards now; H8DM3, H8DMi, and H8DM8 are all MCP55P + AMD8132, as are the H8QM's. (Rumour mill also says to expect an H8DM3-similar with full dual 16x sometime after May.)



Because you've had the misfortune of dealing with Tyan's Billing Twice^W^W"Customer Service" department?



Completely untrue, yet another Linux cluebie. The reason LSI hasn't replied? Because that's a load of crap; if it was a firmware issue, it would affect all operating systems. (The MegaRAID family use a unified API.)
The latest firmware is 713R, here's the release notes:
http://www.lsilogic.com/files/support/rsa/sata/firmware/LSI_713R_Readme.txt
By the way, only two firmware releases for the 150-6.
There is a fix for a corruption when in degraded bug, and a fix for Linux's inability to preseve pointer integrity (AGAIN) causing the controller to vanish. No such load corruption issue. Typical ignorant whiner. The fact is that the driver is broken because of Linux kiddies screwing around with things they don't understand, and applying dozens of patches to drivers rather than actually fixing kernel pointer integrity - which is responsible for a number of very nasty and absolutely unacceptable bugs.

So I guess my choices are freebsd/openbsd? I am more proficient in OpenBSD but am open to suggestions. I just want something reliable/stable/non-windows for my fileserver.
 
Oh, there are. I'm not sharing them because I get enough arguing with idiots in my diet from my clients. Depending on application and load, SATA continually gets smoked, and even U160 has time for a money shot. The only place SATA actually does any good is when serving limited amounts of static content - specifically, a few dozen files that all can fit in cache combined optimally. This trend, predictably, remains with caching controllers added to the mix as well.



ARRRRRRRRRRRRGH.
NO. EN. OH. DIE. BURN IN HELL. ROT.
The 6081 is a steaming pile of CRAP, just like the X4500 - which is a steaming pile of CRAP. The bus layout is so broken that you have NO hope of achieving full throughput anywhere. The 6081 itself.. NO. Just NO. There is SOMETHING with those that really is wrong, and I haven't had time to figure out what yet. I think they likely have a broken DMA engine. (Not that Linux actually does DMA, but there's other behavior I've seen.)



Everybody jumped to the garbage that is the NEC uPD 400-series. It uses PCI-Express lanes to play PCI-X; this doesn't work. It causes a significant increase in latency, and adds another ugly layer between bus and CPU. The AMD8132 is an HT-connected bridge. Supermicro has some 8132 boards now; H8DM3, H8DMi, and H8DM8 are all MCP55P + AMD8132, as are the H8QM's. (Rumour mill also says to expect an H8DM3-similar with full dual 16x sometime after May.)



Because you've had the misfortune of dealing with Tyan's Billing Twice^W^W"Customer Service" department?



Completely untrue, yet another Linux cluebie. The reason LSI hasn't replied? Because that's a load of crap; if it was a firmware issue, it would affect all operating systems. (The MegaRAID family use a unified API.)
The latest firmware is 713R, here's the release notes:
http://www.lsilogic.com/files/support/rsa/sata/firmware/LSI_713R_Readme.txt
By the way, only two firmware releases for the 150-6.
There is a fix for a corruption when in degraded bug, and a fix for Linux's inability to preseve pointer integrity (AGAIN) causing the controller to vanish. No such load corruption issue. Typical ignorant whiner. The fact is that the driver is broken because of Linux kiddies screwing around with things they don't understand, and applying dozens of patches to drivers rather than actually fixing kernel pointer integrity - which is responsible for a number of very nasty and absolutely unacceptable bugs.

What is a linux cluebie, and are you referring to me or the poster I linked to?
 
Hence the word "public", as distinguished from "private".

Bah. :p

unhappy_mage said:
At school here even mail is on sata storage. How does this even begin to work? They threw a half dozen boxes with 4gb of memory (V20z's) each in front of the FC->sata backend. Lots and lots of buffer, cheap (well, as these things go; they needed the frontends anyways for AFS), and they can add more storage for cheaper than scsi/FC prices - they're only using a quarter of that memory right now, there's plenty of room for growth. Spending the money once, in their case, pays off; they can buy a couple more of the XServe Raid boxes they use for backend storage.

... excuse me. I'm going to slam my head into the desk repeatedly now. That's like how much bad decision making can you get into one configuration. Short of using a sprinkler system in a data center. It's.. just.. ugh. Ugh. Ugh.

unhappy_mage said:
I thought you might say something of the sort. Are the numbers they throw out (2GB/s to memory, 1GB/s to network) blatant lies or misdirection? If one spreads the access over all the buses, would that make it better or worse? Just curiosity, I don't want to buy one. At $60k, it's not gonna happen any time soon anyways.

Partly fabrication, partly fact. The Marvell 6081 is a PCI interfaced controller at 66MHz maximum operation, according to everything I've seen. That's including drivers. That said; what's your maximum speed of a PCI32/66 bus? If you guessed less than 2GB/sec, you win! If you guessed less than 1GB/sec, you win even more! The only way you can do this is by, surprise, same garbage design Sun did with the disks. Cram a whole ton of these hot running bastards onto the board - I would guess a dozen (2 drives per,) which runs you out of PCI controllers very fast (you would need a minimum of 4 PCI controllers.) They may be using port multipliers soldered onto the board to take it down to 6, but either way, it's a giant flaming hack of crap.

unhappy_mage said:
That'd be neat. It does seem silly to add extra latency that way. But then, I don't design motherboards for a living; perhaps there's some obvious reason to avoid the 8132 in favor of the NEC? Other than cost, that is?

The theory is that where it's used, users won't notice the extra latency and timing errors, or will only use 100MHz cards. The uPD can only handle one card at 133MHz because PCIe is timed at 100MHz fixed and the clock generator is apparently crap. I was seeing repeated timing errors at 133MHz and 66MHz.
The primary reason for the NEC uPD? Cost, cost, and cost. Cheaper to design with, cheaper to put on the board, easier to put on the board. Comparatively, the 8132 is a big mother next to the uPD, and also has additional features the uPD doesn't.

unhappy_mage said:
Well, that, and I've read your posts before ;)

Yet you're still here? What am I doing wrong? ;)

Robstar said:
So I guess my choices are freebsd/openbsd? I am more proficient in OpenBSD but am open to suggestions. I just want something reliable/stable/non-windows for my fileserver.

If you're feeling froggy, Solaris x86/64 may or may not work. I have not really tested the PCIe's thoroughly enough. Check the HCL, it's updated constantly.
http://www.sun.com/bigadmin/hcl/data/sol/

Robstar said:
What is a linux cluebie, and are you referring to me or the poster I linked to?

Sorry; didn't mean for you to think I was referring to you. I was decidedly referring to the guy complaining about data corruption. Most likely he had a degraded array, and turned off the alarm. The typical Linux user these days has become a drivelling idiot who's sole reason for using Linux is "I HATE MICRO$0FT!!!"
One of my favorite links to pass around with regards to Linux...
http://www.itbusinessedge.com/blogs/rob/?p=9
 
440BX + Slot1 pIII + Intel Pro 100 = Happy :D
Well, yes, if I wanted IDE or Ethernet ;)
... excuse me. I'm going to slam my head into the desk repeatedly now. That's like how much bad decision making can you get into one configuration.
Expound? Given that the configuration handles ~5k users at once, almost all the time, it can't be too broken. Too expensive for what one gets, perhaps, but it works. What would you have put in place for 10k users, mail and userdirs? Just off the top of your head; I'm not expecting a case study. What part is the most broken?
Partly fabrication, partly fact. The Marvell 6081 is a PCI interfaced controller at 66MHz maximum operation, according to everything I've seen. That's including drivers. That said; what's your maximum speed of a PCI32/66 bus? If you guessed less than 2GB/sec, you win! If you guessed less than 1GB/sec, you win even more! The only way you can do this is by, surprise, same garbage design Sun did with the disks. Cram a whole ton of these hot running bastards onto the board - I would guess a dozen (2 drives per,) which runs you out of PCI controllers very fast (you would need a minimum of 4 PCI controllers.) They may be using port multipliers soldered onto the board to take it down to 6, but either way, it's a giant flaming hack of crap.
A 64/66 device has 8*66=528 MB/s. Mark 20% of that overhead and you've still got 400 MB/s. And yes, they have 10 PCI-X buses. Five 8132s. Take a look at the Architecture whitepaper, page 20. It's not pretty, but damned if they don't have enough PCI buses to get a GB/s or two.
The theory is that where it's used, users won't notice the extra latency and timing errors, or will only use 100MHz cards. The uPD can only handle one card at 133MHz because PCIe is timed at 100MHz fixed and the clock generator is apparently crap. I was seeing repeated timing errors at 133MHz and 66MHz.
Well, heck, I could force the card to 100 mHz if that'd resolve (or avoid) the issue. I'm not planning on testing more than half a dozen disks, so 800 MB/s should be enough. For a start, anyways :p
The primary reason for the NEC uPD? Cost, cost, and cost. Cheaper to design with, cheaper to put on the board, easier to put on the board. Comparatively, the 8132 is a big mother next to the uPD, and also has additional features the uPD doesn't.
Fair enough.
One of my favorite links to pass around with regards to Linux...
http://www.itbusinessedge.com/blogs/rob/?p=9
Response coming later on this one. Long story short, a lot of his points are more about the open source model than about Linux, which *BSD and Solaris also share to some extent.
 
I read an article recently (yea I don't have the link, slashdot I believe) that studied the reliability of SATA vs. SCSI, they found no direct correlation.

We pound the crap out of SATA arrays with no issues to date. I'd test SATAs in our heavy DBs if they made 15K versions.

With today's newer SANs, like our Xiotech, that span all arrays across all containers, the necessity of "hardcore" drives is lessened IMO.
 
If you're feeling froggy, Solaris x86/64 may or may not work. I have not really tested the PCIe's thoroughly enough. Check the HCL, it's updated constantly.

Interesting as I ordered solaris "free" about a few months ago. Waiting for it to show up! The box I'm building it on does not have pci-e. It's a dual athlon-mp I have sitting around.

I have just started getting into linux because we use it here at work. I have been using openbsd since 1997 and freebsd since maybe mid 2005.
 
I read an article recently (yea I don't have the link, slashdot I believe) that studied the reliability of SATA vs. SCSI, they found no direct correlation.
We pound the crap out of SATA arrays with no issues to date. I'd test SATAs in our heavy DBs if they made 15K versions.
With today's newer SANs, like our Xiotech, that span all arrays across all containers, the necessity of "hardcore" drives is lessened IMO.

Believe me, that's a load of crap. I read the supposed study, found faults everywhere, and realized the whole thing was garbage. SATA is nowhere near as reliable as SCSI and FC-AL under prolonged heavy load, period. If they were using SCSI/FC-AL armatures, heads, and motors then maybe. But that's where a large chunk of their cost savings in SATA is, so not happening any time soon. (Weaker armatures and slower motors are cheaper.)

Robstar said:
Interesting as I ordered solaris "free" about a few months ago. Waiting for it to show up! The box I'm building it on does not have pci-e. It's a dual athlon-mp I have sitting around.

I get special treatment from Sun for a variety of reasons, but I ordered my Solaris Express Dev Kit CD when they announced it, got it about two weeks later. Feb07 x86/64 media for free - can't beat it. :)

Robstar said:
I have just started getting into linux because we use it here at work. I have been using openbsd since 1997 and freebsd since maybe mid 2005.

Then you'll have an easy time of it. FreeBSD is OpenBSD with less denial and Theo. And better hardware support. (Most OpenBSD drivers come from NetBSD or FreeBSD anyways.)
 
Believe me, that's a load of crap. I read the supposed study, found faults everywhere, and realized the whole thing was garbage. SATA is nowhere near as reliable as SCSI and FC-AL under prolonged heavy load, period. If they were using SCSI/FC-AL armatures, heads, and motors then maybe. But that's where a large chunk of their cost savings in SATA is, so not happening any time soon. (Weaker armatures and slower motors are cheaper.)

I get special treatment from Sun for a variety of reasons, but I ordered my Solaris Express Dev Kit CD when they announced it, got it about two weeks later. Feb07 x86/64 media for free - can't beat it. :)

When it arrives, ill give it a shot :)

Then you'll have an easy time of it. FreeBSD is OpenBSD with less denial and Theo. And better hardware support. (Most OpenBSD drivers come from NetBSD or FreeBSD anyways.)

Drive reliability>
What kind of study have you done? I think the one that was done was the largest done to date and was very scientific. I think the study was done by Google with over 100k drives measuring temp 24/7, smart and a ton of other factors.

FreeBSD/OpenBSD
Yes, so it seems. Might end up using it for a desktop OS too. Just upgraded to 8G and this xp pro-32 is getting tired.

Argh, now I see that nvidia STILL Doesn't have a 64-bit freebsd driver *beats head on desk*

I also just looked at my email and I asked for solaris license/media on JANUARY 15th. Not sure why it never got here. Anyone know if we can just register & download it on their website?
 
Drive reliability>
What kind of study have you done? I think the one that was done was the largest done to date and was very scientific. I think the study was done by Google with over 100k drives measuring temp 24/7, smart and a ton of other factors.

They didn't measure load metrics, startup loads, voltages, and SMART is known to lie on SATA drives and SCSI drives alike. There was no bus monitoring to rule out bus-induced failures, insufficent environmental monitoring, no random samples were taken to inspect flow filters, and so on.

Robstar said:
FreeBSD/OpenBSD
Yes, so it seems. Might end up using it for a desktop OS too. Just upgraded to 8G and this xp pro-32 is getting tired.

Uhm, you do know that XP Pro32 won't do 8GB, right...?

Robstar said:
Argh, now I see that nvidia STILL Doesn't have a 64-bit freebsd driver *beats head on desk*
I also just looked at my email and I asked for solaris license/media on JANUARY 15th. Not sure why it never got here. Anyone know if we can just register & download it on their website?

Take it up with nVidia. They won't provide even basic chipset docs because people like me are around, and we'll gladly have comments in the code like..
/*
* This code is to work around yet another internal corruption issue on nFarce.
*/

If you're registered, yes, you can download Solaris x86/64 for free. I forget the way to it though. It's under products and downloads I think.
 
Believe me, that's a load of crap. I read the supposed study, found faults everywhere, and realized the whole thing was garbage. SATA is nowhere near as reliable as SCSI and FC-AL under prolonged heavy load, period. If they were using SCSI/FC-AL armatures, heads, and motors then maybe. But that's where a large chunk of their cost savings in SATA is, so not happening any time soon. (Weaker armatures and slower motors are cheaper.)

Agreed. I've had both apart. I argue how much "beef" is required? From the pounding I've seen recent SATA drives take, perhaps hardcore SCSI drives have been way over-engineered and not necessary?
 
They didn't measure load metrics, startup loads, voltages, and SMART is known to lie on SATA drives and SCSI drives alike. There was no bus monitoring to rule out bus-induced failures, insufficent environmental monitoring, no random samples were taken to inspect flow filters, and so on.



Uhm, you do know that XP Pro32 won't do 8GB, right...?



Take it up with nVidia. They won't provide even basic chipset docs because people like me are around, and we'll gladly have comments in the code like..
/*
* This code is to work around yet another internal corruption issue on nFarce.
*/

If you're registered, yes, you can download Solaris x86/64 for free. I forget the way to it though. It's under products and downloads I think.

I know xp pro won't do 8GB :) I was planning to move before I bought the ram, anyhow :)
 
Back
Top