10 Samsung 2tb drives, every drive writes with different speed.

zsozso · Dec 17, 2015

Hi Guys,

I have found an issue i can't seem to be able to solve.
I have a machine with the following specs:

Code:

Disks:
12x2tb Samsung
10x3tb Toshiba
CPU:
G2030
Motherboard:
Gigabyte z77
IO:
2x LSI (m1015/h200)
OS:
Freebsd 10.2

I'm using the 10x3tb drives in a raidz2 pool which is capable of 1.3GB/s of reads and 950MB/s writes.

Code:

Pool            : tank (27.2T, 89% full)
Test size       : 64 GiB
normal read	: 1.317 GB/s
normal write	: 947.809 MB/s

I have added the 10x2tb Samsung drives in a raidz2 pool as well, and checked the speed.
It barely did 200MB/s writes and 300MB/s reads

Code:

Pool            : utank (18.1T, 0% full)
Test size       : 64 GiB
normal read	: 317.215 MB/s
normal write	: 232.401 MB/s

I did a dd to see the individual speed of the disks and this is the result:

Code:

dd: /dev/da10: end of device
1907730+0 records in
1907729+1 records out
2000398934016 bytes transferred in 25611.117707 secs (78106663 bytes/sec)
dd: /dev/ada1: short write on character device
dd: /dev/ada1: end of device
1907730+0 records in
1907729+1 records out
2000398934016 bytes transferred in 28510.373941 secs (70163897 bytes/sec)
dd: /dev/ada2: short write on character device
dd: /dev/ada2: end of device
1907730+0 records in
1907729+1 records out
2000398934016 bytes transferred in 30463.944923 secs (65664474 bytes/sec)
dd: /dev/da14: short write on character device
dd: /dev/da14: end of device
1907730+0 records in
1907729+1 records out
2000398934016 bytes transferred in 32311.926303 secs (61908997 bytes/sec)
dd: /dev/da9: short write on character device
dd: /dev/da9: end of device
1907730+0 records in
1907729+1 records out
2000398934016 bytes transferred in 34419.804713 secs (58117672 bytes/sec)
dd: /dev/da12: short write on character device
dd: /dev/da12: end of device
1907730+0 records in
1907729+1 records out
2000398934016 bytes transferred in 37004.582590 secs (54058141 bytes/sec)
dd: /dev/da11: short write on character device
dd: /dev/da11: end of device
1907730+0 records in
1907729+1 records out
2000398934016 bytes transferred in 38248.556852 secs (52299985 bytes/sec)
dd: /dev/da8: short write on character device
dd: /dev/da8: end of device
1907730+0 records in
1907729+1 records out
2000398934016 bytes transferred in 39927.166415 secs (50101200 bytes/sec)
dd: /dev/da15: short write on character device
dd: /dev/da15: end of device
1907730+0 records in
1907729+1 records out
2000398934016 bytes transferred in 52705.992947 secs (37953918 bytes/sec)
dd: /dev/ada0: short write on character device
dd: /dev/ada0: end of device
1907730+0 records in
1907729+1 records out
2000398934016 bytes transferred in 53652.216914 secs (37284553 bytes/sec)

Three of the dd's are still running, according to gstat the speed is less than 20MB/s
Some of the fast drives are on the motherboard some of them are on the LSI controller, so we can rule out the controller as well...
And since we are writing to the raw device, we can rule out the FS too.
Has anyone seen something like this before?

zsozso · Dec 17, 2015

when i do a dd on a slower disk sometimes it goes down to 7MB/s according to gstat.
I have no idea what could cause something like this.
New cables, no issues according to smart, no apparent reason why is it so slow.

drescherjm · Dec 17, 2015

I would try this drive in a different computer and do the dd from a linux live usb stick.

staticlag · Dec 17, 2015

Yes this is common in failing Samsung drives. You will see a lot of SMART errors in attribute 200: multi zone error.

The problem is these drives will produce a lot of latency if you look at the average service time for each request (good ones will do 3-6 millisec, while the bad ones will do 12-200 millisec). Based on my experience with them (I have 20), I think they are very prone to vibration disturbances.

While these were essentially "best value" drives back when they came out roughly 3-4 yrs ago by today's standards they are poor for array usage. They are like 5200-ish rpm power managed drives equivalent to WD's green line. Most people have phased them out of their array configurations by now.

I would suggest to use them in a non striped RAID congifuration if you want to use them at all (JBOD, single disks)

zsozso · Dec 17, 2015

staticlag said:
Yes this is common in failing Samsung drives. You will see a lot of SMART errors in attribute 200: multi zone error.

The problem is these drives will produce a lot of latency if you look at the average service time for each request (good ones will do 3-6 millisec, while the bad ones will do 12-200 millisec). Based on my experience with them (I have 20), I think they are very prone to vibration disturbances.

While these were essentially "best value" drives back when they came out roughly 3-4 yrs ago by today's standards they are poor for array usage. They are like 5200-ish rpm power managed drives equivalent to WD's green line. Most people have phased them out of their array configurations by now.

I would suggest to use them in a non striped RAID congifuration if you want to use them at all (JBOD, single disks)

I cant see any outstanding errors in smart either.

Code:

cat /root/smart.txt |grep Multi_Zone_Error_Rate
200 Multi_Zone_Error_Rate   0x002a   100   100   000    Old_age   Always       -       31
200 Multi_Zone_Error_Rate   0x002a   100   100   000    Old_age   Always       -       6
200 Multi_Zone_Error_Rate   0x002a   100   100   000    Old_age   Always       -       76
200 Multi_Zone_Error_Rate   0x002a   100   100   000    Old_age   Always       -       16
200 Multi_Zone_Error_Rate   0x002a   100   100   000    Old_age   Always       -       8
200 Multi_Zone_Error_Rate   0x002a   100   100   000    Old_age   Always       -       20
200 Multi_Zone_Error_Rate   0x000a   100   100   000    Old_age   Always       -       0
200 Multi_Zone_Error_Rate   0x002a   100   100   000    Old_age   Always       -       9
200 Multi_Zone_Error_Rate   0x002a   100   100   000    Old_age   Always       -       86
200 Multi_Zone_Error_Rate   0x002a   100   100   000    Old_age   Always       -       7
200 Multi_Zone_Error_Rate   0x002a   100   100   000    Old_age   Always       -       15

The whole thing is very strange to me because i was using these disks a while ago in an exact same raidz2 pool and they weren't slow at all.
The disks sat on a shelf for some time, and I have built a 10 disk pool out of it and they are dead slow now?
I will take some of the disks out and try it in a windows environment to see if they behave the same.

zsozso · Dec 17, 2015

Well I've tried it with HD tune and HDD sentinel and the results are less than stellar...
Write:

Read:

HDD sentinel thinks its fine:

Markus.Schragner · Dec 18, 2015

I Think your problem may really be vibrations. Try to test this theory by turning your other array down (power down) and test again. As you have said, This array worked before (in another case? other mounting conditions regarding individual and overall vibrations?)

A While ago i created a Fileserver with 10x 6TB WD Reds. The Spec Sheet of those drives was explicitely mentioning "for arrays up to 10 Drives" it was about the vibration caused by one drive, disturbing the work/head positioning of the other drives.

This shows impressively how prune Harddrives are to Vibration:
https://www.youtube.com/watch?v=tDacjrSCeq4

zsozso · Dec 18, 2015

Markus.Schragner said:
I Think your problem may really be vibrations. Try to test this theory by turning your other array down (power down) and test again. As you have said, This array worked before (in another case? other mounting conditions regarding individual and overall vibrations?)

A While ago i created a Fileserver with 10x 6TB WD Reds. The Spec Sheet of those drives was explicitely mentioning "for arrays up to 10 Drives" it was about the vibration caused by one drive, disturbing the work/head positioning of the other drives.

This shows impressively how prune Harddrives are to Vibration:
https://www.youtube.com/watch?v=tDacjrSCeq4

Well, I think this can't really be the issue, because i took a few drives out and tested them individually in a windows workstation. (Pictures in my last post)

drescherjm · Dec 18, 2015

I would say its time to retire the drives that are doing this or just use them as a backup.

I have a 3 2tb F4s in my linux based pvr that I use every day since I installed them ( Dec 2010?). I have not noticed this yet on any of them. The only problem I had with these so far was the defective firmware bug that caused random data loss if you looked at the SMART data during a write to the disk.

10 Samsung 2tb drives, every drive writes with different speed.

zsozso

n00b

zsozso

n00b

drescherjm

[H]F Junkie

staticlag

[H]ard|Gawd

zsozso

n00b

zsozso

n00b

Markus.Schragner

n00b

zsozso

n00b

drescherjm

[H]F Junkie