Intel P3608 1.6TB vs. Amfeltec 4x 1TB 960 Pro

H2R2P2

Limp Gawd
Joined
Jun 18, 2006
Messages
412
I have an Intel P3608 1.6 TB in my home workstation. Not sure why I really did this, but decided to try out the Squid from Amfeltec and put 4 1TB Samsung 960 Pro NVME drives on there and do a comparison. Though I would share the results:

P3608
P3608.png


Amfeltec:
Amfeltec.jpg


Anyone have comparison numbers to something similar so I can glean whether I am getting the full performance out of these?
 
Yes, try pg_test_fsync and you will se that the Samsung 960 pro fails big time.
 
What does the 4K test show? Random read/writes with 4K blocks? I would have expected it to be higher with a brand new NVME drive?
 
Nice. I looked into the Amfeltec card, but never bit. Did you buy direct? Mind disclosing what you paid? Thanks. :)
 
I'm tempted to get single thread numbers for this when I am home from my desktop running a billion background apps to see the difference.
 
OK here are my results for a single 960 M2 NVMe PCIE card on my system in the sig.

-----------------------------------------------------------------------
CrystalDiskMark 5.2.1 x64 (C) 2007-2017 hiyohiyo
Crystal Dew World : http://crystalmark.info/
-----------------------------------------------------------------------
* MB/s = 1,000,000 bytes/s [SATA/600 = 600,000,000 bytes/s]
* KB = 1000 bytes, KiB = 1024 bytes

Sequential Read (Q= 32,T= 1) : 3315.483 MB/s
Sequential Write (Q= 32,T= 1) : 1943.791 MB/s
Random Read 4KiB (Q= 32,T= 1) : 760.903 MB/s [185767.3 IOPS]
Random Write 4KiB (Q= 32,T= 1) : 636.454 MB/s [155384.3 IOPS]
Sequential Read (T= 1) : 2243.936 MB/s
Sequential Write (T= 1) : 1933.159 MB/s
Random Read 4KiB (Q= 1,T= 1) : 50.028 MB/s [ 12213.9 IOPS]
Random Write 4KiB (Q= 1,T= 1) : 246.609 MB/s [ 60207.3 IOPS]

Test : 1024 MiB [C: 29.8% (277.5/931.0 GiB)] (x5) [Interval=5 sec]
Date : 2017/03/07 20:15:30
OS : Windows 10 [10.0 Build 14393] (x64)


Just to have a point of reference. My motherboard is the MSI Z270 Gaming M7.
 
Just got back into town and should have some time this weekend to do some more comparisons and 4K read/write tests. One other caveat here is the drives were not installed in the same machine. The 4x 1TB 960 Pro's were installed in a Gigabyte GA-Z270X-Gaming 9 motherboard with an overclocked 7700K. This motherboard has the added PLX chip, but I didnt realize until recently that the overall bandwidth is still crippled to the CPU, so that very well could be why the results are lower than expected. The P3608 is installed in a Haswell-E system, so I should definitely use that system for all the testing. Will do that this weekend and report back.

Also, any suggestions (and links) to benchmarks you want me to run would be appreciated.
 
Nice. I looked into the Amfeltec card, but never bit. Did you buy direct? Mind disclosing what you paid? Thanks. :)

I bought it through a local outfit but think the cost was around $350 for the card itself.. I have my Haswell-E system up here and am looking forward to throwing it in there and see if the extra bandwidth available helps its performance.
 
OK- well, I moved the Amfeltec over to my Haswell-E system and there was quite a jump in performance thanks to the extra PCIe bandwidth available:

upload_2017-3-10_17-56-21.png
 
I find it interesting that my Samsung M.2 pcie blah blah blah was able to out perform your quad storage on a couple metrics. the rest it was trounced as I would expect. Thanks for posting and it is food for thought.
 
Figured I might as well compare this to a ramdisk too.. I used the utility from Dataramdisk to create a 24GB Ram Disk and ran the same test. System is a i7 5960X overclocked to 4.5GHZ with 64 GB Ram (@ 2666):

upload_2017-3-10_18-5-38.png


The main drawback on the Amfeltec as it stands right now is you lose TRIM when you raid the drives. As I was thinking about this, I figured one potential way to deal with it would be to create a VHDX on the array, mount that as a drive and TRIM from that. Of course there would be overhead of being in the VHDX but you would never have to worry about degrading performance. So here are the results from that run:
upload_2017-3-10_18-18-49.png


While TRIM does work in this scenario, there is a performance penalty which doesn't seem worth it. But now we know! :)
 
I find it interesting that my Samsung M.2 pcie blah blah blah was able to out perform your quad storage on a couple metrics. the rest it was trounced as I would expect. Thanks for posting and it is food for thought.

Yeah, I agree.. its rather odd how the array behaves. Some things scale really well, while other metrics tank. Things like databses and VM's would be right at home, but not sure I would recommend someone doing something like this. Its a waste of money for 99% of everyone...
 
Yea if you're looking to spend a few grand on a fast solution a single true PCIE card might be a better solution for you and not have the same issues because of a controller designed to work with it.

I wouldn't ever put a database on something without redundancy at least not in a professional environment.
 
I have an Intel P3608 1.6 TB in my home workstation. Not sure why I really did this, but decided to try out the Squid from Amfeltec and put 4 1TB Samsung 960 Pro NVME drives on there and do a comparison. Though I would share the results:

P3608
View attachment 18304

Amfeltec:
View attachment 18305

Anyone have comparison numbers to something similar so I can glean whether I am getting the full performance out of these?


My 2x sm961 nvme 1TB in software raid-0 in win 10 on X99:


LL
 
These results don't make sense to me, but maybe I'm missing something. Aren't the sequential numbers far higher than the raw maximum PCIe speed? For PCIe2 x4, that's 2000MB/s (each direction), right?
 
I appreciate others adding their benchmarks... I (like Luke M) was thinking something may be off here, so I started digging further and found something rather peculiar.

I monitored my DPC Latency using LatencyMon and found I had several rogue drivers causing problems. I don't have screen shots of the results, but the main culprit was "storpoint.sys" and was being told my system would have problems playing back audio without stuttering. I have SLI Titan-X (Pascal) and a 4K monitor but never noticed issues in games. The only thing I *have* noticed is my mouse seems to be off when playing Doom but only in multiplayer (single player is fine). I went through my BIOS, ensured I am still on the most recent available (and I was). Basically double and triple checked everything, but still had high latency. Cleared my BIOS, set everything back up but no change. Took all my cards out of the system, turned off every add-on via the motherboard and still high latency (storpoint.sys). What made even less sense is I have never used a storpoint driver! Why was it on my system? The only storage I have on this system is a P3608 and the 960 Pro and I have the RSTe and Samsung NVME driver installed for those. Where is storpoint coming from?!?

While I still don't have an exact answer as to how it got set up, I did find out a way to fix it.. I downloaded all the latest drivers, put them on a USB drive, then I re-installed Windows 10 but did so without the system being connected to the internet (so it couldn't find drivers and auto-install them). I installed and set up all the hardware without an internet connection, then looked at my latency. Bottom line is its at or near 0 for everything now! I ran CrystalMark, and sure enough, I got way better transfer rates! Here are new results from a single 960 Pro:

upload_2017-3-11_21-15-46.png


And this is WITHOUT the Samsung NVME driver installed!

Long story short is I think what happened was Windows was auto-installing certain drivers in the background while doing updates and those conflicted with me installing my drivers. Disconnecting from the internet during install and setup fixed this issue. The system does indeed seem much quicker and snappier than before too! I still need to get some things switched around before I can test the P3608 but I will report back to what I find out.

**EDIT & UPDATE**
The stinking Samsung NVME driver is what adds back storpoint and kills my system with latency!!! Most info I read has said the Samsung driver performs *better* than the Windows one. Maybe only X99 chipsets are adversely affected?? Anyone else able to confirm their latency with the Samsung driver installed?
 
Last edited:
Checking in with some older technology.
Four 1.2TB ioDrive IIs in a stripe.

These are passed through to a VM, so my pci-e request size is gimped a little, but the performance is still decent.
The Fusion-io driver prefers multiple threads, so I had to tune that a tad to get max performance, plus there's some windows dynamic disk overhead with the stripe I bet.
E5-2670 as the CPU, 8 cores passed through to the VM.

G0r65RB.png


0ms latency for the whole test.

With 16 cores, I can do a bit better, but this is still a slower CPU.
uibD2wT.png


ATTO Gives me much better numbers throughput-wise since it's not doing all 4K
l7c99vl.png
 
Last edited:
So I'm on a Z270 chipset and I do not have high latency with that driver. Though I do have it with storport.sys.

I have some old tech plugged into the USB ports on the back and wonder if that is the culprit. I'm going to do a reboot and see if disabling some other pieces in windows will resolve the issue for me.
 
Now I'm all bent on figuring out why storport.sys is causing me crazy latency.. in the 2000 range.. GRRRR...
 
@Grinlaking

I know what you mean! I was pulling my hair out trying to track this down. Again- just to verify. You *are* or *are not* using the Samsung NVME driver? If so, you may want to uninstall it as that is what was using storpoint.sys on my system. Removing it fixed all the latency problems. And believe me; I tried disabling EVERYTHING before figuring out that is what it was. I disabled every single unused port and only had a video card installed (I literally turned off spare PCIe ports, all USB, all NICs, all storage controllers, etc) and even re-installed Windows because I didnt want to believe it was the Samsung NVME driver.
 
Yea I may move off of that driver. BUT, I haven't had any real world performance impact due to this and I am far from the only person with this problem with the storport.sys. though when looking it is used for high bandiwidth controllers and is part of the OS. NOT the special Samsung NVMe driver.

And yes I have the Samsung 960EVO... I think it's called out in my SIG.
 
I didnt have any apparent usability issues either, but it did turn out to be affecting the performance of my other storage devices and also attributed to some stuttering in very faced paced gaming. I use an external USB DAC for audio too and again, while I didnt have any issues I "could" cause pops and cracks in my audio if I did certain things. Getting rid of the Samsung NVME driver fixed these issues and the system does feel snappier.
 
Just a note about the Samsung NVME driver. DPC latency was horrible when running the Samsung NVME 2.1 driver, but just fine with the older version 2.0 driver. Running with just the microsoft NVME driver initially looked ok, but running something like the AS SSD benchmark showed some terrible performance. I'd recommend using the Samsung version 2.0 NVME driver. It seems ok, at least for me. Definitely avoid the 2.1 driver with it's horrible DPC Latency.
 
Just a note about the Samsung NVME driver. DPC latency was horrible when running the Samsung NVME 2.1 driver, but just fine with the older version 2.0 driver. Running with just the microsoft NVME driver initially looked ok, but running something like the AS SSD benchmark showed some terrible performance. I'd recommend using the Samsung version 2.0 NVME driver. It seems ok, at least for me. Definitely avoid the 2.1 driver with it's horrible DPC Latency.

Thanks for the info. I will give that version of the driver a shot and see what happens.

**EDIT** Tried out 2.0 version and can confirm it also took care of all latency issues. Stay away from the 2.1 version of the Samsung driver!
 
Last edited:
Back
Top