Large File Transfers with Gigabit Server & Gigabit NAS

GreenLED

Limp Gawd
Joined
Mar 1, 2014
Messages
141
I am supremely frustrated this morning. I have a very large file transfer to complete for a client and (initially) everything was working smoothly at first.

Setup
1. Server with Gigabit Network Adapter
2. Western Digital MyCloud PR4100

BOTH ends of the connection have a gigabit capable adapter.

When I first begin the file transfer, I get a nice steady 110 MB/s transfer rate.

Then, after some time it goes down to 35 MB/s

I have configured a software RAID array on Server 2012 (which is working without issue).

The physical link is a Gigabit Ethernet switch (UN-managed).

What should I try? Flow control adjustment? Jumbo frames? This is unbelievably frustrating.
 
It could be several things. What are the file sizes and the quantities of small files. I have seen copies of thousands to hundreds of thousands of files less than 16K drag a gigabit adapter down to 5MB/s. Is the array REFS, that will make the small file issue even worse. If this is the case you might want to zip the file locally, transfer the large blob and then expand it on the remote system. It could be a network issue as well, can you put wireshark up on the wire and see how your network flow is going. The mycloud is also not a high performance box and may have problems keeping up at wire speed.
 
It could be several things. What are the file sizes and the quantities of small files. I have seen copies of thousands to hundreds of thousands of files less than 16K drag a gigabit adapter down to 5MB/s. Is the array REFS, that will make the small file issue even worse. If this is the case you might want to zip the file locally, transfer the large blob and then expand it on the remote system. It could be a network issue as well, can you put wireshark up on the wire and see how your network flow is going. The mycloud is also not a high performance box and may have problems keeping up at wire speed.

I have seen the flow of these conversations before on other threads. I suppose we shall have to start checking off things.

ALL files I am transferring (98% of them) are very large (3-10 GB) in size. You are (of course) correct about the smaller files choking down the flow of data. But, to make it clear, that is not the issue here. I hate to go full-on Wireshark already without looking at other things.

Should I test with a non-RAID drive just to see if the RAID array is at fault?

Modify Flow control settings? Transmit buffers, receive buffers? What else can I tell you or check-off in search of an answer?

Thank you in advance for all your suggestions so far.
 
If it starts fast, and goes fast for at least a few minutes? It's most likely not your network or your network settings at the root of the problem. I'm the first one to turn to wireshark for problems, but I really doubt it would be very useful in this case. A quick peek might show some major problems, but I'd look elsewhere.

What sort of drives are in the NAS, any chance they're shingled? Long sequential writes to an empty drive should be great for shingled drives, but for that to work, the host OS needs to know about it, and a lot of SMR drives lie about it to their host, so things get slower and slower. If you can get a terminal on the NAS box, you might try a long sequential write dd if=/dev/zero of=somefile bs=1m count=512 or something (with the network copy stopped, so you're not competing); if that's fast and your network copy is slow, maybe it is the network; but if it's slow too, it's the drives.

Also, Western Digital says "Turning Off Cloud Access / Remote Access, DLNA Media Server and iTunes Server can help increase data transfer rates.", so that might be something if your large files are media files, maybe.
 
I have seen the flow of these conversations before on other threads. I suppose we shall have to start checking off things.

ALL files I am transferring (98% of them) are very large (3-10 GB) in size. You are (of course) correct about the smaller files choking down the flow of data. But, to make it clear, that is not the issue here. I hate to go full-on Wireshark already without looking at other things.

Should I test with a non-RAID drive just to see if the RAID array is at fault?

Modify Flow control settings? Transmit buffers, receive buffers? What else can I tell you or check-off in search of an answer?

Thank you in advance for all your suggestions so far.
As I said above, the WD Mycloud unit is basically 2-4 (depending on model) drives and bout $10 worth of the shittiest SOC they could get away with. It just might not be able to keep up. Do you have another box on the network you can mount and see if the problems disappear?
 
If it starts fast, and goes fast for at least a few minutes? It's most likely not your network or your network settings at the root of the problem. I'm the first one to turn to wireshark for problems, but I really doubt it would be very useful in this case. A quick peek might show some major problems, but I'd look elsewhere.

And herein lies the rub.

That is exactly what is happening. It starts out fast for a couple of minutes, then chokes down to around 35 MB/s average.

Therefore, we can rule out the capability of the link, the link is definitely able to sustain 115 MB/s without issue.

BOTH the drives in the NAS and the ones on the server are CMR.
 
Also, Western Digital says "Turning Off Cloud Access / Remote Access, DLNA Media Server and iTunes Server can help increase data transfer rates.", so that might be something if your large files are media files, maybe.

I would love to do that, but I am afraid I will interrupt the copying process while I'm shutting down the service and create a whole new problem for myself.

I was actually able to pause the process on the server-side, update the network drivers, then continue the process -- upon which I experienced the same exact problem, amazing burst speed at first for about 1-2 minutes or so (guessing here) and then back down to slower speeds.
 
I would love to do that, but I am afraid I will interrupt the copying process while I'm shutting down the service and create a whole new problem for myself.

I was actually able to pause the process on the server-side, update the network drivers, then continue the process -- upon which I experienced the same exact problem, amazing burst speed at first for about 1-2 minutes or so (guessing here) and then back down to slower speeds.
No, don't turn off services mid-transfer because there is no way to know how the box would deal with it? How are you copying, just crag/drop/xcopy? A better choice in the future is FreeFileSync, freeware that can see where you left off and just copy the rest, and it is generally faster than just a drag&drop copy!
 
Just curious, is it possible the cache on the WD Red drives on the TARGET server is filling up over and over. When I back off for a period of time and resume transfer, suddenly I'm back to 100+MB/s.

I'm willing to fork over some green if someone can figure this out. This will dog me on the next job even though I will finish this one just fine.
 
Well, the box will have a minimum memory cache and once that is done it should be writing the files directly to the drives, not creating a disk cache just to write it back out. Now I don’t know how it deals when the cache is full, it might read net to ram, write it out to the drives and then wait to refill the cache. I have seen bizarre things with the WD and Seagate consumer/soho gear. It is designed to hit a particular price point, not for performance. Also if you bought it populated, you might be getting inferior drives… drives that are new and mfanufacturer provided but in Burnin and I QVT in they could not run full speed, that is why they were diverted from Enterprise / retail because they worked but not at full speed. I saw this a lot during the drive availability crisis after the Taiwan floods. Drives that worked but not the the performance that a retail drive of the same spec could.
 
Well, the box will have a minimum memory cache and once that is done it should be writing the files directly to the drives, not creating a disk cache just to write it back out. Now I don’t know how it deals when the cache is full, it might read net to ram, write it out to the drives and then wait to refill the cache. I have seen bizarre things with the WD and Seagate consumer/soho gear. It is designed to hit a particular price point, not for performance. Also if you bought it populated, you might be getting inferior drives… drives that are new and mfanufacturer provided but in Burnin and I QVT in they could not run full speed, that is why they were diverted from Enterprise / retail because they worked but not at full speed. I saw this a lot during the drive availability crisis after the Taiwan floods. Drives that worked but not the the performance that a retail drive of the same spec could.

Seems like I run into the same issues over and over again, large data buses that promise the world, but when it comes down to it, just fail.

Would increasing the NAS's RAM help us in this situation possibly?
 
BOTH the drives in the NAS and the ones on the server are CMR.
WD Red drives on the TARGET
Are you doubly sure the WD Red drives aren't SMR? There was that thing where some models of red were SMR, but they didn't tell anybody (least of all the host operating system). Got a model number to double check?

Would increasing the NAS's RAM help us in this situation possibly?
More RAM in the NAS will probably not help really, if your maximum throughput to the drives is 35 MB/s limited by maybe the drives or the SATA interface or something, more RAM won't make that any faster, but it could let you do a longer initial burst of activity to fill up the NAS OS write buffer.
 
Seems like I run into the same issues over and over again, large data buses that promise the world, but when it comes down to it, just fail.

Would increasing the NAS's RAM help us in this situation possibly?
No, it would just give you a few seconds more before it throttled. And if it is doing a yo-yo receive/write/flush/receive it could actually make it worse depending on how they actually buffer the data.
 
No, it would just give you a few seconds more before it throttled. And if it is doing a yo-yo receive/write/flush/receive it could actually make it worse depending on how they actually buffer the data.

What troubleshooting steps would you take in this situation?
 
Well, if I didn't have a choice on the host, I would probably, after the copying is done put to new known good drives into the box set them as RAID0 and seeif you can replicate the problem.
 
Well, if I didn't have a choice on the host, I would probably, after the copying is done put to new known good drives into the box set them as RAID0 and seeif you can replicate the problem.
Even using "Storage Pools" on Windows Server 2012? That is what I'm currently using. It also seems as if there is some sort of difference between storage pools and straight-up "software RAID" available within Disk Management. I could be wrong, but it just seems that way. I'm wondering if there is more overhead on the software RAID system. However, my mind keeps going back to the same problem, I don't think I would even get prolonged periods of 100+MB/s is that were the case. I'm sure there is some sort of buffer here that is limiting me, WHICH BUFFER IS IT? LOL. I can't tell and it's driving me nuts. I'm thinking of trying to use the USB 3 interface on the NAS as well as an A/B comparison to see if I get the same throttling, my guess is, I won't, but that's just a guess at this point.
 
As I said above, the WD Mycloud unit is basically 2-4 (depending on model) drives and bout $10 worth of the shittiest SOC they could get away with. It just might not be able to keep up. Do you have another box on the network you can mount and see if the problems disappear?
This probably won't be helpful, but I have a WD NAS drive, and I've had it for about 400 years. 2 TB single drive unit and that thing never throttles. I can transfer at GB speeds all the time and while the drive is being accessed elsewhere, via DLNA or whatever, I still get reasonable speeds.

Only time it mimics this guys situation is it there are a shit ton of small files.
 
This probably won't be helpful, but I have a WD NAS drive, and I've had it for about 400 years. 2 TB single drive unit and that thing never throttles. I can transfer at GB speeds all the time and while the drive is being accessed elsewhere, via DLNA or whatever, I still get reasonable speeds.

Only time it mimics this guys situation is it there are a shit ton of small files.

I concur. I really don't think it's the WD either. Unfortunately, this means comparisons, which take time, and time costs money. I think I'm going to just let it ride for now until the job finishes so I can invoice. Afterwords, when the dust settles, I can compare all day long and figure out why. I just wish I could get this done faster. Luckily, the process is running on a server out of the way and not prohibiting me from running the rest of my business.
 
I concur. I really don't think it's the WD either. Unfortunately, this means comparisons, which take time, and time costs money. I think I'm going to just let it ride for now until the job finishes so I can invoice. Afterwords, when the dust settles, I can compare all day long and figure out why. I just wish I could get this done faster. Luckily, the process is running on a server out of the way and not prohibiting me from running the rest of my business.
Could you not put the entire transfer into a rar file and then transfer that? One single massive file.
 
Even using "Storage Pools" on Windows Server 2012? That is what I'm currently using. It also seems as if there is some sort of difference between storage pools and straight-up "software RAID" available within Disk Management. I could be wrong, but it just seems that way. I'm wondering if there is more overhead on the software RAID system. However, my mind keeps going back to the same problem, I don't think I would even get prolonged periods of 100+MB/s is that were the case. I'm sure there is some sort of buffer here that is limiting me, WHICH BUFFER IS IT? LOL. I can't tell and it's driving me nuts. I'm thinking of trying to use the USB 3 interface on the NAS as well as an A/B comparison to see if I get the same throttling, my guess is, I won't, but that's just a guess at this point.
Yes, there is a huge difference. Storage pools is accomplished with ReFS, and it is horrible for performance if you setup redundancy. Dog slow at the best of times, slow as a floppy at the worst of times. Writes are horrendous, reads not as much so. It is most likely the lack of the WD box to keep up! Unfortunately, you have so many variables without trying every combination of things to change you might never find the right one. I would still spook up wire shark to just remove the network connection as the issue. Might not be the network as a whole, could be something as simple as a bad patch cable…
 
Last edited:
Yes, there is a huge difference. Storage pools is accomplished with ReFS, and it is horrible for performance if you setup redundancy. Dog slow at the best of times, slow as a floppy at the worst of times. Writes are horrendous, reads not as much so. It is most likely the lack of the WD box to keep up! Unfortunately, you have so many variables without trying every combination of things to change you might never find the right one. I would still spook up wire shark to just remove the network connection as the issue. Might not be the network as a whole, could be something as simple as a bad patch cable…

To be clear, the storage pool is on the Windows Server, the NAS is using a simple RAID 5 array with EXT4 as the file system -- I believe that is the standard configuration on those things.
 
I know. I was just commenting on other things in the chain that can decrease performance. Lol if you could find a way to install Server 2022 on the WD I would be impressed! I know you got thrown into this by a client, but it you ever have one looking in the future for a more reliable set-it-and-forget-it NAS, look to Synology.
 
Sounds like cache running out after a bit of transfer?
On the windows side you can try turning off (maybe) remote differential something, forget the exact name.
If it's a windows explorer copy your doing, try Robo copy, or Robo copy gui
 
Last edited:
Well, the plot thickens. It's NOT the NAS, it's NOT the SOC in it. I think the culprit here is . . .

* drum roll please *

The software RAID configuration. Either the drives or the software layer trying to handle RAID is failing miserably.

Take a look at this image of a transfer I am performing right now that has been going for at least 1.5 hours. 110 MB/s solid. This transfer is now going from a single hard drive connected directly to the internal HBA (as the target drives were before) to the NAS. Absolutely no problem whatsoever.

File Transfer from Server to NAS at Gigabit Speeds.jpg
 
Well, the plot thickens. It's NOT the NAS, it's NOT the SOC in it. I think the culprit here is . . .

* drum roll please *

The software RAID configuration. Either the drives or the software layer trying to handle RAID is failing miserably.

Take a look at this image of a transfer I am performing right now that has been going for at least 1.5 hours. 110 MB/s solid. This transfer is now going from a single hard drive connected directly to the internal HBA (as the target drives were before) to the NAS. Absolutely no problem whatsoever.

View attachment 493808
Yes, most likely because the SOC is shit and in anything but R0 it can't handle it. Right now you are just pumping one drive to the net port, the NAS isn't doing anything taxing like parity verification.
 
Back
Top