Large File over network from SAN to SAN

stunning

Limp Gawd
Joined
Sep 14, 2006
Messages
317
Hello Everyone.

Here is the scenario; Users are trying to copy large files roughly 10-30gb files over the WAN from Server A (county 1) to Server B,C,D (county 2) which are located in two different counties. All of these servers are virtual servers and are in a clustered environment.

Our users are on windows xp: Their process is they UnC to both the file shares and have two windows opened and execute the the copying of files by dragging and dropping. Now when they copy files from county1 to countyB the NIC on the server B is roughly around 750 mbps I/O.

When they start copying from Server A to Server C at the same time the NIC reaches a max of 180 mbps I/O.


County 1: all esx host are connected to a cisco 6509
County 2: all esx host are connected to a nexus 5020


I then started to perform test; I downloaded a program called lan speed test. I was getting anywhere from 16-40 mpbs for read/write. The wide range was due to user congestion during peak times of the day. However there were times I was getting a windows delayed write failure.

After further research I disabled write caching (did not work) hard coded the nic on my pc, access layer switch to 1gbps. This still did not fix the issue.



However just a few minutes ago I decided to RDP into the Server located at County B, opened the shared path to the files in County A and drag and drop the files over. I did this for multiple servers and was able to get 750+ mbps network I/O from all servers while copying the files at the same time...(*to clarify I was able to copy at 750 mbps for Each Server not a total)


if someone can please help or elaborate I would appreciate any help.
 
Last edited:

/usr/home

Supreme [H]ardness
Joined
Mar 18, 2008
Messages
6,160
Is this a 1 Gbps link? If so, copying files at 750 and 180 = 930Mbps which is good on a 1 Gbps link and that's normal.
 

tangoseal

[H]F Junkie
Joined
Dec 18, 2010
Messages
9,330
Shit 1gbps wont take long at all...

Or you can put the files on a brick and drive them over since they are only one couty over.
 

toast0

[H]ard|Gawd
Joined
Jan 26, 2010
Messages
1,471
Client driven copy is probably copy data from server A to client memory, copy data from client memory to server B. Server to server the data will only be on the network once.
 

bloodypulp

Gawd
Joined
May 17, 2010
Messages
903
toast0 is correct.

Also, SMB is ill-suited for file transfers over WAN links.
XP uses SMB 1. With 2008+, SMB 2.x is a little better. That's why your server to server copy is faster.

Split your large files into smaller chunks(to avoid timeout failure) or use a different transport protocol.
Also, a Riverbed Steelhead may accelerate the file transfers.
 

Mackintire

2[H]4U
Joined
Jun 28, 2004
Messages
2,932
I'm assuming your connecting to the SAN via iSCSI target. If this is true you should be connecting the SAN via iSCSI target at the host level and presenting the volume to the guest as a hard drive.

If you are connecting via the guest OS, you are emulating a synthetic adapter and moving data 2X though it. As you add connections the entire system will choke and slow down dramatically.

For a file server do not virtualize the FC or iSCSI connection. If you have the extra nic ports direct connect the file server to its own dedicated nic.

That's two dedicated nics if you are using iSCSI.

There's a lot of data out there about how to properly connect a SAN to a file server in a virtual cluster or an VMware enviroment. If you do it right you should get excellent performance.


We are in the process of fixing our file servers where I work. Before I started, the file server averaged 30mbps to the client with a max of 280mbps measured at the iSCSI virtual port.

The first server I converted now averages 92mbps over our 100mbps MetroE and shows 240-260MB/s during internal disk backups over the 3 iSCSI ethernet ports.

That's the difference between virtual hardware with (1) 1Gbps iSCSI connection routed though a shared network (on its own VLAN) running standard size frames compared to running physical hardware with (3) 1Gbps connections connected via MPIO on a dedicated iSCSI switch with large frames enabled.
 

Mackintire

2[H]4U
Joined
Jun 28, 2004
Messages
2,932
I just reread your description and I bet that the a confirmation of the routed data and going all the way back to the client machine and that's whats slowing the entire process down. Initiating the transfer from one of the servers is always going to be faster and the confirmation packet doesn t have far to travel. It's also hell on the I/O front end system of the SANs. See my previous post for an explanation.


The point is moving large files in this fashion is going to be a little slow at best and clunky and annoying at worst.

As mentioned before SMB 1.0 verse SMB 2.0 is also part of the speed issue.

Good Luck.
 
Joined
Sep 22, 2008
Messages
878
It's definitively the slowest way to copy files because like everyone said already - your data goes through the client to reach the other server. Logging into one of those servers is the best method for SMB copy.
 

Electrofreak

[H]ard|Gawd
Joined
Aug 5, 2008
Messages
1,080
SMB copy sucks to begin with, and I can see some people have already beat me to the SMBv1 vs SMBv2 argument.

SMB is a very chatty protocol and was designed for use on LANs, not over WAN links. Even SMBv2 (Vista, 7, Server2008) isn't great, but a definite improvement on SMBv1 (XP / Server2003).

Use another method of file transfer, like FTP or HTTP.
 

stunning

Limp Gawd
Joined
Sep 14, 2006
Messages
317
SMB copy sucks to begin with, and I can see some people have already beat me to the SMBv1 vs SMBv2 argument.

SMB is a very chatty protocol and was designed for use on LANs, not over WAN links. Even SMBv2 (Vista, 7, Server2008) isn't great, but a definite improvement on SMBv1 (XP / Server2003).

Use another method of file transfer, like FTP or HTTP.

SMB is designed for lan base copy?

So let me get this straight:
-Host initiates copy of files from Storage to Server B at county B storage)
When Host A at County A initiates a copy to Server B at County B, it writes/caches to memory of the desktop then to memory of the server then to disk? Thus the slowness is caused by low memory and the lackiness of the SMBv1 protocol?

But how come when I RDP into Server B at county B, I initiate the copy from server A/storage A and I get full throughput?
 

Mackintire

2[H]4U
Joined
Jun 28, 2004
Messages
2,932
SMB is designed for lan base copy?

So let me get this straight:
-Host initiates copy of files from Storage to Server B at county B storage)
When Host A at County A initiates a copy to Server B at County B, it writes/caches to memory of the desktop then to memory of the server then to disk? Thus the slowness is caused by low memory and the lackiness of the SMBv1 protocol?

But how come when I RDP into Server B at county B, I initiate the copy from server A/storage A and I get full throughput?



I believe you are getting confused.

Via RDP or otherwise.....

If you are logged in and initiating the copy from Server A to Server B = No problem.

If you are logged in and initiating the copy from Server B to Server A = No problem

If you are logged in and initiating the copy from Server C to Server A = No problem

If you are logged in and initiating the copy from Server B to Server C = No problem.

If you log into the desktop of (ANY MACHINE) other than one of the servers you are copying to or from and try to initiate a copy/move/delete/etc the file transfer will relay through the third wheel (aka it will funnel the data through your local machine.) (The one you are typing on) causing a massive slowdown.
 

stunning

Limp Gawd
Joined
Sep 14, 2006
Messages
317
Basically here is the reply I received from the end user

The files will originally be copied to the SAN via FTP. Once they are there we catalog them and make them available to our software, NeoBatch.

They will never be directly copied to directly to Server B (County B). That is software driven. So in the case below, we execute a job in NeoBatch that knows where it is located from when the file was cataloged. In this case, it’s the SAN. When we execute a job, NeoBatch (Server B's) then pulls the file over in order to work on it.

Let me know that’s not clear.


Thus can this mean the application itself has its TCP windowing size limitation? But how does this explain how the first session A to B1 can copy at 700mbps+ and when the second and third session go to Server B2, B3 they are only at 200 mbps each while the first one stays around 700
 

Mackintire

2[H]4U
Joined
Jun 28, 2004
Messages
2,932
Forgive me, but your descriptions fall apart from lack of clarity.

So let me dissect your statements.

"The files will originally be copied to the SAN via FTP."

Understood.

"Once they are there we catalog them and make them available to our software, NeoBatch."

Understood.

"They will never be directly copied to directly to Server B (County B)."

This is the first vague statement.... I am making the assumption that:
They = The end users???

Will never use windows explorer (SMB) to copy files from (Somewhere you don't specify) to Server B (County B)"


"That is software driven. So in the case below, we execute a job in NeoBatch that knows where it is located from when the file was cataloged. In this case, it’s the SAN."

Understood that: NeoBatch is located on Server B and will initiate all transfer from its location on Server B, which is connected to a SAN at county location (B).

I believe you have multiple SANs One in county A and one in county B.


"When we execute a job, NeoBatch (Server B's) then pulls the file over in order to work on it."


It was asked previous if there is a 1Gb connection between the two county's networks.

Your description sounds like there is but you have not confirmed this as fact.

Can you confirm what the connection speed of your WAN links are at both locations?

Are you using a MetroE connection between the two locations? What's there? This is a huge omission and we can only guess as you have not told us.


Assuming you have a gigabit connection, and even this next part is going to be a optimistic number you will have 1000mbps of throughput max 700+200+200 is awfully close to the max you will see on a gigabit MetroE connection.

If you provide the missing details we can help you further.
 

stunning

Limp Gawd
Joined
Sep 14, 2006
Messages
317
County A / Server As
-All servers are connected to our distribution switch which then connectors to our core with 2x10gb connections. The core then connects to the firewalls at with a 1gb link which then connects to the opteMAN router at 100mb.

All three virtual servers at Site B, B1,B2,B3 are all on separate physical host which all have dual homed port channel 1gb copper.


The statement about "they will never be copied..." the end user was referring to the files created by the application.
 
Top