Streaming file

tdktank59

Gawd
Joined
Jan 23, 2007
Messages
590
So ive got a streaming file down-loader script.

Basically we allow our clients to click a link which streams a generically named file (normally an ID to a resource record we have) and we stream it when the proper filename so they can download it.

FIle download works without a kink. However sometimes we glitch the system somehow and it does this.
Seems like it tries to stream the file on the next page load basically streaming the first part of the html and breaking the next page.


Heres the code we use to stream the downloads
Code:
public function send_file($path, $file_name="", $vars=array()) {
	session_write_close();
	@ob_end_clean();
	if (!is_file($path) || connection_status()!=0) {
		return(FALSE);
	}
	
	// PREVENT LONG FILES GETTING CUT OFF FROM max_execution_time
	set_time_limit(0);
	
	// GET BASENAME
	if(empty($file_name)) {
		$file_name=basename($path);
	}
	
	if(!isset($vars["force"])) {
		$vars["force"]=1;
	}
	
	// DECODE HTML ENTITES
	$file_name = html_entity_decode($file_name, ENT_QUOTES);
	
	// STRIP CHARS
	require_once($GLOBALS["SCRIPT_CLASS_DIR"]."functions.class.inc");
	$functions = new functions;
	$file_name = $functions->strip_chars_filename($file_name);
	unset($functions);
	
	
	// FILENAMES IN IE CONTAINING DOTS WILL SCREW UP THE FILENAME, SO STRIP 
	if (strstr($_SERVER["HTTP_USER_AGENT"], "MSIE")) {
		$file_name = preg_replace("/\./", "%2e", $file_name, substr_count($file_name, ".") - 1);
	}
	
	
	////////////////////
	// REQUIRED HEADERS
	////////////////////
	
	// IF SSL
	if(!empty($_SERVER["HTTPS"])) {
		// HEADERS IE REQUIRES FOR HTTPS
		header("Cache-Control: private");
		header("Pragma: private");
	} else {
		//header("Cache-Control: no-cache, must-revalidate");
		header("Pragma: public");
		
	}
	header("Expires: 0");
	
	
	header("Content-Length: ".@filesize($path));
	header("Content-Type: application/octet-stream");
	header('Content-Disposition: inline; filename="'.$file_name.'"');
	header("Content-Transfer-Encoding: binary\n");
	
	readfile($path);
	
	return((connection_status()==0) and !connection_aborted());
}

So my question is does anyone have a clue in whats causing this to happen.
 
Last edited:
So ive got a streaming file down-loader script.

Basically we allow our clients to click a link which streams a generically named file (normally an ID to a resource record we have) and we stream it when the proper filename so they can download it.

FIle download works without a kink. However sometimes we glitch the system somehow and it does this.
Seems like it tries to stream the file on the next page load basically streaming the first part of the html and breaking the next page.

Heres a rough version of what we do to stream the file. (removed the unique stuff to our site)

lol, what do you need explained
First of all, there is no question posed in your first post. You seem to describe something that normally works, but doesn't anymore. There's no detail about when it started working inconsistently. Hence the confusion, and Whatsisname asking what everyone else is thinking.

So I'll assume that your first question is "Why is my downloader script acting inconsistent?" And my first thought is along the inconsistency: is the script hosted on a server farm, is there a proxy web server between, are there any firewall error logs around the same moment in time, etc.? Another thought to run down would be comparing this new code to past versions, and identifying the differences. Not just with the script, but code diff's site-wide.
 
First of all, there is no question posed in your first post. You seem to describe something that normally works, but doesn't anymore. There's no detail about when it started working inconsistently. Hence the confusion, and Whatsisname asking what everyone else is thinking.
Hahaha... yeah kinda forgot those parts... My bad

So I'll assume that your first question is "Why is my downloader script acting inconsistent?"
Yes that would be correct. Well more along the lines of what is causing this to happen.

And my first thought is along the inconsistency: is the script hosted on a server farm, is there a proxy web server between, are there any firewall error logs around the same moment in time, etc.?
No on the farm, no proxy, and firewall is clean.

Another thought to run down would be comparing this new code to past versions, and identifying the differences. Not just with the script, but code diff's site-wide.
Nothing has changed, Ive included a copy of the actual code. All variables are being passed in properly.

Code:
public function send_file($path, $file_name="", $vars=array()) {
	session_write_close();
	@ob_end_clean();
	if (!is_file($path) || connection_status()!=0) {
		return(FALSE);
	}
	
	// PREVENT LONG FILES GETTING CUT OFF FROM max_execution_time
	set_time_limit(0);
	
	// GET BASENAME
	if(empty($file_name)) {
		$file_name=basename($path);
	}
	
	if(!isset($vars["force"])) {
		$vars["force"]=1;
	}
	
	// DECODE HTML ENTITES
	$file_name = html_entity_decode($file_name, ENT_QUOTES);
	
	// STRIP CHARS
	require_once($GLOBALS["SCRIPT_CLASS_DIR"]."functions.class.inc");
	$functions = new functions;
	$file_name = $functions->strip_chars_filename($file_name);
	unset($functions);
	
	
	// FILENAMES IN IE CONTAINING DOTS WILL SCREW UP THE FILENAME, SO STRIP 
	if (strstr($_SERVER["HTTP_USER_AGENT"], "MSIE")) {
		$file_name = preg_replace("/\./", "%2e", $file_name, substr_count($file_name, ".") - 1);
	}
	
	
	////////////////////
	// REQUIRED HEADERS
	////////////////////
	
	// IF SSL
	if(!empty($_SERVER["HTTPS"])) {
		// HEADERS IE REQUIRES FOR HTTPS
		header("Cache-Control: private");
		header("Pragma: private");
	} else {
		//header("Cache-Control: no-cache, must-revalidate");
		header("Pragma: public");
		
	}
	header("Expires: 0");
	
	
	header("Content-Length: ".@filesize($path));
	header("Content-Type: application/octet-stream");
	header('Content-Disposition: inline; filename="'.$file_name.'"');
	header("Content-Transfer-Encoding: binary\n");
	
	readfile($path);
	
	return((connection_status()==0) and !connection_aborted());
}
 
Now we're getting some good feedback ;)

No on the farm, no proxy, and firewall is clean.
Then this being a single webserver environment rules out some ideas I had.

Another thought is the inconsistent nature of your problem could be caused from load/stress...
- Have you been able to reproduce the issue in a QA or test environment?
- Have you gathered any metrics on server load, I/O, peak bandwidth times, etc.?
- How big are the files that are being streamed over HTTP or HTTPS?

If you aren't gathering analytics on your site (or at least this page), then now's the time to get that in place, too.


Edit: Another thought... Try attaching a HTTP debugger, such as Fiddler, to inspect all of the data coming from the downloader script. If possible, post the content that Fiddler shows.
http://www.fiddler2.com/fiddler2/
 
Last edited:
Now we're getting some good feedback ;)


Then this being a single webserver environment rules out some ideas I had.

Another thought is the inconsistent nature of your problem could be caused from load/stress...
Sorry lol, yes it is a single webserver. Pretty beefy for what we are doing with it at the moment.

- Have you been able to reproduce the issue in a QA or test environment?
Well its the same server but we do have a "dev" mode... that we were testing different methods to stop the problems. I will work on throwing it onto a new server tho.

- Have you gathered any metrics on server load, I/O, peak bandwidth times, etc.?
All are well below our maxes. LA averages around 3-4 throughout the day. We notice slowdowns at about 8 or so... Bandwidth usage is super low... thing we do get is wait time on the disk but still isn't that bad most of the time.

However we just tried to break the script again. and were not able to. So we will try again tomorrow during peak times. since it seems to be when stuff is breaking.

- How big are the files that are being streamed over HTTP or HTTPS?
Just small image files, <1MB each. However we do also have documents and other files being streamed as well. that are >1MB (we are testing with the images)
And we do the streaming over HTTP and HTTPS. We have been having the issues with HTTPS and havn't tested HTTP yet.

If you aren't gathering analytics on your site (or at least this page), then now's the time to get that in place, too.
We have some analytics :) as well as Nagios monitoring loads and other things to make sure things are in operating norms
 
The readfile() function's details seem straight-forward, but some of the comments do suggest limitations and reliability considerations. Definitely read through the comments posted under that function to see if they apply or could help.
 
Back
Top