stop hotlinking using .htaccess

sailor

Limp Gawd
Joined
Jun 1, 2002
Messages
158
I have a personal (vanity, not business) web site and I got tired of other sites hotlinking to my files so now I am using .htaccess RewriteEngine to prevent it.

Requests for graphics files with a referrer field from my own site get served and those from other sites get blocked. So far so good.

The question is what to do with requests with no referrer information. Googlebot and other crawlers have no referrer info and it is easy enough to enable them individually as wanted. No problem there either.

But what about other requests with no referrer? What is the standard way of dealing with them? I suppose I could enable them all but that would seem to favor those who block the referrer field because they could use hotlinks that would be blocked to those who use the referrer field. On the other hand I see in the logs a few visitors who downloaded the HTML file but could not get the graphics because their browser was blocking the referrer info. Obviously I would want to avoid this. If everybody blocked the referrer info then this method of blocking hotlinking would not work.

More puzzling is that I see an IP request the HTML and graphics providing the referrer but then the same IP immediately requests the graphics again with no referrer and a code 302 is returned. I have no idea why this happens. What might be a reason for this?

What do webmasters generally do? Just grant all requests with no referrer?

It would seem to me that the best way to deal with the issue would be to grant all requests for graphic files coming from an IP who had requested an HTML file from the same directory in the last (say) 30 seconds. This would obviate the need for using the referrer info. On the other hand I have no idea if it is possible to do this with htaccess. I believe a GET request can only be evaluated on its own and not on any sever logs or history. Right?
 
Could you please elaborate? I am not an expert in this matter. This is my first contact with .htaccess and I am struggling and learning htaccess as I go along. I googled "dot folder" but could not find anything relevant.

BTW, I am using a shared server on a commercial hosting service. RewriteEngine is enabled but I have no guarantee other tools might be enabled or not. I would have to check.
 
I think he means make a folder named .images

the . makes the folder a hidden folder then have the htaccess rewrite to that location?
 
/facepalm

Your thread now has yellow icon w black dot so i can follow when others respond constructively.

As its relevant to me too.
 
Most of them just don't care. You're making this into a way bigger problem than it should be. Satisfy yourself that you've dealt with 90% of the problem and do something productive.
 
Well, I am interested in learning a bit about these things so in that sense it is productive for me. Obviously I am not worried about a few hotlinks but I am also interested in what is common practice, standards, etiquette, etc.

I can understand a site visitor saying "I just do not want to provide the referrer due to privacy reasons" and I can understand a web master saying "my site, my rules, if you want my content you need to provide the referrer". So I am interested in learning more about what is the general consensus if there is one.

It seems the use of the referrer field is not standard HTML which seems to indicate that its use would be deprecated and maybe cookies is the better way to go although I personally would try not to use cookies as I dont like them myself. Maybe sites that are serious use session logs at their own end but that is not worth it in my case.

I also notice quite a few cases of something like this: A user downloads the HTML and the associated graphics. A while later the same IP requests all the graphics but not the HTML and the referrer field is empty. In one example I am looking at the first request gives useragent "Mozilla/5.0 (Windows NT 5.1; rv:5.0) Gecko/20100101 Firefox/5.0" and referrer my own HTML page while the second request quite some minutes later requests all the graphics but not the original HTML and gives user agent "Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1; Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1) ; .NET CLR 1.1.4322; .NET CLR 2.0.50727; .NET CLR 3.0.4506.2152; .NET CLR 3.5.30729; AskTB5.6)" and the referrer is empty. I cannot imagine what would cause this behavior. Maybe the user opened the page the first time in Firefox and somehow in Firefox (which I have never used) there is a "open in IE" button and the HTML gets sent to IE but then IE tries to download the graphics from the internet. The whole thing makes no sense to me.
 
Unless you can find a really fancy Apache mod to handle this stuff, .htaccess is not intended to do any really complex logic. You can limit things to username/password in a few different ways, you can do rewriting based on paths & headers and whatnot but doing any non-stateless logic is pretty much out of the question. If you want to persist any concept of state between requests, you're going to have to go to an image serving script (IE - PHP + MySQL or something).

As for trying to satisfy people that turn off referrer headers, they're in the same boat as those that run without JS and cookies - they can't expect the modern internet to work without them turned on.
 
More puzzling is that I see an IP request the HTML and graphics providing the referrer but then the same IP immediately requests the graphics again with no referrer and a code 302 is returned. I have no idea why this happens. What might be a reason for this?

Right click / save as
Open in new tab
Right click / view source
et cetera

people trying to get the image or find out how by viewing the source code possibly.
 
Thanks for all the responses so far. After much tinkering and experimenting so far I have an htaccess configuration which is working reasonably well for me. In summary here is what I do:
Code:
1- Using DENY block several IPs from link spammers, etc. 

RewriteEngine On 

2- Totally block a named list of referrers.  This catches some link 
   spammers which were not blocked by IP.

3- For a name list of particularly obnoxious hotlinkers I redirect any 
   request for a graphic file to a particularly obnoxious graphic file.

4- If the referrer field is empty and the USER AGENT is not in a list 
   (this lets search engine bots through) I redirect any graphic file 
   request to a graphich which says "your browser did not supply the 
   necessary referrer information in the request". 

5- If the referrer field is not included in a white list which includes 
   my own site then any request for a graphic file is redirected to a 
   graphich which says "No hotlinking".
HTML files are always served unless they are on the initial list of blocked IP (1) or blocked referrers (2). Steps 3 to 5 only affect graphic files.

So far this is working quite well for me. It is blocking what I want blocked and serving what I want served.

The only question is about what I mentioned earlier in that I see an IP will request an HTML page and the included graphics and it will supply the referrer and everything is served fine but a bit later all the graphics are requested again by the same IP and with no referrer. Here is an example:

User Agent:
Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 5.1; Trident/4.0; .NET CLR 1.1.4322; .NET CLR 2.0.50727; AskTbMPC2/5.11.0.15286)
Code:
          file    Result  bytes  
  Time    type    code     sent   referrer
1:42:52   HTML     200    44223   google.com
1:42:53   PNG      200      414   my site
1:42:53   PNG      200    13102   my site
1:42:53   PNG      200    19131   my site
1:42:53   GIF      200     1205   my site
1:42:53   PNG      200     9835   my site
1:42:53   PNG      200     4852   my site
1:42:53   JPG      200    11929   my site
1:44:45   PNG      302      256      -
1:44:46   PNG      302      256      -
1:44:46   PNG      302      256      -
1:44:47   JPG      302      256      -
1:44:47   JPG      302      256      -
1:44:48   PNG      302      256      -
1:44:48   PNG      302      256      -
1:44:48   PNG      302      256      -
1:44:48   PNG      302      256      -
1:44:49   PNG      302      256      -
1:44:49   GIF      302      256      -
1:44:49   GIF      302      256      -
The initial referral is from google and the graphics for that page are served correctly but a couple minutes later the same IP requests all the graphics, sometimes more than once, in just a few seconds. It does not seem like it could be right-clicks due to the speed and I thought when you right-click on a graphic you are saving it from the browser's cache, not from the original server.

At any rate, I am not too concerned about this. The page was served correctly the first time around and that is enough for me.

The cases of users which start out not supplying referrer info from the outset are few enough that I am not really concerned.

Analysis of the log files yields interesting information.
 
It's been a while since I played with this but basically you want to make an image folder that has a .htaccess file in it which restricts access to these files if the referrer is not your website. This is not 100% foolproof as people can mess around with their browser to change/remove the referrer, but it will work for browsers with default settings. I forget exactly how but this may help:

http://dmr.ath.cx/notes/rewrite.html

Basically you redirect them to an image that says "no hot linking" or nude pics or w/e you want.

Not sure why the browser is requesting the images again though.
 
Well, using htaccess I have been quite successful in preventing hotlinking to the point where even attempts have pretty much disappeared. I guess people who try soon realize it does not work. On the other hand some may just copy the image elsewhere.

So this got me thinking about other sites which may be pirating my content wholesale and some googling revealed a few. I can report that some emails to the owners and to their hosting services were 100% effective in having the pirated content removed within a few short days.

In the future I think I will be watermarking my images and maybe introducing some intentional typos in the text which will make it easier to track down unauthorized copying.
 
Back
Top