suggestions on DETAILED log analizer

Joined
Dec 17, 2000
Messages
2,567
I have tried a the programs on Sourceforge including Awstats and Webalizer. Also tries Advanced Log Analizer.


NONE of these will break down how many visits from each search engine I am getting each DAY. Does anyone know of anything I can use other than go through my raw log file manually?
 
Wouldn't

grep "hit signature" | wc - l

do this? If you post a couple lines from the log file (one for each search engine), I could do it for you with a few 1 liners.
 
here are exampes strait from my raw logs:

4.40.2.85 - - [14/Feb/2004:02:31:37 -0500] "GET /book.php?id=12766 HTTP/1.1" 200 5615 "http://search.msn.com/pass/results.aspx?ps=ba%3d15(0.)0.......%26co%3d15(0.1)3.200.2.5.10.1.3.%26pn%3d1%26rd%3d0%26&q=bagel+recipe&ck_sc=1&ck_af=0" "Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; .NET CLR 1.0.3705)"


67.83.164.132 - - [15/Feb/2004:17:09:49 -0500] "GET /book.php?id=12857 HTTP/1.1" 200 6005 "http://search.yahoo.com/search?p=%22Momo%22+author&ei=UTF-8&fr=fp-tab-web-t&cop=mss&tab=" "Mozilla/4.0 (compatible; MSIE 6.0; Windows 98)"

141.233.32.94 - - [15/Feb/2004:19:51:14 -0500] "GET /search.php?item%3DGustaf%2BFr%F6ding HTTP/1.1" 200 4420 "http://www.google.com/search?sourceid=navclient&ie=UTF-8&oe=UTF-8&q=site%3Awww%2Eiblist%2Ecom" "Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; .NET CLR 1.1.4322)"




Those 3 are the only ones I really care about. And my coding skills are nonexistant, so I have no clue what you showed does.
 
If you're on a unix based system, you're already set. If not, download and install cygwin to get grep (a search tool) and wc (word count).

These'll count the hits. Change the 14/15 for the day, Feb for the month, and 2004 for the year.
Code:
grep "14\/Feb\/2004.*GET .* HTTP.* 200 [0-9][0-9]*.*search\.msn\.com" filename.log | wc -l
grep "15\/Feb\/2004.*GET .* HTTP.* 200 [0-9][0-9]*.*www\.google\.com" filename.log | wc -l
grep "15\/Feb\/2004.*GET .* HTTP.* 200 [0-9][0-9]*.*search\.yahoo\.com" filename.log | wc -l
In english, this looks for lines containing "Feb/2004" followed by a successful HTTP GET with the search engine as the refering URL in the log file filename.log, in the current directory. wc -l just adds up the resulting lines, it can be removed if you want to see every line that matches. Will someone else be nice enough to doublecheck what I put down? It's been a while since I used regular expressions. I tested them on my machine (using cygwin) as much as I could.
 
Originally posted by eloj
[0-9][0-9]* <=> [0-9]+

Yea, I've seen (crappy) versions of grep that didn't support extended regular expressions, though. Or was it sed?

pcgamerz: to use eloj's solution (which is cleaner), you might have to add a -E immediately after grep. So:

grep -E "stuff..."
 
I tried it both with and without the -E through SSH.

I got the following:
grep: 1: No such file or directory
grep: wc: No such file or directory
 
Originally posted by pcgamez
I tried it both with and without the -E through SSH.

I got the following:
grep: 1: No such file or directory
grep: wc: No such file or directory

the | before wc is a pipe, not a 1. shift-\
 
ah, that's right

I am just getting 0 now though.


If I pull the date (14) or month out, I get 31252 if that helps on MSN...haven't tried the others.

edit: 0 on the other two as well.
 
Originally posted by pcgamez
ah, that's right

I am just getting 0 now though.


If I pull the date (14) or month out, I get 31252 if that helps on MSN...haven't tried the others.

edit: 0 on the other two as well.

hrm...

try changing the date portion to this:

14/Feb/2004

and remove the
| wc - l

for now (add it back in after it's working, just so that we're sure it's matching properly)
 
removing the | wc -l part causes there to be no errors or output as far as I can tel


hmmm


btw, I am on under 35997903 or DeathByAnts on AIM
 
Originally posted by pcgamez
removing the | wc -l part causes there to be no errors or output as far as I can tel


hmmm

there should only be no output when grep matches nothing. Try changing the date like you did before. Unfortunatly I have no unix box to test this on. I suspect it's either the quotes around the regular expression(try changing them to single quotes, I seem to remember something about cygwin in NT's shell treating them differently than unix shells), or I was escaping things I shouldn't have been in the date. I know it works as is under cygwin. Anyone else with a unix shell, can you help pcgamerz debug this?
 
omg, I am a frigging moron. It works perfectly. I was accessing the wrong logfile that stopped at Feb 6th. DOH!!
 
Back
Top