Forum Moderators: coopster
First,
I am of the understanding the script is counting the # of .html or.php pages requested in the time frame
.
so would need a config something like below
to catch the 1000's of hit and run bots
(40 pages in 90 seconds)
(10 pages in 30 seconds)
+ a lots of slower bots
.
$bInterval= 10;seconds
$bMaxVisit= 4; (MUST be > $bInterval) ? #.html pg/request
$bPenalty= 12000; a very long time
$bTotVisit= 100; basicly all bad bots are OUT!
.
Ok your script probably does this so I'm open to your recomended config settings for my problem
.
in the log file created i can then copy & paste the recorded IPs & addresses add to my iptables
.
thanks for your help
adrian
Let's take it in order:
In the default Apache setup, the script will catch accesses to .php pages only (requests for .html pages by default never hit PHP). If you have added extra config to cause Apache to pass HTML pages through php, it will then also catch too-fast accesses to .html pages, but only if you add this script as a prepended file for those html pages. The apache config to do this is in a previous post.
The default setup in the script as supplied (link above) will work fine for PHP-supplied pages.
$bMaxVisit: if you wish to alter it, you need to take note of the comment that follows ("MUST be > $bInterval") (an explanation why is in a previous post - I forget now exactly why that is, but it is!). The default of 7 secs/14 visits for $bInterval/$bMaxVisit should be fine, since every fast scraper that I've seen exceeds the 14 visits in a couple of seconds or so.
$bPenalty: set it as long as you like!
$bTotVisit: 100 is probably not a good idea. It will allow 100 pages/day for each visitor, then block all further accesses from that IP. Probably not what you want! ($bTotVisit is the slow-scraper block).
Actually better to use the setup given in the previous thread:
The problems with the G-bot etc are caused by the slow-scraper block, so do not use it. It is then a good idea to set the roll-over period to be shorter than 24 hours, so:
.
$bTotVisit= 0;// tot visits within $bStartOver (0==no slow-scraper block)
$bStartOver= 10800;// 4 hours; restart tracking
End_of_Days initially had a problem with the server setup until a comment later in the script was spotted. S/he suggested moving the comment higher up the script. I agree. Here are the amended top-comments:
<?php
/* Items prepended with an underscore are Constants that need to be define()'d
* somewhere in your scripts before this snippet of code gets used:
* eg (*nix):
define( '_B_DIRECTORY', '/full/path/on/server/to/block_dir/' );
define( '_B_LOGFILE', 'logfile.name' );
define( '_B_LOGMAXLINES', '1000' );
* These Constants can be variables or even constant-values within the code - your choice.
*
* Make sure that _B_DIRECTORY is INSIDE script dir.
* eg: if /path/to/script/ <-- webserver accessible
* then: /path/to/script/logs/logfile.name
* not: /path/to/logs/logfile.name
* (thanks End_of_Days)
...
*/