gcc.gnu.org website blocking browser by user-agent?

Jason Molenda jason-swarelist@molenda.com
Tue Sep 9 20:49:00 GMT 2003


sources.redhat.com does swithc off HTTP_USER_AGENT a little.  I have
a vain attempt to catch spam-harvester agents.  From the config file:

# RewriteEngine section.
# These rewrites exist to catch spambots who are harvesting e-mail addresses
# from the web mailing list archives.
#
# I got this Rewrite set from http://mosa.unity.ncsu.edu/brabec/antispam.html
# Other info at http://www.turnstep.com/Spambot/
# 
# www.mailing-list.com has to deal with the same problem in a more serious
# way; they've actually stopped putting e-mail addresses in web archives
# altogether.  That's a little harsh, so we'll try just this for now.
#

  RewriteEngine  on
  RewriteCond %{HTTP_USER_AGENT} ^EmailSiphon       [OR]
  RewriteCond %{HTTP_USER_AGENT} ^EmailWolf         [OR]
  RewriteCond %{HTTP_USER_AGENT} ^ExtractorPro      [OR]
  RewriteCond %{HTTP_USER_AGENT} ^Mozilla.*NEWT     [OR]
  RewriteCond %{HTTP_USER_AGENT} ^Crescent          [OR]
  RewriteCond %{HTTP_USER_AGENT} ^CherryPicker      [OR]
  RewriteCond %{HTTP_USER_AGENT} ^[Ww]eb[Bb]andit   [OR]
  RewriteCond %{HTTP_USER_AGENT} ^WebEMailExtrac.*  [OR]
  RewriteCond %{HTTP_USER_AGENT} ^NICErsPRO         [OR]
  RewriteCond %{HTTP_USER_AGENT} ^Telesoft          [OR]
  RewriteCond %{HTTP_USER_AGENT} ^Zeus.*Webster     [OR]
  RewriteCond %{HTTP_USER_AGENT} ^Microsoft.URL     [OR]
  RewriteCond %{HTTP_USER_AGENT} ^Mozilla/3.Mozilla/2.01 [OR]
  RewriteCond %{HTTP_USER_AGENT} ^.*;\ DigExt       [OR]
  RewriteCond %{HTTP_USER_AGENT} ^EmailCollector
  RewriteRule ^.*$ /badspammer.html  [L]

  RewriteLog logs/rewrite_log
  RewriteLogLevel 0



This code is present in both the generic section of httpd.conf and
the sources.redhat.com VirtualHost defn.  In Apache 1.3.x, the
settings in the generic part of httpd.conf had no effect on any of
the virtualhosts - I should have removed them.

But none of these should match Firebird.  I expect if the user had
been redirected to this page he would have mentioned that in his
message.

AFAIK that's the only thing done based on HTTP_USER_AGENT on the site.

J



More information about the Overseers mailing list