Click here for Vacation Photos

SpamBot Administration

When a Robot visits a Web site, say http://www.jaaronanderson.com/, it firsts checks for http://www.jaaronanderson.com/robots.txt. If it can find this document, it will analyze its contents to see if it is allowed to retrieve the document. You can customize the robots.txt file to apply only to specific robots, and to disallow access to specific directories or files.

There can only be a single “/robots.txt” on a site. [moare info here]

also although little effectiveness steer with META commands
< META name=”ROBOTS” content=”NOINDEX, NOFOLLOW” >

As of August 2008, according to botsvsbrowsers, there is at least 192,894 user agents and 2,070 indexing bots!

Bots Used By Spammers

Unless you enjoy receiving lots of SPAM, you don’t want any of these bad bots like the ones below on your web site. These look for email addresses on web pages to send their junk email to. Click here for the truth behind blocking bad robots These bots ignore the robots.txt file as they want to find new email addresses by any means possible. They don’t care if you want them on your web site or not.

Start Of User-Agent String
EmailSiphon
EmailWolf
ExtractorPro
CherryPicker
NICErsPRO
Teleport
EmailCollector

If you are getting a listing in your logs and cannot identify it, try this User-Agent String analyzer.

There is still another way to refuse them access to your site. These can be blocked by using a .htaccess  file instead. This approach attempts to ban these bots from your web site hosting account in this other way. Here is a .htaccess example to add targeted crawlers to.

robots.txt have one and only one proper way to comment, which is to put comments after a hash mark (#).



Digg it | Save to del.icio.us | Netscape | Reddit | Stumble It!

- - - - - S P O N S O R I N G     A D V E R T I S M E N T - - - - -

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -

Post your thoughts in the Comments ...
Not signed up to share your ideas & thoughts?

It’s free and easy to collaborate!
Click Here to begin

Click Here to earn money for reviewing this post

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -

Leave a Reply

You must be logged in to post a comment.