Spider:
A program that automatically fetches Web pages. Spiders are used to feed pages to search engines. It’s called a spider because it crawls over the Web. Another term for these programs is webcrawler. Link >>
Last week we looked at search engine “indexes” and how they are built. The automated way that a search engine builds it’s index is by using a “spider”, otherwise known as a “robot” or a “webspider”.
Indexing the internet by hand is always going to be a game of catch-up. I’ve read that there are anywhere from 1200 to 100,000 new websites added to the internet per day. Search engines have to have these spiders running all of the time, just to keep up with those new sites.
One important thing to note is that when a spider visits your site, their visit is recorded by the web server just like a human visit. This is important to know if you ever look at the “hits” on your website - depending on how complex your traffic reporting software is, you might be seeing some traffic that didn’t come from actual human eyeballs reading your site.
Looking at the traffic here on Boyink.com, I get anywhere from 200 to 1300 page requests from spiders per day. This is a good thing, as I know the search engines are keeping a close eye on the new content I keep adding. It normally doesn’t take more than a day for my new content to be indexed.
Which brings up another good point - smart search engines like Google identify websites that update often, and have the spider visit more often. It might seem obvious, but if you want better search rankings, one way to get them is to simply develop and maintain a schedule for adding content to your website.
OK - now we have an index for a search engine to use in bringing search results back to you. How does it figure out how to order the results? This is known as search “ranking”, and we’ll look at it next week.
Back to Article