Spiders – How They Index The Web
Spiders are the elements of Search Engines that find or index search engines. Though that’s pretty near the mark, if we’re being picky it’s not strictly true, in that they have nothing to do with the indexing process other than initial data capture of individual web pages. In fact they’re not even really called spiders that’s just the term us SEO’s use for them.
The term the engines use for them is bots, short for robots, which is a good description. Why Spiders? Well the best description of what they do is spidering or crawling! If you imagine the websites a spider would visit and dot them on a piece of paper, join up the dots and hey presto you’ve got a spiders web.
Bots are automated programs. That is they’re told to behave in a certain way given certain data by the search engines. So a search engine gives it’s bots a list of websites and the spiders trundle of at lightning speed capturing the source code of each site so it’s engine can start indexing it. It will report back on links it finds any to its engine and when the engine finds a link it hasn’t come across yet it will add that link to the list of another of its spiders.
If this all sounds a bit complicated, there are a few key learning points.
1. The search engines have a system of scouring the web for sites automatically.
2. As it’s done automatically, by machine, you website has to comply by certain rules otherwise it can’t be read
3. Bots/spiders love reporting links! Hint: get some links to your site!