Feb 3, 2010

How Search Engines Effort

The first basic truth you need to learn about SEO is that search engines are not humans. While this might be understandable for everybody, the differences between how humans and search engines view web pages aren't. Unlike humans, search engines are text-driven. Although technology advances rapidly, search engines are far from smart human being that can feel the beauty of a cool design or enjoy the sounds and movement in movies. Instead, search engines crawl the Web, looking at particular site items (mainly text) to get an idea what a site is about. This brief clarification is not the most precise because as we will see next, search engines perform several activities in order to deliver search results crawling, indexing, processing, calculating relevancy, and retrieving.

First, search engines crawl the Web to see what is there. This task is performed by e piece of software, called a crawler or a spider (or Googlebot, as is the case with Google). Spiders follow links from one page to another and index everything they find on their way. Having in mind the number of pages on the Web (over 20 billion), it is impossible for a spider to visit a site daily just to see if a new page has appeared or if an existing page has been personalized. Sometimes crawlers will not visit your site for a month or two, so during this time your SEO efforts will not be rewarded. But there is nothing you can do about it, so just keep settle down.

What you can do is to check what a sycophant sees from your site. As already mentioned, crawlers are not humans and they do not see images, Flash movies, JavaScript, frames, password-protected pages and directories, so if you have tons of these on your site, you'd better run the Spider Simulator below to see if these goodies are viewable by the spider. If they are not viewable, they will not be spidered, not indexed, not processed, etc. - in a word they will be non-existent for search engines.

After a page is crawled, the next step is to index its content. The indexed page is stored in a giant database, from where it can later be retrieved. Fundamentally, the process of indexing is identifying the words and expressions that best describe the page and assigning the page to particular keywords. For a human it will not be possible to process such amounts of information but generally search engines deal just fine with this task. Sometimes they might not get the meaning of a page right but if you help them by optimizing it, it will be easier for them to classify your pages correctly and for you to get higher rankings.

When a search application comes, the search engine processes it like to compares the search string in the search request with the indexed pages in the database. Since it is likely that more than one page (practically it is millions of pages) contains the search series, the search engine starts calculating the relevancy of each of the pages in its index to the search series.

There are various algorithms to work out relevancy. Each of these algorithms has different relative weights for common factors like keyword density, links, or metatags. That is why different search engines give dissimilar search results pages for the same search sequence. What is more, it is a known fact that all major search engines, like Yahoo!, Google, MSN, ASK.com etc. occasionally change their algorithms and if you want to keep at the top, you also need to adapt your pages to the latest changes. This is one reason (the other is your challenger) to devote permanent efforts to SEO, if you'd like to be at the top.

The last step in search engines movement is retrieving the results. Fundamentally, it is nothing more than simply displaying them in the browser i.e. the endless pages of search results that are sorted from the most applicable to the least relevant sites.

No comments:

Post a Comment