written 6.8 years ago by | • modified 2.8 years ago |
Subject: Advanced Internet Technology
Topic: Search Engine Optimazation
Difficulty: Medium
written 6.8 years ago by | • modified 2.8 years ago |
Subject: Advanced Internet Technology
Topic: Search Engine Optimazation
Difficulty: Medium
written 6.7 years ago by |
To offer the best possible results, search engines must attempt to discover all the public pages on the World Wide Web and then present the ones that best match up with the user’s search query.
The first step in this process is crawling the Web. The search engines start with a seed set of sites that are known to be very high quality sites, and then visit the links on each page of those sites to discover other web pages.
The link structure of the Web serves to bind together all of the pages that have been made public as a result of someone linking to them. Through links, search engines’ automated robots, called crawlers or spiders, can reach the many billions of interconnected documents.
The search engine will then load those other pages and analyze that content as well. This process repeats over and over again until the crawling process is complete.
Search engines do not attempt to crawl the entire Web every day. In fact, they may become aware of pages that they choose not to crawl because they are not likely to be important enough
Search engines use links on web pages to help them discover other web pages and websites.
For this reason, we strongly recommend taking the time to build an internal linking structure that spiders can crawl easily. Many sites make the critical mistake of hiding their navigation in ways that limit spider accessibility, thus impacting their ability to get pages listed in the search engines’ indexes.
In Google’s spider has reached Page A and sees links to pages B and E. However, even though pages C and D might be important pages on the site, the spider has no way to reach them (or even to know they exist), because no direct, crawlable links point to those pages.
As far as Google is concerned, they might as well not exist great content, good keyword targeting, and smart marketing won’t make any difference at all if the spiders can’t reach those pages in the first place.