Tuesday, August 18, 2009

How easy is it for search engine to crawl your site?

URLs are like the bridges between your website and a search engine’s crawler: crawlers need to be able to find and cross those bridges in order to get to your site’s content. If your URLs are complicated or redundant, crawlers are going to spend time tracing and retracing their steps; if your URLs are organized and lead directly to distinct content, crawlers can spend their time accessing your content rather than crawling through empty pages, or crawling the same content over and over via different URLs.
Google also shares some best practice tips for ensuring your URLs are fully optimized to be crawled and indexed by the Google crawler.
There are a few things that Google recommends you follow when ensuring your URLs are set-up correctly, which will help the crawlers find and crawl your content faster. These include:
Remove user-specific details from URLs.
URL parameters that don’t change the content of the page—like session IDs or sort order—can be removed from the URL and put into a cookie.
Disallow actions Googlebot can’t perform.
Using your robots.txt file, you can disallow crawling of login pages, contact forms, shopping carts, and other pages whose sole functionality is something that a crawler can’t perform.
One URL, one set of content

Her is a presentasjon made in Google docs and is an explanation on the subject.

No comments: