01 February 2009

Blogging Tips : What Is a Sitemap? Do I Need It?

Rajasekharan asks:

I couldn’t find an XML sitemap in your blog. Do you have one? If no, why are you reluctant to include an XML sitemap?

First things first, what is a sitemap? According to

Sitemaps are an easy way for webmasters to inform search engines about pages on their sites that are available for crawling. In its simplest form, a Sitemap is an XML file that lists URLs for a site along with additional metadata about each URL (when it was last updated, how often it usually changes, and how important it is, relative to other URLs in the site) so that search engines can more intelligently crawl the site.

Web crawlers usually discover pages from links within the site and from other sites. Sitemaps supplement this data to allow crawlers that support Sitemaps to pick up all URLs in the Sitemap and learn about those URLs using the associated metadata. Using the Sitemap protocol does not guarantee that web pages are included in search engines, but provides hints for web crawlers to do a better job of crawling your site.

The key message is in the second paragraph: “Web crawlers usually discover pages from links within the site and from other sites. Sitemaps supplement this data.”

Google and the other search engines rely mainly on their bots and web crawlers to discover new pages around the web. Sitemaps help them in some situations, but they are not compulsory nor necessarily beneficial.

Here is what Google itself says about Sitemaps:

Sitemaps are particularly helpful if:

  • Your site has pages that aren’t easily discovered by Googlebot during the crawl process - for example, pages featuring rich AJAX or Flash.
  • Your site is new and has few links to it. (Googlebot crawls the web by following links from one page to another, so if your site isn’t well linked, it may be hard for us to discover it.)
  • Your site has a large archive of content pages that are not well linked to each other, or are not linked at all.

Unless you fall into one of those cases, therefore, having a Sitemap will not be “particularly helpful” for your site.

In fact, some people claim that it might even be the opposite. There are cases where the addition of a Sitemap actually reduced the crawl rate of the website, possibly because Google no longer need to crawl the site completely to discover its internal pages.

On most of my sites I don’t use Sitemaps because I always try to have an efficient internal link structure in place. For example on Daily Blog tips every single page of the site is accessible within 2 clicks of the homepage.

If your website is having indexation problems, has a poor internal link structure, or falls in one of the three situations described by Google, then using a Sitemap would probably be a good idea.

No comments: