Are there any clear indicators that my sitemap file is beneficial?

30,244

Solution 1

A Sitemap file helps search engines to discover new and updated URLs on your website. In particular, if your website is fairly large, then this can help them to be able to focus on the new & updated content, instead of having to blindly crawl through everything to see if anything has changed. That can result in new content being found much faster, which can be quite noticeable especially if the site is larger or more complex.

With Google in particular (I work at Google; I don't know how other search engines handle these), it also does the following:

  • Find the number of indexed URLs for your website: These statistics are recalculated daily and very accurate. You can find these in the Sitemaps detail page.
  • Discover canonicalization issues: If the numbers there don't match up, that's frequently a sign that you're specifying URLs in the Sitemap file that don't match what we find during our crawling. That's usually a sign that you need to work on canonicalization.
  • Help with canonicalization: When we find multiple URLs on your site that show identical content, we will give any URL that's listed in a Sitemap an extra edge, even if you don't use other canonicalization methods.
  • Find badly-indexed parts of your site: These counts are supplied per Sitemap file, so you can create separate Sitemap files for logical sections of your site, to discover areas where Google isn't indexing as much as you'd like.
  • Prioritize crawl errors: In the crawl errors section, URLs that were specified in Sitemaps files are listed separately. Since you specifically supplied these URLs, we assume that you want them indexed, and that any crawl errors there are important.

Additionally, you can use several extensions in Sitemaps files (eg for images, video, News, or internationalization), should you choose to do that. These extensions are all optional.

For most websites, the most visible element of Sitemaps files is that you can see the indexed URL count. It can take a day or so to appear, so if you just submitted a Sitemap for the first time, you may need to be a bit patient. While other ways (eg a site:-query) are very, very rough approximations, this count is extremely acccurate.

Edited to add: another thing that I personally find extremely useful with regards to Sitemaps is that if you're not generating them directly with your CMS, you invariably find out a lot about how your website is crawlable, and what kind of URLs are discovered during that process. I've seen many cases where crawling a website with a tool on your side (eg a Sitemaps generator) will bubble up issues that you might miss otherwise, be that session-IDs in URLs, duplicate content through URL differences, infinite spaces (such as endless calendars), or even parts of a site that aren't linked at all.

Solution 2

If you're not getting any errors then you can assume Google has parsed it and is aware of the contents. But that doesn't mean they will crawl and/or index those pages. Sitemaps are just another way to tell search engines about your pages. They are not obligated to crawl and index any or all of those pages. The same applies to them finding pages through links or URL submissions.

Solution 3

Google usually does a good job of crawling your website if you have a good number of quality links. If your spending a lot of time looking at the number of pages you have indexed I would suggest its better to improve your site and get some quality links.

Share:
30,244

Related videos on Youtube

Stephen Ostermiller
Author by

Stephen Ostermiller

Updated on September 18, 2022

Comments

  • Stephen Ostermiller
    Stephen Ostermiller almost 2 years

    I have recently created a sitemap.xml file and uploaded it to my Google Webmasters Tools account. Google didn't report any issues or errors with the uploaded sitemap of my site.

    Now my question is:

    • How do I know if my sitemap is working within Google Webmaster Tools?

    The reason I ask is I don't know what I'm suppose to being seeing or looking for, and it feels like I've uploaded an useless file.

  • Su'
    Su' about 12 years
    The file has already been validated. That's not the question.
  • Dan S
    Dan S about 12 years
    Great info. I have also heard that with some high traffic heavily crawed sites it's better not to use a sitemap because Google does a better job crawling and if there is anything missing from your sitemap it might stop getting indexed.
  • Su'
    Su' about 12 years
    @Chris_O You're mashing different problems together. In that example, it's not that Google is doing a "better" job crawling on its own; the sitemap itself is faulty. That's isn't a direct line of argument to "don't use a sitemap at all." The solution to that situation is to fix the sitemap. Additionally, sitemaps are informative, not directives. Something missing from a sitemap doesn't mean Google won't find it on its own, or disregard it.
  • Dan S
    Dan S about 12 years
    The site in question has over 40k indexed pages and new content gets indexed in less than 5 minutes(with no site map). Based on your answer we're going to start building them and break them into years.
  • Franz
    Franz about 12 years
    @john-mueller hi mr. m. - i once submitted a test sitemap with 1000 URLs, we got back an index count of about 700, now we tested all 1000 URLs via the site:www.complete.org/url/to/the/page.html, we got back an count way below the 700 urls (more in the 200 region). what does this mean?
  • John Mueller
    John Mueller about 12 years
    @Franz There are sometimes details involved which make it hard to reproduce the indexed URL count with site:-queries. For example, there are situations where we might combine multiple URLs and only show one of them for a site:-query. So if you see a difference there, it's usually not worth worrying about.
  • cboettig
    cboettig over 11 years
    Great answer. does it make a difference if the sitemap is an xml sitemap vs a simple .txt sitemap?
  • JCL1178
    JCL1178 over 11 years
    @cboettig Not for simple indexing of text content. The XML vs TXT sitemap mainly concerns metadata. See here: webmasters.stackexchange.com/questions/43407/…
  • Pepone
    Pepone over 8 years
    Bing/Yahoo also use sitemaps for discovery - when we implemented sitemaps on kellysearch a while back we got 15% bost in organic from Bing