How do I stop Google from indexing AJAX calls within Javascript?

5,227

Solution 1

I would put the Ajax function into robots.txt:

Disallow: /somepage/AjaxFunction.aspx

That will prevent Google from crawling it. Google doesn't typically index URLs it can't crawl. It will only index them if they are linked prominently, especially numerous external links. Even if Google does index the URL, it won't index the content of the URL. Google will show "this page was blocked by robots.txt" in the search results.

In many cases it is actually desirable to allow Google to crawl Ajax URLs. Those URLs may provide content that you want Google to index after JavaScript writes it into another page. In that case, robots.txt is not appropriate. You just don't want Google including the Ajax URL itself in the search results. You can use a header directive for that:

X-Robots-Tag: noindex

On an Apache server you can add that header with .htaccess code like:

<Files "AjaxFunction.aspx">
    Header append X-Robots-Tag "noindex"
</Files>

In aspx code you can set it like this:

<% Response.AddHeader "X-Robots-Tag", "noindex" %> 

Solution 2

Do not add X-Robots-Tag "noindex" in the AJAX function. That may block the main html page.

We thought it would be a good idea, we did it in a project and what happened is that the AJAX portion of the page affected to the "mother" HTML page and Google considers that we are giving a "noindex" header to the mother page.

This is the result in our GSC. Of course, we have removed the "noindex" header and requested a validation. Now we are looking for a better solution.

GSC image showing how "noindex" on ajax portions affect to the main HTML page

Solution 3

Your ajax urls should not return plain html (content-type:text/html;charset=utf-8), but for example JSON is great:

[....]
content-type: application/json

{"html":"<div>content<\/div>"}

Than you can easily parse the json response and add it to the DOM where needed.

In that way, this URL will not appear on search results and at the same time Google will read it and understand it's something needed to render the page.

Share:
5,227

Related videos on Youtube

Strontium_99
Author by

Strontium_99

Updated on September 18, 2022

Comments

  • Strontium_99
    Strontium_99 over 1 year

    We are getting a number of odd results come up on Google EG:

    http://www.somedomain.com/somepage/AjaxFunction.aspx?stuff=XXX&other=XXX 
    

    When I looked at "somepage" the ajax function is not mentioned at all within the HTML, which makes me assume Google is spidering out to the external javascript files and finding this AjaxFunction.aspx call.

    My question is: a) Is this possible? b) If so, how can I stop it?

  • A Biron
    A Biron about 4 years
    Thanks for sharing this Gari, I was about to try this exact solution but wasn't sure if it would affect the parent page. You've just confirmed it for me and saved me a lot of time and headache testing this solution. :)
  • Stephen Ostermiller
    Stephen Ostermiller over 3 years
    I've seen Google index weird content types before. Sometimes Google indexes the contents in robots.txt or sitemap.xml for example. I'm not sure that this would totally prevent indexing, but then again I've never run across json data in any search result I've seen.
  • the_nuts
    the_nuts over 3 years
    I think because txt (and xml in some situations, in the xhtml era) could be files meant to be read by a human, but json are like .js and .css, which are never shown in search results as far as I know