Google spreadsheet ImportXML Error:"the XPath query did not return any data"

13,335

Solution 1

As a matter of fact, maybe it is a bug on the new google sheets or they have changed how the function works. I've activated mine and when I try to use the ImportXML it simply wont work. Since I have some old sheets here (on the old mechanism) they still work normally. If I copy and paste the script from the old to the new one it simply doesn't get any data.

Here a example:

=ImportXML("http://www.nytimes.com/pages/todayspaper/index.html";"//div[@class='columnGroup first']//h3")

If I run this on the old mechanism it works fine, but if I run the same on the new mechanism, first it will exchange my ";" for a "," and then it will bring a "#N/A" with a warning of "Error: Imported XML content cannot be parsed".

Edit (05/05/2015): I am happy to say that I tested this function again today on the new spreadsheets and they've fixed it. I was checking that every two months and now finally they have solved this issue. The example I've added above is now returning information.

Solution 2

I'm sorry, but you won't be able to easily parse Google result pages. The reason your function throws an error is because the content of the page you see in your browser is generated by javascript, and Google spreadsheet doesn't execute js.

Your ImportXML has the right syntax, it doesn't return anything because the node you're looking for isn't there (importXML Parse Error).

You will have to find another source if you want these result in your spreadsheet. For info some libraries already parse the usual result page (http://www.seerinteractive.com/blog/google-scraper-in-google-docs-update for example, if it still works), but I doubt finding one for your special case will be easy.

Solution 3

This gives the answer (importXML Parse Error), but it's not entirely obvious.

ImportXML doesn't load Javascript. When you're building ImportXML queries on Google results, make sure you're testing against a version of the page that has Javascript turned off. You can do this using the Chrome DevTools.

(But I agree that ImportXML is fickle, idiosyncratic, and generally rage-inducing).

Share:
13,335
Andrew Strathclyde
Author by

Andrew Strathclyde

Updated on November 25, 2022

Comments

  • Andrew Strathclyde
    Andrew Strathclyde over 1 year

    I continue to get this error when I try to run this XPath query

    //div[@iti='0']
    

    on this link (flight search from google)

    https://www.google.com/flights/#search;f=LGW;t=JFK;d=2014-05-22;r=2014-05-26
    

    I get something like this:

    =ImportXML("https://www.google.fr/flights/#search;f=jfk;t=lgw;d=2014-02-22;r=2014-02-26";"//div[@iti='0']")
    

    I verified and the XPath is correct (I get the answer wanted using XPath helper, the answer wanted are the data relative to the first flight selected).

    I guess that it is a problem of syntax, but I tried more or less all the combinations of lower/uppercase, punctuation (replacing ; , ' ") and I tried to link the URI and the XPath query stored in cells, but nothing works.

    Any help will be appreciated.

    • Robin
      Robin about 10 years
      Google uses a lot of javascript basically everywhere, and it can obviously modify significantly the base HTML. What are you using to retrieve the page? Did you check the HTML is as expected in whatever context you are trying to apply xpath from? And, btw, what is this context (language/tool...)?
    • Andrew Strathclyde
      Andrew Strathclyde about 10 years
      well, I tried it with several searches, the -iti='0'- is the best way to identify this item. Anyway i tried also with a similar query from kayak and I get the same problem. I really think the issue is in the syntax of ImportXML. Regarding the context, i'm using Google Spreadsheets and I want to find the best combination of destination/location for a trip.
    • Robin
      Robin about 10 years
      What I mean is, are you sure the HTML you receive in the spreadsheet is the same as the one you used to choose iti='0' for criteria. The output of chrome, after js execution and iframe retrieving, might be different from what the spreadsheet receives. Did you check the full page is as expected after imported in the spreadsheet? The syntax of ImportXML seems functional on other examples.
    • Andrew Strathclyde
      Andrew Strathclyde about 10 years
      ok, I understand what you mean, I'm exploring the xml seen from the spreadsheet, it's actually quite different, let's see if i find something.
    • Robin
      Robin about 10 years
      Yep I looked into it a little bit myself and it seems that the body of your webpage is almost empty, everything is generated through javascript in your browser (BTW if you're checking in chrome, F12 displays the interpreted HTML and Ctrl+U the raw code). Google doesn't like being parsed. Let me know if you find anything else but this probably is your problem.
    • Rafael Merlin
      Rafael Merlin almost 9 years
      @AndrewStrathclyde try it again now. I've checked this on May 5th and apparently they've fixed the ImportXML, at least the ones I was using on the old google sheets are now working on the new google sheets!