ExcelVBA - HttpReq via MSXML2.XMLHTTP - fetch page after loading page

12,475

finally you gave me the final link to solve my problem. I baked the code into my DownloadFile Function to stay with the IE Object until the end and then close it.

One Error i found is was that the readystate should be polled before anything is done with the HTMLObject.

Public Function DownloadFile(sSourceUrl As String, _
                             sLocalFile As String) As Boolean

Dim IE As InternetExplorer
Set IE = New InternetExplorer



Dim HtmlDoc As New MSHTML.HTMLDocument
Dim collTables As MSHTML.IHTMLElementCollection
Dim collSpans As MSHTML.IHTMLElementCollection
Dim objSpanElem As MSHTML.IHTMLSpanElement

Dim fnum As Integer

With IE
    'May changed to "false if you don't want to see browser window"
    .Visible = True   
    .Navigate (sSourceUrl)
    'this waits for the page to be loaded
     Do Until .readyState = 4: DoEvents: Loop 
End With

'Set HtmlDoc = wait_for_html(sSourceUrl, "text/css")
Do
    Set HtmlDoc = IE.Document

    'searching for the "Span" tag
    Set collSpans = HtmlDoc.getElementsByTagName("span") 

   'first Span element cotains...
    Set objSpanElem = collSpans(0) 

'... this if loading screen is display
Loop Until Not objSpanElem.innerHTML = "Your report is running." 

'just grab the tables and leave the rest    
Set collTables = HtmlDoc.getElementsByTagName("table") 

fnum = FreeFile()
Open sLocalFile For Output As fnum ' save the file and add html and body tags
Print #fnum, "<html>"
Print #fnum, "<body>"

Print #fnum, collTables(15).outerHTML 'title
Print #fnum, collTables(17).outerHTML 'Date
Print #fnum, collTables(18).outerHTML 'Part, Operation etc.
Print #fnum, collTables(19).outerHTML 'Measuerements

Print #fnum, "</body>"
Print #fnum, "</html>"

Close #fnum
IE.Quit 'close Explorer

DownloadFile = True

End Function
Share:
12,475
David Nottebohm-Knochenhauer
Author by

David Nottebohm-Knochenhauer

Updated on June 04, 2022

Comments

  • David Nottebohm-Knochenhauer
    David Nottebohm-Knochenhauer over 1 year

    i have a problem with fetching data from an internal web based Dataservice (cognos). Basically i put together a GET request like "blah.com/cognosapi.dll?product=xxx&date=yyy...", send it to the server and receive a webpage that i can store as HTML and parse into my excel form later.

    I build a VBA program which worked quite well in the past, but the webservice changed an now they are displaying a "your report is running" page in between that lasts from 1sec to 30sec. So when i call my function i always download this "your report is running" page insteat of the data. How can i catch the page that automatically loads up after the "report is running" page?

    This is the DownloadFile Function with the GETstring and the target path as parameters.

    Public Function DownloadFile(sSourceUrl As String, _
                                 sLocalFile As String) As Boolean
    
    
    Dim HttpReq As Object
    Set HttpReq = CreateObject("MSXML2.XMLHTTP")
    
    Dim HtmlDoc As New MSHTML.HTMLDocument
    
    
    HttpReq.Open "GET", sSourceUrl, False
    HttpReq.send
    
    
    If HttpReq.Status = 200 Then
        HttpReq.getAllResponseHeaders
        HtmlDoc.body.innerHTML = HttpReq.responseText
        Debug.Print HtmlDoc.body.innerHTML
    
    End If
    
      'Download the file. BINDF_GETNEWESTVERSION forces
      'the API to download from the specified source.
      'Passing 0& as dwReserved causes the locally-cached
      'copy to be downloaded, if available. If the API
      'returns ERROR_SUCCESS (0), DownloadFile returns True.
    
      DownloadFile = URLDownloadToFile(0&, _
                                        sSourceUrl, _
                                        sLocalFile, _
                                        BINDF_GETNEWESTVERSION, _
                                        0&) = ERROR_SUCCESS
    
    End Function
    

    Thanks David

  • David Nottebohm-Knochenhauer
    David Nottebohm-Knochenhauer over 10 years
    Point is that the page will jump to an file download after that. Is there a value i can check for?
  • wlgreg
    wlgreg over 10 years
    If I understood your original question, you are initially taken to one page (Page1), forced to wait, then redirected to a second page (Page2). In my function above, you identify some unique value in Page2 (I used the title tag text, which you would pass to the function as the strDesiredText argument), and the Do...Loop will keep running until the page containing that unique value has been loaded. The loop would then exit, and at that point, you could call your DownloadFile function (perhaps add it as the last line before 'End Function').
  • wlgreg
    wlgreg over 10 years
    Glad you got a working solution! Thanks for pointing out the error in my code.