C# WebBrowser control -- Get Document Elements After AJAX?

52,364

Solution 1

I solved the problem for me.

the key is, attaching a handler for onPropertyChanged event of the div element which is being populated via ajax call.

HtmlElement target = webBrowser.Document.GetElementById("div_populated_by_ajax");

if (target != null)
{
      target.AttachEventHandler("onpropertychange", handler);
}

and finally,

private void handler(Object sender, EventArgs e)
{
      HtmlElement div = webBrowser.Document.GetElementById("div_populated_by_ajax");
      if (div == null) return;
      String contentLoaded = div.InnerHtml; // get the content loaded via ajax
}

Solution 2

using System;
using System.Windows.Forms;

namespace WebBrowserDemo
{
    class Program
    {
        public const string TestUrl = "http://www.w3schools.com/Ajax/tryit_view.asp?filename=tryajax_first";

        [STAThread]
        static void Main(string[] args)
        {
            WebBrowser wb = new WebBrowser();
            wb.DocumentCompleted += new WebBrowserDocumentCompletedEventHandler(wb_DocumentCompleted);
            wb.Navigate(TestUrl);

            while (wb.ReadyState != WebBrowserReadyState.Complete)
            {
                Application.DoEvents();
            }

            Console.WriteLine("\nPress any key to continue...");
            Console.ReadKey(true);
        }

        static void wb_DocumentCompleted(object sender, WebBrowserDocumentCompletedEventArgs e)
        {
            WebBrowser wb = (WebBrowser)sender;

            HtmlElement document = wb.Document.GetElementsByTagName("html")[0];
            HtmlElement button = wb.Document.GetElementsByTagName("button")[0];

            Console.WriteLine(document.OuterHtml + "\n");

            button.InvokeMember("Click");

            Console.WriteLine(document.OuterHtml);           
        }
    }
}

Solution 3

You will need to use DOM for it. Cast WebBrowser.Document.DomDocument to IHTMLDocument?. You will have to import some COM interfaces or Microsoft.mshtml assembly.

Have a look to http://msdn.microsoft.com/en-us/library/aa752641(VS.85).aspx for more details.

Solution 4

I assume that since you're reading content which is generated from Ajax requests that you require the user to progress the application to a point where the relevant data is loaded, at which point you run code to read the data.

If that's not the case, you'll need to automate this process, generating the click events which build out the DOM nodes you're interested in reading. I do this somewhat commonly with the WebBrowser control and tend to write that layer of functionality in Javascript and call it with .InvokeScript(). Another route would be to find the nodes which fire the Ajax functionality from C# and manually trigger their click events:

HtmlElement content = webMain.Document.GetElementById("content");
content.RaiseEvent("onclick");

An important aspect to note in the script above is the fact that you can interact with DOM nodes naively in C# if you accept and work around the limitations of the HtmlElement object type.

Share:
52,364
aikeru
Author by

aikeru

During my day job, I help empower and enable business folks to solve problems and create value through the creative use of technology (often with terms like C#/.NET, JavaScript, React/Redux, node.js) I love technology, and especially software development, and so I also have several small hobby projects or try to help contribute to open source projects when I have the opportunity.

Updated on April 14, 2020

Comments

  • aikeru
    aikeru about 4 years

    I'm writing an application that uses the WebBrowser control to view web content that can change with AJAX that adds new content/elements. I can't seem to get at the new elements any way I've tried. BrowserCtl.DocumentText doesn't have the up-to-date page and of course it's not in "view source" either.

    Is there some way to get this new data using this control? :( Please help. Thanks!

    IE:

    Browser.Navigate("www.somewebpagewithAJAX.com");
    //Code that waits for browser to finish...
    ...
    //WebBrowser control has loaded content and AJAX has loaded new content
    // (is visible at runtime on form) but can't see them in Browser.Document.All
    // or Browser.DocumentText :(
    
  • aikeru
    aikeru about 15 years
    Ouch! I'd like to avoid this if possible, I think. I'm fine working with HtmlElement.DomElement COM types but would the IHTMLDocument have the now-changed elements post-javascript?
  • aikeru
    aikeru about 15 years
    Thanks for the informative post, unfortunately my problem lies in that once the JavaScript has already run, there are new elements on the page that I need to interact with or check a value from ... the Document.x doesn't seem to have these new elements post-javascript :(
  • John Lewin
    John Lewin about 15 years
    The .Document reference provides live access to the DOM & elements created after initial load are just as accessible as the original ones. Is there any chance that the generated elements exist in a frame? Can you share the page which present the problem?
  • Hugo
    Hugo over 12 years
    Nice solution. I tried out just for fun and for simple things, this looks pretty nice.
  • TheGateKeeper
    TheGateKeeper about 12 years
    I am trying to do this too, did you ever find a solution?
  • Steve Radich-BitShop.com
    Steve Radich-BitShop.com almost 12 years
    This continued rendering the old html. Tried in .Net 4.0
  • Alok
    Alok over 10 years
    hi i have been trying to do this with youtube comments, but i cant figure out how this will work.
  • Chloraphil
    Chloraphil almost 10 years