Rendering HTML+Javascript server-side

10,133

Solution 1

I found Awesomium Does exactly what I need! "Windowless web-browser framework". Brilliant.

Solution 2

You can consider using Watin. Generate your page then use Watin api to capture the generated page.

http://fwdnug.com/blogs/ddodgen/archive/2008/06/19/watin-api-capturewebpagetofile.aspx

Share:
10,133
Dimitar Velitchkov
Author by

Dimitar Velitchkov

Updated on July 12, 2022

Comments

  • Dimitar Velitchkov
    Dimitar Velitchkov almost 2 years

    I need to render an HTML page server-side and "extract" the raw bytes of a canvas element so I can save it to a PNG. Problem is, the canvas element is created from javascript (I'm using jquery's Flot to generate a chart, basically). So I guess I need a way to "host" the DOM+Javascript functionality from a browser without actually using the browser. I settled on mshtml (but open to any and all suggestions) as it seems that it should be able to to exactly that. This is an ASP.NET MVC project.

    I've searched far and wide and haven't seen anything conclusive.

    So I have this simple HTML - example kept as simple as possible to demonstrate the problem -

    <!DOCTYPE html>
    <html>
    <head>
        <title>Wow</title>
        <script src="http://ajax.aspnetcdn.com/ajax/jQuery/jquery-1.7.1.min.js" type="text/javascript"></script>
    </head>
    <body>
        <div id="hello">
        </div>
        <script type="text/javascript">
            function simple() 
            {
                $("#hello").append("<p>Hello</p>");
            }                    
        </script>
    </body>
    </html>
    

    which produces the expected output when run from a browser.

    I want to be able to load the original HTML into memory, execute the javascript function, then manipulate the final DOM tree. I cannot use any System.Windows.WebBrowser-like class, as my code needs to run in a service environment.

    So here's my code:

    IHTMLDocument2 domRoot = (IHTMLDocument2)new HTMLDocument();
    
            using (WebClient wc = new WebClient())
            {
                using (var stream = new StreamReader(wc.OpenRead((string)url)))
                {
                    string html = stream.ReadToEnd();
                    domRoot.write(html);
                    domRoot.close();
                }
            }
    
            while (domRoot.readyState != "complete")
                Thread.Sleep(SleepTime);
    
            string beforeScript = domRoot.body.outerHTML;
    
            IHTMLWindow2 parentWin = domRoot.parentWindow;            
            parentWin.execScript("simple");
    
            while (domRoot.readyState != "complete")
                Thread.Sleep(SleepTime);
    
    
            string afterScript = domRoot.body.outerHTML;
    
            System.Runtime.InteropServices.Marshal.FinalReleaseComObject(domRoot);
            domRoot = null;
    

    The problem is, "beforeScript" and "afterScript" are exactly the same. The IHTMLDocument2 instance goes through the normal "uninitialized", "loading", "complete" cycle, no errors are thrown, nothing.

    Anybody have any ideas on what I'm doing wrong? Completely lost here.

  • Dimitar Velitchkov
    Dimitar Velitchkov over 12 years
    Thanks for taking the time to reply, but I will disagree with what you said - regarding "not intended". A browser's viewable area (the window) should be (and is) separate from its internal DOM/Javascript engine. Why should it matter where the result is rendered? Why can't the DOM+CSS+Javascript be rendered in any location in memory? In fact, there is an example of Node.js interacting with Flot to do exactly what I want to do - generate a chart server-side. Sadly, I can't use Node.js...
  • Dimitar Velitchkov
    Dimitar Velitchkov over 12 years
    Regarding what I need to do on the server - I need to, periodically, produce a chart (for now, using Flot) in a headless, non-interactive environment. The chart has to look exactly the same as it would if a user was viewing the web page in a browser. This is an ASP.NET MVC project, I haven't found any server-side controls/components to do this easily.
  • Alexander Yezutov
    Alexander Yezutov over 12 years
    Yeah, basically you are right. It really makes sense to render DOM+CSS+Javascript anywhere into the memory. However, I would expect this to work with a browser, which cannot be loaded in non-interactive session. Anyway, I wish you good luck, if you will proceed with this.
  • Dimitar Velitchkov
    Dimitar Velitchkov over 12 years
    I tried Watin. It's nice, but it seems to be just a wrapper around IE/Firefox and actually launches them, as in a separate process + window. So, no good in a headless, non-interactive environment.