Get HTML source code from CefSharp web browser
Solution 1
I don't think I quite get this DispatcherTimer
solution. I would do it like this:
public frmSelection()
{
InitializeComponent();
wb.FrameLoadEnd += WebBrowserFrameLoadEnded;
wb.Address = "http://www.racingpost.com/horses2/cards/card.sd?race_id=644222&r_date=2016-03-10#raceTabs=sc_";
}
private void WebBrowserFrameLoadEnded(object sender, FrameLoadEndEventArgs e)
{
if (e.Frame.IsMain)
{
wb.ViewSource();
wb.GetSourceAsync().ContinueWith(taskHtml =>
{
var html = taskHtml.Result;
});
}
}
I did a diff on the output of ViewSource
and the text in the html
variable and they are the same, so I can't reproduce your problem here.
This said, I noticed that the main frame gets loaded pretty late, so you have to wait quite a while until the notepad pops up with the source.
Solution 2
I was having the same issue trying to get click on and item located in a frame and not on the main frame. Using the example in your answer, I wrote the following extension method:
public static IFrame GetFrame(this ChromiumWebBrowser browser, string FrameName)
{
IFrame frame = null;
var identifiers = browser.GetBrowser().GetFrameIdentifiers();
foreach (var i in identifiers)
{
frame = browser.GetBrowser().GetFrame(i);
if (frame.Name == FrameName)
return frame;
}
return null;
}
If you have a "using" on your form for the module that contains this method you can do something like:
var frame = browser.GetFrame("nameofframe");
if (frame != null)
{
string HTML = await frame.GetSourceAsync();
}
Of course you need to make sure the page load is complete before using this, but I plan to use it a lot. Hope it helps!
Jim
Related videos on Youtube
Scott
Updated on July 09, 2022Comments
-
Scott almost 2 years
I am using aCefSharp.Wpf.ChromiumWebBrowser (Version 47.0.3.0) to load a web page. Some point after the page has loaded I want to get the source code.
I have called:
wb.GetBrowser().MainFrame.GetSourceAsync()
however it does not appear to be returning all the source code (I believe this is because there are child frames).
If I call:
wb.GetBrowser().MainFrame.ViewSource()
I can see it lists all the source code (including the inner frames).
I would like to get the same result as ViewSource(). Could some one point me in the right direction please?
Update – Added Code example
Note: The address the web browser is pointing too will only work up to and including 10/03/2016. After that it may display different data which is not what I would be looking at.
In the frmSelection.xaml file
<cefSharp:ChromiumWebBrowser Name="wb" Grid.Column="1" Grid.Row="0" />
In the frmSelection.xaml.cs file
public partial class frmSelection : UserControl { private System.Windows.Threading.DispatcherTimer wbTimer = new System.Windows.Threading.DispatcherTimer(); public frmSelection() { InitializeComponent(); // This timer will start when a web page has been loaded. // It will wait 4 seconds and then call wbTimer_Tick which // will then see if data can be extracted from the web page. wbTimer.Interval = new TimeSpan(0, 0, 4); wbTimer.Tick += new EventHandler(wbTimer_Tick); wb.Address = "http://www.racingpost.com/horses2/cards/card.sd?race_id=644222&r_date=2016-03-10#raceTabs=sc_"; wb.FrameLoadEnd += new EventHandler<CefSharp.FrameLoadEndEventArgs>(wb_FrameLoadEnd); } void wb_FrameLoadEnd(object sender, CefSharp.FrameLoadEndEventArgs e) { if (wbTimer.IsEnabled) wbTimer.Stop(); wbTimer.Start(); } void wbTimer_Tick(object sender, EventArgs e) { wbTimer.Stop(); string html = GetHTMLFromWebBrowser(); } private string GetHTMLFromWebBrowser() { // call the ViewSource method which will open up notepad and display the html. // this is just so I can compare it to the html returned in GetSourceAsync() // This is displaying all the html code (including child frames) wb.GetBrowser().MainFrame.ViewSource(); // Get the html source code from the main Frame. // This is displaying only code in the main frame and not any child frames of it. Task<String> taskHtml = wb.GetBrowser().MainFrame.GetSourceAsync(); string response = taskHtml.Result; return response; } }
-
Szabolcs Dézsi about 8 yearsCan you share some more code? I can't reproduce your problem, I get the same text with
GetSourceAsync
as withViewSource
. Tried it withAddress
set tohttp://stackoverflow.com
(it has two frames, oneiframe
and the main frame) -
Scott about 8 yearsThanks for taking a look. I have added example source to the original post.
-
-
Scott about 8 yearsThank you for the feedback on my code, I have sine updated it to reflect your example. I have run the code on another computer since posting the example and I get the same results as you (both return the full source code). I can only conclude there is something weird going on with my machine and I will consider doing a format.