C# HtmlAgilityPack HtmlDocument() LoadHtml encoding

15,819

Solution 1

Use DownloadData method of WebClient instead of DownloadString():

WebClient client = new WebClient();
var data = client.DownloadData(url);
var html = Encoding.UTF8.GetString(data);

Solution 2

Use MemoryStream

WebClient client = new WebClient(); 
MemoryStream ms = new MemoryStream(client.DownloadData("http://localhost/rgm.php"));

HtmlDocument doc23 = new HtmlDocument();
doc23.Load(ms, Encoding.UTF8);

HtmlNode body23 = doc23.DocumentNode.SelectSingleNode("//body");
string content23 = body23.InnerHtml;
Share:
15,819
milesh
Author by

milesh

Updated on June 19, 2022

Comments

  • milesh
    milesh over 1 year
    Uri url = new Uri("http://localhost/rgm.php");
    WebClient client = new WebClient();
    string html = client.DownloadString(url);
    
    HtmlAgilityPack.HtmlDocument doc23 = new HtmlAgilityPack.HtmlDocument();
    doc23.LoadHtml(html);
    
    HtmlNode body23 = doc23.DocumentNode.SelectSingleNode("//body");
    
    string content23 = body23.InnerHtml;
    

    How can i force this to parse web page with "UTF-8" encoding?