Get a value of an attribute by XPath and HtmlAgilityPack
Solution 1
you can get it in .Attributes
collection:
var doc = new HtmlAgilityPack.HtmlDocument();
doc.Load("file.html");
var node = doc.DocumentNode.SelectNodes("//input") [0];
var val = node.Attributes["value"].Value; //10743
Solution 2
Update2: Here is a code example how to get values of attributes using Html Agility Pack:
http://htmlagilitypack.codeplex.com/wikipage?title=Examples
HtmlDocument doc = new HtmlDocument();
doc.Load("file.htm");
foreach(HtmlNode link in doc.DocumentElement.SelectNodes("//a[@href"])
{
HtmlAttribute att = link.Attributes["href"];
att.Value = FixLink(att);
}
doc.Save("file.htm");
You obviously need to adapt this code to your needs -- for example you will not modify the attributes, but will just use att.Value
.
Update: You may also look at this question:
Selecting attribute values with html Agility Pack
Your problem is most likely a default namespace problem -- search for "XPath default namespace c#" and you will find many good solutions (hint: use the overload of SelectNodes()
that has an XmlNamespaceManager
argument).
The following code shows what one gets for an attribute in a document in "no namespace":
using System;
using System.IO;
using System.Xml;
public class Sample
{
public static void Main()
{
XmlDocument doc = new XmlDocument();
doc.LoadXml("<input value='novel' ISBN='1-861001-57-5'>" +
"<title>Pride And Prejudice</title>" +
"</input>");
XmlNode root = doc.DocumentElement;
XmlNode value = doc.SelectNodes("//input/@value")[0];
Console.WriteLine("Inner text: " + value.InnerText);
Console.WriteLine("InnerXml: " + value.InnerXml);
Console.WriteLine("OuterXml: " + value.OuterXml);
Console.WriteLine("Value: " + value.Value);
}
}
The result from running this app is:
Inner text: novel
InnerXml: novel
OuterXml: value="novel"
Value: novel
Now, for a document that is in a default namespace:
using System;
using System.IO;
using System.Xml;
public class Sample
{
public static void Main()
{
XmlDocument doc = new XmlDocument();
doc.LoadXml("<input xmlns='some:Namespace' value='novel' ISBN='1-861001-57-5'>" +
"<title>Pride And Prejudice</title>" +
"</input>");
XmlNode root = doc.DocumentElement;
XmlNamespaceManager nsmgr = new XmlNamespaceManager(doc.NameTable);
nsmgr.AddNamespace("x", "some:Namespace");
XmlNode value = doc.SelectNodes("//x:input/@value", nsmgr)[0];
Console.WriteLine("Inner text: " + value.InnerText);
Console.WriteLine("InnerXml: " + value.InnerXml);
Console.WriteLine("OuterXml: " + value.OuterXml);
Console.WriteLine("Value: " + value.Value);
}
}
Running this app produces again the wanted results:
Inner text: novel
InnerXml: novel
OuterXml: value="novel"
Value: novel
Solution 3
You can also directly grab the attribute if you use the HtmlNavigator
.
//Load document from some html string
HtmlDocument hdoc = new HtmlDocument();
hdoc.LoadHtml(htmlContent);
//load navigator for current document
HtmlNavigator navigator = (HtmlNodeNavigator)hdoc.CreateNavigator();
//Get value with given xpath
string xpath = "//input/@value";
string val = navigator.SelectSingleNode(xpath).Value;
Comments
-
Chani Poz over 3 years
I have a HTML document and I parse it with XPath. I want to get a value of the element input, but it didn't work.
My Html:
<tbody> <tr> <td> <input type="text" name="item" value="10743" readonly="readonly" size="10"/> </td> </tr> </tbody>
My code:
using HtmlAgilityPack; HtmlAgilityPack.HtmlDocument doc; HtmlWeb hw = new HtmlWeb(); HtmlNodeCollection node = doc.DocumentNode.SelectNodes("//input/@value"); string s=node[0].InnerText;
So I want to get the value: "10743" (and I don't mind to get another tags with the answer.)
-
Chani Poz over 12 yearsThanks, but it is not the problem, my doc is Html, and another XPath doe's good, except of that - because this XPath is not right for my intention. I need to find another XPath, but I have no idea.
-
Chani Poz over 12 yearsWasn't I was clear? anyway I added all my code and wrote what I want: the string: "10743" (value of node input)
-
Dimitre Novatchev over 12 years@Chanipoz: Have a look at my second update -- a code sample showing exactly how to obtain the value of an attribute using Html Agility Pack-- something you can easily adapt to your needs.
-
Robert Synoradzki almost 5 yearsTHIS is the answer!
System.Xml.Linq.XDocument
works the same and allows for very legible XPath one-liners. Thank you, @Pierluc!