Force XDocument to write to String with UTF-8 encoding
Solution 1
Try this:
using System;
using System.IO;
using System.Text;
using System.Xml.Linq;
class Test
{
static void Main()
{
XDocument doc = XDocument.Load("test.xml",
LoadOptions.PreserveWhitespace);
doc.Declaration = new XDeclaration("1.0", "utf-8", null);
StringWriter writer = new Utf8StringWriter();
doc.Save(writer, SaveOptions.None);
Console.WriteLine(writer);
}
private class Utf8StringWriter : StringWriter
{
public override Encoding Encoding { get { return Encoding.UTF8; } }
}
}
Of course, you haven't shown us how you're building the document, which makes it hard to test... I've just tried with a hand-constructed XDocument
and that contains the relevant whitespace too.
Solution 2
Try XmlWriterSettings:
XmlWriterSettings xws = new XmlWriterSettings();
xws.OmitXmlDeclaration = false;
xws.Indent = true;
And pass it on like
using (XmlWriter xw = XmlWriter.Create(sb, xws))
Chris
Updated on September 05, 2020Comments
-
Chris over 3 years
I want to be able to write XML to a String with the declaration and with UTF-8 encoding. This seems mighty tricky to accomplish.
I have read around a bit and tried some of the popular answers for this but the they all have issues. My current code correctly outputs as UTF-8 but does not maintain the original formatting of the XDocument (i.e. indents / whitespace)!
Can anyone offer some advice please?
XDocument xml = new XDocument(new XDeclaration("1.0", "utf-8", "yes"), xelementXML); MemoryStream ms = new MemoryStream(); using (XmlWriter xw = new XmlTextWriter(ms, Encoding.UTF8)) { xml.Save(xw); xw.Flush(); StreamReader sr = new StreamReader(ms); ms.Seek(0, SeekOrigin.Begin); String xmlString = sr.ReadToEnd(); }
The XML requires the formatting to be identical to the way
.ToString()
would format it i.e.<?xml version="1.0" encoding="utf-8" standalone="yes"?> <root> <node>blah</node> </root>
What I'm currently seeing is
<?xml version="1.0" encoding="utf-8" standalone="yes"?><root><node>blah</node></root>
Update I have managed to get this to work by adding
XmlTextWriter
settings... It seems VERY clunky though!MemoryStream ms = new MemoryStream(); XmlWriterSettings settings = new XmlWriterSettings(); settings.Encoding = Encoding.UTF8; settings.ConformanceLevel = ConformanceLevel.Document; settings.Indent = true; using (XmlWriter xw = XmlTextWriter.Create(ms, settings)) { xml.Save(xw); xw.Flush(); StreamReader sr = new StreamReader(ms); ms.Seek(0, SeekOrigin.Begin); String blah = sr.ReadToEnd(); }
-
Chris over 13 yearsWorks a treat, thanks - is there no way to get the encoding sorted without inheriting from StringWriter?
-
Jon Skeet over 13 years@Chris: It's possible that there is some way of getting the TextWriter overload to ignore the encoding that the TextWriter advertises, but I've found this to be a really simple hack to get the job done. (You only need it in one place...)
-
Chris over 13 yearsYeah I like it - it's FAR better than the method I came up with. Thanks