How do I get Nokogiri to add the right XML encoding?
Solution 1
Are you using Nokogiri XML Builder? You can pass an encoding option to the new() method:
new(options = {})
Create a new Builder object. options are sent to the top level Document that is being built.
Building a document with a particular encoding for example:
Nokogiri::XML::Builder.new(:encoding => 'UTF-8') do |xml|
...
end
Also this page says you can do the following (when not using Builder):
doc = Nokogiri.XML('<foo><bar /><foo>', nil, 'EUC-JP')
Presumably you could change 'EUC-JP' to 'UTF-8'.
Solution 2
When parsing the doc you can set the encoding like this:
doc = Nokogiri::XML::Document.parse(xml_input, nil, "UTF-8")
For me that returns
<?xml version="1.0" encoding="UTF-8"?>
Luc
Background - Linux administration - Development (Perl, Java/J2E, Ruby, Shell, Javascript, ...) Hot Topics - development of several iOS applications - node.js - NoSQL (redis, mongodb, HBase) - Hadoop Areas of interest - entrepreneurship - finance / stock exchange - foreign languages
Updated on June 14, 2022Comments
-
Luc about 2 years
I have created a xml doc with Nokogiri:
Nokogiri::XML::Document
The header of my file is
<?xml version="1.0"?>
but I'd expect to have<?xml version="1.0" encoding="UTF-8"?>
. Is there any options I could use so the encoding appears ? -
Luc over 13 yearsin fact, I do not parse an existing file but create a new one using Nokogiri::XML::Document.new
-
LarsH over 2 yearsIt's funny that this has been one of my most highly upvoted answers. I have never used Nokogiri or Ruby, just XML and google search.