Ruby: How to generate CSV files that has Excel-friendly encoding
Solution 1
The top voted answer from @joaofraga worked for me, but I found an alternative solution that also worked - no UTF-8 to ISO-8859-1 transcoding required.
From what I've read, Excel, can indeed handle UTF-8, but for some reason, it doesn't recognize it by default. But if you add a BOM to the beginning of the CSV data, this seems to cause Excel to realise that the file is UTF-8.
So, if you have a CSV like so:
csv_string = CSV.generate(csv_config) do |csv|
csv << ["Text a", "Text b", "Text æ", "Text ø", "Text å"]
end
just add a BOM byte like so:
"\uFEFF" + csv_string
In my case, my controller is sending the CSV as a file, so this is what my controller looks like:
def show
respond_to do |format|
format.csv do
# add BOM to force Excel to realise this file is encoded in UTF-8, so it respects special characters
send_data "\uFEFF" + csv_string, type: :csv, filename: "csv.csv"
end
end
end
I should note that UTF-8 itself does not require or recommend a BOM at all, but as I mentioned, adding it in this case seemed to nudge Excel into realising that the file was indeed UTF-8.
Solution 2
Excel understands UTF-8 CSV if it has BOM. That can be done like:
Use CSV.generate
# the argument of CSV.generate is default string
csv_string = CSV.generate("\uFEFF") do |csv|
csv << ["Text a", "Text b", "Text æ", "Text ø", "Text å"]
end
Use CSV.open
filename = "/tmp/example.csv"
# Default output encoding is UTF-8
CSV.open(filename, "w") do |csv|
csv.to_io.write "\uFEFF" # use CSV#to_io to write BOM directly
csv << ["Text a", "Text b", "Text æ", "Text ø", "Text å"]
end
Solution 3
You should switch the encoding to ISO-8859-1
as following:
CSV.generate(encoding: 'ISO-8859-1') { |csv| csv << ["Text á", "Text é", "Text æ"] }
For your context, you can do this:
config = {
col_sep: ';',
row_sep: ';',
encoding: 'ISO-8859-1'
}
CSV.generate(config) { |csv| csv << ["Text á", "Text é", "Text æ"] }
I had the same issue and that encoding fixed.
Related videos on Youtube
Cjoerg
Updated on June 11, 2022Comments
-
Cjoerg almost 2 years
I am generating CSV files that needs to be opened and reviewed in Excel once they have been generated. It seems that Excel requires a different encoding than UTF-8.
Here is my config and generation code:
csv_config = {col_sep: ";", row_sep: "\n", encoding: Encoding::UTF_8 } csv_string = CSV.generate(csv_config) do |csv| csv << ["Text a", "Text b", "Text æ", "Text ø", "Text å"] end
When opening this in Excel, the special characters are not being displayed properly:
Text a Text b Text æ Text ø Text å
Any idea how to ensure proper encoding?
-
Amadan almost 9 yearsTry putting
# encoding: UTF-8
as your Ruby file's first line (second if you have a hash-bang line,#!/usr/bin/env ruby
). I believe you are writing in UTF-8, but the Ruby source file is taken to be encoded as US_ASCII. (With Ruby 2.0+, source encoding defaults to UTF-8) -
Cjoerg almost 9 yearsI am using
ruby 2.0.0p247 (2013-06-27 revision 41674) [x86_64-darwin12.4.0]
so I suppose that means that my installation is already defaulting to UTF-8. -
Axel Richter almost 9 yearsNo experience with Ruby. But Excel can open semicolon delimited CSV files which are UTF-8 encoded. But the file must have a BOM at its beginning. And whether the semicolon can be used as delimiter is locale dependent. So the best approach is to use tabulator delimited CSV and those UTF-16LE encoded. This should be most locale independent.
-
Fred over 8 yearsWhat Excel are you using? I had no trouble getting the special characters to display in Excel 2013.
-
knut about 8 yearsAnother hint: With the axlsx-gem it is easy to create direct a xlsx-files.
-
-
deepmotion over 6 yearsThe answer above worked for me, but only after I removed the
col_sep
androw_sep
arguments. Just theencoding: 'ISO-8859-1'
was all I needed. For context, the specific issue I was having wasé
characters appearing asé
-
joaofraga over 6 yearsGood catch Greg, I will update the example without the context.
-
Yunnosch about 6 yearsWould you like to augment your code-only answer with some explanation?
-
hajpoj over 5 yearsThanks the CSV.open solution worked perfectly for me! Slightly cleaner solution than other answers.
-
Samuel Heredia over 5 years#<NoMethodError: undefined method `to_io' for #<StringIO:0x00000001d0c540> :(
-
mArtinko5MB over 4 yearsDon't be confused like i was, older Excel does not work (2010 and above)
-
Tallboy over 4 yearsThanks for also saying how with CSV.open
-
Marcelo Pereira Rodrigues almost 4 yearsThis works for me even when I open on excel online, thanks!
-
quetzalcoatl almost 4 yearsyeah, sure, 8859-1 good for everything yay, try writing some japanese or arabic characters in a CSV file like that. OP asked specifically for UTF-8, so why advise him to go decades back in time?
-
quetzalcoatl almost 4 yearsOP asked for UTF-8 specifically, not 8859-1 aka ANSI
-
Maxim over 2 years