How to save a text file (CSV) with UTF-8 without BOM encoding in VBA (Excel)?
Solution 1
So I came in the situation where I needed this code again, and I read the comments and Leonard's answer, which made me update my code together with better descriptions.
This code will convert your Excel sheet and save it as a CSV file with the UTF-8 without BOM encoding. I found this code on a website, so I will not take credit for it. CSV without BOM link
Option Explicit
Sub CSVFileAsUTF8WithoutBOM()
Dim SrcRange As Range
Dim CurrRow As Range
Dim CurrCell As Range
Dim CurrTextStr As String
Dim ListSep As String
Dim FName As Variant
Dim UTFStream As Object
Dim BinaryStream As Object
' ADO Constants
Const adTypeBinary = 1 ' The stream contains binary data
Const adTypeText = 2 ' The stream contains text data (default)
Const adWriteLine = 1 ' write text string and a line separator (as defined by the LineSeparator property) to the stream.
Const adModeReadWrite = 3 ' Read/write
Const adLF = 10 ' Line feed only - default is carriage return line feed (adCRLF)
Const adSaveCreateOverWrite = 2 ' Overwrites the file with the data from the currently open Stream object, if the file already exists
' Open this workbook location
ChDrive Left(ThisWorkbook.Path, 1)
ChDir ThisWorkbook.Path
' ask for file name and path
FName = Application.GetSaveAsFilename("", "CSV File (*.csv), *.csv")
' prepare UTF-8 stream
Set UTFStream = CreateObject("adodb.stream")
UTFStream.Type = adTypeText
UTFStream.Mode = adModeReadWrite
UTFStream.Charset = "UTF-8"
UTFStream.LineSeparator = adLF
UTFStream.Open
'set field separator
ListSep = ";"
'set source range with data for csv file
If Selection.Cells.Count > 1 Then
Set SrcRange = Selection
Else
Set SrcRange = ActiveSheet.UsedRange
End If
For Each CurrRow In SrcRange.Rows
'enclose each value with quotation marks and escape quotation marks in values
CurrTextStr = ""
For Each CurrCell In CurrRow.Cells
CurrTextStr = CurrTextStr & """" & Replace(CurrCell.Value, """", """""") & """" & ListSep
Next
'remove ListSep after the last value in line
While Right(CurrTextStr, 1) = ListSep
CurrTextStr = Left(CurrTextStr, Len(CurrTextStr) - 1)
Wend
'add line to UTFStream
UTFStream.WriteText CurrTextStr, adWriteLine ' Writes character data to a text Stream object
Next
'skip BOM
UTFStream.Position = 3 ' sets or returns a long value that indicates the current position (in bytes) from the beginning of a Stream object
'copy UTFStream to BinaryStream
Set BinaryStream = CreateObject("adodb.stream")
BinaryStream.Type = adTypeBinary
BinaryStream.Mode = adModeReadWrite
BinaryStream.Open ' Opens a Stream object
'Strips BOM (first 3 bytes)
UTFStream.CopyTo BinaryStream ' Copies a specified number of characters/bytes from one Stream object into another Stream object
UTFStream.Flush ' Sends the contents of the Stream buffer to the associated underlying object
UTFStream.Close ' Closes a Stream object
'save to file
BinaryStream.SaveToFile FName, adSaveCreateOverWrite
BinaryStream.Flush ' Sends the contents of the Stream buffer to the associated underlying object
BinaryStream.Close ' Closes a Stream object
End Sub
Solution 2
thank you for posting this question and also the solution. It helped me a lot. Yes, I also found that SaveAs does not save the CSV file in UTF8. In my case it uses shift-JIS. The adodb.stream worked well for me.
However, I am not sure why but I had to declare some constants (enum) you used in the code. (I am really new to VBA so maybe I missed something about why this happens). I added this in the beginning of the function, then it worked perfectly:
Const adTypeText = 2
Const adModeReadWrite = 3
Const adTypeBinary = 1
Const adLF = 10
Const adSaveCreateOverWrite = 2
Const adWriteLine = 1
I got the value from Microsoft docs. Once again, thanks!
Niclas
Engineer working as a consultant. Love to learn and help. I find automation fascinating.
Updated on June 26, 2022Comments
-
Niclas almost 2 years
So this was my initial question. The answer to my question below, seems to be that the only solution to get UTF-8 (and UTF-8 without BOM) encoding, is to use the ADODB.Stream object.
The answer to my new question in the subject line is posted as a code.
I am sitting here and trying toSave
an Excel sheet as a.CSV
-file with a VBA macro.
However, I am wondering if it matters whether I useADODB
/ADODB.Stream
or just.SaveAs
Fileformat:=xlCSV
. I have tried to Google it, and it seems like I cannot find an answer to which method is the "best". I would need it to be comma delimited, UTF-8, and double quotations ("") as text-identifier.
Is it correct that when you useFileformat:=
, it is not possible toSaveAs
UTF-8, since thexlCSV
is not using that encoding? YES, that is correct.See my answer for the solution.
-
Balwinder Singh about 6 yearsThis isn't an answer to the question and should be deleted. If you are facing any related issue, then you can ask a new question while mentioning this particular question
-
Leonard AB about 6 years@BalwinderSingh thank you for the feedback. What I wrote should actually be a comment to the question (or to be precise, to the answer, as you may have checked, OP answered his own question by editing it). I managed to use his solution with the modification I wrote above. I thought it can help somebody else too. Problem is, I do not have enough reputation to comment. If you think it is better for me to copy OP's code and added my modification, kindly let me know.
-
Balwinder Singh about 6 yearsMy bad. As per the original edit, I thought you were facing some related issue. Now it makes sense after you edited it and it might help others who face similar issue. Cheers