How to save a Unicode character to a text file

10,202

Solution 1

I hope this will fit VBA for Word on Mac as well, but on Windows I have the CreateTextFile method of the FileSystemObject (see MSDN doc). There I can define to create a unicode text file.

  Set fsObject = CreateObject("Scripting.FileSystemObject")
  Set xmlFile = fsObject.CreateTextFile("path/filename.txt", True, True) 'the second "true" forces a unicode file.

  xmlFile.write "YourUnicodeTextHere"
  xmlFile.close

Solution 2

VBA can't code text in UTF-8 this way. Use ADODB - yes, for text, not for database.

'ensure reference is set to Microsoft ActiveX DataObjects library
'(the latest version of it) under "tools/references"
Sub AdoTest()
    Dim adoStream As ADODB.Stream
    
    Set adoStream = New ADODB.Stream
    
    'Unicode coding
    adoStream.Charset = "Unicode" 'or any string listed in registry HKEY_CLASSES_ROOT\MIME\Database\Charset
    
    'open sream
    adoStream.Open
    
    'write a text
    adoStream.WriteText "Text for testing: ěšč", StreamWriteEnum.stWriteLine
    
    'save to file
    adoStream.SaveToFile "D:\a\ado.txt"
    
    adoStream.Close
End Sub

Reading is simplier, see my answer here:

Unicode and UTF-8 with VBA

Edited: I've inserted complete example.

Edited 2: Added refernce to list of coding in the registry

Solution 3

The question is for VBA on Mac, and I'm afraid none of the answers work on a Mac.

The question is about Unicode which comes in many flavours. I'll address the UTF-16 aspect of it. UTF-8 follows a different path, but it isn't difficult too. AFAIU, your question is about UTF-16 string.

The code below has no error handling, I'll let you take care of that.

Function writeUnicodeTextToFile(filePathName As String, myText As String)

`Dim myFileNumber As Long, I As Long, byteArray() As Byte

myFileNumber = FreeFile()
Open filePathName For Binary As #myFileNumber

ReDim byteArray(1)

' Create a BOM for your Unicode flavour
' (CHOOSE! one of the two, programmatically, or hard-code it)
 ' => Little Endian
    byteArray(0) = 255: byteArray(1) = 254
' => Big Endian
    'byteArray(0) = 254: byteArray(1) = 255

' now write the two-byte BOM
Put myFileNumber, 1, byteArray

' redimension your byte array
' note it works even if you don't Redim (go figure) but it's more elegant
I = (LenB(myText) / 2) - 1
ReDim byteArray(I)

' populate the byte array...
byteArray = myText

' ... and write you text AFTER the BOM
Put myFileNumber, 3, byteArray
Close #myFileNumber
End Function

Solution 4

Here is a VBA routine that takes a string as input (your text), and fills an array of bytes. Then you write that array to disk in binary mode, making sure you start writing it after the first three bytes (BOM).

You'll need those Public variables: byteArray() As Byte, regexUTF8 As String

Sub testing()

' creating the BOM

Dim bom(2) As Byte, someFile As Long

bom(0) = 239: bom(1) = 187: bom(2) = 191

' Writing something as utf-8 UTF16toUTF8 "L'élève de l'école"

someFile = FreeFile() Open "MacDisk:test.txt" For Binary As #someFile ' first, the BOM Put #someFile, 1, bom ' then the utf-8 text Put #someFile, 4, byteArray1 Close #someFile End Sub

Sub UTF16toUTF8(theString As String)

' by Yves Champollion ' Transforms a VB/VBA string (they're all 16-bit) into a byteArray1, utf-8 compliant

    If isStringUTF8(theString) Then Exit Sub

    Dim iLoop As Long, i As Long, k As Long

    k = 0
    ReDim byteArray1(Len(theString) * 4)
    For iLoop = 1 To Len(theString)
        i = AscW(Mid$(theString, iLoop, 1))
        If i < 0 Then i = i + 65536
        If i > -1 And i < 128 Then
            byteArray1(k) = i
            k = k + 1
        ElseIf i >= 128 And i < 2048 Then
            byteArray1(k) = (i \ 64) Or 192
            byteArray1(k + 1) = (i And 63) Or 128
            k = k + 2
        ElseIf i >= 2048 And i < 65536 Then
            byteArray1(k) = (i \ 4096) Or 224
            byteArray1(k + 1) = ((i \ 64) And 63) Or 128
            byteArray1(k + 2) = (i And 63) Or 128
            k = k + 3
        Else
            byteArray1(k) = (i \ 262144) Or 240
            byteArray1(k + 1) = (((i \ 4096) And 63)) Or 128
            byteArray1(k + 2) = ((i \ 64) And 63) Or 128
            byteArray1(k + 3) = (i And 63) Or 128
            k = k + 4
        End If
    Next

    ReDim Preserve byteArray1(k - 1)

End Sub

Function isStringUTF8(theString As String) As Boolean

    Dim i As Integer, j As Integer, k As Integer

    ' Prime the regex argument
    If Len(regexUTF8) <> 66 Then
        regexUTF8 = "*[" + Space$(62) + "]*"
        For i = 192 To 253
            Mid(regexUTF8, i - 189, 1) = Chr(i)
        Next
    End If

    ' First quick check: any escaping characters?
    If Not theString Like regexUTF8 Then Exit Function

    'longer check: are escaping characters followed by UTF-8 sequences?
    For i = 1 To Len(theString) - 3
        If Asc(Mid(theString, i, 1)) > 192 Then
            k = Asc(Mid(theString, i, 1))
            If k > 193 And k < 220 Then
                If (Asc(Mid(theString, i + 1, 1)) And 128) Then
                    isStringUTF8 = True
                    Exit Function
                End If
            End If
            If k > 223 Then
                If (Asc(Mid(theString, i + 1, 1)) And 128) And (Asc(Mid(theString, i + 2, 1)) And 128) Then
                    isStringUTF8 = True
                    Exit Function
                End If
            End If
            j = j + 1
            If j > 100 Then Exit For
        End If
    Next
End Function
Share:
10,202
Codename K
Author by

Codename K

Updated on June 05, 2022

Comments

  • Codename K
    Codename K almost 2 years

    This is in Word for MAC VBA. I want to save the Unicode character from a text box to text file. For example this character "⅛".

    I use this code.

    Dim N as Long
    N = FreeFile
    Dim strText as String
    strText = Textbox1.Text 'This is what is in the textbox "⅛"
    Open <file path> For Output As N 
         Print #N, strText
    Close N
    

    It does not save the Unicode character. I understand I have to change the text encoding format. How do I do that?

    Likewise, how to read the text file with the Unicode format?