Microsoft Excel UTF-8 encoding without importing

5,580

You can use a PowerShell script to open CSV files and automatically pass them to Excel. The script silently uses Excel's text import method which handles UTF-8 encoding and as a bonus treats values always as text

Put a shortcut to this script in your sendto folder so you can open all CSV files via Right click » SendTo » myScript (easy method)

(or)

You can use PS2EXE to convert your script to an executable (.exe). Now you can right click a .csv file and under 'Open with' you choose this .exe as your default program to open CSV files (advanced method)

What it does

  • UTF-8 characters are displayed correctly
  • Can open multiple CSV files. Stores each CSV content as new worksheet in the same workbook
  • All values are treated as pure text. No interpretation from Excel's buildin CSV handler

    • ie. 0001 stays 0001 and is not converted to 1
    • ie. -A122:23 or =AZ1+32 are not interpreted as formula and stay as they are

How to use

  • Create a new text file and paste the below script. A commented version can be found here
$CSVs = @()
$args.ForEach({
    If ((Test-Path $_) -and ($_ -Match ".csv$|.txt$")) {
        $CSVs += ,$_
    } 
})

if (-Not $null -eq $CSVs) {

    $excel = New-Object -ComObject excel.application 
    $excel.visible = $true
    $excel.SheetsInNewWorkbook = $CSVs.Count    
    $workbook = $excel.Workbooks.Add()

    for ($i=0; $i -lt $CSVs.Count; $i++){

        $csv = Get-Item $CSVs[$i]
        $worksheet = $workbook.worksheets.Item($i + 1)
        $worksheet.Name = $csv.basename

        $TxtConnector = ("TEXT;" + $csv.fullname)
        $Connector = $worksheet.QueryTables.add($TxtConnector,$worksheet.Range("A1"))
        $query = $worksheet.QueryTables.item($Connector.name)
        $query.TextFilePlatform = 65001
        $query.TextFileTextQualifier = 1
        $query.TextFileOtherDelimiter = $Excel.Application.International(5) 
        $query.TextFileParseType  = 1
        $arrFormats = ,2 * $worksheet.Cells.Columns.Count
        $query.TextFileColumnDataTypes = $arrFormats
        $query.AdjustColumnWidth = 1
        $query.Refresh()
        $query.Delete()
    }
}

  • Save it somewhere like C:\Tools\myScript.ps1. (Note the extension .ps1)
  • Open your sendto folder via WinR » shell:sendto » Enter
  • Create a new shortcut via Right click » New » Shortcut and paste this line. Don't forget to change the path to your own one where you've put your script

"%SystemRoot%\system32\WindowsPowerShell\v1.0\powershell.exe " -NoProfile -NonInteractive -WindowStyle Hidden -File "C:\my\folder\myScript.ps1"

Comparison

  • New way: Opened via script (either through sendto or PS2EXE converted executable)

enter image description here

  • Old way: Opened via double click

enter image description here


Remarks

  • PS2EXE does not run on Windows 10 because the .NET version 4.5 or higher is not supported from PS2EXE. The program works on Windows 7 with .NET 4.0

  • In my first tests Excel did show gibberish when trying to display chinese UTF-8 characters. However during my tests this behavior switched and now both methods handle UTF-8 correctly. I have no idea what caused this

Share:
5,580

Related videos on Youtube

Nicolás Fernández Fioretti
Author by

Nicolás Fernández Fioretti

I like eating, sleeping and the KonMari Method.

Updated on September 18, 2022

Comments

  • Nicolás Fernández Fioretti
    Nicolás Fernández Fioretti over 1 year

    Before you tell me to search more and to look at other people's questions and their answers read my precise question.

    Most other topics here deal with how to make UTF-8 the default option within the text import wizard. However, I would like to avoid having to use the text import wizard at all.

    I have CSV files that I want to open by double clicking. Excel opens them automatically, but even if I did the regedit trick of using 65001 (UTF-8) as the default encoding, the files are still not correctly displayed.

    Is there some way to force this encoding in the opening of a file instead of having to import it?

    Thanks in advance

    (using Excel 2016 in US English and my Windows 7 regional configuration is set to Spanish (United States), I don't know if that is important, but it actually was for dealing with default commas as separators)

    EDIT for clarification: accents and characters from other languages still appear as weird codes when opening a CSV file after doing the regedit trick of forcing utf-8 for the import. I am not importing though (the trick works as expected there); I am opening the file directly.

  • Nicolás Fernández Fioretti
    Nicolás Fernández Fioretti almost 7 years
    Thanks @nixda This looks like an amazing. I can't make it work though. I tried the sendto script, and even though I followed the instructions, nothing happens. After I rightclick -> sendto -> script, nothing happens. What appears to be a cmd window appears very briefly and disappears, but excel does not run. I also tried the PS2EXE method, but when I run the script the message "THE POWERSHELL VERSION IS UNKNOWN", which, according to the ps2exe.ps1 file (opened in notepad) belongs to if( $psversion -eq 0 ) { write-host "THE POWERSHELL VERSION IS UNKNOWN!" exit -1 }
  • nixda
    nixda almost 7 years
    @NicolásFernándezFioretti Could it be that you haven't allowed Windows to execute Powershell script? Default is restricted. Check it in PoewerShell with Get-ExecutionPolicy. And if needed start PS as administrator and set it to set-executionpolicy remotesigned or Set-ExecutionPolicy unrestricted
  • Nicolás Fernández Fioretti
    Nicolás Fernández Fioretti almost 7 years
    I am using a corporate computer, so I may have special restrictions i am not aware of. I can run powershell as administrator, and get-executionpolicy returns unrestricted, but set-executionpolicy says I do not have access to modifiyng the registry. I am suspecting that even if I am right-clicking powershell and selecting 'run as administrator' I am running it normally, since there is no administrator indication on the title bar of the powershell window.
  • nixda
    nixda almost 7 years
    @NicolásFernándezFioretti To see where the problem is ,please try the following: Open Powershell ISE (that's the build in editor) and paste the above script into it. Now add as first line this $args = "C:\my\test.csv". Change the path to an existing test CSV on your drive. Does it open Excel and import the CSV?
  • Nicolás Fernández Fioretti
    Nicolás Fernández Fioretti almost 7 years
    It works! pasting the script in powershell and using the path to the file opened it with the correct treatment for utf8. I also tried again the "send to" method, and it worked well. I seem to have trouble crating an .exe with PS2EXE. Should I ask for help elsewhere?
  • nixda
    nixda almost 7 years
    @NicolásFernándezFioretti It could be that all executables created via PS2EXE can only be run on machines where the .NET version is not higher than 4.0. So the error is maybe not on your side
  • Nicolás Fernández Fioretti
    Nicolás Fernández Fioretti almost 7 years
    Well. Too bad. Just out of curiosity: why is it so hard to do this with the best and most popular spreadsheet editor in the world? LibreOffice handles CSV with UTF8 perfectly fine and has no trouble with list-separating semicolons or date formatting. Is this not a commonly requested feature?
  • nixda
    nixda almost 7 years
    @NicolásFernándezFioretti Yes, it is a commly requested feature - for many years now. Have a look at this old, but popular question
  • xtian
    xtian over 3 years
    +1, Not my issue; kudos for such a comprehensive answer.
  • ba_ul
    ba_ul about 2 years
    I'm pleasantly surprised, almost shocked, to see it work like a charm. Thank you for your incredible service to humankind.