Files with non-ASCII characters in file name in a Windows batch file

8,640

Your main problem is font https://stackoverflow.com/questions/9321419/unicode-utf-8-text-file-gibberish-on-windows-console-trying-to-display-hebrew With the correct font you won't get question marks. So you should add Courier New to the command prompt. Then you'll be able to type or display/echo such characters.

If you then find that some commands have issues then try chcp 65001 (in answer to your question, rest assured that chcp 65001 will only affect that cmd prompt window). You'd need chcp 65001 for redirection to work on characters beyond \u7F e.g. that dir >asdf command to write a file with those characters, will need chcp 65001. But your ren command works fine without 65001.

Note- OP points out a correction to this.. His font was fine.. But he needed chcp 65001.

Another case where one needs chcp 65001 is if a batch file is in utf8. Otherwise even executing a batch file with just letters like привет , those will be converted into question marks.

OP also points out a great workaround for the problem that notepad saves with utf-8 with BOM, whereas chcp 65001 is UTF-8 without BOM. And if you have a batch file encoded as utf-8 with bom, that says just e.g. dir, or echo привет then it will not work, even if cmd has encoding of 65001. Because cmd mixes the BOM up into the first line. So a workaround is to put the command(s) starting from the second line. (Alternatively one could use a text editor that saves as utf-8 without BOM).

Share:
8,640

Related videos on Youtube

Alexander Gelbukh
Author by

Alexander Gelbukh

Updated on September 18, 2022

Comments

  • Alexander Gelbukh
    Alexander Gelbukh over 1 year

    On a usual (Western) Windows computer, I have a file

    файл.txt
    

    with non-ASCII letters in the file name. How can I do the following from a .bat file?

    dir файл.txt
    ren файл.txt file.txt
    

    etc.?

    I tried placing the above commands into a file mybat.bat (using UTF-8 or UTF-16 encoding), but it does not work even if I run it as cmd /u /c mybat.bat.

    Note: the question is not how to put those letters in a batch file, but how to make the batch file do what is expected (in my example, to list the file and then rename it).

    Note: dir > log.txt command shows the file файл.txt as ????.txt. However, dir shows this file on the screen correctly as файл.txt.

    • miroxlav
      miroxlav about 6 years
      Use chcp 65001 before ren command.
    • Alexander Gelbukh
      Alexander Gelbukh about 6 years
      Do I understand correctly that this will change the codepage for the whole machine? Say, if in parallel there are other batch files executing, it will affect them?
    • barlop
      barlop about 6 years
      @AlexanderGelbukh the chcp command would only affect that command prompt window but that's not your main problem..
    • barlop
      barlop about 6 years
      @miroxlav ren does not require chcp 65001 it works fine without it. I just tested it. When things are really simple to test then please test things before telling people to do things. And if you haven't tested it then please state that you haven't tested it and you're not sure, otherwise you're giving misinformation.
    • Alexander Gelbukh
      Alexander Gelbukh about 6 years
      @barlop Thanks for the lesson. I did test it on Windows 7 Spanish version. I saved the batch file in utf-8 encoding, because OEM encoding obviously does not allow to save these letters. No, ren does not work. Probably your computer has OEM encoding that includes Cyrillic, then you can test ren, say, on Chinese.
    • barlop
      barlop about 6 years
      @AlexanderGelbukh see pastebin.com/raw/k7SFwrCf ren works even when the codepage does not support the characters of the filename. Those characters in that link you can find in charmap \u05D0 hebrew letter aleph, and \u05D1 hebrew letter bet.
    • miroxlav
      miroxlav about 6 years
      @AlexanderGelbukh – if you answered the very first comment with "@miroxlav" I could see it and answer you, but without it I did not get notified to help you.
  • Alexander Gelbukh
    Alexander Gelbukh about 6 years
    Yes, chcp 65001 did the trick: ren worked. Yes, it does not affect other windows. No, ren does not work without chcp (try Chinese if your OEM is Cyrillic, etc.). For those who will use this answer: you can find your current code page with chcp without parameters (to save it in a variable and restore it before exiting). You need a blank line at the start of a batch file in utf-8, otherwise the BOM interfers with the first command in the file.
  • barlop
    barlop about 6 years
    @AlexanderGelbukh ren works for me without chcp 65001 see pastebin.com/raw/4Z218wST And regarding a BOM, it's better not to have one in the file, but if you have a BOM there then whether there is a blank line at the top or not won't make much difference see pastebin.com/raw/pPJhKY0r
  • barlop
    barlop about 6 years
    @AlexanderGelbukh If you want to save in UTF-8 without BOM, then you can use notepad++ or notepad2. Also chcp 65001, then e.g. echo אאא> a.a will write those characters in utf-8 without BOM . If you download VIM 7 it comes with an xxd.exe command . The xxd command is amazing. pastebin.com/raw/VXCRmdhP (charmap shows UTF-16 codes, UTF-8 codes are different..) xxd shows the hex.
  • Alexander Gelbukh
    Alexander Gelbukh about 6 years
    (1) "Works without chcp": Yes, works from command line, but the question was about batch files. Put your commands in a batch file, will it work? (2) Blank line: my comment was about batch file. Place echo привет as first line of the batch file and run it, it will not work. Place it as second line, and it will give you an error message about the BOM at the first line but will execute the second line normally. (3) Removing the BOM: right, easy, depending on your goals. I myself use Far Manager to remove the BOM: in the editor switch to a DOS codepage (F8) and remove those three letters.
  • Alexander Gelbukh
    Alexander Gelbukh about 6 years
    "Your main problem is font". Nope. As I said, the letters are shown OK on the screen, but placed as ?, chr(63), in a file. No font will show the ? differently :)
  • barlop
    barlop about 6 years
    @AlexanderGelbukh oh I See. thanks. For clarification i'd note that ren was fine though.. Even if you had put those characters привет in a bat file with no command at all, And even if they were on the second line. Then if the file wasn't saved with an encoding that supports the characters and matches the encoding that cmd had been set to, then executing that bat file would display the characters wrongly. So you were giving ren question marks as input. xxd is good also because it can isolate things well and see if the issue is the hex of the file.