Even on Windows 7, can you do a "dir" and be able to see filenames that has unicode characters?
Solution 1
This is a very old question, but all of the answers given here are wrong.
You will never see Unicode output on the Windows command line (CMD.exe). The reason is that CMD cannot display Unicode. It can, however, display DBCS (Double-Byte Character Set).
If you want to see Japanese output, for example, you have to change your System Locale to Japanese and reboot. Then, you'll be able to see Japanese DBCS (i.e. Shift-JIS) characters on the command line. Windows supports Japanese Shift-JIS, Simplified Chinese, Korean, and Traditional Chinese "Big5" DBCS code pages.
Incidentally, you can pipe UTF-16 (inaccurately used interchangeably with "Unicode" by Microsoft) to a file, then open that file in, say, Notepad, and view the Unicode characters. You can also mark and copy the gibberish text from CMD.exe and paste it into Notepad and see the Unicode characters. In other words, CMD supports Unicode, but it doesn't display Unicode.
You can find more information in this blog post.
Solution 2
Based on your username I suspect you mainly work with asian languages.
Windows tools operate normally in unicode mode (as you saw by piping the output of dir
into a file and opening that file with an editor):
- the tool does its stuff
- it outputs unicode characters
- another program takes this output and has to display it.
to display any character on the screen the program from step 3 has to lookup the glyph appropriate for the given byte sequence. example:
0x65 'a' maps to a different glyph in each font (so the 'a' looks different from font to font)
0x937 'Ω' (greek 'omega') maps to a different glyph in each font as well
this mapping only works IF the font has a glyph for the given byte sequence. otherwise the visual result differs, sometimes you see '?', sometimes diamonds etc.
again: dir
produces bytesequences, which sometimes are purely in the ASCII-range, sometimes they are in the unicode range (depending on what filenames it finds). it sends these sequences to another program which is responsible for actually rendering the bytesequences. to be able to display these sequences, this program has to map the sequence to a glyph. to do that, it has to search in a font for the glyph. if the font does not have a glyph for the given sequence, then the program can not display the byte sequence produced by, for example, dir
.
so, the solution to your problem (seeing any unicode-character in the 'console / terminal' of windows) is: use a font for the program which has (almost) every glyph for (almost) any given unicode bytesequence in it.
Related videos on Youtube
GeekAbhiGeek
I started with Apple Basic and 6502 machine code and Assembly, then went onto Fortran, Pascal, C, Lisp (Scheme), microcode, Perl, Java, JavaScript, Python, Ruby, PHP, and Objective-C. Originally, I was going to go with an Atari... but it was a big expense for my family... and after months of me nagging, my dad agreed to buy an Apple ][. At that time, the Pineapple was also available. The few months in childhood seem to last forever. A few months nowadays seem to pass like days. Those days, a computer had 16kb or 48kb of RAM. Today, the computer has 16GB. So it is in fact a million times. If you know what D5 AA 96 means, we belong to the same era.
Updated on September 17, 2022Comments
-
GeekAbhiGeek over 1 year
This is somewhat related to question
On Windows 7, dir or tree can't show unicode characters, even starting cmd with cmd /U
Even on Windows 7, I found that the only way I can get unicode to go into a file is by
> cmd /U > dir /B > files.txt
the file will be in "Unicode" when I open in Notepad and try "Save As", and if I
dir /B > files.html
and open the HTML file in firefox, it can show using Encoding of UTF-16 (or UTF-16 LE).but, if I want to see it on the screen instead of having it go to a file, it is still impossible. Is there a way to make it happen? Possibly somehow telling cmd not to show nonprintable characters as "?"
Update: I tried cmd.exe, cygwin's bash on windows, and PowerShell. They are the same. Except if I change the "Properties -> Font" to Consolas or Lucida Console, there is some improvement -- now it is not question mark but is either square border or square with a question mark in it.
The more expensive Mac computers with Mac OS X can do it. The free Ubuntu can do it too.
-
GeekAbhiGeek almost 14 yearshm, but the cmd, cygwin bash, and PowerShell all are limited to 3 fonts: Raster fonts, Lucida Console, and Consolas... actually Windows usually fall back to a unicode font when it can't display anything with the current font... also, if I redirect the output, like
dir > file.txt
it is still question mark in the file, even though it is "square box" on the screen. -
akira almost 14 years@Jian Lin: yes, but that is essentially YOUR problem to provide a font which contains these glyphs. and even if windows falls back to "some" font which holds "some" unicode glyphs in it ... that is not enough to display some of your asian glyphs (you have problems with the asian glyphs, right?).
-
akira almost 14 yearsaccording to some websites, "Ascender Uni Duo" seems to be the best font (even for "fixed") ascendercorp.de/fonts/multilingual/ascender-uni but maybe you find something better / cheaper en.wikipedia.org/wiki/Unicode_typefaces
-
GeekAbhiGeek almost 14 years@akira there are many fonts on Windows 7 that can display the whole Unicode glyph set. But (1) Cmd window won't let you choose any of them. (2) When windows or the app falls back to the font that can display unicode, such as Lucida Sans Unicode, it can display most any chinese characters.
-
GeekAbhiGeek almost 14 yearsLucida Sans Unicode used to be much larger... now it is about 300kb on Windows 7. But anyway, even if you set the any web browser to use this font or any other font such as Time New Roman, when you go to news.google.com/news?edchanged=1&ned=tw you can still see the chinese characters if you are using Vista or Win 7. Either the app, or more likely Windows, when cannot find the glyph in that current font, will go find it in the font that has it.
-
GeekAbhiGeek almost 14 yearsbesides, when I redirect the output using CMD /U and then DIR /B > file.txt, I can see the correct glyph in Notepad automatically, even using a default English font. So unless Microsoft is saying, oh we just won't show unicode char in Command Prompt, even people still use it and it is part of Win 7, we will make it behave less well than Notepad. PowerShell too. Unicode? out of the question.
-
GeekAbhiGeek almost 14 yearshm... still won't work... cmd /U, chcp 65001, dir, and dir /B with the font already set to Lucida Console, still the same.
-
ta.speot.is almost 14 yearsYou may want to try adding more fonts to the console: support.microsoft.com/kb/247815 and blogs.msdn.com/b/oldnewthing/archive/2007/05/16/2659903.aspx (the latter for some discussion on the issue).
-
akira almost 14 yearsas you said: cmd.exe only accepts fonts for fixed sizes. it does not matter if you can see all the glyphs in your webbrowser, or in notepad, or in xyz. if the glyph is not in the font used by cmd.exe you can not see it, period. even if windows fallsback to other (fixed size) fonts: if the glyph is not in there either, it can not be displayed. and thats why i said: find a fixed size font for cmd.exe which contains almost all glyphs (as "ascender uni duo", so i was told)
-
akira almost 14 yearsand no, you are not using an "english only" font in notepad. you are lucky that either the font itself or the fallback provides the glyphs the bytesequences require. anyhow, in notepad the default font is not the fixed size font.
-
akira almost 14 yearsit all depends on the fonts you are giving the program to render the text. read the support article, good info in it.
-
GeekAbhiGeek almost 14 yearsI think Mac OS X solved it by making the glyph 2 characters wide, and then, no character is overlapping. It works pretty well and at least people can see the unicode filenames. It is not trying to build a rocket here.
-
Philipp almost 14 yearsIt really has nothing to do with the operating system or encodings. The Windows console display simply uses just one font and doesn't look for alternatives if a glyph is missing. OTOH, the Windows text box (which Notepad uses) does look for alternative fonts.
-
Philipp almost 14 years@taspeotis: The Windows console always uses Unicode internally, regardless of the codepage setting (which is obsolete anyway and only included for backwards compatibility). It is really just a font problem.
-
Philipp almost 14 years@akira: Good answer, I'd just replace ”byte sequences” by “16-bit strings” or “UTF-16 strings” since that is what Windows internally uses.
-
akira almost 14 years@Phillip: i wanted to keep it more generic since the underlying mechanism is the same on every OS: bytesequence -> lookp the glyphs in the font -> rendering.
-
akira almost 14 yearsi ve created a russian filename and cmd.exe displayed the glyphs correctly after switching to lucida. for asian fonts i think OP has to pick a "better" or more "unicode complete fixed font" (even if he does not like that answer :)).
-
GeekAbhiGeek almost 14 yearsCan any of the included font on Win 7 be used? such as MingLiU, DFKai-SB
-
GeekAbhiGeek almost 14 yearsThere is no font I can change to and use, even though there are about 12 chinese fonts on Windows 7. The only font I can change to and use is Courier New, which is pure English. The font "Ascender Uni Duo" costs $149 and is almost as expensive as Windows 7 itself. And who knows whether it will work or not...
-
GeekAbhiGeek almost 14 yearsThere is no font I can change to and use, even though there are about 12 chinese fonts on Windows 7. The only font I can change to and use is Courier New, which is pure English. The font "Ascender Uni Duo" costs $149 and is almost as expensive as Windows 7 itself. And who knows whether it will work or not...
-
akira almost 14 yearsi understand your pain. the last time i contacted ascender they were very friendly, just ask them if you can test the font.
-
Danzzz almost 14 years
ls
in Powershell is actually just an alias for "Get-ChildItem" -
ta.speot.is almost 14 yearsWhat's wrong with adding fonts support.microsoft.com/kb/247815
-
GeekAbhiGeek almost 14 yearsIt is fine adding font. But (1) I already have 7 or 8 default Chinese fonts on the system that should also have other unicode characters that I don't care as much, but, if you say, add one more, sure, I can do that. (2) which one to add -- is there any free one. Somebody suggested adding one that is $149.
-
GeekAbhiGeek over 13 yearsand then? don't tell me you use
Get-ChildItem
on the command line every day instead ofls
. For example, we usually drink water instead of hydrogen dioxide. -
Jeff over 9 years@Philipp it's not just a font problem. The CMD window is an old-school DBCS program. The command line processor itself supports Unicode, but not the display portion. The only way to show Japanese, Chinese, Korean, and Trad. Chinese in the CMD window (or any old-school DBCS UI) is to change the System Locale.