UTF-8 characters are not displayed correctly in Debian
Solution 1
You've told bash and other applications that your terminal uses the UTF-8 encoding. That's good only if your terminal actually does use UTF-8. Bash doesn't get to decide that the terminal encoding is, the terminal gets to decide.
If you want to use UTF-8, configure your terminal to use UTF-8. Since you're using SSH, you need to configure whatever terminal you're running the SSH client in to use UTF-8. That's the default on most modern systems, but apparently yours isn't set up this way.
You should avoid setting LC_CTYPE
explicitly in a terminal: ideally the terminal will set this. However this doesn't always work, especially over SSH (on many systems, the SSH server forbids the client from setting LC_CTYPE
).
If you need to set the environment variable, the right place would be .profile
, not .bashrc
.
Solution 2
It sounds as if you are using the Linux console (rather than one of the X-based terminal emulators), and that it is not running in UTF-8 mode. I would use this script to turn it on (and investigate to see why it is off):
#!/bin/sh
# send character-string to enable UTF-8 mode
if test ".$1" = ".off" ; then
printf '\033%%@'
else
printf '\033%%G'
fi
that is, call the script utf8
, and type
utf8 on
To investigate the error messages, I made a script like this, in two flavors (one in UTF-8, and the other in ISO-8859-1):
#!/bin/bash
printf "ä\n"
echo "ä"
ä
The UTF-8 script says
$ ./foo
ä
ä
./foo: line 4: ä: command not found
and the ISO-8859-1 script says (in a terminal using a locale with UTF-8 encoding):
$ ./foo2
�
�
./foo2: line 5: $'\344': command not found
The point is that bash
adjusts its error message to correspond to the locale, and seeing that it cannot show the ISO-8859-1 character in the UTF-8 locale, shows it as an octal number.
Related videos on Youtube
Steffen
Updated on September 18, 2022Comments
-
Steffen over 1 year
Short description of my problem:
I ran into an issue lately where I am unable to make bash/nano/irssi/etc display "special" UTF-8 characters like the german umlauts (äüö), the euro sign (€) and some other UTF-8 characters like ß, §, etc.What I already tried:
dpkg-reconfigure locales
and only generated en_US.UTF-8- setting
LC_ALL
,LANG
andLANGUAGE
toen_US.UTF-8
within the.bashrc
for both my user and root - re-installed locales and libx11-data (which seems to contain all the language data)
Of course I re-logged in via ssh after all these changes and even tried restarting the server even though I know it doesn't solve any problem in Linux in 99,9875% of all cases.
Information on my system:
OS: Debian stretch -> Linux 3.2.0-4-amd64 #1 SMP Debian 3.2.63-2 x86_64 GNU/Linux
locales: v.2.22-7Output of
locale
:LANG=en_US.UTF-8 LANGUAGE=en_US.UTF-8 LC_CTYPE="en_US.UTF-8" LC_NUMERIC="en_US.UTF-8" LC_TIME="en_US.UTF-8" LC_COLLATE="en_US.UTF-8" LC_MONETARY="en_US.UTF-8" LC_MESSAGES="en_US.UTF-8" LC_PAPER="en_US.UTF-8" LC_NAME="en_US.UTF-8" LC_ADDRESS="en_US.UTF-8" LC_TELEPHONE="en_US.UTF-8" LC_MEASUREMENT="en_US.UTF-8" LC_IDENTIFICATION="en_US.UTF-8" LC_ALL=en_US.UTF-8
When typing for example ä into the console and press enter I get
-bash: $'\344': command not found
.
Honestly I am out of ideas, can anyone help me out with this?-
Marius almost 8 yearsstretch is Debian/testing, which has bash 4.3-14+b1, and that does not open any interesting files as seen with strace.
-
Steffen almost 8 yearsSo this is possibly a bug of bash itself then? I shamefully have to admit, that I didn't had the idea to check it with strace. EDIT: I tested it on another machine with stretch which seems to have the very same problem (bash 4.3-14+b1).
-
Marius almost 8 yearsIt behaves as you show in an older version of bash (I've Debian 7 running), and was probably introduced as a feature enhancement rather than bug-fix. I used strace to check if bash is reading some relevant locale files, but found no sign of that.
-
Steffen almost 8 yearsI did just realize, that it can't be a bug of bash itself, since it acts the very same way in every other application I tested (nano, irssi, dpkg-reconfigure [the UTF-8 blocks are just some garbage characters here]), so it needs to be some systemwide "thing" (bug/setting/whatever).
-
Marius almost 8 yearsWell... the
$'\344'
hints that it may not be UTF-8. In Debian 7, the message shows$'\303\244'
. If I change the input character to Latin-1ä
, I get the same message that you are seeing. Perhaps whatever "console" you are using is set to non-UTF-8 mode, but the locale still uses UTF-8. -
thenakulchawla about 7 yearsI am struggling with almost the same issue, and none of the answers below seem to be working for me. What solution did you use?
-
Ken Sharp over 6 yearsDid you ever solve this?
-
Steffen almost 8 yearsHello, first I'd like to thank you very much for your investigations. I followed your steps exactly and all it responds is $'\344': command not found. I even created the script on another machine and transfered it afterwards to make sure its encoding is set properly. Of course I executed utf8 on first, but actually this does not do anything (at least it seems like it), other than printing a capital G. I've tried it with either SecureCRT and Putty as client and made sure, both use UTF-8 as encoding and "Xterm" as emulation. Additionally I checked the font, if it has those UTF-8 chars.
-
Steffen almost 8 yearsI've tried SecureCRT and Putty as SSH client and ensured that both use UTF-8 as encoding and Xterm as emulation - the font has the necessary characters as well. Actually (as I mentioned in a comment above) I'm able to reproduce the very same behaviour on a machine running Debian Stretch, but not on a machine which is running Debian Wheezy or Debian Jessie, while using the exact same session options. So for me it seems something on the system side changed with an upgrade to Stretch - or am I interpreting that wrong?
-
Marius almost 8 yearsyes... it's not due to a difference in
bash
but rather in how you are entering the characters. -
Steffen almost 8 yearsI've entered the characters the exact same way.