Linux not interpreting UTF8 encoded characters
Solution 1
The problem here doesn't seem to be on your browser nor your Apache configuration. You need to double check the locale settings of your system.
You need to check if the locale apache is running is UTF-8 enabled. To do so you may run the command:
$ sudo su -l -c locale www-data
where www-data
is the apache user. Check if the locale returned doesn't looks like, for example, es_ES.UTF-8
it means your locale doesn't have UTF-8 enabled.
If this is the case, you may change this configuration, on a CentOS machine, at /etc/sysconfig/i18n
, changing the line LANG="es_ES"
to LANG="es_ES.UTF-8"
. But, still, in order for this to work, your system need the locale file for this language. To check if it existes, use locale -a
to get a list of locales available.
If your system doesn't have a UTF-8 enabled locale, you may generate one using the command:
$ sudo localedef -i es_ES -f UTF-8 es_ES.utf8
and set it as your default language.
Hope this help!
Solution 2
In addition to fboaventura's answer
Check if the locale apache is running
$ sudo su -l -c locale www-data
In order to change i18n
configuration at /etc/sysconfig/i18n
:
Go to the CentOS system configuration directory
$ cd /etc/sysconfig
Make backup copy of your language setting file
$ cp i18n i18n.backup
Edit language setting file by using nano
$ nano i18n
Edit the file to include your configuration
For example:
LANG="en_US.utf8"
SYSFONT="latarcyrheb-sun16"
SUPPORTED="en_US.utf8:en_US:en:fr_FR.utf8:fr_FR:fr :es_ES.utf8:es_ES:es:de_DE.utf8:de_DE:de:sv_SE.utf 8:sv_SE:sv:zh_CN.utf8:
zh_CN:zh:zh_TW.utf8:zh_TW:zh:ja_JP.utf8:ja_JP:ja:k o_KR.utf8:ko_KR:ko"
Save the file and restart the system.
Additional Resources
- https://unix.stackexchange.com/questions/74618/how-to-change-locale-environment-variable
- How to change my commandline locale after CentOS decided to change it?
w0rldart
Updated on September 18, 2022Comments
-
w0rldart over 1 year
So, having the following file
Adán-y-Eva-50x50.jpg
when I try to access it, apache translates it toAd\xc3\xa1n-y-Eva-50x50.jpg
and won't find it, even though it exists.This happens only for filenames that contain UTF8 characters.
I have already the following configuration in my
/etc/httpd/conf/httpd.conf
... AddDefaultCharset UTF-8 ... IndexOptions FancyIndexing VersionSort NameWidth=* HTMLTable +Charset=UTF-8 ...
And added also this to my root
.htaccess
on the first line:IndexOptions +Charset=UTF-8
All this with no effect to load those kind of files. Any suggestions?
UPDATE
Just to mention it: I'm running the websites on a CentOS server with plesk panel preconfigured
-
Zubair over 11 yearsThe options you list are for hinting about the character type in the content of the pages and not for the URL which is the problem you are describing. Any problems in encoding the URL may be due to the browser and not apache. It worked fine for me with apache 2.2.3 on centos5 with LANG=en_US.UTF8
-
Andrew B over 11 yearsI'm inclined to agree with mtinberg, but just in case, can you elaborate on what you mean by "apache translates it to"? Is this in the URL bar of the browser, or a log file? Have you tried grabbing the index page from the console (assuming you have a UTF-8 enabled LANG variable and terminal) with
wget
orcurl
to verify that that is indeed what the webserver itself is sending? -
Rosty Koryaha over 11 yearsWhat browser are you using, and what language / characterset is the browser configured to use, see: Bug: Apache 2.0 Breaks Non-UTF-8 Encoded URLs on Windows
-
w0rldart over 11 yearshappens on chrome, firefox and safari... it's not a browser issue as on the older server had no issue with the mentioned
-
Michal S over 7 yearsFor those unable to solve similar problem by answers below. Check your UTF8 file names for NORMALIZTION FORM (C, D) For example when you transfer files from mac do linux with UTF* name it may be not proper fo new environment. Can be changed by convmv with --nfc flag.
-
-
w0rldart over 11 yearstried
su -l -c locale apache
andsu -l -c locale my-user
and none return the desired output. I also ransystem-config-language
and set to Spanish utf8, reboted and the same... also seti18n
toes_ES.UTF-8
-
fboaventura over 11 yearsin order to generate the locale file, and this will be done globally, you have to run
locale-gen es_ES.utf8
as root. -
w0rldart over 11 yearsI'm runing on CentOS, I don't have
locale-gen
and have searched and all pointed tosystem-config-language
-
fboaventura over 11 yearsI've changed the command to generate the locale file. After the generation of the file, you may change the
i18n
file inside sysconfig, restart your apache (or reboot your system) and test is out. -
w0rldart over 11 yearsI keep having the same issue... and
su -l -c locale site-user
still doesn't output anything -
fboaventura over 11 yearsSorry mate! This is as far as I can go without seeing your system. I've set up a CentOS machine yesterday just to test and validate the commands above.