File encoding not preserved after saving in Notepad++

13,896

Solution 1

This is expected behaviour.

It can happen that a file is saved with a certain encoding, but upon reopening it in Notepad++ it is detected with another encoding. This is a technical limitation and happens because sometimes the resulting file will not differ even though different encodings are used. This is most noticeable if the file is saved without a special BOM (Byte Order Mark) indicating the used encoding.

ANSI and UTF-8 share their first 128 characters (ASCII), making them indistinguishable if those are all you use. With a plain text file, there is no metadata indicating the encoding, so all Notepad++ (and other editors) can do is look at the characters/data in the file and take a guess.

  • If the file has a BOM, NP++ detects it and knows about the encoding.
  • If the file is HTML or XML, the encoding is read from the first line of the file.
  • Otherwise, NP++ takes a guess between UCS-2LE, UCS2-BE and ANSI. You cannot make a difference between a file encoded in UTF-8 without BOM and a file in ANSI with plenty of high ASCII characters.

http://sourceforge.net/projects/notepad-plus/forums/forum/331754/topic/3822723


In theory, PRacicot's answer should open all ANSI/UTF-8 files (without a BOM) as UTF-8. This is also recommended in an answer to a similar Stack Overflow question. If this doesn't work for you, I'm not sure what can be done. The encoding is indicated as ANSI as UTF-8 in the status bar, for me.

Solution 2

You may want to change this setting in you're notepad++ preference.

Go in the menu Settings -> Preference -> New document/ Default Directory. In the suck section Encoding, check UTF-8 without BOM and check Apply to opened ANSI files.

By checking the Apply to opened ANSI files, this preference will also be apply to current opened documents in notepad++.

Since I don't have enough points yet to post image here is a link to postimage where I uploaded the image: http://postimage.org/image/4qza0bkv9/

Good Luck and happy programming.

Solution 3

You have to Convert to ... instead of change the Encode to ... option.
You may also want to change this option in the settings, so all your new files will get created with your choosen encoding.

imuge

Share:
13,896

Related videos on Youtube

Fuxi
Author by

Fuxi

Updated on September 18, 2022

Comments

  • Fuxi
    Fuxi over 1 year

    I'm using Notepad++ for editing my PHP scripts.

    However, I found a strange problem: when changing the encoding from ANSI to UTF-8 (without BOM), saving, closing, re-loading – then checking encoding: is still ANSI.

    Any ideas what's wrong? It always worked for me in the past.

    • Vince Bowdren
      Vince Bowdren almost 11 years
      If you change the encoding then make some change before saving, does that make a difference?
  • Fuxi
    Fuxi almost 12 years
    thanks for your comment, already did that but didn't help. when selecting "convert to ..", saving, reloading - it's still ANSI. i also tried creating a new file and pasting.
  • Jonathon
    Jonathon about 11 years
    Makes me wonder why BOM is not recommended, when it appears that using UTF8 is highly annoying and easily subject to error without it.
  • tvdo
    tvdo about 11 years
    @JonathonWisnoski I believe it's because the use of a BOM can break backwards compatibility with legacy (and other) programs expecting ASCII. In particular, many script parsers (possibly including PHP) and the shebang on some POSIX systems would trip up.
  • oefe
    oefe almost 9 years
    ANSI and UTF-8 are not more or less the same - apart from the ASCII subset, they are completely different. Trying to read a UTF-8 file (with non-ASCII characters) as ANSI will result in garbage, trying to read an ANSI file (with non-ASCII characters) as UTF-8 will most likely result in decoding errors (but can also result in garbage in some cases).
  • tvdo
    tvdo almost 9 years
    @oefe Edited ANSI => ASCII.
  • oefe
    oefe almost 9 years
    much better!