In Notepad++ the encoding of a file is set to UTF-8 but the encoding is actually ASCII

notepad++ encoding utf-8 ascii

5,104

Files which contain only ASCII characters are represented identically in ASCII and UTF-8 encodings. There's no difference between the two unless the file contains at least one non-ASCII character.

Whatever is causing your problem isn't the encoding.

5,104

azim58

Updated on September 18, 2022

Comments

azim58 over 1 year

I need some text files to be encoded as UTF-8 text files when I use them with Notepad++. However, sometimes I have UTF-8 selected as the encoding in Notepad++, but the file is actually in ASCII. I know this by two different methods.

The first method is that I use a simple wiki engine called Mobiki on XAMPP which will only display UTF-8 encoded special characters correctly. My text file is not displaying correctly with Mobiki. The second method I used to check implements the http://www.checkfiletype.com webpage. When I upload the problem files to that page, the website tells me that the file is encoded in the ASCII format. The other pages which are working with Mobiki are encoded in the UTF-8 format as returned by the website.

Why isn't Notepad++ forcing the file to be a UTF-8 file and/or how can I make Notepad++ do this? I tried selecting "Convert to UTF-8" even though Notepad++ shows that the file is encoded in UTF-8 already, but forcing this conversion did not help.

I found some other forum posts which describe a similar problem, but their solution was just to create a new text file. I hope to find a solution without creating a new text file.
- azim58 almost 8 years
  
  Thanks for the information. Yes for that file, Notepad++ displays UTF-8 in the lower right corner, but this seems to be wrong. The file has problems with my wiki, and the checkfiletype.com website shows that it is ASCII. Therefore, perhaps this is a bug as you indicated might be possible.
- Gerald almost 6 years
  
  Possible duplicate of Unable to convert file to UTF-8
- Zan Lynx about 5 years
  
  Is it possible that the file encoded as UTF-8 has a "BOM" (byte order mark) encoded in the first three bytes? This BOM is meaningless for UTF-8 because only UCS-2 / UTF-16 / UCS-4 care about byte order. But some editors abuse it as a UTF encoding mark.