UTF-8 characters missing or displayed as boxes in Notepad, but works fine in webbrowser and other text editors
If it looks fine in other editors, then the text itself is fine. If it looks OK in the browser, then the response is probably fine too (but better check page info in the browser and see what the encoding is). Your problem is probably with notepad itself. Sometimes it requires BOM to detect Unicode properly. But BOM can break other apps that don't support it. You should also try Notepad on different versions of Windows. I have just tried opening an UTF-8 file in Windows 7, looks fine to me.
JAVAGeek
Updated on June 13, 2022Comments
-
JAVAGeek almost 2 years
I have UTF-8 text stored in DB and served as
text/plain; charset=utf-8
in a web application. All the things are working fine. I can see the UTF-8 text on browser window without any problem.But when I save that text to a file and try to open it in Windows Notepad, I got some characters missing and displayed as a small rectangular box. However, the text file looks fine in other editors like EditPlus and Notepad++.
How is this caused and how can I solve it?
-
JAVAGeek over 11 yearsi can see the encoding in notepad++ its ANSI .where i want it to be UTF-8
-
Sergei Tachenov over 11 years@JAVAGeek, if it's really ANSI then Notepad shouldn't have any problems with reading it. It means that Notepad++ is wrong, and it's not ANSI. By UTF-8 Notepad++ means "UTF-8 with BOM", which isn't strictly correct, as UTF-8 without BOM is UTF-8 too. To be sure, look at your file using some hex viewer - if symbols outside of 7-bit ASCII are encoded as 2 or more bytes, then it's really UTF-8.
-
CallumDA over 7 yearsPlease consider adding more explanation to your answer, for example, explaining where OP went wrong or why your solution works