Problem writing UTF-8 encoded file in PHP
Solution 1
First off, don't depend on mb_detect_encoding
. It's not great at figuring out what the encoding is unless there's a bunch of encoding specific entities (meaning entities that are invalid in other encodings).
Try just getting rid of the mb_detect_encoding
line all together.
Oh, and utf8_encode
turns a Latin-1
string into a UTF-8
string (not from an arbitrary charset to UTF-8
, which is what you really want)... You want iconv
, but you need to know the source encoding (and since you can't really trust mb_detect_encoding
, you'll need to figure it out some other way).
Or you can try using iconv
with a empty input encoding $str = iconv('', 'UTF-8', $str);
(which may or may not work)...
Solution 2
It doesn't work like that. Even if you utf8_encode($theString) you will not CREATE a UTF8 file.
The correct answer has something to do with the UTF-8 byte-order mark.
This to understand the issue:
- http://en.wikipedia.org/wiki/Byte_order_mark
- http://unicode.org/faq/utf_bom.html
The solution is the following: As the UTF-8 byte-order mark is '\xef\xbb\xbf' we should add it to the document's header.
<?php
function writeStringToFile($file, $string){
$f=fopen($file, "wb");
$file="\xEF\xBB\xBF".$string; // utf8 bom
fputs($f, $string);
fclose($f);
}
?>
The $file could be anything text or xml... The $string is your UTF8 encoded string.
Try it now and it will write a UTF8 encoded file with your UTF8 content (string).
writeStringToFile('test.xml', 'éèàç');
user387302
Updated on June 04, 2022Comments
-
user387302 almost 2 years
I have a large file that contains world countries/regions that I'm seperating into smaller files based on individual countries/regions. The original file contains entries like:
EE.04 Järvamaa EE.05 Jõgevamaa EE.07 Läänemaa
However when I extract that and write it to a new file, the text becomes:
EE.04 Järvamaa EE.05 Jõgevamaa EE.07 Läänemaa
To save my files I'm using the following code:
mb_detect_encoding($text, "UTF-8") == "UTF-8" ? : $text = utf8_encode($text); $fp = fopen(MY_LOCATION,'wb'); fwrite($fp,$text); fclose($fp);
I tried saving the files with and without utf8_encode() and neither seems to work. How would I go about saving the original encoding (which is UTF8)?
Thank you!