How can I convert "Western (Mac OS Roman)" formatted text to UTF-8 with PHP?
The mb-functions can't handle "macintosh" which is the IANA defined name for Mac Roman. You have to use iconv
.
$line = iconv('macintosh', 'UTF-8', $line);
Angry Dan
web/software developer, .NET, C#, WPF, PHP, software trainer, English teacher, have philosophy degree, love languages, run marathons my tweets: http://www.twitter.com/edward_tanguay my runs: http://www.tanguay.info/run my code: http://www.tanguay.info/web my publications: PHP 5.3 training video (8 hours, video2brain) my projects: http://www.tanguay.info
Updated on July 25, 2022Comments
-
Angry Dan almost 2 years
I have files being exported by Excel for Mac 2011 VBA in Western (Mac OS Roman) as shown here:
I haven't been successful in getting Excel for Mac VBA to export directly to UTF-8 so I want to convert these files with PHP before I save them to MySQL, I am using this command:
$dataset[$k] = mb_convert_encoding($line, 'ASCII', 'UTF-8'); //not correctly converted $dataset[$k] = mb_convert_encoding($line, 'ISO-8859-8', 'UTF-8'); //not correctly converted $dataset[$k] = mb_convert_encoding($line, 'macintosh', 'UTF-8'); //unrecognized name $dataset[$k] = mb_convert_encoding($line, 'Windows-1251', 'UTF-8'); //changes "schön" to "schљn" $dataset[$k] = mb_convert_encoding($line, 'Windows-1252', 'UTF-8'); //changes "schön" to "schšn"
I found this list of valid encoding formats from 2008, but none of them seem to represent
Western (Mac OS Roman)
.* UCS-4 * UCS-4BE * UCS-4LE * UCS-2 * UCS-2BE * UCS-2LE * UTF-32 * UTF-32BE * UTF-32LE * UTF-16 * UTF-16BE * UTF-16LE * UTF-7 * UTF7-IMAP * UTF-8 * ASCII * EUC-JP * SJIS * eucJP-win * SJIS-win * ISO-2022-JP * JIS * ISO-8859-1 * ISO-8859-2 * ISO-8859-3 * ISO-8859-4 * ISO-8859-5 * ISO-8859-6 * ISO-8859-7 * ISO-8859-8 * ISO-8859-9 * ISO-8859-10 * ISO-8859-13 * ISO-8859-14 * ISO-8859-15 * byte2be * byte2le * byte4be * byte4le * BASE64 * HTML-ENTITIES * 7bit * 8bit * EUC-CN * CP936 * HZ * EUC-TW * CP950 * BIG-5 * EUC-KR * UHC (CP949) * ISO-2022-KR * Windows-1251 (CP1251) * Windows-1252 (CP1252) * CP866 (IBM866) * KOI8-R
What format do I need to use to convert "Western (Mac OS Roman) to UTF-8?