iconv - Detected an illegal character in input string

113,849

Solution 1

The illegal character is not in $matches[1], but in $xml

Try

iconv($matches[1], 'utf-8//TRANSLIT', $xml);

And showing us the input string would be nice for a better answer.

Solution 2

If you used the accepted answer, however, you will still receive the PHP Notice if a character in your input string cannot be transliterated:

<?php
$cp1252 = '';

for ($i = 128; $i < 256; $i++) {
    $cp1252 .= chr($i);
}

echo iconv("cp1252", "utf-8//TRANSLIT", $cp1252);

PHP Notice:  iconv(): Detected an illegal character in input string in CP1252.php on line 8

Notice: iconv(): Detected an illegal character in input string in CP1252.php on line 8

So you should use IGNORE, which will ignore what can't be transliterated:

echo iconv("cp1252", "utf-8//IGNORE", $cp1252);

Solution 3

BE VERY CAREFUL, the problem may come from multibytes encoding and inappropriate PHP functions used...

It was the case for me and it took me a while to figure it out.

For example, I get the a string from MySQL using utf8mb4 (very common now to encode emojis):

$formattedString = strtolower($stringFromMysql);
$strCleaned = iconv('UTF-8', 'utf-8//TRANSLIT', $formattedString); // WILL RETURN THE ERROR 'Detected an illegal character in input string'

The problem does not stand in iconv() but stands in strtolower() in this case.

The appropriate way is to use Multibyte String Functions mb_strtolower() instead of strtolower()

$formattedString = mb_strtolower($stringFromMysql);
$strCleaned = iconv('UTF-8', 'utf-8//TRANSLIT', $formattedString); // WORK FINE

MORE INFO

More examples of this issue are available at this SO answer

PHP Manual on the Multibyte String

Solution 4

PHP 7.2

iconv('UTF-8', 'ASCII//TRANSLIT', 'é@ùµ$`à');
// "e@uu$`a"

iconv('UTF-8', 'ASCII//IGNORE', 'é@ùµ$`à');
// "@$`"

iconv('UTF-8', 'ASCII//TRANSLIT//IGNORE', 'é@ùµ$`à');
// "e@uu$`a"

PHP 7.4

iconv('UTF-8', 'ASCII//TRANSLIT', 'é@ùµ$`à');
// PHP Notice:  iconv(): Detected an illegal character

iconv('UTF-8', 'ASCII//IGNORE', 'é@ùµ$`à');
// "@$`"

iconv('UTF-8', 'ASCII//TRANSLIT//IGNORE', 'é@ùµ$`à');
// "e@u$`a"

iconv('UTF-8', 'ASCII//TRANSLIT//IGNORE', Transliterator::create('Any-Latin; NFD; [:Nonspacing Mark:] Remove; NFC')->transliterate('é@ùµ$`à'))
// "e@uu$`a" -> same as PHP 7.2

Solution 5

I found one Solution :

echo iconv('UTF-8', 'ASCII//TRANSLIT', utf8_encode($string));

use utf8_encode()

Share:
113,849
Ben
Author by

Ben

Updated on July 09, 2022

Comments

  • Ben
    Ben almost 2 years

    I don't see anything illegal - any suggestions on what might be the problem?

        if (strtolower($matches[1]) != 'utf-8') {
            var_dump($matches[1]);
            $xml = iconv($matches[1], 'utf-8', $xml);
            $xml = str_replace('encoding="'.$matches[1].'"', 'encoding="utf-8"', $xml);
        }
    

    Below is my debug/error

    string(12) "windows-1252"
    Notice (8): iconv() [http://php.net/function.iconv]: Detected an illegal character in input string [APP/models/sob_form.php, line 16]
    

    I've verified that the above code is indeed line 16

  • Erel Segal-Halevi
    Erel Segal-Halevi over 10 years
    I get the same notice even when I put "//IGNORE" on both sides
  • Erel Segal-Halevi
    Erel Segal-Halevi over 10 years
    I get the same notice even when I put "//TRANSLIT" on both sides
  • NobleUplift
    NobleUplift over 10 years
    What do you mean on both sides?
  • NobleUplift
    NobleUplift almost 9 years
    And @ErelSegal-Halevi, I would like to see your code.
  • NobleUplift
    NobleUplift almost 9 years
    @Mantas But Erel was replying to the //IGNORE text in my answer, which is why I was confused by your praising of him.
  • Erel Segal-Halevi
    Erel Segal-Halevi almost 9 years
    It was a long time ago, but from what I remember, my code was something like: echo iconv("cp1252//IGNORE", "utf-8//IGNORE", $cp1252);
  • NobleUplift
    NobleUplift almost 9 years
    That might explain it. You can't add flags to the in_charset of iconv, but you're right; this question is pretty old lol. Good thing I love necroposts.
  • Juha Untinen
    Juha Untinen almost 8 years
    In my case, using //IGNORE seems to delete the entire string? The hex values of the string: 4f 62 65 72 6b 72 c3 a4 6d 65 72 (= "Oberkrämer"), which becomes empty if I use iconv(mb_detect_encoding($string), 'ISO-8859-1//TRANSLIT', ($string));
  • NobleUplift
    NobleUplift almost 8 years
    @JuhaUntinen Can you debug/output the result of mb_detect_encoding($string)? And you say //TRANSLIT is resulting in an empty string, or //IGNORE? You also don't need to put the second instance of $string in parentheses.
  • Grokking
    Grokking almost 7 years
    Use //IGNORE instead
  • Wonko the Sane
    Wonko the Sane over 3 years
    ^^^this (ASCII//TRANSLIT//IGNORE) is exactly what I needed. Thanks