PHP json encode - Malformed UTF-8 characters, possibly incorrectly encoded

100,260

Solution 1

The issue happens if there are some non-utf8 characters inside even though most of them are utf8 chars. This will remove any non-utf8 characters and now it works.

$data['name'] = mb_convert_encoding($data['name'], 'UTF-8', 'UTF-8');

Solution 2

If you have a multidimensional array to encode in JSON format then you can use below function:

If JSON_ERROR_UTF8 occurred :

$encoded = json_encode( utf8ize( $responseForJS ) );

Below function is used to encode Array data recursively

/* Use it for json_encode some corrupt UTF-8 chars
 * useful for = malformed utf-8 characters possibly incorrectly encoded by json_encode
 */
function utf8ize( $mixed ) {
    if (is_array($mixed)) {
        foreach ($mixed as $key => $value) {
            $mixed[$key] = utf8ize($value);
        }
    } elseif (is_string($mixed)) {
        return mb_convert_encoding($mixed, "UTF-8", "UTF-8");
    }
    return $mixed;
}

Solution 3

Please, make sure to initiate your Pdo object with the charset iso as utf8. This should fix this problem avoiding any re-utf8izing dance.

$pdo = new PDO("mysql:host=localhost;dbname=mybase;charset=utf8", 'user', 'password');

Solution 4

With php 7.2, two options allow to manage invalid UTF-8 direcly in json_encode :

https://www.php.net/manual/en/function.json-encode

json_encode($text, JSON_INVALID_UTF8_IGNORE);

Or

json_encode($text, JSON_INVALID_UTF8_SUBSTITUTE);

Solution 5

you just add in your pdo connection charset=utf8 like below line of pdo connection:

$pdo = new PDO("mysql:host=localhost;dbname=mybase;charset=utf8", 'user', 'password');

hope this will help you

Share:
100,260
sparkmix
Author by

sparkmix

Updated on July 08, 2022

Comments

  • sparkmix
    sparkmix almost 2 years

    I'm using json_encode($data) to an data array and there's a field contains Russian characters.

    I used this mb_detect_encoding() to display what encoding it is for that field and it displays UTF-8.

    I think the json encode failed due to some bad characters in it like "ра▒". I tried alot of things utf8_encode on the data and it will by pass that error but then the data doesn't look correct anymore.

    What can be done with this issue?

  • Alexandru Topală
    Alexandru Topală about 5 years
    This solved my situation. It also works for other connection types, like dlib for MSSQL Server.
  • elnezah
    elnezah about 4 years
    mb_convert_encoding does the recursive work itself, as you can see in the documentation link: If val is an array, all its string values will be converted recursively. So the function utf8ize is not needed. All you need would be json_encode(mb_convert_encoding($responseForJS, "UTF-8", "UTF-8"));
  • mylesmg
    mylesmg about 4 years
    mb_convert_encoding is only able to convert arrays if you are running PHP 7.2 or above, just for clarification. Otherwise, this function works perfectly.
  • yurguis
    yurguis almost 4 years
    Was given an old project to fix encoding issues and this helped me a lot. Only difference is that this project was using ADO and solution was a little bit different, solved it by using setCharset(), info here adodb.org/dokuwiki/…
  • Justin
    Justin over 3 years
    You might want to add this as well $mysqli->set_charset("utf8");
  • pilat
    pilat about 3 years
    I've tried to find that invalid string by adding the following code: ` foreach ($addresses as $address) { $converted = mb_convert_encoding($address, 'UTF-8', 'UTF-8'); if ($converted !== $address) { dd($addresses); } }` Two points: 1. The $converted !== $address condition is never met. I suppose this is because === is a "binary-safe" operator… 2. I don't get error in the end, even though I never assign $converted to anything! It's like mb_convert_encoding() accepted string by reference, although it's not…
  • Haritsinh Gohil
    Haritsinh Gohil over 2 years
    thanks, It works for me because my response in api has emoji in title string, but i have one confusion, that i have read somewhere that emoji is utf-8 character then why emoji in string gives this malformed utf-8 characters error?
  • hugsbrugs
    hugsbrugs over 2 years
    @HaritsinhGohil perhaps some emojis are valid UTF-8 chars and others are not ...