Json Parser Error - Invalid UTF-8 start byte 0xa0

12,647

Solution 1

0xa0 is a Unicode Character 'NO-BREAK SPACE' (U+00A0)

not the usual space 0x20

visually hard to notice the difference, but some frameworks do not like it.

Solution 2

Convert characters - the other characters and test it.

Helpful link http://www.ietf.org/rfc/rfc4627.txt

Share:
12,647
sha
Author by

sha

Android developer

Updated on July 28, 2022

Comments

  • sha
    sha over 1 year

    I'm struggling with an issue while sending Json data to the server. I guess there is some issue with the bad characters which are not expected at start of UTF-8 format.

    I used CharDecoder to replace all the malformed utf-8 characters and here is the code.

     // Construct the Decoder
        CharsetDecoder utf8Decoder = Charset.forName("UTF-8").newDecoder();
        utf8Decoder.onMalformedInput(CodingErrorAction.REPLACE);
        utf8Decoder.onUnmappableCharacter(CodingErrorAction.REPLACE);
        //  Configure to replace Malformed input with space
        utf8Decoder.replaceWith(" ");
    
        //  Construct ByteBuffer
        ByteBuffer byteBuff = ByteBuffer.wrap(text.getBytes());
        try {
            //  Process the text.
            CharBuffer parsed = utf8Decoder.decode(byteBuff);
            return new String(parsed.array());
        } catch (CharacterCodingException e) {
            e.printStackTrace();
        }
    

    This is not helping me. When I look at the column line of Json post data where parser is complaining, it is a space character.

    Json to post is

    {"body":{"messageSegments":[{"type":"Text","text":"This is a link "},{"type":"Mention","id":"005GGGGGG02g6MMIAZ"},{"type":"Text","text":" ish"}]},"capabilities":{"questionAndAnswers":{"questionTitle":"https:\/\/www.google.co.nz\/webhp?sourceid=chrome-instant&ion=1&espv=2&ie=UTF-16"}}}
    

    Error is

    [{"errorCode":"JSON_PARSER_ERROR","message":"Invalid UTF-8 start byte 0xa0 at [line:1, column:139]"}]

    Any leads please.

    Thanks,
    Sree

  • Jason Young
    Jason Young over 4 years
    Note that 'NO-BREAK SPACE' in UTF-8 is two bytes: 0xc2 0xa0: utf8-chartable.de/unicode-utf8-table.pl?utf8=0x I cannot find plain 0xA0 in UTF-8, which makes sense if it's not a valid start byte.