curl with Umlaut causes "JSON parse error: Invalid UTF-8 middle byte 0x22"

6,321

Try replacing the character ä with its UTF8 encoding \u00e4:

curl -s -X POST -H "Content-Type: application/json" -H "Accept: application/json" -d '{"testField":"u00e4"}' https://someurl..
Share:
6,321

Related videos on Youtube

Bernie Lenz
Author by

Bernie Lenz

Updated on September 18, 2022

Comments

  • Bernie Lenz
    Bernie Lenz over 1 year

    I'm running below curl command from the command line (Git Bash on Windows) or as part of a Bash Script.

    curl -s -X POST -H "Content-Type: application/json" -H "Accept: application/json" -d "{\"testField\":\"ä\"}" https://someurl...
    

    The body of the curl command has an Umlaut ä.

    The server which is a Spring Boot REST API running in an AWS Elastic Beanstalk Container returns the following error:

    JSON parse error: Invalid UTF-8 middle byte 0x22; nested exception is com.fasterxml.jackson.core.JsonParseException: Invalid UTF-8 middle byte 0x22\n at [Source: (PushbackInputStream); line: 1, column: 17]
    

    The same curl command imported into Postman works just fine, so I believe it's rather a curl issue than a server problem.

    • Ignacio Vazquez-Abrams
      Ignacio Vazquez-Abrams about 6 years
      Why not use a Unicode escape?
    • Bernie Lenz
      Bernie Lenz about 6 years
      The inputs come from users either as a copy paste or a text file, so I'd like to avoid having to manually convert any extended ascii characters.
    • Fox
      Fox about 6 years
      The issue isn't really the accented character; it's that the data encoding does not match the parse encoding. How would your system react if the input were, say, ISO 2022-JP encoded instead of (I assume) Windows code page 1252? Just ignoring text within quotes doesn't cut it, because 0x22 is not always " in ISO 2022-JP. Can you enforce a specific input encoding? If not, can you reliably determine the encoding given?
  • Bernie Lenz
    Bernie Lenz about 6 years
    The inputs come from users either as a copy paste or via a text file, so I'd like to avoid having to manually convert any extended ascii characters.
  • jayhendren
    jayhendren about 6 years
    Then you'll need to use a Unicode encoding library of some kind. There are ones available in most popular languages (Python, PHP), so just choose your favorite...