"Invalid byte 1 of 1-byte UTF-8 sequence" occurs when posting xml in .jar but not in eclpise

10,442

You need to choose the encoding used by your PrintWriter. Outside of Eclipse, your platform is presumably defaulting to something other than UTF-8.

Try this code:

PrintWriter pw = new PrintWriter(new OutputStreamWriter(
    conn.getOutputStream(), "UTF-8"));
Share:
10,442
JaskeyLam
Author by

JaskeyLam

blog: jaskey.github.io

Updated on June 05, 2022

Comments

  • JaskeyLam
    JaskeyLam over 1 year

    I have a problem where I am struggling, and I read many thread about "Invalid byte 1 of 1-byte UTF-8 sequence" , such as XML Invalid byte 1 of 1-byte UTF-8 sequence , MalformedByteSequenceException Invalid byte 1 of 1-byte UTF-8 sequence . But it does not solve my problem.

    I have a web application(living in a paas cloud), and it is working fine to handle the request from the mobile clients(not developed by myself).

    In order to test the server application, I write a client application(say test-client)based on Swing to post the xml data through HTTP to my server.

    The strange problem is that when I run this test-client in eclipse , it works fine to submit the post and get the message from my server back.

    But when I export it into Runnable jar, the exception is found in my server logs that " org.dom4j.DocumentException: Invalid byte 1 of 1-byte UTF-8 sequence. Nested exception: Invalid byte 1 of 1-byte UTF-8 sequence." when I post some xml data contains Chinese Character.

    I believe this is relative to the difference encoding between my computer and eclipse.

    Please note that

    1.I do not have xml to read,instead I construct the xml data from an object

    2.my general/preference/workspace is encoded as UTF-8. And I have request.setCharacterEncoding("UTF-8"); in my doPost;

    3.I hope to modify my test-client code to let it works fine since the server is now working fine in production with the mobile user.

    Below is how I post the xml data

            URL url = new URL(address);
            URLConnection uc = url.openConnection();
            HttpURLConnection conn = (HttpURLConnection) uc;
            conn.setDoInput(true);
            conn.setDoOutput(true);
            conn.setRequestMethod("POST");
            conn.setRequestProperty("Content-type", "text/xml");
            System.out.println("before POST:\n"+xmlstr);
            PrintWriter pw = new PrintWriter(conn.getOutputStream());
            pw.write(xmlstr);
            pw.close();
    

    And the xmlstr comes from below

    (RequestTextMessage is a very easy class which only has getter, and one field of this Class will accept a input String, which may be Chinese):

     xmlStr= XMLRequest.textMessageToXml(msg);
    
    public static String textMessageToXml(RequestTextMessage textMsg){
        xstream.alias("xml", textMsg.getClass());
        return xstream.toXML(textMsg);
    }
       private static XStream xstream = new XStream(new XppDriver() {
    
            @Override
            public HierarchicalStreamWriter createWriter(Writer out) {  
                return new PrettyPrintWriter(out) {  
                    boolean cdata = true;  
    
                    protected void writeText(QuickWriter writer, String text) {  
                        if (cdata) {  
                            writer.write("<![CDATA[");  
                            writer.write(text);  
                            writer.write("]]>");  
                        } else {  
                            writer.write(text);  
                        }  
                    }  
                };  
            }  
        });  
    

    For your information the exception from the server is below(I am sorry the exception is reverse):

    at java.lang.Thread.run(Thread.java:724)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
    at org.apache.tomcat.util.net.JIoEndpoint$SocketProcessor.run(JIoEndpoint.java:310)
    at org.apache.coyote.AbstractProtocol$AbstractConnectionHandler.process(AbstractProtocol.java:603)
    at org.apache.coyote.http11.AbstractHttp11Processor.process(AbstractHttp11Processor.java:1041)
    at org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:408)
    at org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:118)
    at org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:100)
    at org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:171)
    at org.apache.catalina.authenticator.AuthenticatorBase.invoke(AuthenticatorBase.java:502)
    at org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:123)
    at org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:222)
    at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:210)
    at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:305)
    at javax.servlet.http.HttpServlet.service(HttpServlet.java:728)
    at javax.servlet.http.HttpServlet.service(HttpServlet.java:647)
    at wodinow.weixin.jaskey.co.CoreServlet.doPost(CoreServlet.java:161)
    at wodinow.weixin.jaskey.service.CommandService.generateResponseXML(CommandService.java:76)
    at wodinow.weixin.jaskey.util.MessageUtil.parseXml(MessageUtil.java:52)
    at org.dom4j.io.SAXReader.read(SAXReader.java:335)
    at org.dom4j.io.SAXReader.read(SAXReader.java:439)
    at com.sun.org.apache.xerces.internal.jaxp.SAXParserImpl$JAXPSAXParser.parse(SAXParserImpl.java:568)
    at com.sun.org.apache.xerces.internal.parsers.AbstractSAXParser.parse(AbstractSAXParser.java:1210)
    at com.sun.org.apache.xerces.internal.parsers.XMLParser.parse(XMLParser.java:123)
    at com.sun.org.apache.xerces.internal.parsers.XML11Configuration.parse(XML11Configuration.java:764)
    at com.sun.org.apache.xerces.internal.parsers.XML11Configuration.parse(XML11Configuration.java:835)
    at com.sun.org.apache.xerces.internal.impl.XMLDocumentFragmentScannerImpl.scanDocument(XMLDocumentFragmentScannerImpl.java:489)
    at com.sun.org.apache.xerces.internal.impl.XMLNSDocumentScannerImpl.next(XMLNSDocumentScannerImpl.java:116)
    at com.sun.org.apache.xerces.internal.impl.XMLDocumentScannerImpl.next(XMLDocumentScannerImpl.java:607)
    at com.sun.org.apache.xerces.internal.impl.XMLDocumentFragmentScannerImpl$FragmentContentDriver.next(XMLDocumentFragmentScannerImpl.java:2947)
    at com.sun.org.apache.xerces.internal.impl.XMLDocumentFragmentScannerImpl.scanCDATASection(XMLDocumentFragmentScannerImpl.java:1614)
    at com.sun.org.apache.xerces.internal.impl.XMLEntityScanner.scanData(XMLEntityScanner.java:1252)
    at com.sun.org.apache.xerces.internal.impl.XMLEntityScanner.load(XMLEntityScanner.java:1753)
    at com.sun.org.apache.xerces.internal.impl.io.UTF8Reader.read(UTF8Reader.java:557)
    at com.sun.org.apache.xerces.internal.impl.io.UTF8Reader.invalidByte(UTF8Reader.java:687)
    com.sun.org.apache.xerces.internal.impl.io.MalformedByteSequenceException: Invalid byte 1 of 1-byte UTF-8 sequence.
    Nested exception:
    at java.lang.Thread.run(Thread.java:724)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
    at org.apache.tomcat.util.net.JIoEndpoint$SocketProcessor.run(JIoEndpoint.java:310)
    at org.apache.coyote.AbstractProtocol$AbstractConnectionHandler.process(AbstractProtocol.java:603)
    at org.apache.coyote.http11.AbstractHttp11Processor.process(AbstractHttp11Processor.java:1041)
    at org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:408)
    at org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:118)
    at org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:100)
    at org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:171)
    at org.apache.catalina.authenticator.AuthenticatorBase.invoke(AuthenticatorBase.java:502)
    at org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:123)
    at org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:222)
    at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:210)
    at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:305)
    at javax.servlet.http.HttpServlet.service(HttpServlet.java:728)
    at javax.servlet.http.HttpServlet.service(HttpServlet.java:647)
    at wodinow.weixin.jaskey.co.CoreServlet.doPost(CoreServlet.java:161)
    at wodinow.weixin.jaskey.service.CommandService.generateResponseXML(CommandService.java:76)
    at wodinow.weixin.jaskey.util.MessageUtil.parseXml(MessageUtil.java:52)
    at org.dom4j.io.SAXReader.read(SAXReader.java:335)
    at org.dom4j.io.SAXReader.read(SAXReader.java:458)
    org.dom4j.DocumentException: Invalid byte 1 of 1-byte UTF-8 sequence. Nested exception:     Invalid byte 1 of 1-byte UTF-8 sequence. 
    
  • Joop Eggen
    Joop Eggen about 9 years
    Plus conn.setRequestProperty("Content-type", "text/xml; charset=UTF-8"); which might be redundant as this seems configured elsewhere.
  • JaskeyLam
    JaskeyLam about 9 years
    Thank you very much!! It seems it works now, I will try convert it into exe and retry. Actually, before I got your answer, I have tried below 4 ways: 1.xmlstr = URLEncoder.encode(xmlstr, "utf-8"); 2.chineseContent=URLEncoder.encode(chineseContent, "utf-8"); 3. conn.setRequestProperty("Accept-Charset", "utf-8"); 4.conn.setRequestProperty("contentType", "utf-8"); they donot help, and some of them even makes it worse!. Would you please help to explain what is the difference between your answer and my attempts?
  • Duncan Jones
    Duncan Jones about 9 years
    @Jaskey I'm not an expert on these classes, but I would suggest: 1. This escapes Chinese characters but is not equivalent to creating a UTF-8 string. 2. Same thing. 3. Not sure. 4. Looks like this advertises your text is UTF-8, but doesn't actually cause it to be UTF-8 encoded.
  • JaskeyLam
    JaskeyLam about 9 years
    @Duncan , Thank you! I tried in exe file and it works too. I have a question that when we use response.setCharacterEncoding("UTF-8"); it is that I will get a printwriter with UTF-8 encoded when I use response.getWriter();