String length differs from Javascript to Java code

11,521

Solution 1

This isn't really a JavaScript (or Java) problem - both layers report an accurate length for the string they are dealing with. The problem in your case is that the string gets transformed during the HTTP transmission.

If you absolutely must ensure that the string doesn't exceed a certain length, you can mimic this transformation on the client by replacing every instance of "\n" with "\n\r" - but only for length verification purposes:

textarea.value.replace(/\n/g, "\r\n").length

Solution 2

Do you particularly care which line-endings are used? Why not just make the Java convert "\r\n" to "\n"? (Note that "\r\n" is the Windows style; "\n" is the Unix style.)

Alternatively, do the reverse when checking the length within the JavaScript.

Solution 3

Are you limiting it to 2000 chars so it fits inside an nvarchar(2000) column in a database? Otherwise maybe just allow a 2% overrun to be flexible on the Java side.

And Java should be using Unicode UTF16 to represent Strings. That /r must have got in there somewhere else, maybe a conversion in the web browser when submitting? Have you tried different browsers? On different platforms? You might just have to strip out the /rs.

Share:
11,521
François P.
Author by

François P.

I'm a junior IT Engineer from Montreal, Canada who currently works as a native iPhone applications developer. I call myself a über-geek, tech enthusiast who has a deep passion for programming. I'm a devoted Mac and Open Source fan. I've contributed to a few FLOSS projects (such as Wifidog and the MIT iFIND project). Stackoverflow is part of my daily routine since november 2008 and I love learning with you guys.

Updated on June 17, 2022

Comments

  • François P.
    François P. almost 2 years

    I've got a JSP page with a piece of Javascript validation code which limits to a certain amount of characters on submit. I'm using a <textarea> so I can't simply use a length attribute like in a <input type="text">.

    I use document.getElementById("text").value.length to get the string length. I'm running Firefox 3.0 on Windows (but I've tested this behavior with IE 6 also). The form gets submitted to a J2EE servlet. In my Java servlet the string length of the parameter is larger than 2000!

    I've noticed that this can easily be reproduced by adding carriage returns in the <textarea>. I've used Firebug to assert the length of the <textare> and it really is 2000 characters long. On the Java side though, the carriage returns get converted to UNIX style (\r\n, instead of \n), thus the string length differs!

    Am I missing something obvious here or what ? If not, how would you reliably (cross-platform / browser) make sure that the <textarea> is limited.

    • Tomalak
      Tomalak over 15 years
      @François: Always enclose things in tag brackets in back-ticks (e.g. format them as code), or they will be stripped out on display of your question.
    • varnie
      varnie over 11 years
      just stumbled across such situation. your topic made my day, sir! thanks a lot!
  • François P.
    François P. over 15 years
    OK. I get it. I guess that means that Javascript always represents carriage returns the UNIX way internally and through its APIs (i.e. length()). Somehow I gets converted to \r\n because the Java VM is running on Windows. I wish it was more uniform...
  • Jon Skeet
    Jon Skeet over 15 years
    I don't know what rules different browsers on servlet engines will apply, but normalization should remove the differences either way. Btw, it's worth trying on macs too, where \r is the normal linebreak.
  • user2161301
    user2161301 about 14 years
    Note: The code should be textarea.value.replace(/\n/g, "\n\r").length to find all occurrences. The original code only looks for the first match.
  • Christoffer Hammarström
    Christoffer Hammarström about 14 years
    It should be "\r\n", not "\n\r".
  • Vincent Robert
    Vincent Robert almost 14 years
    +1 Just remove all "\r" and everybody will be happy, whatever platform you are using. Macs included.
  • Camilo Martin
    Camilo Martin over 13 years
    To remember that the right order is \r\n, remember \r stands for Carriage Return (Cr) and \n stands for (new)Line Feed (Lf) in CrLf.
  • Ryan
    Ryan over 12 years
    I think you need to do the line ending conversion / check on both client and server because you don't know what line endings the client browser is going to use (Firefox submits \n even on Windows) and if you develop server side on Windows and deploy on Linux then the line endings will be handled differently.