Replacement for javascript escape?

14,383

escape() is defined in section B.2.1.2 escape and the introduction text of Annex B says:

... All of the language features and behaviours specified in this annex have one or more undesirable characteristics and in the absence of legacy usage would be removed from this specification. ...

For characters, whose code unit value is 0xFF or less, escape() produces a two-digit escape sequence: %xx. This basically means, that escape() converts a string containing only characters from U+0000 to U+00FF to an percent-encoded string using the latin-1 encoding.

For characters with a greater code unit, the four-digit format %uxxxx is used. This is not allowed within the hfields section (where subject and body are stored) of an mailto:-URI (as defined in RFC6068):

mailtoURI    = "mailto:" [ to ] [ hfields ]
to           = addr-spec *("," addr-spec )
hfields      = "?" hfield *( "&" hfield )
hfield       = hfname "=" hfvalue
hfname       = *qchar
hfvalue      = *qchar
...
qchar        = unreserved / pct-encoded / some-delims
some-delims  = "!" / "$" / "'" / "(" / ")" / "*"
               / "+" / "," / ";" / ":" / "@"

unreserved and pct-encoded are defined in STD66:

unreserved  = ALPHA / DIGIT / "-" / "." / "_" / "~"
pct-encoded   = "%" HEXDIG HEXDIG

A percent sign is only allowed if it is directly followed by two hexdigits, percent followed by u is not allowed.

Using a self-implemented version, that behaves exactly like escape doesn't solve anything - instead just continue to use escape, it won't be removed anytime soon.



To summerise: Your previous usage of escape() generated latin1-percent-encoded mailto-URIs if all characters are in the range U+0000 to U+00FF, otherwise an invalid URI was generated (which might still be correctly interpreted by some applications, if they had javascript-encode/decode compatibility in mind).

It is more correct (no risk of creating invalid URIs) and future-proof, to generate UTF8-percent-encoded mailto-URIs using encodeURIComponent() (don't use encodeURI(), it does not escape ?, /, ...). RFC6068 requires usage of UTF-8 in many places (but allows other encodings for "MIME encoded words and for bodies in composed email messages").

Example:

text_latin1="Swedish åäö"
text_other="Emoji 😎"

document.getElementById('escape-latin-1-link').href="mailto:?subject="+escape(text_latin1);
document.getElementById('escape-other-chars-link').href="mailto:?subject="+escape(text_other);
document.getElementById('utf8-link').href="mailto:?subject="+encodeURIComponent(text_latin1);
document.getElementById('utf8-other-chars-link').href="mailto:?subject="+encodeURIComponent(text_other);

function mime_word(text){
  q_encoded = encodeURIComponent(text) //to utf8 percent encoded
  .replace(/[_!'()*]/g, function(c){return '%'+c.charCodeAt(0).toString(16).toUpperCase();})// encode some more chars as utf8
  .replace(/%20/g,'_') // mime Q-encoding is using underscore as space
  .replace(/%/g,'='); //mime Q-encoding uses equal instead of percent
  return encodeURIComponent('=?utf-8?Q?'+q_encoded+'?=');//add mime word stuff and escape for uri
}

//don't use mime_word for body!!!
document.getElementById('mime-word-link').href="mailto:?subject="+mime_word(text_latin1);
document.getElementById('mime-word-other-chars-link').href="mailto:?subject="+mime_word(text_other);
<a id="escape-latin-1-link">escape()-latin1</a><br/>
<a id="escape-other-chars-link">escape()-emoji</a><br/>
<a id="utf8-link">utf8</a><br/>
<a id="utf8-other-chars-link">utf8-emoji</a><br/>
<a id="mime-word-link">mime-word</a><br/>
<a id="mime-word-other-chars-link">mime-word-emoji</a><br/>

For me, the UTF-8 links and the Mime-Word links work in Thunderbird. Only the plain UTF-8 links work in Windows 10 builtin Mailapp and my up-to-date version of Outlook.

Share:
14,383

Related videos on Youtube

gusjap
Author by

gusjap

Java/Web/App developer #SOreadytohelp

Updated on June 24, 2022

Comments

  • gusjap
    gusjap almost 2 years

    I know that the escape function has been deprecated and that you should use encodeURI or encodeURIComponent instead. However, the encodeUri and encodeUriComponent doesn't do the same thing as escape.

    I want to create a mailto link in javascript with Swedish åäö. Here are a comparison between escape, encodeURIComponent and encodeURI:

    var subject="åäö";
    var body="bodyåäö";
    
    console.log("mailto:?subject="+escape(subject)+"&body=" + escape(body));
    console.log("mailto:?subject="+encodeURIComponent(subject)+"&body=" + encodeURIComponent(body));
    console.log("mailto:?subject="+encodeURI(subject)+"&body=" + encodeURI(body));  
    Output:
    mailto:?subject=My%20subject%20with%20%E5%E4%F6&body=My%20body%20with%20more%20characters%20and%20swedish%20%E5%E4%F6
    mailto:?subject=My%20subject%20with%20%C3%A5%C3%A4%C3%B6&body=My%20body%20with%20more%20characters%20and%20swedish%20%C3%A5%C3%A4%C3%B6
    mailto:?subject=My%20subject%20with%20%C3%A5%C3%A4%C3%B6&body=My%20body%20with%20more%20characters%20and%20swedish%20%C3%A5%C3%A4%C3%B6 
    

    Only the mailto link created with "escape" opens a properly formatted mail in Outlook using IE or Chrome. When using encodeURI or encodeURIComponent the subject says:

    My subject with åäö
    

    and the body is also looking messed up.

    Is there some other function besides escape that I can use to get the working mailto link?

    • Cyclonecode
      Cyclonecode over 9 years
      What encoding are you using, have you tried using utf-8?
    • gusjap
      gusjap over 9 years
      I'm using UTF-8 encoding.
    • gusjap
      gusjap over 9 years
      I did notice now that escape is not working in Firefox, so I'll have to use encodeURIComponent in the Firefox case. Error in Firefox: _ERROR_ILLEGAL_VALUE: Component returned failure code: 0x80070057 (NS_ERROR_ILLEGAL_VALUE) [nsIDOMLocation.href]
    • gusjap
      gusjap over 9 years
      The best solution I've come up with is to define my own escape function instead of using the deprecated one. Here is one example of how the escape function could be implemented: cwestblog.com/2011/05/23/escape-unescape-deprecated
  • Ralph King
    Ralph King about 6 years
    Those two functions don't do the same thing as escape. The question even states this. Also, W3 schools is not a great resource to link to.
  • localhostdotdev
    localhostdotdev about 5 years
    s = 'a?b/c<d>e"f\'g'; console.log(escape(s), encodeURI(s), encodeURIComponent(s)) #=> a%3Fb/c%3Cd%3Ee%22f%27g a?b/c%3Cd%3Ee%22f'g a%3Fb%2Fc%3Cd%3Ee%22f'g, only escape protects against XSS correctly (not possible to escape a quotes attribute)
  • C Perkins
    C Perkins almost 5 years
    In a test of Chrome 74.0.3729.169 (64-bit) on Windows 10, only the UTF-8 links worked to produce the correct unicode characters. The other resulted in either undetermined characters (�) or just kept the string of %hex escape codes.
  • T S
    T S almost 5 years
    @CPerkins You specify the browser, you used to click on the links, but AFAIK the behavior is irrelevant of the browser. The result does however depend on the mail-client that is opened when interpreting mailto:-links. It would therefore be nice, if you could specify which mail-client is opened, when you click on the links. If the links open a new tab in your browser, that means a webmail service is registered as mailto:-handler. In that case, please specify, which webmail service is used. (gmail.com or outlook.com or roundcube or ... ?)
  • C Perkins
    C Perkins almost 5 years
    I intended to mention that I was using a browser-based mail client, so in that case the browser's behavior and initial interpretation of the URI is definitely relevant. To explain further, the mailto URIs were produced and clicked in Firefox 67 which opened Chrome. The browser is not irrelevant anyway, since this is all about producing the correct link within the browser, so between producing the URI to passing it off to the mailto client, I suppose there is a dependence on the browser there too. FYI, I confirmed that Thunderbird correctly interprets the UTF-8 and mime-word URIs.