HTML-Entity escaping to prevent XSS

51,040

Solution 1

I use the OWASP (ESAPI) library as well, to escape strings for different types of display, use :

String html = ESAPI.encoder().encodeForHTML("hello < how > are 'you'");
String html_attr = ESAPI.encoder().encodeForHTMLAttribute("hello < how > are 'you'");
String js = ESAPI.encoder().encodeForJavaScript("hello < how > are 'you'");

HTML (assume jsp)

<tag attr="<%= html_attr %>" onclick="alert('<%= js %>')"><%= html %></tag>

Update (2017)

As ESAPI Encoders are considered legacy, a better alternative has been created and is actively being maintained, I would strongly recommend using the OWASP Java Encoder instead.

If your project already uses ESAPI, an integration has been added that will allow you to use this library for encoding instead.

The usage is explained on their wiki page, but for the sake of completion, this is how you can use it to contextually encode your data:

// HTML Context
String html = Encoder.forHtml("u<ntrus>te'd'");

// HTML Attribute Context
String htmlAttr = Encoder.forHtmlAttribute("u<ntrus>te'd'");

// Javascript Attribute Context
String jsAttr = Encoder.forJavaScriptAttribute("u<ntrus>te'd'");

HTML (assume jsp)

<div data-attr="<%= htmlAttr %>" onclick="alert('<%= jsAttr %>')">
    <%= html %>
</div>

PS: more contexts exist and are supported by the library

Solution 2

I recommend you to use Appache Common Lang library to escape strings, for exmaple to escape HTML:

String escapedString = org.apache.commons.lang.StringEscapeUtils.escapeHtml(String str);

the library has many useful methods to escape in HTML, XML, Javascript.

Share:
51,040
Christian Kuetbach
Author by

Christian Kuetbach

Hello my name is Christian Kütbach. I'm a software developer from Germany.

Updated on September 07, 2020

Comments

  • Christian Kuetbach
    Christian Kuetbach over 3 years

    I have some user input. Within my code, I ensure that the following symbols are escaped:

    & -> &amp; 
    < -> &lt; 
    > -> &gt;
    

    OWASP states that there are more chars to be escaped.

    For attributes, I do another kind of escaping:

    & -> &amp; 
    " -> &quot;
    

    This ensures that all attributes are enclosed by ". This makes me sure about my html-attributes, but not about HTML itself.

    I wonder if my escaping is sufficient. I've read this post, but I'm still not sure about my concern.

    (JavaScripts are escaped with the OWASP-Library)

    • Joop Eggen
      Joop Eggen over 12 years
      ' -> &apos; and % -> &perc; (for XSS, encoding chars per %34 etc.)
    • Gumbo
      Gumbo over 12 years
      @JoopEggen In what case would be replacing % by &perc; useful?
    • Joop Eggen
      Joop Eggen over 12 years
      @Gumbo &perc; is indeed less useful against XSS, but it can obfuscate urls. Browsers do not take a % code for its char, i.e.: <a href="%6Aavascript:alert('hi')"> does not invoke javascript.
  • Christian Kuetbach
    Christian Kuetbach over 12 years
    As I mentioned, I use OWASP for escaping Javascript-Strings. But I have some legacy code, which is produced by apache cocoon. This code is doing the escaping as I described. My question is: Is that escaping sufficient? If not (and only if not) I will have to modify ~200 XSL-Stylesheet line by line.
  • epoch
    epoch over 12 years
    imho, i do not think it is sufficient, just by checking this site (ha.ckers.org/xss.html) you can already determine that your escaping is not sufficient
  • epoch
    epoch over 12 years
    @ckuetbach, does that answer your question?
  • Christian Kuetbach
    Christian Kuetbach over 12 years
    I think my escaping should be enough. Attributes and Javascripts are escaped as described at OWASP. Only within pure HTML, my escaping is less hard than the OWASP says it should be. But at ha-ckers.org I can't fin any HTML-Body only XSS witch woul work if < and > are escaped.
  • avgvstvs
    avgvstvs almost 10 years
    I don't think the common lang lib is tested for deliberately malicious input in the same way that ESAPI is.
  • Preston Badeer
    Preston Badeer over 9 years
    2014/2015 Update: I would highly recommend using this as a reference for avoiding XSS attacks. It's by the OWASP people as well: owasp.org/index.php/…
  • Guillaume Polet
    Guillaume Polet about 9 years
    Actually, StringEscapeUtils does not escape single-quote ' into &apos;, so it is not suited for HTML escaping to prevent XSS
  • Sarah
    Sarah over 7 years
    @PrestonBadeer yes i'm reading this now. can i ask you a ques about this article?. it says to escape the following characters. & --> &amp; < --> &lt; > --> &gt; " --> &quot; ' --> &#x27; / --> &#x2F; I am using htmlspecialchars($string, ENT_QUOTES) to do this but it is not escaping the / should i do this manually? thanks for the help
  • Preston Badeer
    Preston Badeer over 7 years
    @sarah ESAPI is a open source library that provides a function that is safer than using htmlspecialchars(). Their example in the link I posted is: String safe = ESAPI.encoder().encodeForHTML( request.getParameter( "input" ) );
  • Sarah
    Sarah over 7 years
    @PrestonBadeer Cool thanks. so I might use that instead.. Is it the latest zip file here (released in sept 2013) that i should download?.. thanks code.google.com/p/owasp-esapi-java/downloads/list
  • Sarah
    Sarah over 7 years
    @PrestonBadeer nevermind my question. I was looking for the javascript one and I've just found it here: code.google.com/archive/p/owasp-esapi-js/downloads for anyone else that is looking... thanks
  • Venkaiah Yepuri
    Venkaiah Yepuri over 7 years
    @WillieScholtz : I WillieScholtz, I have gone through yours solution its really helpful but in my case I am directly doing attribute encding from attribut level in my jsp like <input id="rlv" type="text" name="releaseVersion" maxlength="250" size="25" value="<%=$ESAPI.encoder().encodeForHTMLAttribute(request.ge‌​tAttribute("relV"))%‌​>" style="font-size:12px;font-family:arial,helvetica,sans-serif‌​;" onkeyup = "releaseVersionSearch();" onblink="releaseVersionSearch();" class="modulecontent" /> when I am doing like this my page is not loading on browser
  • Venkaiah Yepuri
    Venkaiah Yepuri over 7 years
    @WillieScholtz : And in browser console tab I able to see this error net::ERR_INCOMPLETE_CHUNKED_ENCODING can please help me what I am doing mistake here ? Ty.
  • epoch
    epoch over 7 years
    @Venki, it would be best to ask a new question
  • Venkaiah Yepuri
    Venkaiah Yepuri over 7 years
    @WillieScholtz : Sure Willie Schotz. Let me create new question for this. Ty.