MvcHtmlString.ToHtmlString() not encoding HTML?

13,523

Solution 1

MvcHtmlString (or HtmlString, or anything that implements IHtmlString) is for strings that should be emitted as HTML verbatim - i.e. by making that an MvcHtmlString you're telling it that you actually want those HTML tags.

The difference is when you emit the string into an ASP.NET page using <%: .. %> (new in ASP.NET 4 or later). In that case the ASP.NET engine will automatically HtmlEncode regular strings for you (or anything that doesn't implement IHtmlString) whereas the MvcHtmlString will be emitted into the page verbatim / unencoded.

i.e. I think the documentation is wrong. There's a connect ticket with the equivalent error in the HtmlString constructor documentation, which they did fix. (I thought I filed that :-/ maybe mine got closed as a duplicate of someone else's?) I didn't notice the MvcHtmlString documentation was wrong too.

Solution 2

The MSDN documentation is correct, but perhaps a bit confusing. The MvcHtmlString and IHtmlString interface are used to represent a string that has already been HTML encoded. MSDN says:

Returns an HTML-encoded string that represents the current object.

The object you passed in to the MvcHtmlString object was already HTML-encoded, so both .ToString() and .ToHtmlString() merely return the object you passed in.

Please note that the MSDN docs do clearly state that:

The ToHtmlString and ToString methods return the same value.

So why have all this? Two reasons:

  1. In the Razor view engine and in ASP.NET Web Forms v4 an object that implements IHtmlString is written out as raw data. The view engines assume that the person creating the IHtmlString has already sanitized the data.
  2. The IHtmlString has its own stringify method so that it need not have the same implementation as ToString(). While ToHtmlString() must return the HTML, you could easily imagine that ToString() might return some developer-friendly debug information.
Share:
13,523
Robert Muehsig
Author by

Robert Muehsig

Updated on September 15, 2022

Comments

  • Robert Muehsig
    Robert Muehsig over 1 year

    Related to this question I play around with XSS issues in my ASP.NET MVC project and I´m confused with the MvcHtmlSTring.ToHtmlString() method. From the documentation "Returns an HTML-encoded string that represents the current object.", but it doesn´t work in my case:

        var mvcHtmlString = MvcHtmlString.Create("<SCRIPT/XSS SRC=\"htpp://ha.ckers.org/css.js\">").ToHtmlString();
    
        var encoded = HttpUtility.HtmlEncode("<SCRIPT/XSS SRC=\"htpp://ha.ckers.org/css.js\">");
    

    Output of mvcHtmlString

    <SCRIPT/XSS SRC="htpp://ha.ckers.org/css.js">
    

    Output of encoded <-- this is the behaviour I would suspect!

    &lt;SCRIPT/XSS SRC=&quot;htpp://ha.ckers.org/css.js&quot;&gt;
    

    Did I miss something?

  • Robert Muehsig
    Robert Muehsig about 12 years
    If it´s just a documentation issue, whats the purpose of the ToHtmlString methode? From my example: There is no difference between calling the "ToHtmlString()" method and the "ToString()" method - both will output the bad un-encoded HTML.
  • Rup
    Rup about 12 years
    It's the method on the IHtmlString interface. It really means "return HTML content to be inserted into the page" - i.e. this is what ASP.NET MVC 4 will call and emit the results without further encoding. I guess it's so you can do something different in other classes but you're right there's no difference here.
  • Rup
    Rup about 12 years
    IMO it's "Returns a string of HTML content": it's not necessarily an HTML-encoded string, and it's mentioning encoding here which was misleading to the OP. I do think there's room for improvement here - in particular the "same value" remark (which is only on the MvcHtmlString docs and not HtmlString) should clarify that this is the value used to construct the object untransformed.