Why in Rails 3, <%= note.html_safe %> and <%= h note.html_safe %> give the same result?

ruby-on-rails ruby-on-rails-3 xss html-escape html-safe

10,425

As you can see, calling html_safe on a string turns it into an html safe SafeBuffer

http://github.com/rails/rails/blob/89978f10afbad3f856e2959a811bed1982715408/activesupport/lib/active_support/core_ext/string/output_safety.rb#L87

Any operations on a SafeBuffer that could affect the string safety will be passed through h()

h uses this flag to avoid double escaping

http://github.com/rails/rails/blob/89978f10afbad3f856e2959a811bed1982715408/activesupport/lib/active_support/core_ext/string/output_safety.rb#L18

The behavior did change and I think you are mostly correct about how it works. In general you should not call html_safe unless you're sure that it is already sanitized. Like anything, you have to be careful while using it

10,425

Author by

nonopolarity

I started with Apple Basic and 6502 machine code and Assembly, then went onto Fortran, Pascal, C, Lisp (Scheme), microcode, Perl, Java, JavaScript, Python, Ruby, PHP, and Objective-C. Originally, I was going to go with an Atari... but it was a big expense for my family... and after months of me nagging, my dad agreed to buy an Apple ][. At that time, the Pineapple was also available. The few months in childhood seem to last forever. A few months nowadays seem to pass like days. Those days, a computer had 16kb or 48kb of RAM. Today, the computer has 16GB. So it is in fact a million times. If you know what D5 AA 96 means, we belong to the same era.

Updated on July 26, 2022

Comments

nonopolarity over 1 year
It feels like html_safe adds an abstraction to the String class that requires understanding of what is going on, for example,
```
<%= '1 2' %> # gives 1 &lt;b&gt;2&lt;/b&gt; in the HTML source code

<%= h '1 2' %> # exactly the same as above

<%= '1 2'.html_safe %> # 1 2 in HTML source code

<%= h '1 2'.html_safe %> # exactly the same as above

<%= h (h '1 2') %> # 1 &lt;b&gt;2&lt;/b&gt; wont' escape twice
```
For line 4, if we are saying, ok, we trust the string -- it is safe, but why can't we escape it? It seems that to escape it by h, the string has to be unsafe.

So on line 1, if the string is not escaped by h, it will be automatically escaped. On line 5, h cannot escape the string twice -- in other words, after < is changed to <, it can't escape it one more time to &lt;.

So what's happening? At first, I thought html_safe is just tagging a flag to the string, saying it is safe. So then, why does h not escape it? It seems that h and html_escape actually co-operate on using the flag:

1) If a string is html_safe, then h will not escape it

2) If a string is not html_safe, then when the string is added to the output buffer, it will be automatically escaped by h.

3) If h already escaped a string, it is marked html_safe, and therefore, escaping it one more time by h won't take any effect. (as on Line 5, and that behavior is the same even in Rails 2.3.10, but on Rails 2.3.5 h can actually escape it twice... so in Rails 2.3.5, h is a simple escape method, but some where along the line to 2.3.10, h became not as simple. But 2.3.10 won't auto escape a string, but for some reason, the method html_safe already exists for 2.3.10 (for what purpose?))

Is that how it works exactly? I think nowadays, sometimes we don't get what we want in the output and we immediately add html_safe to our variable, which can be quite dangerous, because it can introduce XSS attack that way, so understanding how it exactly works can be quite important. The above is only a guess of how it exactly work. Could it be actually a different mechanism and is there any doc that supports it?
nonopolarity over 13 years

interesting... it uses a new class SafeBuffer as the "flag"... so "foobar".html_safe will actually create and return a new instance of SafeBuffer with the content of the original string...