Is there an HTML safe truncate method in Rails?

23,914

Solution 1

There are two completely different solutions both with the same name: truncate_html

  1. https://github.com/ianwhite/truncate_html : This is a gem and uses an html parser (nokogiri)
  2. https://github.com/hgmnz/truncate_html : This is a file you put in your helpers directory. It uses regular expressions and has no dependencies.

Solution 2

the regular truncate function works fine, just pass :escape => false as an option to keep the HTML intact. eg:

truncate(@html_text, :length => 230, :omission => "" , :escape => false)

RubyOnRails.org

*Edit I didn't read the question very carefully (or at all TBH), so this answer does not solve this question... It IS the answer I happened to be looking for though, so hopefully it helps 1 or 2 people :)

Solution 3

You should solve this problem with CSS rather than Ruby. You are doing something that affects the DOM layout, and there is no way to programmatically devise a solution that will work consistently.

Let's say you get your HTML parser gem working, and you find a lowest common denominator character count that will work most of the time.

What happens if you change font sizes, or your site layout? You'll have to recalculate the character count again.

Or let's say your html has something like this in it: <p><br /></p><br /> That is zero characters, however it would cause a big chunk of blank text to be inserted. It could even be a <blockquote> or <code> tag with too much padding or margin to throw your layout totally out of whack.

Or the inverse, let's say you have this 3&nbsp;&#8773;&nbsp;&#955; (3 ≅ λ) That is 26 characters long, but for display purposes it is only 5.

The point being that character count tells you nothing about how something will render in the browser. Not to mention the fact HTML parsers are hefty pieces of code that can at times be unreliable.

Here is some good CSS to deal with this. The :after pseudo class will add a white fade to the last line of content. Very nice transition.

body { font-size: 16px;}
p {font-size: 1em; line-height: 1.2em}
/* Maximum height math is:
   line-height * #oflines - 0.4
   the 0.4 offset is to make the cutoff  look nicer */
.lines-3{height: 3.2em;}
.lines-6{height: 6.8em;}
.truncate {overflow: hidden; position:relative}
.truncate:after{
    content:""; 
    height: 1em; 
    display: block; 
    width: 100%; 
    position:absolute;
    background-color:white; 
    opacity: 0.8; 
    bottom: -0.3em
}

You can add as many .lines-x classes as you see fit. I used em but px is just as good.

Then apply this to your element: <div class="truncate lines-3">....lots of stuff.. </div>

and the fiddle: http://jsfiddle.net/ke87h/

Solution 4

You could use the truncate_html plugin for this. It uses nokogiri and htmlentities gems and does exactly what the plugin name suggests.

Solution 5

We had this need in zendone.com. The problem was that the existing solutions were very slow when truncating long HTML documents (MBs) into shorter ones (KBs). I ended up coding a library based in Nokogiri called truncato. The library includes some benchmarks comparing its performance with other libs.

Share:
23,914
Kmaczek
Author by

Kmaczek

Updated on July 09, 2022

Comments

  • Kmaczek
    Kmaczek almost 2 years

    I have a string of HTML in Rails. I'd like to truncate the string after a certain number of characters not including the HTML markup. Also, if the split happens to fall in the middle of an opening and closing tag, I'd like to close the open tag/s. For example;

    html = "123<a href='#'>456</a>7890"
    truncate_markup(html, :length => 5) --> "123<a href='#'>45</a>"
    
  • Don Cruickshank
    Don Cruickshank almost 11 years
    This is how I approached the problem in the site I work for. When JavaScript is available, I truncate characters off the end until it fits with an ellipses at the end. Doing the truncate server-side by a number of characters can lead to jagged results when a line has a lot of thin or wide characters.
  • Arcolye
    Arcolye almost 11 years
    This would strip out all the html, wouldn't it?
  • Fernando Kosh
    Fernando Kosh over 10 years
    Add 'separator' param to prevent word crop: truncate(html.gsub(/(<[^>]+>)/, ''), length: 5, separator: ' ')
  • Alaric
    Alaric over 10 years
    Are there any newer gems that have are still maintained and support Rails 4?
  • alexpls
    alexpls over 10 years
    Might as well use Rails' strip_tags helper to do this.
  • penner
    penner about 10 years
    Google brought me here and it was what i was looking for. Thanks.
  • Yuval Karmi
    Yuval Karmi about 10 years
    Thank you! This is exactly what I was looking for.
  • Jesse Fisher
    Jesse Fisher about 10 years
    You helped me. Quotes were showing by their code such as &ldquo; but by setting the escape option to false it works how I want. Thank you.
  • phillyslick
    phillyslick about 10 years
    This is great. You can sanitize / strip html tags on the server side if you need to strip specific elements as well.
  • Daniel
    Daniel about 10 years
    @RyanClark I would go with hgmnz/truncate_html. It's based on regular expressions and should work with any Rails version as long as the Ruby versions are compatible.
  • Adriano Resende
    Adriano Resende over 8 years
    "..not including the HTML markup", this code work with HTML
  • Adriano Resende
    Adriano Resende over 8 years
    This question is for Rails, check title: "Is there an HTML safe truncate method in Rails?"
  • Alexis
    Alexis over 8 years
    nice, i used like this sanitize(sanitize("my text <em>with html</em>", tags: ['h1']).truncate(150)) the first sanitize will complete the missing markup, the inner sanitize will remove my unwanted tags and truncate on the end will truncate like an html_safe, thanks
  • Vlad
    Vlad over 6 years
    So much cleaner! Good job
  • dcangulo
    dcangulo over 5 years
    Unfortunately, hgmnz/truncate_html removes special characters. The code truncate_html('<p>aàáâäæãåāeèéêëēėęiîïíīįìoôöòóœøōõuûüùúū</p‌​>', :length => 10) has an expected output of <p>aàáâäæãåāe</p> instead it returns <p>aeiou</p>.
  • Abram
    Abram almost 5 years
    Cool, but if there's an open tag on the truncated HTML, you will end up breaking your layout.
  • Abram
    Abram almost 5 years
    Using this one.