Is there an HTML safe truncate method in Rails?
Solution 1
There are two completely different solutions both with the same name: truncate_html
- https://github.com/ianwhite/truncate_html : This is a gem and uses an html parser (nokogiri)
- https://github.com/hgmnz/truncate_html : This is a file you put in your helpers directory. It uses regular expressions and has no dependencies.
Solution 2
the regular truncate
function works fine, just pass :escape => false
as an option to keep the HTML intact. eg:
truncate(@html_text, :length => 230, :omission => "" , :escape => false)
*Edit I didn't read the question very carefully (or at all TBH), so this answer does not solve this question... It IS the answer I happened to be looking for though, so hopefully it helps 1 or 2 people :)
Solution 3
You should solve this problem with CSS rather than Ruby. You are doing something that affects the DOM layout, and there is no way to programmatically devise a solution that will work consistently.
Let's say you get your HTML parser gem working, and you find a lowest common denominator character count that will work most of the time.
What happens if you change font sizes, or your site layout? You'll have to recalculate the character count again.
Or let's say your html has something like this in it: <p><br /></p><br />
That is zero characters, however it would cause a big chunk of blank text to be inserted. It could even be a <blockquote>
or <code>
tag with too much padding or margin to throw your layout totally out of whack.
Or the inverse, let's say you have this 3 ≅ λ
(3 ≅ λ) That is 26 characters long, but for display purposes it is only 5.
The point being that character count tells you nothing about how something will render in the browser. Not to mention the fact HTML parsers are hefty pieces of code that can at times be unreliable.
Here is some good CSS to deal with this. The :after pseudo class will add a white fade to the last line of content. Very nice transition.
body { font-size: 16px;}
p {font-size: 1em; line-height: 1.2em}
/* Maximum height math is:
line-height * #oflines - 0.4
the 0.4 offset is to make the cutoff look nicer */
.lines-3{height: 3.2em;}
.lines-6{height: 6.8em;}
.truncate {overflow: hidden; position:relative}
.truncate:after{
content:"";
height: 1em;
display: block;
width: 100%;
position:absolute;
background-color:white;
opacity: 0.8;
bottom: -0.3em
}
You can add as many .lines-x
classes as you see fit. I used em but px is just as good.
Then apply this to your element: <div class="truncate lines-3">....lots of stuff.. </div>
and the fiddle: http://jsfiddle.net/ke87h/
Solution 4
You could use the truncate_html plugin for this. It uses nokogiri and htmlentities gems and does exactly what the plugin name suggests.
Solution 5
We had this need in zendone.com. The problem was that the existing solutions were very slow when truncating long HTML documents (MBs) into shorter ones (KBs). I ended up coding a library based in Nokogiri called truncato. The library includes some benchmarks comparing its performance with other libs.
Kmaczek
Updated on July 09, 2022Comments
-
Kmaczek almost 2 years
I have a string of HTML in Rails. I'd like to truncate the string after a certain number of characters not including the HTML markup. Also, if the split happens to fall in the middle of an opening and closing tag, I'd like to close the open tag/s. For example;
html = "123<a href='#'>456</a>7890" truncate_markup(html, :length => 5) --> "123<a href='#'>45</a>"
-
Don Cruickshank almost 11 yearsThis is how I approached the problem in the site I work for. When JavaScript is available, I truncate characters off the end until it fits with an ellipses at the end. Doing the truncate server-side by a number of characters can lead to jagged results when a line has a lot of thin or wide characters.
-
Arcolye almost 11 yearsThis would strip out all the html, wouldn't it?
-
Fernando Kosh over 10 yearsAdd 'separator' param to prevent word crop: truncate(html.gsub(/(<[^>]+>)/, ''), length: 5, separator: ' ')
-
Alaric over 10 yearsAre there any newer gems that have are still maintained and support Rails 4?
-
alexpls over 10 yearsMight as well use Rails'
strip_tags
helper to do this. -
penner about 10 yearsGoogle brought me here and it was what i was looking for. Thanks.
-
Yuval Karmi about 10 yearsThank you! This is exactly what I was looking for.
-
Jesse Fisher about 10 yearsYou helped me. Quotes were showing by their code such as “ but by setting the escape option to false it works how I want. Thank you.
-
phillyslick about 10 yearsThis is great. You can sanitize / strip html tags on the server side if you need to strip specific elements as well.
-
Daniel about 10 years@RyanClark I would go with hgmnz/truncate_html. It's based on regular expressions and should work with any Rails version as long as the Ruby versions are compatible.
-
Adriano Resende over 8 years"..not including the HTML markup", this code work with HTML
-
Adriano Resende over 8 yearsThis question is for Rails, check title: "Is there an HTML safe truncate method in Rails?"
-
Alexis over 8 yearsnice, i used like this
sanitize(sanitize("my text <em>with html</em>", tags: ['h1']).truncate(150))
the first sanitize will complete the missing markup, the inner sanitize will remove my unwanted tags and truncate on the end will truncate like an html_safe, thanks -
Vlad over 6 yearsSo much cleaner! Good job
-
dcangulo over 5 yearsUnfortunately,
hgmnz/truncate_html
removes special characters. The codetruncate_html('<p>aàáâäæãåāeèéêëēėęiîïíīįìoôöòóœøōõuûüùúū</p>', :length => 10)
has an expected output of<p>aàáâäæãåāe</p>
instead it returns<p>aeiou</p>
. -
Abram almost 5 yearsCool, but if there's an open tag on the truncated HTML, you will end up breaking your layout.
-
Abram almost 5 yearsUsing this one.