Ruby post title to slug

27,821

Solution 1

slug = title.downcase.strip.gsub(' ', '-').gsub(/[^\w-]/, '')

downcase makes it lowercase. The strip makes sure there is no leading or trailing whitespace. The first gsub replaces spaces with hyphens. The second gsub removes all non-alpha non-dash non-underscore characters (note that this set is very close to \W but includes the dash as well, which is why it's spelled out here).

Solution 2

Is this Rails? (works in Sinatra)

string.parameterize

That's it. For even more sophisticated slugging, see ActsAsUrl. It can do the following:

"rock & roll".to_url => "rock-and-roll"
"$12 worth of Ruby power".to_url => "12-dollars-worth-of-ruby-power"
"10% off if you act now".to_url => "10-percent-off-if-you-act-now"
"kick it en Français".to_url => "kick-it-en-francais"
"rock it Español style".to_url => "rock-it-espanol-style"
"tell your readers 你好".to_url => "tell-your-readers-ni-hao"

Solution 3

to_slug is a great Rails plugin that handles pretty much everything, including funky characters, but its implementation is very simple. Chuck it onto String and you'll be sorted. Here's the source condensed down:

String.class_eval do
  def to_slug
    value = self.mb_chars.normalize(:kd).gsub(/[^\x00-\x7F]/n, '').to_s
    value.gsub!(/[']+/, '')
    value.gsub!(/\W+/, ' ')
    value.strip!
    value.downcase!
    value.gsub!(' ', '-')
    value
  end
end

Solution 4

I've used this gem.It's simple but helpful.

https://rubygems.org/gems/string_helpers

Share:
27,821
ma11hew28
Author by

ma11hew28

Updated on July 15, 2022

Comments

  • ma11hew28
    ma11hew28 almost 2 years

    How should I convert a post title to a slug in Ruby?

    The title can have any characters, but I only want the slug to allow [a-z0-9-_] (Should it allow any other characters?).

    So basically:

    • downcase all letters
    • convert spaces to hyphens
    • delete extraneous characters
  • Bruce
    Bruce over 13 years
    Your character class could be expressed more concisely as /[^\w-]/.
  • ma11hew28
    ma11hew28 over 13 years
    Thanks, Ben. I added some more complexity to account for . \ / and to remove multiple -'s in a row and remove them from the end: slug = title.strip.downcase.gsub(/[\s\.\/\\]/, '-').gsub(/[^\w-]/, '').gsub(/[-_]{2,}/, '-').gsub(/^[-_]/, '').gsub(/[-_]$/, ''). I stopped after realizing it's pretty darn complicated to get it perfect. Also, tr is faster than gsub, so it's better to do: tr(' ', '-') than gsub(' ', '-').
  • Ben Lee
    Ben Lee over 13 years
    @MattDiPasquale. There is a ruby method called String#squeeze that will convert all sequences of two or more of the passed character to one. So you could write the above as slug = title.downcase.gsub('/[\s.\/_]/, ' ').squeeze(' ').strip.gsub(/[^\w-]/, '').tr(' ', '-'). This first turns all whitespace, ., /, and '_' to spaces. Then it squeezes spaces (all sequences of 2 or more spaces become a single one), then it it strips spaces (removes leading and trailing spaces), then it converts the remaining spaces back to dashes.
  • Ben Lee
    Ben Lee over 13 years
    As far as speed of gsub processing versus tr, you're really just talking processor cycles -- nanoseconds, really. Unless you are creating hundreds of thousands of posts per second, that speed difference will make absolutely no difference. What you should take into account is personal style and clarity of code. In this case, I tr may still better, but for those two reasons, not because it's faster.
  • Ben Lee
    Ben Lee over 13 years
    Oh, and this is the answer for plain Ruby. If you are using Rails, you can do just do slug = title.parameterize as Mark Thomas pointed out. Even if you are not using rails, you can get the same support from the active support gem by doing: require 'active_support'; $KCODE = 'UTF8';
  • Ben Lee
    Ben Lee over 13 years
    Oh and I just noticed that in my comment a few above, I put the things in the wrong order. It should have been: slug = title.downcase.gsub('/[\s.\/_]/, ' ').squeeze(' ').strip.tr(' ', '-').gsub(/[^\w-]/, '')
  • ma11hew28
    ma11hew28 over 13 years
    @Ben Lee, Thanks for your recommendation to use String#squeeze. However, the refactoring doesn't work exactly as I have it. E.g., it returns "hello---world" when title = "hello - world", but I want it to return "hello-world". Also, delete the single quote after the first opening parenthesis.
  • ma11hew28
    ma11hew28 over 13 years
    It's not Rails, but it looks like that gem will work with plain Ruby as well. Thanks! I like how it converts & to and, but I want it to convert / and . to -. It converts them to slash and dot, respectively. Also, in this case, to keep things simple, I'd rather not require extra gems. So, I updated my solution to slug = title.strip.downcase.gsub(/(&|&)/, ' and ').gsub(/[\s\.\/\\]/, '-').gsub(/[^\w-]/, '').gsub(/[-_]{2,}/, '-').gsub(/^[-_]/, '').gsub(/[-_]$/, '').
  • Ben Lee
    Ben Lee over 13 years
    @Matt, I didn't actually test the code I posted, so there's probably a bug or two in there, but you get the idea, right? You can play around with gsub, squeeze, and tr to get the desired result. Or you can stick with what you already had that worked =).
  • Ben
    Ben over 12 years
    With this plugin, how do you call a method that'd slug your string ?
  • Niyaz
    Niyaz almost 12 years
    @BenLee I think you should put the gsub(/[^\w-]/, '') part before you strip the string. That way this expression will handle strings like ' ? ? hello ? ? world ? ?' correctly
  • Yarin
    Yarin over 10 years
    @JamieRumbelow- Your sample code had an error. you need to explicitely return value, because .gsub! returns nil when no substitutions are performed (e.g. "test".to_slug would return nil). I fixed the code for you.