How to URL encode a string in Ruby

188,632

Solution 1

str = "\x12\x34\x56\x78\x9a\xbc\xde\xf1\x23\x45\x67\x89\xab\xcd\xef\x12\x34\x56\x78\x9a".force_encoding('ASCII-8BIT')
puts CGI.escape str


=> "%124Vx%9A%BC%DE%F1%23Eg%89%AB%CD%EF%124Vx%9A"

Solution 2

Nowadays, you should use ERB::Util.url_encode or CGI.escape. The primary difference between them is their handling of spaces:

>> ERB::Util.url_encode("foo/bar? baz&")
=> "foo%2Fbar%3F%20baz%26"

>> CGI.escape("foo/bar? baz&")
=> "foo%2Fbar%3F+baz%26"

CGI.escape follows the CGI/HTML forms spec and gives you an application/x-www-form-urlencoded string, which requires spaces be escaped to +, whereas ERB::Util.url_encode follows RFC 3986, which requires them to be encoded as %20.

See "https://stackoverflow.com/questions/2824126/whats-the-difference-between-uri-escape-and-cgi-escape/13059657#13059657" for more discussion.

Solution 3

str = "\x12\x34\x56\x78\x9a\xbc\xde\xf1\x23\x45\x67\x89\xab\xcd\xef\x12\x34\x56\x78\x9a"
require 'cgi'
CGI.escape(str)
# => "%124Vx%9A%BC%DE%F1%23Eg%89%AB%CD%EF%124Vx%9A"

Taken from @J-Rou's comment

Solution 4

You can use Addressable::URI gem for that:

require 'addressable/uri'   
string = '\x12\x34\x56\x78\x9a\xbc\xde\xf1\x23\x45\x67\x89\xab\xcd\xef\x12\x34\x56\x78\x9a'
Addressable::URI.encode_component(string, Addressable::URI::CharacterClasses::QUERY)
# "%5Cx12%5Cx34%5Cx56%5Cx78%5Cx9a%5Cxbc%5Cxde%5Cxf1%5Cx23%5Cx45%5Cx67%5Cx89%5Cxab%5Cxcd%5Cxef%5Cx12%5Cx34%5Cx56%5Cx78%5Cx9a" 

It uses more modern format, than CGI.escape, for example, it properly encodes space as %20 and not as + sign, you can read more in "The application/x-www-form-urlencoded type" on Wikipedia.

2.1.2 :008 > CGI.escape('Hello, this is me')
 => "Hello%2C+this+is+me" 
2.1.2 :009 > Addressable::URI.encode_component('Hello, this is me', Addressable::URI::CharacterClasses::QUERY)
 => "Hello,%20this%20is%20me" 

Solution 5

I was originally trying to escape special characters in a file name only, not on the path, from a full URL string.

ERB::Util.url_encode didn't work for my use:

helper.send(:url_encode, "http://example.com/?a=\11\15")
# => "http%3A%2F%2Fexample.com%2F%3Fa%3D%09%0D"

Based on two answers in "Why is URI.escape() marked as obsolete and where is this REGEXP::UNSAFE constant?", it looks like URI::RFC2396_Parser#escape is better than using URI::Escape#escape. However, they both are behaving the same to me:

URI.escape("http://example.com/?a=\11\15")
# => "http://example.com/?a=%09%0D"
URI::Parser.new.escape("http://example.com/?a=\11\15")
# => "http://example.com/?a=%09%0D"
Share:
188,632

Related videos on Youtube

HRÓÐÓLFR
Author by

HRÓÐÓLFR

Updated on April 26, 2020

Comments

  • HRÓÐÓLFR
    HRÓÐÓLFR about 4 years

    How do I URI::encode a string like:

    \x12\x34\x56\x78\x9a\xbc\xde\xf1\x23\x45\x67\x89\xab\xcd\xef\x12\x34\x56\x78\x9a
    

    to get it in a format like:

    %124Vx%9A%BC%DE%F1%23Eg%89%AB%CD%EF%124Vx%9A
    

    as per RFC 1738?

    Here's what I tried:

    irb(main):123:0> URI::encode "\x12\x34\x56\x78\x9a\xbc\xde\xf1\x23\x45\x67\x89\xab\xcd\xef\x12\x34\x56\x78\x9a"
    ArgumentError: invalid byte sequence in UTF-8
        from /usr/local/lib/ruby/1.9.1/uri/common.rb:219:in `gsub'
        from /usr/local/lib/ruby/1.9.1/uri/common.rb:219:in `escape'
        from /usr/local/lib/ruby/1.9.1/uri/common.rb:505:in `escape'
        from (irb):123
        from /usr/local/bin/irb:12:in `<main>'
    

    Also:

    irb(main):126:0> CGI::escape "\x12\x34\x56\x78\x9a\xbc\xde\xf1\x23\x45\x67\x89\xab\xcd\xef\x12\x34\x56\x78\x9a"
    ArgumentError: invalid byte sequence in UTF-8
        from /usr/local/lib/ruby/1.9.1/cgi/util.rb:7:in `gsub'
        from /usr/local/lib/ruby/1.9.1/cgi/util.rb:7:in `escape'
        from (irb):126
        from /usr/local/bin/irb:12:in `<main>'
    

    I looked all about the internet and haven't found a way to do this, although I am almost positive that the other day I did this without any trouble at all.

  • mu is too short
    mu is too short almost 13 years
    force_encoding('binary') might be a more self-documenting choice.
  • J-Rou
    J-Rou almost 12 years
    They deprecated that method, use * CGI.escape * instead. -> http://www.ruby-forum.com/topic/207489#903709. You should also be able to use URI.www_form_encode * URI.www_form_encode_component *, but I have never used those
  • pje
    pje over 10 years
    No need to require 'open-uri' here. Did you mean require 'uri'?
  • Alexander.Iljushkin
    Alexander.Iljushkin over 8 years
    @J-Rou, CGI.escape can escape whole URL, it does not selectively escapes query parameters, for instance, if you pass 'a=&!@&b=&$^' to CGI.escape it will escape whole thing with query separators & so this could be used only to query values. I suggest using addressable gem , it is more intellectual working with urls.
  • Raccoon
    Raccoon over 5 years
    Also can do like this: CGI.escape('Hello, this is me').gsub("+", "%20") => Hello%2C%20this%20is%20me" if don't want to use any gems
  • Tashows
    Tashows about 5 years
    I needed to access files on remote server. Encoding with CGI didn't work, but URI.encode did the work just fine.
  • cesartalves
    cesartalves almost 4 years
    If the receiving server is old, it might not respond well to CGI.escape. This is still a valid alternative.
  • akostadinov
    akostadinov about 3 years
    The only actual answer that I could find. Thank you.
  • Russell Fulton
    Russell Fulton about 3 years
    This whole issue is a mess! Thanks for shedding some real light on it. I wasted at least a day chasing my tail before finding this!