Ruby: How to escape url with square brackets [ and ]?
Solution 1
encode
doesn't escape brackets because they aren't special -- they have no special meaning in the path part of a URI, so they don't actually need escaping.
If you want to escape chars other than just the "unsafe" ones, pass a second arg to the encode method. That arg should be a regex matching, or a string containing, every char you want encoded (including chars the function would otherwise already match!).
Solution 2
You can escape [
with %5B
and ]
with %5D
.
Your URL will be:
URL.gsub("[","%5B").gsub("]","%5D")
I don't like that solution but it's working.
Solution 3
If using a third-party gem is an option, try addressable.
require "addressable/uri"
url = Addressable::URI.parse("http://[::1]/path[]").normalize!.to_s
#=> "http://[::1]/path%5B%5D"
Note that the normalize! method will not only escape invalid characters but also perform casefolding on the hostname part, unescaping on unnecessarily escaped characters and the like:
uri = Addressable::URI.parse("http://Example.ORG/path[]?query[]=%2F").normalize!
url = uri.to_s #=> "http://example.org/path%5B%5D?query%5B%5D=/"
So, if you just want to normalize the path part, do as follows:
uri = Addressable::URI.parse("http://Example.ORG/path[]?query[]=%2F")
uri.path = uri.normalized_path
url = uri.to_s #=> "http://Example.ORG/path%5B%5D?query[]=%2F"
Solution 4
According to new IP-v6 syntax there could be urls like this:
http://[1080:0:0:0:8:800:200C:417A]/index.html
Because of this we should escape [] only after host part of the url:
if url =~ %r{\[|\]}
protocol, host, path = url.split(%r{/+}, 3)
path = path.gsub('[', '%5B').gsub(']', '%5D') # Or URI.escape(path, /[^\-_.!~*'()a-zA-Z\d;\/?:@&%=+$,]/)
url = "#{protocol}//#{host}/#{path}"
end
foobar
Updated on June 05, 2022Comments
-
foobar almost 2 years
This url:
http://gawker.com/5953728/if-alison-brie-and-gillian-jacobs-pin-up-special-doesnt-get-community-back-on-the-air-nothing-will-[nsfw]
should be:
http://gawker.com/5953728/if-alison-brie-and-gillian-jacobs-pin-up-special-doesnt-get-community-back-on-the-air-nothing-will-%5Bnsfw%5D
But when I pass the first one into
URI.encode
, it doesn't escape the square brackets. I also triedCGI.escape
, but that escapes all the '/' as well.What should I use to escape URLS properly? Why doesn't
URI.encode
escape square brackets?