Add http(s) to URL if it's not there?
Solution 1
Use a before filter to add it if it is not there:
before_validation :smart_add_url_protocol
protected
def smart_add_url_protocol
unless url[/\Ahttp:\/\//] || url[/\Ahttps:\/\//]
self.url = "http://#{url}"
end
end
Leave the validation you have in, that way if they make a typo they can correct the protocol.
Solution 2
Don't do this with a regex, use URI.parse
to pull it apart and then see if there is a scheme on the URL:
u = URI.parse('/pancakes')
if(!u.scheme)
# prepend http:// and try again
elsif(%w{http https}.include?(u.scheme))
# you're okay
else
# you've been give some other kind of
# URL and might want to complain about it
end
Using the URI library for this also makes it easy to clean up any stray nonsense (such as userinfo) that someone might try to put into a URL.
Solution 3
The accepted answer is quite okay.
But if the field (url) is optional, it may raise an error such as undefined method
+ for nil
class.
The following should resolve that:
def smart_add_url_protocol
if self.url && !url_protocol_present?
self.url = "http://#{self.url}"
end
end
def url_protocol_present?
self.url[/\Ahttp:\/\//] || self.url[/\Ahttps:\/\//]
end
Solution 4
Preface, justification and how it should be done
I hate it when people change model in a before_validation
hook. Then when someday it happens that for some reason models need to be persisted with save(validate: false), then some filter that was suppose to be always run on assigned fields does not get run. Sure, having invalid data is usually something you want to avoid, but there would be no need for such option if it wasn't used. Another problem with it is that every time you ask from a model is it valid these modifications also take place. The fact that simply asking if a model is valid may result in the model getting modified is just unexpected, perhaps even unwanted. There for if I'd have to choose a hook I'd go for before_save
hook. However, that won't do it for me since we provide preview views for our models and that would break the URIs in the preview view since the hook would never get called. There for, I decided it's best to separate the concept in to a module or concern and provide a nice way for one to apply a "monkey patch" ensuring that changing the fields value always runs through a filter that adds a default protocol if it is missing.
The module
#app/models/helpers/uri_field.rb
module Helpers::URIField
def ensure_valid_protocol_in_uri(field, default_protocol = "http", protocols_matcher="https?")
alias_method "original_#{field}=", "#{field}="
define_method "#{field}=" do |new_uri|
if "#{field}_changed?"
if new_uri.present? and not new_uri =~ /^#{protocols_matcher}:\/\//
new_uri = "#{default_protocol}://#{new_uri}"
end
self.send("original_#{field}=", new_uri)
end
end
end
end
In your model
extend Helpers::URIField
ensure_valid_protocol_in_uri :url
#Should you wish to default to https or support other protocols e.g. ftp, it is
#easy to extend this solution to cover those cases as well
#e.g. with something like this
#ensure_valid_protocol_in_uri :url, "https", "https?|ftp"
As a concern
If for some reason, you'd rather use the Rails Concern pattern it is easy to convert the above module to a concern module (it is used in an exactly similar way, except you use include Concerns::URIField
:
#app/models/concerns/uri_field.rb
module Concerns::URIField
extend ActiveSupport::Concern
included do
def self.ensure_valid_protocol_in_uri(field, default_protocol = "http", protocols_matcher="https?")
alias_method "original_#{field}=", "#{field}="
define_method "#{field}=" do |new_uri|
if "#{field}_changed?"
if new_uri.present? and not new_uri =~ /^#{protocols_matcher}:\/\//
new_uri = "#{default_protocol}://#{new_uri}"
end
self.send("original_#{field}=", new_uri)
end
end
end
end
end
P.S. The above approaches were tested with Rails 3 and Mongoid 2.
P.P.S If you find this method redefinition and aliasing too magical you could opt not to override the method, but rather use the virtual field pattern, much like password (virtual, mass assignable) and encrypted_password (gets persisted, non mass assignable) and use a sanitize_url (virtual, mass assignable) and url (gets persisted, non mass assignable).
Solution 5
Based on mu's answer, here's the code I'm using in my model. This runs when :link is saved without the need for model filters. Super is required to call the default save method.
def link=(_link)
u=URI.parse(_link)
if (!u.scheme)
link = "http://" + _link
else
link = _link
end
super(link)
end
Admin
Updated on July 22, 2022Comments
-
Admin almost 2 years
I'm using this regex in my model to validate an URL submitted by the user. I don't want to force the user to type the http part, but would like to add it myself if it's not there.
validates :url, :format => { :with => /^((http|https):\/\/)?[a-z0-9]+([-.]{1}[a-z0-9]+).[a-z]{2,5}(:[0-9]{1,5})?(\/.)?$/ix, :message => " is not valid" }
Any idea how I could do that? I have very little experience with validation and regex..
-
d11wtq over 12 yearsWe use Addressable for exactly this. It's a little weird if there's no scheme in the URL, however, since it considers the host to be the path.
-
Tony Beninate about 12 years
unless self.url[/^http?s:\/\//]
wasn't quite working for me. I had to dounless self.url[/^http:\/\//] || self.url[/^https:\/\//]
-
koffeinfrei over 10 yearsThis shouldn't be a validation, but merely a
before_save
hook. The purpose of validations are to invalidate the instance (preventing it from saving), which is not the case here. -
Douglas F Shearer over 10 yearsYou're getting confused. This isn't a validation. It's a method that's run before the validations are run.
-
mu is too short over 10 years@d11wtq I just switched to Addressable to get sane and consistent UTF-8 support in URLs.
-
Timo about 10 yearsI must advice against this dirty trick. It "works" with simple paths like
/pancakes
, but why would anybody want to enforce protocol on paths? However, if we're talking about "web addresses" as normal human beings understand and write them, then using URI will not parse them correctly. That is because most people leave out the double forward slashes indicating the beginning of authority definition out of "web addresses". In fact I believe many think they belong to the protocol definition, but they do not. To be continued... (sorry for the super long double post) -
Timo about 10 yearsWhen "URLs" like these are then parsed using some specifications abiding component like Rubys URI the result is that there is no host in the URI object, but it is inferred as a mere path in its entirety. This is unlikely what is intended and gives a false impression that you have a properly parsed URI object at your disposal, but if someone was to modify any of its components the results would be surprising. To properly parse a "web address" as a URI you should always ensure first that the forward dashes are in place.
-
mu is too short about 10 years@TimoLehto: I don't get your point. Do you have an example where this fails?
-
Timo about 10 years@muistooshort Well your example lacks some details, but I can see two obvious implementations for the "# prepend http:// and try again" part. The more natural solution
u.scheme = "http"; u.to_s == "http:www.example.com" #Ooops, what happened?
or you prepend it to the original string and parse it again.u = "http://#{orig_string}"
. This however fails if anybody gives you a protocol relative URI as then you'll end up with "http:////www.example.com" and after reparsingu.host == nil && u.path == "//www.example.com"
, which is not right and that's why I consider this quite dangerous. -
mu is too short about 10 years@TimoLehto: Fair enough, but a simply replacing "prepend http:// and try again" with "remove leading slashes, prepend http://, and try again" would handle that. One big advantage of going through URI over simple regex wrangling is that it is easier to strip out almost-always-nefarious things such as userinfo, normalize case, ... And modifying the
scheme
using URI doesn't do anything useful, the library really shouldn't offer a mutator for that property but that's a separate issue. -
Timo about 10 years@muistooshort Yes, I'm not saying your solution is all horrible or anything. It just looks way better solution than it actually is and that's why I consider it dangerous and deceptive. I don't understand what you mean when you say "modifying scheme doesn't do anything useful". Why shouldn't it offer a mutator for that? It works like a charm so long as you feed it proper URLs:
u = URI.parse("http://www.example.com/"); u.scheme = "https"; u.to_s == "https://www.example.com"; u.path = "/index.html; u.to_s == "https://www.example.com/index.html"
-
mu is too short about 10 years@TimoLehto:
u = URI.parse('/pancakes'); u.scheme = 'http'; u.class
vsURI.parse('http://example.com/').class
. The design of the URI library conflates scheme and class but changing the scheme cannot change the class. -
Alter Lagos over 9 yearsI agree with @koffeinfrei in the sense that this should be applied after the field is already validated, not before. For instance, with that solution something like
validates :url, :presence => true
will never fail because always will have at least thehttp://
value -
Nuno Silva over 9 yearsdoes not work with rails 4 due to attribute methods not being defined until api.rubyonrails.org/classes/ActiveModel/… happens. related: stackoverflow.com/questions/16727976/…
-
Earl Jenkins about 5 yearsAs an aside, I'd like to point out that interpolation is always more performant in Ruby than concatenation. Thus, it should be
link = "http://#{_link}"
. -
ryaz about 4 years
\Ahttp(s)?:\/\/
one regexp to catch http and https