Website/URL Validation Regex in JAVA

62,106

Solution 1

You need to make (http://|https://) part in your regex as optional one.

^(http:\/\/|https:\/\/)?(www.)?([a-zA-Z0-9]+).[a-zA-Z0-9]*.[a-z]{3}.?([a-z]+)?$

DEMO

Solution 2

You can use the Apache commons library(org.apache.commons.validator.UrlValidator) for validating a url:

String[] schemes = {"http","https"}.
UrlValidator urlValidator = new UrlValidator(schemes);

And use :-

 urlValidator.isValid(your url)

Then there is no need of regex.

Link:- https://commons.apache.org/proper/commons-validator/apidocs/org/apache/commons/validator/routines/UrlValidator.html

Solution 3

If you use Java, I recommend use this RegEx (I wrote it by myself):

^(https?:\/\/)?(www\.)?([\w]+\.)+[‌​\w]{2,63}\/?$
"^(https?:\\/\\/)?(www\.)?([\\w]+\\.)+[‌​\\w]{2,63}\\/?$" // as Java-String

to explain:

  • ^ = line start
  • (https?://)? = "http://" or "https://" may occur.
  • (www.)? = "www." may orrur.
  • ([\w]+.)+ = a word ([a-zA-Z0-9]) has to occur one or more times. (extend here if you need special characters like ü, ä, ö or others in your URL - remember to use IDN.toASCII(url) if you use special characters. If you need to know which characters are legal in general: https://kb.ucla.edu/articles/what-characters-can-go-into-a-valid-http-url
  • [‌​\w]{2,63} = a word ([a-zA-Z0-9]) with 2 to 63 characters has to occur exactly one time. (a TLD (top level domain (for example .com) can not be shorter than 2 or longer than 63 characters)
  • /? = a "/"-character may occur. (some people or servers put a / at the end... whatever)
  • $ = line end

-

If you extend it by special characters it could look like this:

^(https?:\/\/)?(www\.)?([\w\Q$-_+!*'(),%\E]+\.)+[‌​\w]{2,63}\/?$
"^(https?:\\/\\/)?(www\.)?([\\w\\Q$-_+!*'(),%\\E]+\\.)+[‌​\\w]{2,63}\\/?$" // as Java-String

The answer of Avinash Raj is not fully correct.

^(http:\/\/|https:\/\/)?(www.)?([a-zA-Z0-9]+).[a-zA-Z0-9]*.[a-z]{3}.?([a-z]+)?$

The dots are not escaped what means it matches with any character. Also my version is simpler and I never heard of a domain like "test..com" (which actually matches...)

Demo: https://regex101.com/r/vM7wT6/279


Edit: As I saw some people needing a regex which also matches servers directories I wrote this:

^(https?:\/\/)?([\w\Q$-_+!*'(),%\E]+\.)+(\w{2,63})(:\d{1,4})?([\w\Q/$-_+!*'(),%\E]+\.?[\w])*\/?$

while this may not be the best one, since I didn't spend too much time with it, maybe it helps someone. You can see how it works here: https://regex101.com/r/vM7wT6/700 It also matches urls like "hello.to/test/whatever.cgi"

Solution 4

Java compatible version of @Avinash's answer would be

//Pattern to check if this is a valid URL address
Pattern p = Pattern.compile("^(http://|https://)?(www.)?([a-zA-Z0-9]+).[a-zA-Z0-9]*.[a-z]{3}.?([a-z]+)?$");
Matcher m;
m=p.matcher(urlAddress);
boolean matches = m.matches();

Solution 5

pattern="w{3}\.[a-z]+\.?[a-z]{2,3}(|\.[a-z]{2,3})"

this will only accept addresses like e.g www.google.com & www.google.co.in

Share:
62,106
Hao Ting
Author by

Hao Ting

Updated on August 01, 2022

Comments

  • Hao Ting
    Hao Ting almost 2 years

    I need a regex string to match URL starting with "http://", "https://", "www.", "google.com"

    the code i tried using is:

    //Pattern to check if this is a valid URL address
        Pattern p = Pattern.compile("(http://|https://)(www.)?([a-zA-Z0-9]+).[a-zA-Z0-9]*.[a-z]{3}.?([a-z]+)?");
        Matcher m;
        m=p.matcher(urlAddress);
    

    but this code only can match url such as "http://www.google.com"

    I know this ma be a dupicate question but i have tried all of the regex provided and it does not suit my requirement. Willl someone please help me? Thank you.

  • Avinash Raj
    Avinash Raj almost 10 years
    even more simpler ^(https?:\/\/)?(www.)?([a-zA-Z0-9]+).[a-zA-Z0-9]*.[a-z]{3}.?‌​([a-z]+)?$
  • Ananda
    Ananda about 8 years
    correct is ^(http:\/\/|https:\/\/)?(www.)?([a-zA-Z0-9]+).[a-zA-Z0-9]*.[‌​a-z]{3}\.([a-z]+)?$
  • Chargnn
    Chargnn over 6 years
    You might need a regex to avoid an exception if someone tries to enter "http:\\" or "http:/"
  • Udit Kumawat
    Udit Kumawat about 6 years
    this validator doesn't allow underscore in host names
  • Akash Tomar
    Akash Tomar about 5 years
    This regex does not accept slash eg. https://www.google.com/123. It also does not accept multiple key value pairs, Eg: https://www.google.com?key1=value1&&key2=value2.