Regular Expression - Extract subdomain & domain

102,332

Solution 1

Your regex doesn't seem correct. Try this regex:

/^(?:https?:\/\/)?(?:[^@\n]+@)?(?:www\.)?([^:\/\n?]+)/img

RegEx Demo

Solution 2

The same RegExp as in anubhava's answer, only added support for protocol-relative URLs like //google.com:

/^(?:https?:)?(?:\/\/)?(?:[^@\n]+@)?(?:www\.)?([^:\/\n]+)/im

RegEx Demo

Solution 3

Here's a solution ignoring everything before ://

.*\://?([^\/]+)

Incase you want to ignore www.

.*\://(?:www.)?([^\/]+)

Solution 4

Your regex expression works pretty well. You only need to remove the brackets. The final expression is:

^(?:http:\/\/|www\.|https:\/\/)([^\/]+)

Hope it's useful!

Share:
102,332
sunilkumarba
Author by

sunilkumarba

Updated on July 09, 2022

Comments

  • sunilkumarba
    sunilkumarba almost 2 years

    I'm trying to form a regular expression (javascript/node.js) which will extract the sub-domain & domain part from any given URL. This is what I ended up with:

    [^(?:http:\/\/|www\.|https:\/\/)]([^\/]+)
    

    Right now, I'm just considering http, https for protocol & exclude "www." portion from the subdomain+domain portion of an URL. I checked the expression & it almost works. But, here is the issue:

    Success

    'http://mplay.google.co.in/sadfask/asdkfals?dk=10'.match(/[^(?:http:\/\/|www\.|https:\/\/)]([^\/]+)/i)
    
    'http://lplay.google.co.in/sadfask/asdkfals?dk=10'.match(/[^(?:http:\/\/|www\.|https:\/\/)]([^\/]+)/i)
    

    Failure

    'http://play.google.co.in/sadfask/asdkfals?dk=10'.match(/[^(?:http:\/\/|www\.|https:\/\/)]([^\/]+)/i)
    
    'http://tplay.google.co.in/sadfask/asdkfals?dk=10'.match(/[^(?:http:\/\/|www\.|https:\/\/)]([^\/]+)/i)
    

    I just use the first element from the result array. I'm not able to understand why "play." & "tplay." doesn't work. Could anyone please help me in this regard?

    Does "/p" and "/t" have any meaning for the regular expression evaluator?

    Is there any other way of extracting sub-domain & domain from any given URL using a regular expression?

    Edit -

    Example:

    https://play.google.com/store/apps/details?id=com.skgames.trafficracer => play.google.com

    https://mail.google.com/mail/u/0/#inbox => mail.google.com