Finding urls from text string via php and regex?
$pattern = '#(www\.|https?://)?[a-z0-9]+\.[a-z0-9]{2,4}\S*#i';
preg_match_all($pattern, $str, $matches, PREG_PATTERN_ORDER);
Sisir
Graduated at Electronics & Communication Engineering from KUET. Former Lead project developer for Kallzu, A cutting edge web application for Pay Per Call Marketers. Website https://sisir.me I also maintain a small group of developers and we together build from simple to advanced WordPress websites for clients all over the world.
Updated on June 04, 2022Comments
-
Sisir almost 2 years
I know the question title looks very repetitive. But some of the solution i did not find here.
I need to find urls form text string:
$pattern = '`.*?((http|https)://[\w#$&+,\/:;[email protected]]+)[^\w#$&+,\/:;[email protected]]*?`i'; if (preg_match_all($pattern,$url_string,$matches)) { print_r($matches[1]); }
using this pattern i was able to find urls with
http://
andhttps://
which is okey. But i have user input where people add url likewww.domain.com
evendomain.com
So, i need to validate the string first where i can replace
www.domain.com
domain.com
with common protocolhttp://
before them. Or i need to comeup with more good pattern?I am not good with regex and don't know what to do.
My idea is first finding the urls with
http://
andhttps://
the put them in an array then replace these url with space(" ") in the text string then use other patterns for it. But i am not sure what pattern to use.I am using this
$url_string = preg_replace($pattern, ' ', $url_string );
but that removes if anywww.domain.com
ordomain.com
url between two valid url withhttp://
orhttps://
If you can help that will be great.
To make things more clear:
i need a pattern or some other method where i can find all urls in a text sting. the example of url are:
- domain.com
- www.domain.com
- http://www.domain.com
- http://domain.com
- https://www.domain.com
- https://domain.com
thanks! 5.
-
Sisir about 13 yearsThanks! almost worked!! Still need to find the pattern
domain.com
-
Jonathan Kuhn about 13 years@Sisir replace the
{1}
with a?
to make the http:// or www optional. -
Shane almost 11 yearsThis does not work for me. I receive an empty results.
$pattern = '#(www\.|https?:\/\/){?}[a-zA-Z0-9]{2,254}\.[a-zA-Z0-9]{2,4}(\S*)#i'; $count = preg_match_all($pattern, 'http://www.Imaurl.com', $matches, PREG_PATTERN_ORDER);
And there is no error frompreg_last_error()
-
chmac over 10 yearsCopying and pasting this into an interactive PHP shell I also get blank results. Also, the
{2,254}
limit doesn't support domains liket.co
which are gaining popularity these days. Tried to edit the answer, but an edit must be >6 characters apparently :-( Oh, and I don't think this will match domains likeme-too.com
.