Get url from a text
Solution 1
Try this regex, returns the query string also
(http|ftp|https)://([\w+?\.\w+])+([a-zA-Z0-9\~\!\@\#\$\%\^\&\*\(\)_\-\=\+\\\/\?\.\:\;\'\,]*)?
You can test it on gskinner
Solution 2
public List<string> GetLinks(string message)
{
List<string> list = new List<string>();
Regex urlRx = new Regex(@"((https?|ftp|file)\://|www.)[A-Za-z0-9\.\-]+(/[A-Za-z0-9\?\&\=;\+!'\(\)\*\-\._~%]*)*", RegexOptions.IgnoreCase);
MatchCollection matches = urlRx.Matches(message);
foreach (Match match in matches)
{
list.Add(match.Value);
}
return list;
}
var list = GetLinks("Hey yo check this: http://www.google.com/?q=stackoverflow and this: http://www.mysite.com/?id=10&author=me");
It will find the following type of links:
http:// ...
https:// ...
file:// ...
www. ...
Solution 3
If you are using this urls later on your code (extracting a part, querystring or etc.) please consider using
Uri
class combine with HttpUtility
helper.
Uri uri;
String strUrl = "http://www.test.com/test.aspx?id=53";
bool isUri = Uri.TryCreate(strUrl, UriKind.RelativeOrAbsolute, out uri);
if(isUri){
Console.WriteLine(uri.PathAndQuery.ToString());
}else{
Console.WriteLine("invalid");
}
It could help you with this operations.
PrateekSaluja
ResilienceSoft is a global business advisory firm that provides multidisciplinary solutions to complex challenges and opportunities. With the full power of unique depth of thought combined with the global expertise of leading professionals, we are committed to protecting and enhancing the enterprise value of our clients.
Updated on July 28, 2022Comments
-
PrateekSaluja almost 2 years
Possible Duplicate:
regex for URL including query stringI have a text or message.
Hey! try this http://www.test.com/test.aspx?id=53
Our requirement is to get link from a text.We are using following code
List<string> list = new List<string>(); Regex urlRx = new Regex(@"(?<url>(http:|https:[/][/]|www.)([a-z]|[A-Z]|[0-9]|[/.]|[~])*)", RegexOptions.IgnoreCase); MatchCollection matches = urlRx.Matches(message); foreach (Match match in matches) { list.Add(match.Value); } return list;
It gives url but not the complete one.Output of the code is
But we need complete url like
Please suggest how to resolve that issue.Thanks in advance.
-
Sam Greenhalgh over 12 yearsSeems a little overly explicit. Wouldn't
(ftp|https?)://[^\s]+
work? -
Amar Palsapure over 12 years+1 @zapthedingbat This will also work